NewSQL is a class of modern relational database management systems that seek to provide the same scalable performance of NoSQL systems for online transaction processing (read-write) workloads while still maintaining the ACID guarantees of a traditional single-node database system.[1][2][3]
History
The term was first used by 451 Group analyst Matthew Aslett in a 2011 research paper discussing the rise of new database systems as challengers to established vendors.[1] Many enterprise systems that handle high-profile data (e.g., financial and order processing systems) also need to be able to scale but are unable to use NoSQL solutions because they cannot give up strong transactional and consistency requirements [1] .[4] The only options previously available for these organizations were to either purchase a more powerful single-node machine or develop custom middleware that distributes queries over traditional DBMS nodes. Both approaches are prohibitively expensive and thus are not an option for many. Thus, in this paper, Aslett discusses how NewSQL upstarts are poised to challenge the supremacy of commercial vendors, in particular Oracle.
Systems
Although NewSQL systems vary greatly in their internal architectures, the two distinguishing features common amongst them is that they all support the relational data model and use SQL as their primary interface.[5] One of the first known NewSQL systems is the H-Store parallel database system.[6][7]
NewSQL systems can be loosely grouped into three categories: [8] [9]
New architectures
The first type of NewSQL systems are completely brand new database platforms. Though many of the new databases have taken different design approaches, there are two primary categories evolving.
Distribute query to fragments to data nodes
These are designed to operate in a distributed cluster of shared-nothing nodes. Here nodes typically own a subset of the data. SQL Queries are split into query fragments and sent to the nodes that own the data. These databases are able to scale linearly as additional nodes are added.
- General purpose databases - These maintain full functionality of the traditional databases, handling all queries types. These are complete rewrite with the assumption of a distributed system and includes components like distributed concurrency control, flow control and distributed query processor. This includes Google Spanner, Clustrix, and NuoDB.
- In-memory databases - The applications targeted by these NewSQL systems are characterized as having a large number of transactions that (1) are short-lived (i.e., no user stalls), (2) touch a small subset of data using index look-ups (i.e., no full table scans or large distributed joins), and (3) are repetitive (i.e., executing the same queries with different inputs) [10] These NewSQL systems achieve high performance and scalability by eschewing much of the legacy architecture of the original System R design, such as heavy weight recovery or concurrency control algorithms.[11] VoltDB is the primary database in this category.
Pull data to node processing query
These database system have single primary node source of data (possibly replicated). A set of nodes act as transaction processing nodes. These pull all the data required for a particular query to the node that receives the query. Some optimizations are performed to pull minimum data possible. Two nodes trying to write to the same data have to move data between them. These provide high-availability and some scalability. These do not scale linearly under contentious loads or OLAP queries.
Other notable systems include VMware's SQLFire.
MySQL Engines
The second category are highly optimized storage engines for MySQL. These systems provide the same programming interface as MySQL, but scale better than built-in engines, such as InnoDB. Examples of these new storage engines include ScaleDB, TokuDB, MemSQL, and Akiban.[12]
Transparent sharding
These systems provide a sharding middleware layer to automatically split databases across multiple nodes. Examples of this type of system includes dbShards, Scalearc, and ScaleBase.
See also
References
- ^ a b c Aslett, Matthew (2011). "How Will The Database Incumbents Respond To NoSQL And NewSQL?". 451 Group (published 2011-04-04). http://www.cs.brown.edu/courses/cs227/papers/newsql/aslett-newsql.pdf. Retrieved 2012-07-06.
- ^ Stonebraker, Michael (2011-06-16). "NewSQL: An Alternative to NoSQL and Old SQL for New OLTP Apps". Communications of the ACM. http://cacm.acm.org/blogs/blog-cacm/109710-new-sql-an-alternative-to-nosql-and-old-sql-for-new-oltp-apps/fulltext. Retrieved 2012-07-06.
- ^ Hoff, Todd (2012-09-24). "Google Spanner's Most Surprising Revelation: NoSQL is Out and NewSQL is In". http://highscalability.com/blog/2012/9/24/google-spanners-most-surprising-revelation-nosql-is-out-and.html. Retrieved 2012-10-07.
- ^ Lloyd, Alex (2012). "Building Spanner". Berlin Buzzwords (published 2012-06-05). http://berlinbuzzwords.de/sessions/keynote-0. Retrieved 2012-10-07.
- ^ Cattell, Rick (May 2011). "Scalable SQL and NoSQL data stores". SIGMOD Record (Association for Computing Machinery) 39 (4). Retrieved 2012-10-06.
- ^ Aslett, Matthew (2008). "Is H-Store the future of database management systems?" (published 2008-03-04). http://blogs.the451group.com/information_management/2008/03/04/is-h-store-the-future-of-database-management-systems/. Retrieved 2012-07-05.
- ^ Dignan, Larry (2008). "H-Store: Complete destruction of the old DBMS order?". http://www.zdnet.com/blog/btl/h-store-complete-destruction-of-the-old-dbms-order/8055. Retrieved 2012-07-05.
- ^ Venkatesh, Prasanna (2012). "NewSQL - The New Way to Handle Big Data" (published 2012-01-30). http://www.linuxforu.com/2012/01/newsql-handle-big-data/. Retrieved 2012-10-07.
- ^ Levari, Doron (2011). "The NewSQL Market Breakdown". http://www.scalebase.com/the-story-of-newsql/. Retrieved 2012-04-08.
- ^ Stonebraker, Mike; et al. (2007). "The end of an architectural era: (it's time for a complete rewrite" (PDF). VLDB '07: Proceedings of the 33rd international conference on Very large data bases. Vienna, Austria. http://hstore.cs.brown.edu/papers/hstore-endofera.pdf.
- ^ Stonebraker, Michael (2011-06-16). "Ten Rules For Scalable Performance In Simple Operation' Datastores". Communications of the ACM. pp. 72–80. http://cacm.acm.org/magazines/2011/6/108651-10-rules-for-scalable-performance-in-simple-operation-datastores/fulltext. Retrieved 2012-10-07.
- ^ Darrow, Barb (2012). "Akiban goes wider with its cool NewSQL database". http://gigaom.com/cloud/akiban-goes-wider-with-its-cool-newsql-database/. Retrieved 2012-10-09.
|
|---|
| | | | | | Concepts | |
|---|
| | | Objects | |
|---|
| | | Components | |
|---|
| | | Functions | |
|---|
| | |
|