H-Store

Last updated
H-Store
Developer(s) Brown, CMU, MIT, Yale
Stable release
June 2016 / June 3, 2016;7 years ago (2016-06-03)
Repository
Written in C++, Java
Operating system Linux, Mac OS X
Type Database Management System
License BSD License, GPL
Website hstore.cs.brown.edu

H-Store is an experimental database management system (DBMS). It was designed for online transaction processing applications. H-Store was developed by a team at Brown University, Carnegie Mellon University, the Massachusetts Institute of Technology, and Yale University [1] [2] in 2007 by researchers Michael Stonebraker, Sam Madden, Andy Pavlo and Daniel Abadi. [3] [4] [5]

Contents

Architecture

H-Store was promoted as a new class of parallel database management systems, called NewSQL, [6] that provide the high-throughput and high-availability of NoSQL systems, but without giving up the transactional consistency of a traditional DBMS known as ACID (atomicity, consistency, isolation and durability). [7] Such systems operate across multiple machines, as opposed to a single, more powerful, more expensive machine. [8]

H-Store is able to execute transaction processing with high throughput by forgoing many features of traditional relational database management systems.

H-Store was designed as a parallel system to run on a cluster of shared-nothing, main memory executor nodes (processor + memory + storage). [9] The database is partitioned into disjoint subsets each assigned to a single-threaded execution engine assigned to one core on one node. Each engine has exclusive access to all of the data in its partition. Because it is single-threaded, only one transaction at a time can access the data stored on that partition. No physical locks or latches are included in the system, and once it is started, no transaction stalls waiting for another transaction to complete. Throughput is increased by increasing the number of nodes in the system and reducing partition sizes. [10]

Licensing

H-Store was licensed under the BSD license and GPL licenses. By 2009, the VoltDB company developed a commercial version, and the H-Store research group shut down in 2016. [11]

See also

Related Research Articles

<span class="mw-page-title-main">Database</span> Organized collection of data in computing

In computing, a database is an organized collection of data or a type of data store based on the use of a database management system (DBMS), the software that interacts with end users, applications, and the database itself to capture and analyze the data. The DBMS additionally encompasses the core facilities provided to administer the database. The sum total of the database, the DBMS and the associated applications can be referred to as a database system. Often the term "database" is also used loosely to refer to any of the DBMS, the database system or an application associated with the database.

<span class="mw-page-title-main">Ingres (database)</span>

Ingres Database is a proprietary SQL relational database management system intended to support large commercial and government applications.

Data engineering refers to the building of systems to enable the collection and usage of data. This data is usually used to enable subsequent analysis and data science; which often involves machine learning. Making the data usable usually involves substantial compute and storage, as well as data processing.

Online transaction processing (OLTP) is a type of database system used in transaction-oriented applications, such as many operational systems. "Online" refers to that such systems are expected to respond to user requests and process them in real-time. The term is contrasted with online analytical processing (OLAP) which instead focuses on data analysis.

C-Store is a database management system (DBMS) based on a column-oriented DBMS developed by a team at Brown University, Brandeis University, Massachusetts Institute of Technology and the University of Massachusetts Boston including Michael Stonebraker, Stanley Zdonik, and Samuel Madden. The last release of the original code was in 2006; Vertica a commercial fork, lives on.

A column-oriented DBMS or columnar DBMS is a database management system (DBMS) that stores data tables by column rather than by row. Benefits include more efficient access to data when only querying a subset of columns, and more options for data compression. However, they are typically less efficient for inserting new data.

SAP IQ is a column-based, petabyte scale, relational database software system used for business intelligence, data warehousing, and data marts. Produced by Sybase Inc., now an SAP company, its primary function is to analyze large amounts of data in a low-cost, highly available environment. SAP IQ is often credited with pioneering the commercialization of column-store technology.

<span class="mw-page-title-main">Michael Stonebraker</span> American computer scientist (born 1943)

Michael Ralph Stonebraker is a computer scientist specializing in database systems. Through a series of academic prototypes and commercial startups, Stonebraker's research and products are central to many relational databases. He is also the founder of many database companies, including Ingres Corporation, Illustra, Paradigm4, StreamBase Systems, Tamr, Vertica and VoltDB, and served as chief technical officer of Informix. For his contributions to database research, Stonebraker received the 2014 Turing Award, often described as "the Nobel Prize for computing."

<span class="mw-page-title-main">Vertica</span> Software company

Vertica is an analytic database management software company. Vertica was founded in 2005 by the database researcher Michael Stonebraker with Andrew Palmer as the founding CEO. Ralph Breslauer and Christopher P. Lynch served as CEOs later on.

<span class="mw-page-title-main">CAP theorem</span> Need to sacrifice consistency or availability in the presence of network partitions

In theoretical computer science, the CAP theorem, also named Brewer's theorem after computer scientist Eric Brewer, states that any distributed data store can provide only two of the following three guarantees:

Volt Active Data is an in-memory database designed by Michael Stonebraker, Sam Madden, and Daniel Abadi.

<span class="mw-page-title-main">Actian Vector</span>

Actian Vector is an SQL relational database management system designed for high performance in analytical database applications. It published record breaking results on the Transaction Processing Performance Council's TPC-H benchmark for database sizes of 100 GB, 300 GB, 1 TB and 3 TB on non-clustered hardware.

<span class="mw-page-title-main">SingleStore</span> Database management system

SingleStore is a proprietary, cloud-native database designed for data-intensive applications. A distributed, relational, SQL database management system (RDBMS) that features ANSI SQL support, it is known for speed in data ingest, transaction processing, and query processing.

<span class="mw-page-title-main">Oracle NoSQL Database</span> Distributed database

Oracle NoSQL Database is a NoSQL-type distributed key-value database from Oracle Corporation. It provides transactional semantics for data manipulation, horizontal scalability, and simple administration and monitoring.

NewSQL is a class of relational database management systems that seek to provide the scalability of NoSQL systems for online transaction processing (OLTP) workloads while maintaining the ACID guarantees of a traditional database system.

The Yahoo! Cloud Serving Benchmark (YCSB) is an open-source specification and program suite for evaluating retrieval and maintenance capabilities of computer programs. It is often used to compare the relative performance of NoSQL database management systems.

Azure Cosmos DB is a globally distributed, multi-model database service offered by Microsoft. It is designed to provide high availability, scalability, and low-latency access to data for modern applications. Unlike traditional relational databases, Cosmos DB is a NoSQL database, which means it can handle unstructured and semi-structured, in addition to structured, data types.

In theoretical computer science, the PACELC theorem is an extension to the CAP theorem. It states that in case of network partitioning (P) in a distributed computer system, one has to choose between availability (A) and consistency (C), but else (E), even when the system is running normally in the absence of partitions, one has to choose between latency (L) and loss of consistency (C).

Database scalability is the ability of a database to handle changing demands by adding/removing resources. Databases use a host of techniques to cope.

Daniel Abadi is the Darnell-Kanal Professor of Computer Science at University of Maryland, College Park. His primary area of research is database systems, with contributions to stream databases, distributed databases, graph databases, and column-store databases. He helped create C-Store, a column-oriented database, and HadoopDB, a hybrid of relational databases and Hadoop. Both database systems were commercialized by companies.

References

  1. "H-Store - Next Generation OLTP DBMS Research" . Retrieved 2011-08-07.
  2. Van Couvering, David (2008-02-18). "Stonebraker's H-Store: There's something happenin' here" (published 2011-03-11). Retrieved 2012-07-18.
  3. Stonebraker, Mike; et al. (2007). "The end of an architectural era: (it's time for a complete rewrite)" (PDF). VLDB '07: Proceedings of the 33rd international conference on Very large data bases. Vienna, Austria.
  4. Kallman, Robert; Kimura, Hideaki; Natkins, Jonathan; Pavlo, Andrew; Rasin, Alexander; Zdonik, Stanley; Jones, Evan P. C.; Madden, Samuel; Stonebraker, Michael; Zhang, Yang; Hugg, John; Abadi, Daniel J. (2008). "H-Store: a high-performance, distributed main memory transaction processing system" (PDF). Proc. VLDB Endow. 2. 1: 1496–1499. doi:10.14778/1454159.1454211. ISSN   2150-8097.
  5. Monash, Curt (2008). "Mike Stonebraker calls for the complete destruction of the old DBMS order" (published 2008-02-18). Retrieved 2012-07-18.
  6. Aslett, Matthew (2010). "How Will The Database Incumbents Respond To NoSQL And NewSQL?" (PDF). 451 Group (published 2011-04-04). Archived from the original (PDF) on January 27, 2012. Retrieved 2012-07-06.
  7. Thomas, Nigel (2008-03-01). "H-Store - a new architectural era, or just a toy?" . Retrieved 2012-07-05.
  8. Aslett, Matthew (2008-03-04). "Is H-Store the future of database management systems?". Archived from the original on 2012-05-06. Retrieved 2012-07-05.
  9. "H-Store - Architecture Overview" . Retrieved 2011-08-07.
  10. Dignan, Larry (2008). "H-Store: Complete destruction of the old DBMS order?". ZDNet . Retrieved 2012-07-05.
  11. Monash, Curt (2009). "H-Store is now VoltDB" . Retrieved 2011-07-14.