Developer(s) | Brown, CMU, MIT, Yale |
---|---|
Stable release | June 2016 / June 3, 2016 |
Repository | |
Written in | C++, Java |
Operating system | Linux, Mac OS X |
Type | Database Management System |
License | BSD License, GPL |
Website | hstore |
H-Store is an experimental database management system (DBMS). It was designed for online transaction processing applications. H-Store was developed by a team at Brown University, Carnegie Mellon University, the Massachusetts Institute of Technology, and Yale University [1] [2] in 2007 by researchers Michael Stonebraker, Sam Madden, Andy Pavlo and Daniel Abadi. [3] [4] [5]
H-Store was promoted as a new class of parallel database management systems, called NewSQL, [6] that provide the high-throughput and high-availability of NoSQL systems, but without giving up the transactional consistency of a traditional DBMS known as ACID (atomicity, consistency, isolation and durability). [7] Such systems operate across multiple machines, as opposed to a single, more powerful, more expensive machine. [8]
H-Store is able to execute transaction processing with high throughput by forgoing many features of traditional relational database management systems.
H-Store was designed as a parallel system to run on a cluster of shared-nothing, main memory executor nodes (processor + memory + storage). [9] The database is partitioned into disjoint subsets each assigned to a single-threaded execution engine assigned to one core on one node. Each engine has exclusive access to all of the data in its partition. Because it is single-threaded, only one transaction at a time can access the data stored on that partition. No physical locks or latches are included in the system, and once a transaction is started, it cannot stall waiting for another transaction to complete. Throughput is increased by increasing the number of nodes in the system and reducing partition sizes. [10]
H-Store was licensed under the BSD license and GPL licenses. By 2009, the VoltDB company developed a commercial version, and the H-Store research group shut down in 2016. [11]
In computing, a database is an organized collection of data or a type of data store based on the use of a database management system (DBMS), the software that interacts with end users, applications, and the database itself to capture and analyze the data. The DBMS additionally encompasses the core facilities provided to administer the database. The sum total of the database, the DBMS and the associated applications can be referred to as a database system. Often the term "database" is also used loosely to refer to any of the DBMS, the database system or an application associated with the database.
Ingres Database is a proprietary SQL relational database management system intended to support large commercial and government applications.
Data engineering refers to the building of systems to enable the collection and usage of data. This data is usually used to enable subsequent analysis and data science, which often involves machine learning. Making the data usable usually involves substantial compute and storage, as well as data processing.
Online transaction processing (OLTP) is a type of database system used in transaction-oriented applications, such as many operational systems. "Online" refers to the fact that such systems are expected to respond to user requests and process them in real-time. The term is contrasted with online analytical processing (OLAP) which instead focuses on data analysis.
C-Store is a database management system (DBMS) based on a column-oriented DBMS developed by a team at Brown University, Brandeis University, Massachusetts Institute of Technology and the University of Massachusetts Boston including Michael Stonebraker, Stanley Zdonik, and Samuel Madden. The last release of the original code was in 2006; Vertica a commercial fork, lives on.
Replication in computing involves sharing information so as to ensure consistency between redundant resources, such as software or hardware components, to improve reliability, fault-tolerance, or accessibility.
SAP IQ is a column-based, petabyte scale, relational database software system used for business intelligence, data warehousing, and data marts. Produced by Sybase Inc., now an SAP company, its primary function is to analyze large amounts of data in a low-cost, highly available environment. SAP IQ is often credited with pioneering the commercialization of column-store technology.
Exasol is an analytics database management software company. Its product is called Exasol, an in-memory, column-oriented, relational database management system
Michael Ralph Stonebraker is an American computer scientist specializing in database systems. Through a series of academic prototypes and commercial startups, Stonebraker's research and products are central to many relational databases. He is also the founder of many database companies, including Ingres Corporation, Illustra, Paradigm4, StreamBase Systems, Tamr, Vertica and VoltDB, and served as chief technical officer of Informix. For his contributions to database research, Stonebraker received the 2014 Turing Award, often described as "the Nobel Prize for computing."
Vertica is an analytic database management software company. Vertica was founded in 2005 by the database researcher Michael Stonebraker with Andrew Palmer as the founding CEO. Ralph Breslauer and Christopher P. Lynch served as CEOs later on.
In database theory, the CAP theorem, also named Brewer's theorem after computer scientist Eric Brewer, states that any distributed data store can provide only two of the following three guarantees:
Volt Active Data is an in-memory database designed by Michael Stonebraker, Sam Madden, and Daniel Abadi.
SingleStore is a proprietary, cloud-native database designed for data-intensive applications. A distributed, relational, SQL database management system (RDBMS) that features ANSI SQL support, it is known for speed in data ingest, transaction processing, and query processing.
Oracle NoSQL Database is a NoSQL-type distributed key-value database from Oracle Corporation. It provides transactional semantics for data manipulation, horizontal scalability, and simple administration and monitoring.
NewSQL is a class of relational database management systems that seek to provide the scalability of NoSQL systems for online transaction processing (OLTP) workloads while maintaining the ACID guarantees of a traditional database system.
The Yahoo! Cloud Serving Benchmark (YCSB) is an open-source specification and program suite for evaluating retrieval and maintenance capabilities of computer programs. It is often used to compare the relative performance of NoSQL database management systems.
Azure Cosmos DB is a globally distributed, multi-model database service offered by Microsoft. It is designed to provide high availability, scalability, and low-latency access to data for modern applications. Unlike traditional relational databases, Cosmos DB is a NoSQL and vector database, which means it can handle unstructured, semi-structured, structured, and vector data types.
In database theory, the PACELC theorem is an extension to the CAP theorem. It states that in case of network partitioning (P) in a distributed computer system, one has to choose between availability (A) and consistency (C), but else (E), even when the system is running normally in the absence of partitions, one has to choose between latency (L) and loss of consistency (C).
Database scalability is the ability of a database to handle changing demands by adding/removing resources. Databases use a host of techniques to cope. According to Marc Brooker: "a system is scalable in the range where marginal cost of additional workload is nearly constant." Serverless technologies fit this definition but you need to consider total cost of ownership not just the infra cost.
Daniel Abadi is the Darnell-Kanal Professor of Computer Science at University of Maryland, College Park. His primary area of research is database systems, with contributions to stream databases, distributed databases, graph databases, and column-store databases. He helped create C-Store, a column-oriented database, and HadoopDB, a hybrid of relational databases and Hadoop. Both database systems were commercialized by companies.