VoltDB

Last updated
Volt Active Data
Developer(s) Volt Active Data
Initial releaseMay 25, 2010 (2010-05-25)
Stable release
11.3 / April 21, 2022;2 years ago (2022-04-21)
Repository
Written in Java, C++
Operating system Linux, macOS
Platform Java
Type RDBMS
License GNU Affero General Public License v3, VoltDB Proprietary License
Website voltactivedata.com

Volt Active Data (formerly VoltDB) is an in-memory database designed by Michael Stonebraker, Sam Madden, and Daniel Abadi.

Contents

It is an ACID-compliant RDBMS that uses a shared-nothing architecture, and is derived from work done by Stonebraker on OLTP system performance [1] and optimization. [2]

It is available in both enterprise and community editions. The community edition is licensed under the GNU Affero General Public License.

Architecture

VoltDB is a NewSQL OLTP [3] relational database that supports SQL access from within pre-compiled Java stored procedures.

While direct SQL access is supported, [4] the most efficient form of interaction is using stored procedure calls, [5] as it involves fewer network trips. Stored procedures are written in Java by extending a class called VoltProcedure' and implementing a ‘run()’ method that includes both SQL statements and supporting Java logic. Internally data is managed by a C++ core to avoid garbage collection issues. [6]

VoltDB relies on horizontal partitioning down to the individual hardware thread to scale, k-safety (synchronous replication) to provide high availability, and a combination of continuous snapshots and command logging for durability (crash recovery).

VoltDB is based on H-Store. It uses a shared-nothing architecture to scale. Data and the processing associated with it are distributed across the CPU cores within the servers composing a single VoltDB cluster. By extending its shared-nothing foundation to the per-core level, VoltDB scales with the increasing core-per-CPU counts on multi-core servers.

By making stored procedures the unit of transaction and executing them at the partition containing the necessary data, it is possible to eliminate round trip messaging between SQL statements. Stored procedures are executed serially and to completion in a single thread without locking or latching, similar to the LMAX architecture. [7] Because data is in memory and local to the partition, a stored procedure can execute in microseconds. VoltDB's stored procedure initiation scheme allows all nodes to initiate stored procedures while avoiding a single serializable global order. [8]

VoltDB is ACID compliant. Data is written to durable storage. Durability is ensured by continuous snapshots; asynchronous command logging, which creates both snapshots and a log of transactions between snapshots; and synchronous command logging, which logs transactions after the transaction completes and before it is committed to the database. This ensures that no transactions are committed that are not logged and that no transactions are lost.

History

VoltDB v5.0 introduced a database monitoring and management tool, the VoltDB Management Center (VMC for short). VMC provides browser-based one-stop monitoring and configuration management of the VoltDB database, including graphs for cluster throughput and latency as well as CPU and memory usage for the current server.

VoltDB version 5.1, released in March 2015, introduced database replication (DR) functionality, removing any single point of failure. DR provides simultaneous, parallel replication of multiple partitions and binary logs of transaction results, saving the replica from having to replay the transaction.

V6.0 [9] introduced geospatial datatypes

V6.1 [10] added streams, which can be inserted into, with support for aggregation in materialized views on the streaming data,.

V6.6 [11] added support for XDCR running clusters between mixed versions of Volt and of mixed sizes and configurations.

V7.1 [12] , released in March 2017 introduced support for TLS encryption for client networking.

V7.5 [13] released July 28, 2017 introduced the kafkaloader, for consuming streaming events from Kafka directly into a database table or into a stored procedure for processing.

V7.6 [14] (August 28, 2017) introduced User-Defined SQL Functions, allowing customers to write custom functions in Java and make them callable from a SQL statement. V8.0 [15] (February 6, 2018) introduced TLS encryption for networking between clusters using DR and XDCR and for intra-cluster communication.

V8.2 [16] (July 12, 2018) introduced the TTL feature that allows applications to define a "time to live" on timestamp column in a table. Once the expiration time is hit, an internal process cleans up the records from the database.

V8.4 [17] (December 29, 2018) introduced Long-Term Supported versions to Volt customers. This version increases the support from one year to three years, allowing customers to stay on a version and receive critical updates for stability, security and correctness. V9.0 [18] (April 11, 2019) introduced new streaming functionality, including migration of data to a stream upon expiration and change-data capture.

V9.3 [19] (May 1, 2020), a Long-Term Supported (LTS) release, introduced Scheduled Tasks, a way to automate repetitive tasks and procedure calls from within Volt. Schedule Tasks has an easy to use interface for calling pre-defined procedures and can also be fully customized in Java to create more complex schedules.

V10.0 [20] (August, 2020) introduced a Volt Operator for Kubernetes and Helm Charts offering a complete solution for running VoltDB databases in a Kubernetes cloud environment. In addition V10.0 provided a Prometheus agent for collection and graphing of metrics.

V10.2 [21] (January 2021) introduced VoltDB Topics to provide the intelligent streaming of VoltDB's existing import and export capabilities, but with the flexibility of Kafka-like streams. Topics allow for both inbound and outbound streaming to multiple client producers and consumers. They allow for intelligent processing and manipulation of the data as it passes through the pipeline. V10.2 is a LTS release.

V11.0 [22] (April 21, 2022) introduces connectivity to DataDog, support for Java 17, compatibility with Kubernetes 22.0 and priority transactions.

In February 2022 the product was renamed to "Volt Active Data". [23]

See also

Related Research Articles

<span class="mw-page-title-main">MySQL</span> SQL database engine software

MySQL is an open-source relational database management system (RDBMS). Its name is a combination of "My", the name of co-founder Michael Widenius's daughter My, and "SQL", the acronym for Structured Query Language. A relational database organizes data into one or more data tables in which data may be related to each other; these relations help structure the data. SQL is a language that programmers use to create, modify and extract data from the relational database, as well as control user access to the database. In addition to relational databases and SQL, an RDBMS like MySQL works with an operating system to implement a relational database in a computer's storage system, manages users, allows for network access and facilitates testing database integrity and creation of backups.

<span class="mw-page-title-main">PostgreSQL</span> Free and open-source object relational database management system

PostgreSQL, also known as Postgres, is a free and open-source relational database management system (RDBMS) emphasizing extensibility and SQL compliance. PostgreSQL features transactions with atomicity, consistency, isolation, durability (ACID) properties, automatically updatable views, materialized views, triggers, foreign keys, and stored procedures. It is supported on all major operating systems, including Linux, FreeBSD, OpenBSD, macOS, and Windows, and handles a range of workloads from single machines to data warehouses or web services with many concurrent users.

<span class="mw-page-title-main">Ingres (database)</span> Database software

Ingres Database is a proprietary SQL relational database management system intended to support large commercial and government applications.

MySQL Cluster is a technology providing shared-nothing clustering and auto-sharding for the MySQL database management system. It is designed to provide high availability and high throughput with low latency, while allowing for near linear scalability. MySQL Cluster is implemented through the NDB or NDBCLUSTER storage engine for MySQL.

The following tables compare general and technical information for a number of relational database management systems. Please see the individual products' articles for further information. Unless otherwise specified in footnotes, comparisons are based on the stable versions without any add-ons, extensions or external programs.

Online transaction processing (OLTP) is a type of database system used in transaction-oriented applications, such as many operational systems. "Online" refers to that such systems are expected to respond to user requests and process them in real-time. The term is contrasted with online analytical processing (OLAP) which instead focuses on data analysis.

SAP IQ is a column-based, petabyte scale, relational database software system used for business intelligence, data warehousing, and data marts. Produced by Sybase Inc., now an SAP company, its primary function is to analyze large amounts of data in a low-cost, highly available environment. SAP IQ is often credited with pioneering the commercialization of column-store technology.

Microsoft SQL Server is a proprietary relational database management system developed by Microsoft. As a database server, it is a software product with the primary function of storing and retrieving data as requested by other software applications—which may run either on the same computer or on another computer across a network. Microsoft markets at least a dozen different editions of Microsoft SQL Server, aimed at different audiences and for workloads ranging from small single-machine applications to large Internet-facing applications with many concurrent users.

An embedded database system is a database management system (DBMS) which is tightly integrated with an application software; it is embedded in the application. It is a broad technology category that includes:

MongoDB is a source-available, cross-platform, document-oriented database program. Classified as a NoSQL database product, MongoDB utilizes JSON-like documents with optional schemas. MongoDB is developed by MongoDB Inc. and current versions are licensed under the Server Side Public License (SSPL). MongoDB is a member of the MACH Alliance.

<span class="mw-page-title-main">Michael Stonebraker</span> American computer scientist (born 1943)

Michael Ralph Stonebraker is a computer scientist specializing in database systems. Through a series of academic prototypes and commercial startups, Stonebraker's research and products are central to many relational databases. He is also the founder of many database companies, including Ingres Corporation, Illustra, Paradigm4, StreamBase Systems, Tamr, Vertica and VoltDB, and served as chief technical officer of Informix. For his contributions to database research, Stonebraker received the 2014 Turing Award, often described as "the Nobel Prize for computing."

Redis is a source-available, in-memory storage, used as a distributed, in-memory key–value database, cache and message broker, with optional durability. Because it holds all data in memory and because of its design, Redis offers low-latency reads and writes, making it particularly suitable for use cases that require a cache. Redis is the most popular NoSQL database, and one of the most popular databases overall. Redis is used in companies like Twitter, Airbnb, Tinder, Yahoo, Adobe, Hulu, Amazon and OpenAI.

H-Store is an experimental database management system (DBMS). It was designed for online transaction processing applications. H-Store was developed by a team at Brown University, Carnegie Mellon University, the Massachusetts Institute of Technology, and Yale University in 2007 by researchers Michael Stonebraker, Sam Madden, Andy Pavlo and Daniel Abadi.

<span class="mw-page-title-main">SingleStore</span> Database management system

SingleStore is a proprietary, cloud-native database designed for data-intensive applications. A distributed, relational, SQL database management system (RDBMS) that features ANSI SQL support, it is known for speed in data ingest, transaction processing, and query processing.

NewSQL is a class of relational database management systems that seek to provide the scalability of NoSQL systems for online transaction processing (OLTP) workloads while maintaining the ACID guarantees of a traditional database system.

FoundationDB is a free and open-source multi-model distributed NoSQL database developed by Apple Inc. with a shared-nothing architecture. The product was designed around a "core" database, with additional features supplied in "layers." The core database exposes an ordered key–value store with transactions. The transactions are able to read or write multiple keys stored on any machine in the cluster while fully supporting ACID properties. Transactions are used to implement a variety of data models via layers.

Azure Cosmos DB is a globally distributed, multi-model database service offered by Microsoft. It is designed to provide high availability, scalability, and low-latency access to data for modern applications. Unlike traditional relational databases, Cosmos DB is a NoSQL and vector database, which means it can handle unstructured, semi-structured, structured, and vector data types.

<span class="mw-page-title-main">Postgres-XL</span>

Postgres-XL is a distributed relational database management system (RDBMS) software based on PostgreSQL. It aims to provide feature parity with PostgreSQL while distributing the workload over a cluster. The name "Postgres-XL" stands for "eXtensible Lattice".

TiDB is an open-source NewSQL database that supports Hybrid Transactional and Analytical Processing (HTAP) workloads. Designed to be MySQL compatible, it is developed and supported primarily by PingCAP and licensed under Apache 2.0. It is also available as a paid product. TiDB drew its initial design inspiration from Google's Spanner and F1 papers.

RavenDB is an open-source document-oriented database written in C#, developed by Hibernating Rhinos Ltd. It is cross-platform, supported on Windows, Linux, and Mac OS. RavenDB stores data as JSON documents and can be deployed in distributed clusters with master-master replication.

References

  1. "OLTP Through the Looking Glass, and What We Found There" (PDF). nms.csail.mit.edu.
  2. "The End of an Architectural Era (It's Time for a Complete Rewrite)" (PDF). nms.csail.mit.edu.
  3. "High Performance RDBMS for Fast Data Applications Requiring Smart Streaming with Transactions" (PDF). voltdb.com.
  4. "JDBC Interface". voltdb.com.
  5. "Designing Stored Procedures to Access the Database". voltdb.com.
  6. "Debunking Myths About the VoltDB In-Memory Database - DZone Java". dzone.com. Retrieved 2020-11-13.
  7. "The LMAX Architecture". martinfowler.com. Retrieved 2019-04-07.
  8. "DB Developer Central". VoltDB. Retrieved 2019-04-07.
  9. "VoltDB 6 Release Notes". voltactivedata.com.
  10. "VoltDB 6 Release Notes". voltactivedata.com.
  11. "VoltDB 6 Release Notes". voltactivedata.com.
  12. "VoltDB 7 Release Notes". voltactivedata.com.
  13. "VoltDB 7 Release Notes". voltactivedata.com.
  14. "VoltDB 7 Release Notes". voltactivedata.com.
  15. "VoltDB 8 Release Notes". voltactivedata.com.
  16. "VoltDB 8 Release Notes". voltactivedata.com.
  17. "VoltDB 8 Release Notes". voltactivedata.com.
  18. "VoltDB 9 Release Notes". voltactivedata.com.
  19. "VoltDB 9 Release Notes". voltactivedata.com.
  20. "VoltDB 10 Release Notes". voltactivedata.com.
  21. "VoltDB 10 Release Notes". voltactivedata.com.
  22. "VoltDB 10 Release Notes". voltactivedata.com.
  23. "As of today, we are Volt Active Data". voltactivedata.com.