H2 (database)

Last updated
H2 Database Engine
Initial releaseDecember 2005;18 years ago (2005-12)
Stable release
2.2.220 / July 4, 2023;9 months ago (2023-07-04)
Repository
Written in Java
Operating system Cross-platform
Type Relational Database Management System
License Eclipse Public License or Mozilla Public License 2.0
Website h2database.com

H2 is a relational database management system written in Java. It can be embedded in Java applications or run in client-server mode. [1]

Contents

The software is available as open source software Mozilla Public License 2.0 or the original Eclipse Public License.[ citation needed ]

History

The development of the H2 database engine started in May 2004, and first published in December 2005. The database engine was written by Thomas Mueller. He also developed the Java database engine Hypersonic SQL. [2] In 2001, the Hypersonic SQL project was stopped, and the HSQLDB Group was formed to continue work on the Hypersonic SQL code. The name H2 stands for Hypersonic 2, however H2 does not share code with Hypersonic SQL or HSQLDB. H2 is built from scratch. [3]

Main features

Use of SQL

A subset of the SQL (Structured Query Language) standard is supported. The main programming APIs are SQL and JDBC, however the database also supports using the PostgreSQL ODBC driver by acting like a PostgreSQL server. [4]

Table types

It is possible to create both in-memory tables, as well as disk-based tables. Tables can be persistent or temporary. Index types are hash table and tree for in-memory tables, and b-tree for disk-based tables. All data manipulation operations are transactional. Table level locking and multiversion concurrency control are implemented. The 2-phase commit protocol is supported as well, but no standard API for distributed transactions is implemented.[ citation needed ]

Security features

The security features of the database are: role based access rights, encryption of the password using SHA-256 and data using the AES or the Tiny Encryption Algorithm, XTEA. The cryptographic features are available as functions inside the database as well. SSL / TLS connections are supported in the client-server mode, as well as when using the console application.[ citation needed ]

The database supports protection against SQL injection by enforcing the use of parameterized statements. In H2, this feature is called 'disabling literals'. [5]

Full text search capability

Two full text search implementations are included, a native implementation and one using Lucene.[ citation needed ]

High availability

A simple form of high availability is implemented: when used in the client-server mode, the database engine supports hot failover (this is commonly known as clustering). However, the clustering mode must be enabled manually after a failure. [6]

Cloud version

Since version 1.1.111, H2 in-memory database can run inside the Google App Engine. [7]

Challenges affecting durability of relational databases

The H2 documentation explains in detail several ways in which problems in underlying hardware and in particular power systems can impact durability of relational databases.

According to the H2 documentation, such problems are related not only to the DB engine design, but also to the storage caching mechanism. As storage devices use write cache in order to increase speed, in a situation of power failure, data in the device's cache is lost. Administrators have then to assess the common trade off between speed and data loss risks in the context of the business requirements and must carefully consider the design of the power supply and UPS of critical servers.

It is possible in some cases to force the storage to write cache frequently or even immediately using fsync which slows the writing process or one can accept that caching in the device buffer introduces some risk of data loss in case of power failure. The effectiveness of using fsync is limited by the fact that many HDD have write caching enabled by factory default in which case there is nothing about the design or settings of the database nor OS level commands that will be able to eliminate the chance of lost or inconsistent data in the event of a sudden power failure. Working with the OS and hardware settings to disable all caching so as to write data in real time can have significant impacts on performance in that only around 100 write operations per second would be achievable when using spinning disks.

The H2 documentation makes an effort to describe in detail [8] potential problems with durability (part of ACID) resulting from the potential data loss of committed transactions in case of a power failure.

Considering the hardware limitations regarding preservation of data in the event of sudden power loss and the ineffectiveness of approaches commonly employed by developers to prevent these sorts of losses, many database engines do not by default call FileDescriptor.sync() nor FileChannel.force() nor fsync or equivalents for every commit because they significantly degrade system performance without significantly increasing durability.

HSQLDB documentation references workarounds to a similar set of issues in their documentation. [9]

The Microsoft Knowledge Base describes the impact issues like power failures, write caching, etc. can have on performance and durability. [10] The knowledge base discusses the trade offs between performance and the vulnerability of disk write caching as well as settings that an administrator can use to balance these.

Utilities

An embedded web server with a browser based console application is included, as well as command line tools to start and stop a server, backup and restore databases, and a command line shell tool.[ citation needed ]

See also

Related Research Articles

<span class="mw-page-title-main">PostgreSQL</span> Free and open-source object relational database management system

PostgreSQL, also known as Postgres, is a free and open-source relational database management system (RDBMS) emphasizing extensibility and SQL compliance. PostgreSQL features transactions with atomicity, consistency, isolation, durability (ACID) properties, automatically updatable views, materialized views, triggers, foreign keys, and stored procedures. It is supported on all major operating systems, including Linux, FreeBSD, OpenBSD, macOS, and Windows, and handles a range of workloads from single machines to data warehouses or web services with many concurrent users.

<span class="mw-page-title-main">Ingres (database)</span> Database software

Ingres Database is a proprietary SQL relational database management system intended to support large commercial and government applications.

<span class="mw-page-title-main">SQLite</span> Serverless relational database management system (RDBMS)

SQLite is a database engine written in the C programming language. It is not a standalone app; rather, it is a library that software developers embed in their apps. As such, it belongs to the family of embedded databases. It is the most widely deployed database engine, as it is used by several of the top web browsers, operating systems, mobile phones, and other embedded systems.

Oracle TimesTen In-Memory Database is an in-memory, relational database management system with persistence and high availability. Originally designed and implemented at Hewlett-Packard labs in Palo Alto, California, TimesTen spun out into a separate startup in 1996 and was acquired by Oracle Corporation in 2005.

<span class="mw-page-title-main">HSQLDB</span> Java-based database engine

HSQLDB is a relational database management system written in Java. It has a JDBC driver and supports a large subset of SQL-92, SQL:2008, SQL:2011, and SQL:2016 standards. It offers a fast, small database engine which offers both in-memory and disk-based tables. Both embedded and server modes are available.

ADO.NET is a data access technology from the Microsoft .NET Framework that provides communication between relational and non-relational systems through a common set of components. ADO.NET is a set of computer software components that programmers can use to access data and data services from a database. It is a part of the base class library that is included with the Microsoft .NET Framework. It is commonly used by programmers to access and modify data stored in relational database systems, though it can also access data in non-relational data sources. ADO.NET is sometimes considered an evolution of ActiveX Data Objects (ADO) technology, but was changed so extensively that it can be considered an entirely new product.

The following tables compare general and technical information for a number of relational database management systems. Please see the individual products' articles for further information. Unless otherwise specified in footnotes, comparisons are based on the stable versions without any add-ons, extensions or external programs.

The Access Database Engine is a database engine on which several Microsoft products have been built. The first version of Jet was developed in 1992, consisting of three modules which could be used to manipulate a database.

Multi-master replication is a method of database replication which allows data to be stored by a group of computers, and updated by any member of the group. All members are responsive to client data queries. The multi-master replication system is responsible for propagating the data modifications made by each member to the rest of the group and resolving any conflicts that might arise between concurrent changes made by different members.

Apache Derby is a relational database management system (RDBMS) developed by the Apache Software Foundation that can be embedded in Java programs and used for online transaction processing. It has a 3.5 MB disk-space footprint.

In computing, a materialized view is a database object that contains the results of a query. For example, it may be a local copy of data located remotely, or may be a subset of the rows and/or columns of a table or join result, or may be a summary using an aggregate function.

A relational database management system uses SQL MERGE statements to INSERT new records or UPDATE existing records depending on whether condition matches. It was officially introduced in the SQL:2003 standard, and expanded in the SQL:2008 standard.

sync is a standard system call in the Unix operating system, which commits all data from the kernel filesystem buffers to non-volatile storage, i.e., data which has been scheduled for writing via low-level I/O system calls. Higher-level I/O layers such as stdio may maintain separate buffers of their own.

In relational databases, the information schema is an ANSI-standard set of read-only views that provide information about all of the tables, views, columns, and procedures in a database. It can be used as a source of the information that some databases make available through non-standard commands, such as:

 => SELECT count(table_name) FROM information_schema.tables;  count   -------  99    => SELECT column_name, data_type, column_default, is_nullable  FROM information_schema.columns WHERE table_name='alpha';  column_name | data_type | column_default | is_nullable   -------------+-----------+----------------+-------------  foo | integer | | YES  bar | character | | YES    => SELECT * FROM information_schema.information_schema_catalog_name;  catalog_name   --------------  johnd  

Microsoft SQL Server Compact is a discontinued relational database produced by Microsoft for applications that run on mobile devices and desktops. Prior to the introduction of the desktop platform, it was known as SQL Server for Windows CE and SQL Server Mobile Edition.

Microsoft SQL Server is a proprietary relational database management system developed by Microsoft. As a database server, it is a software product with the primary function of storing and retrieving data as requested by other software applications—which may run either on the same computer or on another computer across a network. Microsoft markets at least a dozen different editions of Microsoft SQL Server, aimed at different audiences and for workloads ranging from small single-machine applications to large Internet-facing applications with many concurrent users.

An embedded database system is a database management system (DBMS) which is tightly integrated with an application software; it is embedded in the application. It is a broad technology category that includes:

A hierarchical query is a type of SQL query that handles hierarchical model data. They are special cases of more general recursive fixpoint queries, which compute transitive closures.

DataBlitz is a general purpose main memory database management system, developed by Lucent Bell Labs Research from 1993 to 1995. It replaced various home-grown database products used throughout Lucent beginning in 1997.

MEMORY is a storage engine for MySQL and MariaDB relational database management systems, developed by Oracle and MariaDB. Before the version 4.1 of MySQL it was called Heap.

References

  1. "Presentation and use of H2 Database Engine". 6 August 2010.
  2. Hypersonic SQL project page at SourceForge
  3. "Write Your Own Database, Again". thecodist.com.
  4. "H2 Database supports PostgreSQL ODBC driver". Archived from the original on 2016-12-09. Retrieved 2010-08-24.
  5. "SQL Injections: How Not To Get Stuck".
  6. "H2 Clustering". Archived from the original on 2010-09-24.
  7. "H2 Database on GAE". gaevfs.
  8. "Advanced" . Retrieved 30 October 2014.
  9. "Chapter 9. SQL Syntax". Archived from the original on 19 May 2018. Retrieved 30 October 2014.
  10. "Slow Disk Performance When Write Caching Is Enabled". Archived from the original on 31 December 2014. Retrieved 21 December 2014.