LevelDB

Last updated
LevelDB
Developer(s) Jeffrey Dean, Sanjay Ghemawat, Google Inc.
Stable release
1.23 [1]   OOjs UI icon edit-ltr-progressive.svg / 23 February 2021;2 years ago (23 February 2021)
Repository
Written in C++
Size 350 kB (binary size)
Type Database library
License New BSD License
Website github.com/google/leveldb   OOjs UI icon edit-ltr-progressive.svg

LevelDB is an open-source on-disk key-value store written by Google fellows Jeffrey Dean and Sanjay Ghemawat. [2] [3] Inspired by Bigtable, [4] LevelDB source code is hosted on GitHub under the New BSD License and has been ported to a variety of Unix-based systems, macOS, Windows, and Android. [5]

Contents

Features

LevelDB stores keys and values in arbitrary byte arrays, and data is sorted by key. It supports batching writes, forward and backward iteration, and compression of the data via Google's Snappy compression library.

LevelDB is not an SQL database. Like other NoSQL and dbm stores, it does not have a relational data model and it does not support SQL queries. Also, it has no support for indexes. Applications use LevelDB as a library, as it does not provide a server or command-line interface.

MariaDB 10.0 comes with a storage engine which allows users to query LevelDB tables from MariaDB. [6]

History

LevelDB is based on concepts from Google's Bigtable database system. The table implementation for the Bigtable system was developed starting in about 2004, and is based on a different Google internal code base than the LevelDB code. That code base relies on a number of Google code libraries that are not themselves open sourced, so directly open sourcing that code would have been difficult. Jeff Dean and Sanjay Ghemawat wanted to create a system resembling the Bigtable tablet stack that had minimal dependencies and would be suitable for open sourcing, and also would be suitable for use in Chrome for the IndexedDB implementation. They wrote LevelDB starting in early 2011, with the same general design as the Bigtable tablet stack, but not sharing any of the code. [7]

Usage

LevelDB is used as the backend database for Google Chrome's IndexedDB and is one of the supported backends for Riak. [8] Additionally, Bitcoin Core and go-ethereum store the blockchain metadata using a LevelDB database. [9] Minecraft Bedrock Edition uses a modified version for chunk and entity data storage. [10] Autodesk AutoCAD 2016 also uses LevelDB.

Performance

Google has provided benchmarks comparing LevelDB's performance to SQLite and Kyoto Cabinet in different scenarios. [11] LevelDB outperforms both SQLite and Kyoto Cabinet in write operations and sequential-order read operations. LevelDB also excels at batch writes, but is slower than SQLite when dealing with large values. The currently published benchmarks were updated after SQLite configuration mistakes were noted in an earlier version of the results. [12] Updated benchmarks [13] show that LevelDB also outperforms Berkeley DB, but these tests also show that OpenLDAP LightningDB is much faster (~10 times in some scenarios) in read operations and some write types (e.g. batch and synchronous writes, see the link above), and is almost equal in the rest of the test.

All the above benchmarks date back from 2011 to 2014, and may only be of historical significance as SQLite, for instance, became significantly more efficient. [14]

Bugs and reliability

LevelDB has a history of database corruption bugs. [15] [16] [17] [18] [19] [20] A study from 2014 has found that, on non-checksummed file systems, the database could become corrupted after a crash or power failure. [21]

See also

Related Research Articles

<span class="mw-page-title-main">SQLite</span> Serverless relational database management system (RDBMS)

SQLite is a database engine written in the C programming language. It is not a standalone app; rather, it is a library that software developers embed in their apps. As such, it belongs to the family of embedded databases. It is the most widely deployed database engine, as it is used by several of the top web browsers, operating systems, mobile phones, and other embedded systems.

In computing, a solution stack or software stack is a set of software subsystems or components needed to create a complete platform such that no additional software is needed to support applications. Applications are said to "run on" or "run on top of" the resulting platform.

A spatial database is a general-purpose database that has been enhanced to include spatial data that represents objects defined in a geometric space, along with tools for querying and analyzing such data.

Bigtable is a fully managed wide-column and key-value NoSQL database service for large analytical and operational workloads as part of the Google Cloud portfolio.

An embedded database system is a database management system (DBMS) which is tightly integrated with an application software; it is embedded in the application. It is a broad technology category that includes:

HBase is an open-source non-relational distributed database modeled after Google's Bigtable and written in Java. It is developed as part of Apache Software Foundation's Apache Hadoop project and runs on top of HDFS or Alluxio, providing Bigtable-like capabilities for Hadoop. That is, it provides a fault-tolerant way of storing large quantities of sparse data.

Web2py is an open-source web application framework written in the Python programming language. Web2py allows web developers to program dynamic web content using Python. Web2py is designed to help reduce tedious web development tasks, such as developing web forms from scratch, although a web developer may build a form from scratch if required.

<span class="mw-page-title-main">Apache Cassandra</span> Free and open-source database management system

Cassandra is a free and open-source, distributed, wide-column store, NoSQL database management system designed to handle large amounts of data across many commodity servers, providing high availability with no single point of failure. Cassandra offers support for clusters spanning multiple datacenters, with asynchronous masterless replication allowing low latency operations for all clients. Cassandra was designed to implement a combination of Amazon's Dynamo distributed storage and replication techniques combined with Google's Bigtable data and storage engine model.

A database shard, or simply a shard, is a horizontal partition of data in a database or search engine. Each shard is held on a separate database server instance, to spread load.

NoSQL is an approach to database design that focuses on providing a mechanism for storage and retrieval of data that is modeled in means other than the tabular relations used in relational databases. Instead of the typical tabular structure of a relational database, NoSQL databases house data within one data structure. Since this non-relational database design does not require a schema, it offers rapid scalability to manage large and typically unstructured data sets. NoSQL systems are also sometimes called "Not only SQL" to emphasize that they may support SQL-like query languages or sit alongside SQL databases in polyglot-persistent architectures.

Structured storage is computer storage for structured data, often in the form of a distributed database. Computer software formally known as structured storage systems include Apache Cassandra, Google's Bigtable and Apache HBase.

Web SQL Database is a deprecated web browser API specification for storing data in databases that can be queried using SQL variant. The technology was only ever implemented in Blink-based and WebKit-based browsers like Google Chrome and the new Microsoft Edge. As of September 2023, WebSQL is being phased out in favor of WebStorage and IndexedDB and OPFS, but still available in some contexts.

<span class="mw-page-title-main">Couchbase Server</span> Open-source NoSQL database

Couchbase Server, originally known as Membase, is a source-available, distributed multi-model NoSQL document-oriented database software package optimized for interactive applications. These applications may serve many concurrent users by creating, storing, retrieving, aggregating, manipulating and presenting data. In support of these kinds of application needs, Couchbase Server is designed to provide easy-to-scale key-value, or JSON document access, with low latency and high sustainability throughput. It is designed to be clustered from a single machine to very large-scale deployments spanning many machines.

Apache Accumulo is a highly scalable sorted, distributed key-value store based on Google's Bigtable. It is a system built on top of Apache Hadoop, Apache ZooKeeper, and Apache Thrift. Written in Java, Accumulo has cell-level access labels and server-side programming mechanisms. According to DB-Engines ranking, Accumulo is the third most popular NoSQL wide column store behind Apache Cassandra and HBase and the 67th most popular database engine of any type (complete) as of 2018.

<span class="mw-page-title-main">Jeff Dean</span> American computer scientist and software engineer

Jeffrey Adgate "Jeff" Dean is an American computer scientist and software engineer. Since 2018, he has been the lead of Google AI. He was appointed Alphabet's chief scientist in 2023 after a reorganization of Alphabet's AI focused groups.

Lightning Memory-Mapped Database (LMDB) is an embedded transactional database in the form of a key-value store. LMDB is written in C with API bindings for several programming languages. LMDB stores arbitrary key/data pairs as byte arrays, has a range-based search capability, supports multiple data items for a single key and has a special mode for appending records (MDB_APPEND) without checking for consistency. LMDB is not a relational database, it is strictly a key-value store like Berkeley DB and dbm.

The following outline is provided as an overview of and topical guide to MySQL:

<span class="mw-page-title-main">RocksDB</span> Embedded key-value database

RocksDB is a high performance embedded database for key-value data. It is a fork of Google's LevelDB optimized to exploit multi-core processors (CPUs), and make efficient use of fast storage, such as solid-state drives (SSD), for input/output (I/O) bound workloads. It is based on a log-structured merge-tree data structure. It is written in C++ and provides official language bindings for C++, C, and Java. Many third-party language bindings exist. RocksDB is free and open-source software, released originally under a BSD 3-clause license. However, in July 2017 the project was migrated to a dual license of both Apache 2.0 and GPLv2 license. This change helped its adoption in Apache Software Foundation's projects after blacklist of the previous BSD+Patents license clause.

Sanjay Ghemawat is an Indian American computer scientist and software engineer. He is currently a Senior Fellow at Google in the Systems Infrastructure Group. Ghemawat's work at Google, much of it in close collaboration with Jeff Dean, has included big data processing model MapReduce, the Google File System, and databases Bigtable and Spanner. Wired have described him as one of the "most important software engineers of the internet age".

Comdb2 is an open source, highly available clustered RDBMS developed by Bloomberg LP, built on optimistic concurrency control techniques. It provides multiple isolation levels, including Snapshot and Serializable Isolation. Read/Write transactions run on any node, with the client library transparently negotiating connections to lowest cost (latency) node which is available. Comdb2 implements queues for publisher-to-subscriber message delivery. Queues can be combined with table triggers for time-consistent log distribution.

References

  1. "Release 1.23". 23 February 2021. Retrieved 13 March 2021.
  2. "Google Research Scientists and Engineers: Jeffrey Dean". Google, Inc.
  3. "Research Scientists and Engineers: Sanjay Ghemawat". Google, Inc.
  4. "Google Open-Sources NoSQL Database Called LevelDB". ReadWriteWeb . July 30, 2011. Archived from the original on August 16, 2011. Retrieved July 30, 2011.
  5. "Google Open Source Blog: LevelDB: A Fast Persistent Key-Value Store". Google, Inc.
  6. LevelDB storage engine
  7. Jeff Dean. "LevelDB mailing list: "Current Status of LevelDB"".
  8. LevelDB. Docs.basho.com. Retrieved on 2013-09-18.
  9. Andreas M. Antonopoulos. "Chapter 7. The Blockchain" . Retrieved 8 January 2015.
  10. "Bedrock Edition level format". Minecraft Wiki. Retrieved 24 September 2023.
  11. "LevelDB Benchmarks". Google, Inc. Archived from the original on 2011-08-20.
  12. "LevelDB Benchmark discussion".
  13. Database Microbenchmarks Archived 2014-08-09 at the Wayback Machine , Symas Corp., 2012-09. Retrieved 22 October 2016
  14. "Measuring and Reducing CPU Usage in SQLite".
  15. Repairing LevelDB
  16. Issues · google/leveldb · GitHub
  17. Unrecoverable corruption in Chromium
  18. Corruption in syncthing
  19. Corruption after power loss
  20. Corruption in Ethereum
  21. All File Systems Are Not Created Equal: On the Complexity of Crafting Crash-Consistent Applications. 2014. pp. 433–448. ISBN   9781931971164.