Datablitz

Last updated
DataBlitz Main Memory RDBMS
Developer(s) Bell Labs
Initial release1997 (1997)
Stable release
7.1 / April 4, 2010;13 years ago (2010-04-04)
Operating system Linux, Solaris
Type RDBMS
License Proprietary

DataBlitz is a general purpose main memory database management system, developed by Lucent Bell Labs Research from 1993 to 1995. It replaced various home-grown database products used throughout Lucent beginning in 1997.

Contents

It was originally named "Dali", and provided recovery and concurrency control features. Later, Dali was renamed as "DataBlitz".

DataBlitz provides a platform for building high-performance shared memory applications that can survive failures or organize large amounts of data with features suited to many applications.

Applications for DataBlitz include:

Features of DataBlitz

Relational

The DataBlitz Relational Manager is a C++ class library interface to a relational system with SQL support limited to definition statements. Schema information is stored in tables, and can be queried using the relational API itself. Indices may be created on arbitrary subsets of the attributes in a table. Referential integrity is supported (foreign key constraints), as are null values, date and time attribute types, and variable length fields. Navigation is supported through iterators over a single table. A conjunctive query may be specified for the iterator, and automatic index selection is performed. Both fine-grained and multi-granularity locking strategies are used for high concurrency without incurring too much overhead. Also, locks obtained by iterators avoid the "phantom" anomaly...

Collections and Indices

DataBlitz also provides higher-layer interfaces for grouping related data items, and performing scans as well as associative access (via indices) on data items in a group...

Storage Manager

Each database file in DataBlitz consists of segments, which are contiguous page-aligned units of allocation, similar to clusters in a file system. Chunk is a collection of segments. Recovery characteristics of memory (transient, zeroed, or persistent) are specified on a per-chunk basis at the time of chunk creation. Zeroed memory remains allocated upon recovery but each byte is set to zero. With transient memory, the data are no longer allocated upon recovery. Users allocate within a chunk, and do not specify a particular segment. Since segments can be arbitrarily large (within the size of the database), arbitrarily large objects can be stored contiguously. Upon allocation within a chunk, the system returns a standard DataBlitz pointer to the space, which specifies the offset within the file. The elements shown linking together segments in a chunk are themselves stored in a special chunk used for control information. Storing control information separately from the data reduces the likelihood of it being corrupted by stray application pointers...

Replication

In DataBlitz, data can be replicated across multiple DataBlitz instances running on machines connected by a network in a distributed environment. The primary benefits of data replication are higher availability and improved performance. For instance, if a table is stored only at a single site in a distributed setting, and if that site crashes or becomes unavailable due to a network failure, then the table would become inaccessible to other sites in the system. DataBlitz provides support for data replication at the granularity of tables. Each table can be replicated at any subset of sites in the system...

Related Research Articles

<span class="mw-page-title-main">Database</span> Organized collection of data in computing

In computing, a database is an organized collection of data stored and accessed electronically. Small databases can be stored on a file system, while large databases are hosted on computer clusters or cloud storage. The design of databases spans formal techniques and practical considerations, including data modeling, efficient data representation and storage, query languages, security and privacy of sensitive data, and distributed computing issues, including supporting concurrent access and fault tolerance.

A database engine is the underlying software component that a database management system (DBMS) uses to create, read, update and delete (CRUD) data from a database. Most database management systems include their own application programming interface (API) that allows the user to interact with their underlying engine without going through the user interface of the DBMS.

SAP ASE (Adaptive Server Enterprise), originally known as Sybase SQL Server, and also commonly known as Sybase DB or Sybase ASE, is a relational model database server developed by Sybase Corporation, which later became part of SAP AG. ASE was developed for the Unix operating system, and is also available for Microsoft Windows.

Oracle TimesTen In-Memory Database is an in-memory, relational database management system with persistence and high availability. Originally designed and implemented at Hewlett-Packard labs in Palo Alto, California, TimesTen spun out into a separate startup in 1996 and was acquired by Oracle Corporation in 2005.

A navigational database is a type of database in which records or objects are found primarily by following references from other objects. The term was popularized by the title of Charles Bachman's 1973 Turing Award paper, The Programmer as Navigator. This paper emphasized the fact that the new disk-based database systems allowed the programmer to choose arbitrary navigational routes following relationships from record to record, contrasting this with the constraints of earlier magnetic-tape and punched card systems where data access was strictly sequential.

MySQL Cluster is a technology providing shared-nothing clustering and auto-sharding for the MySQL database management system. It is designed to provide high availability and high throughput with low latency, while allowing for near linear scalability. MySQL Cluster is implemented through the NDB or NDBCLUSTER storage engine for MySQL.

An in-memory database is a database management system that primarily relies on main memory for computer data storage. It is contrasted with database management systems that employ a disk storage mechanism. In-memory databases are faster than disk-optimized databases because disk access is slower than memory access and the internal optimization algorithms are simpler and execute fewer CPU instructions. Accessing data in memory eliminates seek time when querying the data, which provides faster and more predictable performance than disk.

The object–relational impedance mismatch is a set of conceptual and technical difficulties that are often encountered when a relational database management system (RDBMS) is being served by an application program written in an object-oriented programming language or style, particularly because objects or class definitions, must be mapped to database tables defined by a relational schema.

Entity–attribute–value model (EAV) is a data model to encode, in a space-efficient manner, entities where the number of attributes that can be used to describe them is potentially vast, but the number that will actually apply to a given entity is relatively modest. Such entities correspond to the mathematical notion of a sparse matrix.

Microsoft SQL Server is a proprietary relational database management system developed by Microsoft. As a database server, it is a software product with the primary function of storing and retrieving data as requested by other software applications—which may run either on the same computer or on another computer across a network. Microsoft markets at least a dozen different editions of Microsoft SQL Server, aimed at different audiences and for workloads ranging from small single-machine applications to large Internet-facing applications with many concurrent users.

Raima Database Manager is an ACID-compliant embedded database management system designed for use in embedded systems applications. RDM has been designed to utilize multi-core computers, networking, and on-disk or in-memory storage management. RDM provides support for multiple application programming interfaces (APIs): low-level C API, C++, and SQL(native, ODBC, JDBC, ADO.NET, and REST). RDM is highly portable and is available on Windows, Linux, Unix and several real-time or embedded operating systems. A source-code license is also available.

<span class="mw-page-title-main">RDM Server</span>

RDM Server is an embeddable, heterogeneous, client/server database management system supporting both C/C++ and SQL APIs for programming flexibility. The databases can be disk resident and/or memory resident. RDM Server implements multi-user locking, hot database backup, and a fully ACID-compliant transaction logging system with automatic crash recovery. It is currently supported on many 32- and 64-bit enterprise and embedded operating systems. The database library can optionally be run in-process with the application, eliminating client/server remote procedure calls.

An embedded database system is a database management system (DBMS) which is tightly integrated with an application software; it is embedded in the application. It is a broad technology category that includes:

<span class="mw-page-title-main">Actian NoSQL Object Database</span>

Actian NoSQL Database is an object database software product initially developed by Versant Corporation and currently owned by Actian.

LevelDB is an open-source on-disk key-value store written by Google fellows Jeffrey Dean and Sanjay Ghemawat. Inspired by Bigtable, LevelDB is hosted on GitHub under the New BSD License and has been ported to a variety of Unix-based systems, macOS, Windows, and Android.

Tarantool is an in-memory computing platform with a flexible data schema, best used for creating high-performance applications. Two main parts of it are an in-memory database and a Lua application server.

<span class="mw-page-title-main">Array DBMS</span> System that provides database services specifically for arrays

An array database management system or array DBMS provides database services specifically for arrays, that is: homogeneous collections of data items, sitting on a regular grid of one, two, or more dimensions. Often arrays are used to represent sensor, simulation, image, or statistics data. Such arrays tend to be Big Data, with single objects frequently ranging into Terabyte and soon Petabyte sizes; for example, today's earth and space observation archives typically grow by Terabytes a day. Array databases aim at offering flexible, scalable storage and retrieval on this information category.

<span class="mw-page-title-main">Oracle NoSQL Database</span>

Oracle NoSQL Database is a NoSQL-type distributed key-value database from Oracle Corporation. It provides transactional semantics for data manipulation, horizontal scalability, and simple administration and monitoring.

<span class="mw-page-title-main">Key–value database</span> Data storage paradigm

A key–value database, or key–value store, is a data storage paradigm designed for storing, retrieving, and managing associative arrays, and a data structure more commonly known today as a dictionary or hash table. Dictionaries contain a collection of objects, or records, which in turn have many different fields within them, each containing data. These records are stored and retrieved using a key that uniquely identifies the record, and is used to find the data within the database.

Database scalability is the ability of a database to handle changing demands by adding/removing resources. Databases use a host of techniques to cope.

References