Bitemporal modeling

Last updated November 26, 2024

Bitemporal modeling is a specific case of temporal database information modeling technique designed to handle historical data along two different timelines.^[1] This makes it possible to rewind the information to "as it actually was" in combination with "as it was recorded" at some point in time. In order to be able to do so, information cannot be discarded even if it is erroneous. Within, for example, financial reporting it is often desirable to be able to recreate an old report both as it actually looked at the time of creation and as it should have looked given corrections made to the data after its creation.

Implementations of bitemporal modeling can be done using relational databases and graph databases. As such, bitemporal modeling is considered different from dimensional modeling and complementary to database normalization. The SQL:2011 standard provides language constructs for working with bitemporal data. However, as of 2011 many of the current solutions were still vendor-specific.

Philosophy

Bitemporal modeling uses bitemporal structures as the basic components. This results in the databases which have a consistent type of temporality for all data. One example can be to include event date-timestamp (EDTS) and load datestamp (LDS).

Benefits of bitemporal modeling

By focusing on completeness and accuracy of data, bitemporal modeling facilitates the creation of complete audit trails of data. All data becomes immutable. Specifically this allows for queries which provide:

The most accurate data possible as we know it now
Data as we knew it at any point in time
When and why the most accurate data we had changed

Implementations in notable products

MarkLogic introduced bitemporal data support in version 8.0. Time stamps for Valid and System time are stored in JSON or XML documents.^[2]
XTDB^[3] (formerly Crux) is an open source database that indexes documents using an EAV data model and provides point-in-time bitemporal SQL & Datalog queries.
TerminusDB is an open source document-oriented graph database that uses delta encoding and provides bitemporal functionality^[4]

Related Research Articles

In computing, a database is an organized collection of data or a type of data store based on the use of a database management system (DBMS), the software that interacts with end users, applications, and the database itself to capture and analyze the data. The DBMS additionally encompasses the core facilities provided to administer the database. The sum total of the database, the DBMS and the associated applications can be referred to as a database system. Often the term "database" is also used loosely to refer to any of the DBMS, the database system or an application associated with the database.

In computing, online analytical processing, or OLAP, is an approach to quickly answer multi-dimensional analytical (MDA) queries. The term OLAP was created as a slight modification of the traditional database term online transaction processing (OLTP). OLAP is part of the broader category of business intelligence, which also encompasses relational databases, report writing and data mining. Typical applications of OLAP include business reporting for sales, marketing, management reporting, business process management (BPM), budgeting and forecasting, financial reporting and similar areas, with new applications emerging, such as agriculture.

A surrogate key in a database is a unique identifier for either an entity in the modeled world or an object in the database. The surrogate key is not derived from application data, unlike a natural key.

A query language, also known as data query language or database query language (DQL), is a computer language used to make queries in databases and information systems. In database systems, query languages rely on strict theory to retrieve information. A well known example is the Structured Query Language (SQL).

A temporal database stores data relating to time instances. It offers temporal data types and stores information relating to past, present and future time. Temporal databases can be uni-temporal, bi-temporal or tri-temporal.

In temporal databases, valid-time is the time period when an event happened or something was true in the real world, or more formally when a fact was valid in the modeled reality.

In temporal databases, transaction time is the time when some data has been loaded into a database. The time when a transaction is valid can be called the transaction time-period. It is a technical timeline controlled by a integration layer. More formally, it is the point-in-time during which a fact stored in the database is considered to be true.

A spatial database is a general-purpose database that has been enhanced to include spatial data that represents objects defined in a geometric space, along with tools for querying and analyzing such data.

An entity–attribute–value model (EAV) is a data model optimized for the space-efficient storage of sparse—or ad-hoc—property or data values, intended for situations where runtime usage patterns are arbitrary, subject to user variation, or otherwise unforeseeable using a fixed design. The use-case targets applications which offer a large or rich system of defined property types, which are in turn appropriate to a wide set of entities, but where typically only a small, specific selection of these are instantiated for a given entity. Therefore, this type of data model relates to the mathematical notion of a sparse matrix. EAV is also known as object–attribute–value model, vertical database model, and open schema.

A database model is a type of data model that determines the logical structure of a database. It fundamentally determines in which manner data can be stored, organized and manipulated. The most popular example of a database model is the relational model, which uses a table-based format.

A document-oriented database, or document store, is a computer program and data storage system designed for storing, retrieving and managing document-oriented information, also known as semi-structured data.

NoSQL is an approach to database design that focuses on providing a mechanism for storage and retrieval of data that is modeled in means other than the tabular relations used in relational databases. Instead of the typical tabular structure of a relational database, NoSQL databases house data within one data structure. Since this non-relational database design does not require a schema, it offers rapid scalability to manage large and typically unstructured data sets. NoSQL systems are also sometimes called "Not only SQL" to emphasize that they may support SQL-like query languages or sit alongside SQL databases in polyglot-persistent architectures.

A graph database (GDB) is a database that uses graph structures for semantic queries with nodes, edges, and properties to represent and store data. A key concept of the system is the graph. The graph relates the data items in the store to a collection of nodes and edges, the edges representing the relationships between the nodes. The relationships allow data in the store to be linked together directly and, in many cases, retrieved with one operation. Graph databases hold the relationships between data as a priority. Querying relationships is fast because they are perpetually stored in the database. Relationships can be intuitively visualized using graph databases, making them useful for heavily inter-connected data.

SQL:2011 or ISO/IEC 9075:2011 is the seventh revision of the ISO (1987) and ANSI (1986) standard for the SQL database query language. It was formally adopted in December 2011. The standard consists of 9 parts which are described in detail in SQL. The next version is SQL:2016.

The following is provided as an overview of and topical guide to databases:

Cypher is a declarative graph query language that allows for expressive and efficient data querying in a property graph.

Semantic queries allow for queries and analytics of associative and contextual nature. Semantic queries enable the retrieval of both explicitly and implicitly derived information based on syntactic, semantic and structural information contained in data. They are designed to deliver precise results or to answer more fuzzy and wide open questions through pattern matching and digital reasoning.

In the field of database design, a multi-model database is a database management system designed to support multiple data models against a single, integrated backend. In contrast, most database management systems are organized around a single data model that determines how data can be organized, stored, and manipulated. Document, graph, relational, and key–value models are examples of data models that may be supported by a multi-model database.

A semantic triple, or RDF triple or simply triple, is the atomic data entity in the Resource Description Framework (RDF) data model. As its name indicates, a triple is a sequence of three entities that codifies a statement about semantic data in the form of subject–predicate–object expressions.

GQL is a standardized query language for property graphs first described in ISO/IEC 76120, released in April 2024 by ISO/IEC.

References

↑ "Temporal Patterns". martinfowler.com. Retrieved 2017-04-04.
↑ Bridgwater, Adrian (24 November 2014). "Data Is Good, 'Bidirectionalized Bitemporal' Data Is Better". Forbes .
↑ "XTDB". 9 September 2021.
↑ "terminusdb/terminusdb.pdf at dev · terminusdb/terminusdb". GitHub. Retrieved 2021-09-12.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] "Temporal Patterns". martinfowler.com. Retrieved 2017-04-04.

[2] Bridgwater, Adrian (24 November 2014). "Data Is Good, 'Bidirectionalized Bitemporal' Data Is Better". Forbes .

[3] "XTDB". 9 September 2021.

[4] "terminusdb/terminusdb.pdf at dev · terminusdb/terminusdb". GitHub. Retrieved 2021-09-12.

[1]

[2]

[3]

[4]