Polyglot persistence

Last updated

Polyglot persistence is a term that refers to using multiple data storage technologies within a single system, in order to meet varying data storage needs. Such a system may consist of multiple applications, or it may be a single application with smaller components.

Contents

Description

Polyglot persistence shares its origins with how the term polyglot programming is defined. Polyglot programming is a term coined by Neal Ford in 2006, and expresses the idea that computer applications should be written in a mix of different programming languages, in order to take advantage of the fact that different languages are suitable for tackling different problems. Complex applications combine different types of problems, so picking the right language for each job may be more productive than trying to solve all aspects of the problem using a single language. This same concept can be applied to databases, that an application can communicate with different databases, using each for what it is best at to achieve an end goal, hence the term polyglot persistence.

Since there are numerous databases available to solve different problems, using a single database to satisfy all of a program's requirements can result in a non-performant, "jack of all trades, master of none" solution. Relational databases, for example, are good at enforcing relationships that exist between various data tables. To discover a relationship or to find data from different tables that belong to the same object, an SQL join operation can be used. This might work when the data is smaller in size, but becomes problematic when the data involved grows larger. A graph database might solve the problem of relationships in case of Big Data, but it might not solve the problem of database transactions, which may be provided better by relational database management systems. Instead, a NoSQL document database might be used to store unstructured data for that particular part of the problem. Thus different problems are solved by different database systems, all within the same application.

Some of such data storage technologies, but not limited to, could be

Martin Fowler writes in one of his articles, "One of the interesting consequences of this is that we are gearing up for a shift to polyglot persistence where any decent sized enterprise will have a variety of different data storage technologies for different kinds of data. There will still be large amounts of it managed in relational stores, but increasingly we'll be first asking how we want to manipulate the data and only then figuring out what technology is the best bet for it." [1]

Related Research Articles

<span class="mw-page-title-main">Database</span> Organized collection of data in computing

In computing, a database is an organized collection of data stored and accessed electronically. Small databases can be stored on a file system, while large databases are hosted on computer clusters or cloud storage. The design of databases spans formal techniques and practical considerations, including data modeling, efficient data representation and storage, query languages, security and privacy of sensitive data, and distributed computing issues, including supporting concurrent access and fault tolerance.

<span class="mw-page-title-main">Object database</span> Type of database management system

An object database or object-oriented database is a database management system in which information is represented in the form of objects as used in object-oriented programming. Object databases are different from relational databases which are table-oriented. A third type, object–relational databases, is a hybrid of both approaches.

Structured Query Language (SQL), is a domain-specific language used in programming and designed for managing data held in a relational database management system (RDBMS), or for stream processing in a relational data stream management system (RDSMS). It is particularly useful in handling structured data, i.e., data incorporating relations among entities and variables.

Object–relational mapping in computer science is a programming technique for converting data between a relational database and the heap of an object-oriented programming language. This creates, in effect, a virtual object database that can be used from within the programming language.

<span class="mw-page-title-main">Data model</span> Model that organizes elements of data and how they relate to one another and to real-world entities.

A data model is an abstract model that organizes elements of data and standardizes how they relate to one another and to the properties of real-world entities. For instance, a data model may specify that the data element representing a car be composed of a number of other elements which, in turn, represent the color and size of the car and define its owner.

<span class="mw-page-title-main">IBM Db2</span> Relational model database server

Db2 is a family of data management products, including database servers, developed by IBM. It initially supported the relational model, but was extended to support object–relational features and non-relational structures like JSON and XML. The brand name was originally styled as DB/2, then DB2 until 2017 and finally changed to its present form.

Online analytical processing, or OLAP, is an approach to answer multi-dimensional analytical (MDA) queries swiftly in computing. OLAP is part of the broader category of business intelligence, which also encompasses relational databases, report writing and data mining. Typical applications of OLAP include business reporting for sales, marketing, management reporting, business process management (BPM), budgeting and forecasting, financial reporting and similar areas, with new applications emerging, such as agriculture.

A navigational database is a type of database in which records or objects are found primarily by following references from other objects. The term was popularized by the title of Charles Bachman's 1973 Turing Award paper, The Programmer as Navigator. This paper emphasized the fact that the new disk-based database systems allowed the programmer to choose arbitrary navigational routes following relationships from record to record, contrasting this with the constraints of earlier magnetic-tape and punched card systems where data access was strictly sequential.

An XML database is a data persistence software system that allows data to be specified, and sometimes stored, in XML format. This data can be queried, transformed, exported and returned to a calling system. XML databases are a flavor of document-oriented databases which are in turn a category of NoSQL database.

The object–relational impedance mismatch is a set of conceptual and technical difficulties that are often encountered when organizations stored data in relational data stores and then being addressed and used by domain-driven object models, the default method of implementing business-centric objects in object-oriented programming languages. The problems arise not from a failure of addressing data as relational nor as domain objects, but as a result of the difficulty of implementing a data mapping between the data values of the two conceptually different logic models; both models are logical models that can be implemented differently depending upon the implementation technology utilized. These issues are not limited to applications, but exists across an enterprise, whenever data is stored in a relational manner then utilized as domain-driven object models, and vice versa. These difficulties are sometimes mitigated by use of a object-oriented data store, but that too has its own set of implementation difficulties.

Core Data is an object graph and persistence framework provided by Apple in the macOS and iOS operating systems. It was introduced in Mac OS X 10.4 Tiger and iOS with iPhone SDK 3.0. It allows data organized by the relational entity–attribute model to be serialized into XML, binary, or SQLite stores. The data can be manipulated using higher level objects representing entities and their relationships. Core Data manages the serialized version, providing object lifecycle and object graph management, including persistence. Core Data interfaces directly with SQLite, insulating the developer from the underlying SQL.

Entity–attribute–value model (EAV) is a data model to encode, in a space-efficient manner, entities where the number of attributes that can be used to describe them is potentially vast, but the number that will actually apply to a given entity is relatively modest. Such entities correspond to the mathematical notion of a sparse matrix.

<span class="mw-page-title-main">Database model</span> Type of data model

A database model is a type of data model that determines the logical structure of a database. It fundamentally determines in which manner data can be stored, organized and manipulated. The most popular example of a database model is the relational model, which uses a table-based format.

Entity Framework (EF) is an open source object–relational mapping (ORM) framework for ADO.NET. It was originally shipped as an integral part of .NET Framework, however starting with Entity Framework version 6.0 it has been delivered separately from the .NET Framework.

A document-oriented database, or document store, is a computer program and data storage system designed for storing, retrieving and managing document-oriented information, also known as semi-structured data.

A NoSQL database provides a mechanism for storage and retrieval of data that is modeled in means other than the tabular relations used in relational databases. Such databases have existed since the late 1960s, but the name "NoSQL" was only coined in the early 21st century, triggered by the needs of Web 2.0 companies. NoSQL databases are increasingly used in big data and real-time web applications. NoSQL systems are also sometimes called Not only SQL to emphasize that they may support SQL-like query languages or sit alongside SQL databases in polyglot-persistent architectures.

A graph database (GDB) is a database that uses graph structures for semantic queries with nodes, edges, and properties to represent and store data. A key concept of the system is the graph. The graph relates the data items in the store to a collection of nodes and edges, the edges representing the relationships between the nodes. The relationships allow data in the store to be linked together directly and, in many cases, retrieved with one operation. Graph databases hold the relationships between data as a priority. Querying relationships is fast because they are perpetually stored in the database. Relationships can be intuitively visualized using graph databases, making them useful for heavily inter-connected data.

<span class="mw-page-title-main">Shadow table</span> Abstract object in computer science

Shadow tables are objects in computer science used to improve the way machines, networks and programs handle information. More specifically, a shadow table is an object that is read and written by a processor and contains data similar to its primary table, which is the table it's "shadowing". Shadow tables usually contain data that is relevant to the operation and maintenance of its primary table, but not within the subset of data required for the primary table to exist. Shadow tables are related to the data type "trails" in data storage systems. Trails are very similar to shadow tables but instead of storing identically formatted information that is different, they store a history of modifications and functions operated on a table.

The following is provided as an overview of and topical guide to databases:

In the field of database design, a multi-model database is a database management system designed to support multiple data models against a single, integrated backend. In contrast, most database management systems are organized around a single data model that determines how data can be organized, stored, and manipulated. Document, graph, relational, and key–value models are examples of data models that may be supported by a multi-model database.

References

  1. "bliki: PolyglotPersistence". martinfowler.com. Retrieved 2020-07-09.