MultiValue database

Last updated

A MultiValue database is a type of NoSQL and multidimensional database, typically considered synonymous with PICK, a database originally developed as the Pick operating system.

Contents

MultiValue databases include commercial products from Rocket Software, Revelation, InterSystems, Northgate Information Solutions, ONgroup, [1] and other companies. These databases differ from a relational database in that they have features that support and encourage the use of attributes which can take a list of values, rather than all attributes being single-valued. They are often categorized with MUMPS within the category of post-relational databases, although the data model actually pre-dates the relational model. Unlike SQL-DBMS tools, most MultiValue databases can be accessed both with or without SQL.

History

Don Nelson designed the MultiValue data model in the early to mid-1960s. [2] Dick Pick, a developer at TRW, worked on the first implementation of this model for the US Army in 1965. Pick considered the software to be in the public domain because it was written for the military. This was but the first dispute regarding MultiValue databases that was addressed by the courts. [3]

Ken Simms wrote DataBASIC, sometimes known as S-BASIC, in the mid-1970s. It was based on Dartmouth BASIC, but had enhanced features for data management. Simms played a lot of Star Trek while developing the language, in order to have the language function to his satisfaction. [4]

Three of the implementations of MultiValue - PICK version R77, Microdata Reality [5] 3.x, and Prime Information 1.0 - were very similar. In spite of attempts to standardize, particularly by International Spectrum and the Spectrum Manufacturers Association, who designed a logo for all to use, [6] there are no standards across MultiValue implementations. Subsequently, these flavors diverged, although with some cross-over. These streams of MultiValue database development could be classified as one stemming from PICK R83, one from Microdata Reality, and one from Prime Information. [7] Because of the differences, some implementations have provisions for supporting several flavors of the languages. An attempt to document the similarities and differences can be found at the Post-Relational Database Reference (PRDB). [8]

Marketing groups and others in the industry over the years have classified MultiValue databases as pre-relational, post-relational, relational, and embedded, with detractors often classifying it as legacy. It could now be classified as NoSQL. With a data model that aligns well with JSON and XML and that permits access with or without the use of SQL.

One reasonable hypothesis for this data model lasting 50 years, [9] with new database implementations of the model even in the 21st century is that it provides inexpensive database solutions. Historically, with industry benchmarks tied to SQL transactions, this has been a difficult hypothesis to test, although there are considerable anecdotes of failed attempts to get the functionality of a MultiValue application into a relational database framework.

In spite of a history of more than 40 years of implementations, starting with TRW, many in the MultiValue industry have remained current so that various MultiValue implementations now employ object-oriented versions of Data/BASIC, support AJAX frameworks, and because no one needs to use SQL (but some can) with these databases, they fit under the NoSQL umbrella. In fact, MultiValue developers were the first to acquire nosql domain names, likely prior to other database products classifying their offerings as NoSQL as well. MultiValue is a seasoned data model with several vendors competing in the MultiValue space. It has been continuously enhanced over the years.

Data model example

In a MultiValue database system:

Data is stored using two separate files: a "file" to store raw data and a "dictionary" to store the format for displaying the raw data.

For example, assume there's a file (table) called "PERSON". In this file, there is an attribute called "eMailAddress". The eMailAddress field can store a variable number of email address values in a single record. The list [joe@example.com, jdb@example.net, joe_bacde@example.org] can be stored and accessed via a single query when accessing the associated record.

Achieving the same (one-to-many) relationship within a traditional relational database system would include creating an additional table to store the variable number of email addresses associated with a single "PERSON" record. However, modern relational database systems support this multi-value data model too. For example, in PostgreSQL, a column can be an array of any base type.

MultiValue DataBASIC

Like the Java programming language, the typical Data/BASIC compiler compiles to P-code, or bytecode, and runs in a P-machine, with jBASE being a notable exception.[ citation needed ] It has as many different implementations (compilers) as there are MultiValue databases.

Like PHP programming language, the Data/BASIC language does all the typecasting for the programmer.

MultiValue Query Language

Known as ENGLISH, ACCESS, AQL, UniQuery, Retrieve, CMQL, and by many other names over the years, corresponding to the different MultiValue implementations, the MultiValue query language differs from SQL in several respects. Each query is issued against a single dictionary within the schema, which could be understood as a virtual file or a portal to the database through which to view the data.

LIST PEOPLE LAST_NAME FIRST_NAME EMAIL_ADDRESSES WITH LAST_NAME LIKE "Van..."

The above statement would list all e-mail addresses for each person whose last name starts with "Van". A single entry would be output for each person, with multiple lines showing the multiple e-mail addresses (without repeating other data about the person).

See also

Related Research Articles

Database Organized collection of data in computing

In computing, a database is an organized collection of data stored and accessed electronically. Small databases can be stored on a file system, while large databases are hosted on computer clusters or cloud storage. The design of databases spans formal techniques and practical considerations including data modeling, efficient data representation and storage, query languages, security and privacy of sensitive data, and distributed computing issues including supporting concurrent access and fault tolerance.

Denormalization is a strategy used on a previously-normalized database to increase performance. In computing, denormalization is the process of trying to improve the read performance of a database, at the expense of losing some write performance, by adding redundant copies of data or by grouping data. It is often motivated by performance or scalability in relational database software needing to carry out very large numbers of read operations. Denormalization differs from the unnormalized form in that denormalization benefits can only be fully realized on a data model that is otherwise normalized.

A relational database is a digital database based on the relational model of data, as proposed by E. F. Codd in 1970. A system used to maintain relational databases is a relational database management system (RDBMS). Many relational database systems have an option of using the SQL for querying and maintaining the database.

The relational model (RM) for database management is an approach to managing data using a structure and language consistent with first-order predicate logic, first described in 1969 by English computer scientist Edgar F. Codd, where all data is represented in terms of tuples, grouped into relations. A database organized in terms of the relational model is a relational database.

SQL is a domain-specific language used in programming and designed for managing data held in a relational database management system (RDBMS), or for stream processing in a relational data stream management system (RDSMS). It is particularly useful in handling structured data, i.e. data incorporating relations among entities and variables. SQL offers two main advantages over older read–write APIs such as ISAM or VSAM. Firstly, it introduced the concept of accessing many records with one single command. Secondly, it eliminates the need to specify how to reach a record, e.g. with or without an index.

Object–relational mapping in computer science is a programming technique for converting data between type systems using object-oriented programming languages. This creates, in effect, a "virtual object database" that can be used from within the programming language. There are both free and commercial packages available that perform object–relational mapping, although some programmers opt to construct their own ORM tools.

Object–relational database Database management system

An object–relational database (ORD), or object–relational database management system (ORDBMS), is a database management system (DBMS) similar to a relational database, but with an object-oriented database model: objects, classes and inheritance are directly supported in database schemas and in the query language. In addition, just as with pure relational systems, it supports extension of the data model with custom data types and methods.

First normal form (1NF) is a property of a relation in a relational database. A relation is in first normal form if and only if no attribute domain has relations as elements. Or more informally, that no table column can have tables as values. Database normalization is the process of representing a database in terms of relations in standard normal forms, where first normal is a minimal requirement. SQL-92 does not support creating or using table-valued columns, which means that using only the "traditional relational database features" most relational databases will be in first normal form by necessity. Database systems which do not require first normal form are often called no sql systems. Newer SQL standards like SQL:1999 have started to allow so called non-atomic types, which include composite types. Even newer versions like SQL:2016 allow json.

The Pick operating system is a demand-paged, multiuser, virtual memory, time-sharing computer operating system based around a MultiValue database. Pick is used primarily for business data processing. It is named after one of its developers, Richard A. (Dick) Pick.

In a relational database, a column is a set of data values of a particular type, one value for each row of the database. A column may contain text values, numbers, or even pointers to files in the operating system. Columns typically contain simple types, though some relational database systems allow columns to contain more complex data types, such as whole documents, images, or even video clips. A column can also be called an attribute.

The object–relational impedance mismatch is a set of conceptual and technical difficulties that are often encountered when a relational database management system (RDBMS) is being served by an application program written in an object-oriented programming language or style, particularly because objects or class definitions must be mapped to database tables defined by a relational schema.

Entity–attribute–value model (EAV) is a data model to encode, in a space-efficient manner, entities where the number of attributes that can be used to describe them is potentially vast, but the number that will actually apply to a given entity is relatively modest. Such entities correspond to the mathematical notion of a sparse matrix.

A database model is a type of data model that determines the logical structure of a database. It fundamentally determines in which manner data can be stored, organized and manipulated. The most popular example of a database model is the relational model, which uses a table-based format.

QUEL is a relational database query language, based on tuple relational calculus, with some similarities to SQL. It was created as a part of the Ingres DBMS effort at University of California, Berkeley, based on Codd's earlier suggested but not implemented Data Sub-Language ALPHA. QUEL was used for a short time in most products based on the freely available Ingres source code, most notably in an implementation called POSTQUEL supported by POSTGRES. As Oracle and DB2 gained market share in the early 1980s, most companies then supporting QUEL moved to SQL instead. QUEL continues to be available as a part of the Ingres DBMS, although no QUEL-specific language enhancements have been added for many years.

Actian Zen is an ACID-compliant database management system (DBMS) developed by Pervasive Software. It is optimized for embedding in applications and used in several different types of packaged software applications offered by independent software vendors (ISVs) and original equipment manufacturers (OEMs). It is available for software as a service (SaaS) deployment due to a file-based architecture enabling partitioning of data for multitenancy needs.

The following is provided as an overview of and topical guide to databases:

rasdaman

rasdaman is an Array DBMS, that is: a Database Management System which adds capabilities for storage and retrieval of massive multi-dimensional arrays, such as sensor, image, simulation, and statistics data. A frequently used synonym to arrays is raster data, such as in 2-D raster graphics; this actually has motivated the name rasdaman. However, rasdaman has no limitation in the number of dimensions - it can serve, for example, 1-D measurement data, 2-D satellite imagery, 3-D x/y/t image time series and x/y/z exploration data, 4-D ocean and climate data, and even beyond spatio-temporal dimensions.

Array DBMS System that provides database services specifically for arrays

Array database management systems provide database services specifically for arrays, that is: homogeneous collections of data items, sitting on a regular grid of one, two, or more dimensions. Often arrays are used to represent sensor, simulation, image, or statistics data. Such arrays tend to be Big Data, with single objects frequently ranging into Terabyte and soon Petabyte sizes; for example, today's earth and space observation archives typically grow by Terabytes a day. Array databases aim at offering flexible, scalable storage and retrieval on this information category.

Rocket U2 is a suite of database management (DBMS) and supporting software now owned by Rocket Software. It includes two MultiValue database platforms: UniData and UniVerse. Both of these products are operating environments which run on current Unix, Linux and Windows operating systems. They are both derivatives of the Pick operating system. The family also includes developer and web-enabling technologies including SB/XA, U2 Web Development Environment (WebDE), UniObjects connectivity API and wIntegrate terminal emulation software.

In the field of database design, a multi-model database is a database management system designed to support multiple data models against a single, integrated backend. In contrast, most database management systems are organized around a single data model that determines how data can be organized, stored, and manipulated. Document, graph, relational, and key–value models are examples of data models that may be supported by a multi-model database.

References

  1. "ONgroup". www.ongroup.com.
  2. Nelson, Don (1965). "General Information Retrieval Language and System (GIRLS)" (PDF).{{cite journal}}: Cite journal requires |journal= (help)
  3. "Microdata Alumni". www.microdata-alumni.org.
  4. Sisk, Jonathan (1987). PICK BASIC: A Programmer's Guide. Tab Books.
  5. "Home". www.northgate-is.com.
  6. "MultiValue Symbol".
  7. Wolthuis, Dawn (2002). "MultiValue Family Tree" (PDF).{{cite journal}}: Cite journal requires |journal= (help)
  8. "Post-Relational Database Reference".
  9. Nelson, Don (1964). "Generalized Information Retrieval Language and System (GIRLS)" (PDF).{{cite journal}}: Cite journal requires |journal= (help)