Hierarchical database model

Last updated

A hierarchical database model is a data model in which the data is organized into a tree-like structure. The data are stored as records which is a collection of one or more fields. Each field contains a single value, and the collection of fields in a record defines its type. One type of field is the link, which connects a given record to associated records. Using links, records link to other records, and to other records, forming a tree. An example is a "customer" record that has links to that customer's "orders", which in turn link to "line_items".

Contents

The hierarchical database model mandates that each child record has only one parent, whereas each parent record can have zero or more child records. The network model extends the hierarchical by allowing multiple parents and children. In order to retrieve data from these databases, the whole tree needs to be traversed starting from the root node. Both models were well suited to data that was normally stored on tape drives, which had to move the tape from end to end in order to retrieve data.

When the relational database model emerged, one criticism of hierarchical database models was their close dependence on application-specific implementation. This limitation, along with the relational model's ease of use, contributed to the popularity of relational databases, despite their initially lower performance in comparison with the existing network and hierarchical models. [1]

History

The hierarchical structure was developed by IBM in the 1960s and used in early mainframe DBMS. Records' relationships form a treelike model. This structure is simple but inflexible because the relationship is confined to a one-to-many relationship. The IBM Information Management System (IMS) and RDM Mobile are examples of a hierarchical database system with multiple hierarchies over the same data.

The hierarchical data model lost traction as Codd's relational model became the de facto standard used by virtually all mainstream database management systems. A relational-database implementation of a hierarchical model was first discussed in published form in 1992 [2] (see also nested set model). Hierarchical data organization schemes resurfaced with the advent of XML in the late 1990s [3] (see also XML database). The hierarchical structure is used primarily today for storing geographic information and file systems.[ citation needed ]

Currently, hierarchical databases are still widely used especially in applications that require very high performance and availability such as banking, health care, and telecommunications. One of the most widely used commercial hierarchical databases is IMS. [4] Another example of the use of hierarchical databases is Windows Registry in the Microsoft Windows operating systems. [5]

Examples of hierarchical data represented as relational tables

An organization could store employee information in a table that contains attributes/columns such as employee number, first name, last name, and department number. The organization provides each employee with computer hardware as needed, but computer equipment may only be used by the employee to which it is assigned. The organization could store the computer hardware information in a separate table that includes each part's serial number, type, and the employee that uses it. The tables might look like this:

employee table
EmpNoFirst NameLast NameDept. Num
100AlmukhtarKhan10-L
101GauravSoni10-L
102SiddharthaSoni20-B
103SiddhantSoni20-B
computer table
Serial NumTypeUser EmpNo
3009734-4Computer100
3-23-283742Monitor100
2-22-723423Monitor100
232342Printer100

In this model, the employee data table represents the "parent" part of the hierarchy, while the computer table represents the "child" part of the hierarchy. In contrast to tree structures usually found in computer software algorithms, in this model the children point to the parents. As shown, each employee may possess several pieces of computer equipment, but each individual piece of computer equipment may have only one employee owner.

Consider the following structure:

EmpNoDesignationReportsTo
10Director
20Senior Manager10
30Typist20
40Programmer20

In this, the "child" is the same type as the "parent". The hierarchy stating EmpNo 10 is boss of 20, and 30 and 40 each report to 20 is represented by the "ReportsTo" column. In Relational database terms, the ReportsTo column is a foreign key referencing the EmpNo column. If the "child" data type were different, it would be in a different table, but there would still be a foreign key referencing the EmpNo column of the employees table.

This simple model is commonly known as the adjacency list model and was introduced by Dr. Edgar F. Codd after initial criticisms surfaced that the relational model could not model hierarchical data.[ citation needed ] However, the model is only a special case of a general adjacency list for a graph.

See also

Related Research Articles

<span class="mw-page-title-main">Database</span> Organized collection of data in computing

In computing, a database is an organized collection of data or a type of data store based on the use of a database management system (DBMS), the software that interacts with end users, applications, and the database itself to capture and analyze the data. The DBMS additionally encompasses the core facilities provided to administer the database. The sum total of the database, the DBMS and the associated applications can be referred to as a database system. Often the term "database" is also used loosely to refer to any of the DBMS, the database system or an application associated with the database.

Database normalization is the process of structuring a relational database accordance with a series of so-called normal forms in order to reduce data redundancy and improve data integrity. It was first proposed by British computer scientist Edgar F. Codd as part of his relational model.

A relational database (RDB) is a database based on the relational model of data, as proposed by E. F. Codd in 1970.

The relational model (RM) is an approach to managing data using a structure and language consistent with first-order predicate logic, first described in 1969 by English computer scientist Edgar F. Codd, where all data are represented in terms of tuples, grouped into relations. A database organized in terms of the relational model is a relational database.

Structured Query Language (SQL) is a domain-specific language used to manage data, especially in a relational database management system (RDBMS). It is particularly useful in handling structured data, i.e., data incorporating relations among entities and variables.

Business System 12, or simply BS12, was one of the first fully relational database management systems, designed and implemented by IBM's Bureau Service subsidiary at the company's international development centre in Uithoorn, Netherlands. Programming started in 1978 and the first version was delivered in 1982. It was never widely used and essentially disappeared soon after the division was shut down in 1985, possibly because IBM and other companies settled on SQL as the standard.

<span class="mw-page-title-main">IBM Db2</span> Relational model database server

Db2 is a family of data management products, including database servers, developed by IBM. It initially supported the relational model, but was extended to support object–relational features and non-relational structures like JSON and XML. The brand name was originally styled as DB2 until 2017, when it changed to its present form.

<span class="mw-page-title-main">Network model</span> Database model invented by Charles Bachman

In computing, the network model is a database model conceived as a flexible way of representing objects and their relationships. Its distinguishing feature is that the schema, viewed as a graph in which object types are nodes and relationship types are arcs, is not restricted to being a hierarchy or lattice.

First normal form (1NF) is a property of a relation in a relational database. A relation is in first normal form if and only if no attribute domain has relations as elements. Or more informally, that no table column can have tables as values. Database normalization is the process of representing a database in terms of relations in standard normal forms, where first normal is a minimal requirement. SQL-92 does not support creating or using table-valued columns, which means that using only the "traditional relational database features" most relational databases will be in first normal form by necessity. Database systems which do not require first normal form are often called NoSQL systems. Newer SQL standards like SQL:1999 have started to allow so called non-atomic types, which include composite types. Even newer versions like SQL:2016 allow JSON.

<span class="mw-page-title-main">IBM Information Management System</span> Joint hierarchical database made by IBM

The IBM Information Management System (IMS) is a joint hierarchical database and information management system that supports transaction processing. Development began in 1966 to keep track of the bill of materials for the Saturn V rocket of the Apollo program, and the first version on the IBM System/360 Model 65 was completed in 1967 as ICS/DL/I and officially installed in August 1968.

Codd's twelve rules are a set of thirteen rules proposed by Edgar F. Codd, a pioneer of the relational model for databases, designed to define what is required from a database management system in order for it to be considered relational, i.e., a relational database management system (RDBMS). They are sometimes referred to as "Codd's Twelve Commandments".

In a database, a table is a collection of related data organized in table format; consisting of columns and rows.

<span class="mw-page-title-main">Null (SQL)</span> Marker used in SQL databases to indicate a value does not exist

In SQL, null or NULL is a special marker used to indicate that a data value does not exist in the database. Introduced by the creator of the relational database model, E. F. Codd, SQL null serves to fulfill the requirement that all true relational database management systems (RDBMS) support a representation of "missing information and inapplicable information". Codd also introduced the use of the lowercase Greek omega (ω) symbol to represent null in database theory. In SQL, NULL is a reserved word used to identify this marker.

An entity–attribute–value model (EAV) is a data model optimized for the space-efficient storage of sparse—or ad-hoc—property or data values, intended for situations where runtime usage patterns are arbitrary, subject to user variation, or otherwise unforeseeable using a fixed design. The use-case targets applications which offer a large or rich system of defined property types, which are in turn appropriate to a wide set of entities, but where typically only a small, specific selection of these are instantiated for a given entity. Therefore, this type of data model relates to the mathematical notion of a sparse matrix. EAV is also known as object–attribute–value model, vertical database model, and open schema.

<span class="mw-page-title-main">Database model</span> Type of data model

A database model is a type of data model that determines the logical structure of a database. It fundamentally determines in which manner data can be stored, organized and manipulated. The most popular example of a database model is the relational model, which uses a table-based format.

The nested set model is a technique for representing nested set collections in relational databases.

A document-oriented database, or document store, is a computer program and data storage system designed for storing, retrieving and managing document-oriented information, also known as semi-structured data.

A hierarchical query is a type of SQL query that handles hierarchical model data. They are special cases of more general recursive fixpoint queries, which compute transitive closures.

A graph database (GDB) is a database that uses graph structures for semantic queries with nodes, edges, and properties to represent and store data. A key concept of the system is the graph. The graph relates the data items in the store to a collection of nodes and edges, the edges representing the relationships between the nodes. The relationships allow data in the store to be linked together directly and, in many cases, retrieved with one operation. Graph databases hold the relationships between data as a priority. Querying relationships is fast because they are perpetually stored in the database. Relationships can be intuitively visualized using graph databases, making them useful for heavily inter-connected data.

In database normalization, unnormalized form (UNF or 0NF), also known as an unnormalized relation or non-first normal form (N1NF or NF2), is a database data model (organization of data in a database) which does not meet any of the conditions of database normalization defined by the relational model. Database systems which support unnormalized data are sometimes called non-relational or NoSQL databases. In the relational model, unnormalized relations can be considered the starting point for a process of normalization.

References

  1. Silberschatz, Abraham; Korth, Henry F.; Sudarshan, S. Database System Concepts. 4th ed., McGraw-Hill, 2004, p. 11, 21.
  2. Michael J. Kamfonas/Recursive Hierarchies: The Relational Taboo! Archived 2008-11-08 at the Wayback Machine --The Relation Journal, October/November 1992
  3. "Web Application Development". IBM .
  4. IBM Information Management System
  5. "Structure of the Registry - Win32 apps".