Data hierarchy

Last updated

Data hierarchy refers to the systematic organization of data, often in a hierarchical form. Data organization involves characters, fields, records, files and so on. [1] [2] This concept is a starting point when trying to see what makes up data and whether data has a structure. For example, how does a person make sense of data such as 'employee', 'name', 'department', 'Marcy Smith', 'Sales Department' and so on, assuming that they are all related? One way to understand them is to see these terms as smaller or larger components in a hierarchy. One might say that Marcy Smith is one of the employees in the Sales Department, or an example of an employee in that Department. The data we want to capture about all our employees, and not just Marcy, is the name, ID number, address etc.

Contents

Purpose of the data hierarchy

"Data hierarchy" is a basic concept in data and database theory and helps to show the relationships between smaller and larger components in a database or data file. It is used to give a better sense of understanding about the components of data and how they are related.

It is particularly important in databases with referential integrity, third normal form, or perfect key. "Data hierarchy" is the result of proper arrangement of data without redundancy. Avoiding redundancy eventually leads to proper "data hierarchy" representing the relationship between data, and revealing its relational structure.

Components of the data hierarchy

The components of the data hierarchy are listed below.

A data field holds a single fact or attribute of an entity. Consider a date field, e.g. "19 September 2004". This can be treated as a single date field (e.g. birthdate), or three fields, namely, day of month, month and year.

A record is a collection of related fields. An Employee record may contain a name field(s), address fields, birthdate field and so on.

A file is a collection of related records. If there are 100 employees, then each employee would have a record (e.g. called Employee Personal Details record) and the collection of 100 such records would constitute a file (in this case, called Employee Personal Details file).

Files are integrated into a database. [3] This is done using a Database Management System. If there are other facets of employee data that we wish to capture, then other files such as Employee Training History file and Employee Work History file could be created as well.

Illustration of the data hierarchy

An illustration of the above description is shown in this diagram below:

Data Hierarchy diagram showing Employee database example by JeffTan.gif

The following terms are for better clarity. With reference to the example in the above diagram:

Data field label = Employee Name or EMP_NAME

Data field value = Jeffrey Tan

The above description is a view of data as understood by a user e.g. a person working in Human Resource Department.

The above structure can be seen in the hierarchical model, which is one way to organize data in a database. [2]

In terms of data storage, data fields are made of bytes and these in turn are made up of bits.

See also

Related Research Articles

<span class="mw-page-title-main">Database</span> Organized collection of data in computing

In computing, a database is an organized collection of data stored and accessed electronically. Small databases can be stored on a file system, while large databases are hosted on computer clusters or cloud storage. The design of databases spans formal techniques and practical considerations, including data modeling, efficient data representation and storage, query languages, security and privacy of sensitive data, and distributed computing issues, including supporting concurrent access and fault tolerance.

Database normalization or database normalisation is the process of structuring a relational database in accordance with a series of so-called normal forms in order to reduce data redundancy and improve data integrity. It was first proposed by British computer scientist Edgar F. Codd as part of his relational model.

<span class="mw-page-title-main">Data model</span> Model that organizes elements of data and how they relate to one another and to real-world entities.

A data model is an abstract model that organizes elements of data and standardizes how they relate to one another and to the properties of real-world entities. For instance, a data model may specify that the data element representing a car be composed of a number of other elements which, in turn, represent the color and size of the car and define its owner.

First normal form (1NF) is a property of a relation in a relational database. A relation is in first normal form if and only if no attribute domain has relations as elements. Or more informally, that no table column can have tables as values. Database normalization is the process of representing a database in terms of relations in standard normal forms, where first normal is a minimal requirement. SQL-92 does not support creating or using table-valued columns, which means that using only the "traditional relational database features" most relational databases will be in first normal form by necessity. Database systems which do not require first normal form are often called no sql systems. Newer SQL standards like SQL:1999 have started to allow so called non-atomic types, which include composite types. Even newer versions like SQL:2016 allow json.

A hierarchical database model is a data model in which the data are organized into a tree-like structure. The data are stored as records which are connected to one another through links. A record is a collection of fields, with each field containing only one value. The type of a record defines which fields the record contains.

Adabas, a contraction of “adaptable database system," is a database package that was developed by Software AG to run on IBM mainframes. It was launched in 1971 as a non-relational database. As of 2019, Adabas is marketed for use on a wider range of platforms, including Linux, Unix, and Windows.

In computer science, a record is a basic data structure. Records in a database or spreadsheet are usually called "rows".

WinFS was the code name for a canceled data storage and management system project based on relational databases, developed by Microsoft and first demonstrated in 2003 as an advanced storage subsystem for the Microsoft Windows operating system, designed for persistence and management of structured, semi-structured and unstructured data.

<span class="mw-page-title-main">Flat-file database</span> Database stored as an ordinary unstructured file

A flat-file database is a database stored in a file called a flat file. Records follow a uniform format, and there are no structures for indexing or recognizing relationships between records. The file is simple. A flat file can be a plain text file, or a binary file. Relationships can be inferred from the data in the database, but the database format itself does not make those relationships explicit.

A navigational database is a type of database in which records or objects are found primarily by following references from other objects. The term was popularized by the title of Charles Bachman's 1973 Turing Award paper, The Programmer as Navigator. This paper emphasized the fact that the new disk-based database systems allowed the programmer to choose arbitrary navigational routes following relationships from record to record, contrasting this with the constraints of earlier magnetic-tape and punched card systems where data access was strictly sequential.

A data dictionary, or metadata repository, as defined in the IBM Dictionary of Computing, is a "centralized repository of information about data such as meaning, relationships to other data, origin, usage, and format". Oracle defines it as a collection of tables with metadata. The term can have one of several closely related meanings pertaining to databases and database management systems (DBMS):

<span class="mw-page-title-main">Entity–relationship model</span> Model or diagram describing interrelated things

An entity–relationship model describes interrelated things of interest in a specific domain of knowledge. A basic ER model is composed of entity types and specifies relationships that can exist between entities.

Database design is the organization of data according to a database model. The designer determines what data must be stored and how the data elements interrelate. With this information, they can begin to fit the data to the database model. Database management system manages the data accordingly.

In a relational database, a column is a set of data values of a particular type, one value for each row of the database. A column may contain text values, numbers, or even pointers to files in the operating system. Columns typically contain simple types, though some relational database systems allow columns to contain more complex data types, such as whole documents, images, or even video clips. A column can also be called an attribute.

A table is a collection of related data held in a table format within a database. It consists of columns and rows.

A database model is a type of data model that determines the logical structure of a database. It fundamentally determines in which manner data can be stored, organized and manipulated. The most popular example of a database model is the relational model, which uses a table-based format.

Apache Empire-db is a Java library that provides a high level object-oriented API for accessing relational database management systems (RDBMS) through JDBC. Apache Empire-db is open source and provided under the Apache License 2.0 from the Apache Software Foundation.

The following is provided as an overview of and topical guide to databases:

Rocket U2 is a suite of database management (DBMS) and supporting software now owned by Rocket Software. It includes two MultiValue database platforms: UniData and UniVerse. Both of these products are operating environments which run on current Unix, Linux and Windows operating systems. They are both derivatives of the Pick operating system. The family also includes developer and web-enabling technologies including SB/XA, U2 Web Development Environment (WebDE), UniObjects connectivity API and wIntegrate terminal emulation software.

<span class="mw-page-title-main">ERIL</span> Visual language for representing the data structure of a computer system

ERIL is a visual language for representing the data structure of a computer system. As its name suggests, ERIL is based on entity-relationship diagrams and class diagrams. ERIL combines the relational and object-oriented approaches to data modeling.

References

  1. Blaauw, Gerrit Anne; Brooks, Jr., Frederick Phillips; Buchholz, Werner (1962), "4: Natural Data Units" (PDF), in Buchholz, Werner (ed.), Planning a Computer System – Project Stretch, McGraw-Hill Book Company, Inc. / The Maple Press Company, York, PA., pp. 39–40, LCCN   61-10466, archived from the original (PDF) on 2017-04-03, retrieved 2017-04-03
  2. 1 2 Laudon, Kenneth C.; Laudon, Jane P. (2007). Management Information Systems - Managing the Digital Firm (9 ed.). Upper Saddle River, USA: Pearson Prentice Hall. pp. 226, 229. ISBN   978-0-13-157984-2.
  3. Marston, Tony. "The Relational Data Model - Normalisation and Effective Database Design". Archived from the original on 2012-01-17. Retrieved 2013-08-20.