Three-schema approach

Last updated
The notion of a three-schema model was first introduced in 1975 by the ANSI/X3/SPARC three level architecture, which determined three levels to model data. 4-2 ANSI-SPARC three level architecture.svg
The notion of a three-schema model was first introduced in 1975 by the ANSI/X3/SPARC three level architecture, which determined three levels to model data.

The three-schema approach, or three-schema concept, in software engineering is an approach to building information systems and systems information management that originated in the 1970s. It proposes three different views in systems development, with conceptual modelling being considered the key to achieving data integration. [2]

Software engineering is the application of engineering to the development of software in a systematic method.

Information management (IM) concerns a cycle of organizational activity: the acquisition of information from one or more sources, the custodianship and the distribution of that information to those who need it, and its ultimate disposition through archiving or deletion.

View model

A view model or viewpoints framework in systems engineering, software engineering, and enterprise engineering is a framework which defines a coherent set of views to be used in the construction of a system architecture, software architecture, or enterprise architecture. A view is a representation of a whole system from the perspective of a related set of concerns.

Contents

Overview

The three-schema approach provides for three types of schemas with schema techniques based on formal language descriptions: [3]

A 'conceptual schema' is a high-level description of a business's informational needs. It typically includes only the main concepts and the main relationships among them. Typically this is a first-cut model, with insufficient detail to build an actual database. This level describes the structure of the whole database for a group of users. The conceptual model is also known as the data model that can be used to describe the conceptual schema when a database system is implemented. It hides the internal details of physical storage and targets on describing entities, datatype, relationships and constraints.

At the center, the conceptual schema defines the ontology of the concepts as the users think of them and talk about them. The physical schema according to Sowa (2004) "describes the internal formats of the data stored in the database, and the external schema defines the view of the data presented to the application programs." [4] The framework attempted to permit multiple data models to be used for external schemata. [5]

In computer science and information science, an ontology encompasses a representation, formal naming and definition of the categories, properties and relations between the concepts, data and entities that substantiate one, many or all domains of discourse.

User (computing) person who uses a computer or network service

A user is a person who utilizes a computer or network service. Users of computer systems and software products generally lack the technical expertise required to fully understand how they work. Power users use advanced features of programs, though they are not necessarily capable of computer programming and system administration.

Data facts represented for handling

Data is a set of values of subjects with respect to qualitative or quantitative variables.

Over the years, the skill and interest in building information systems has grown tremendously. However, for the most part, the traditional approach to building systems has only focused on defining data from two distinct views, the "user view" and the "computer view". From the user view, which will be referred to as the “external schema,” the definition of data is in the context of reports and screens designed to aid individuals in doing their specific jobs. The required structure of data from a usage view changes with the business environment and the individual preferences of the user. From the computer view, which will be referred to as the "internal schema", data is defined in terms of file structures for storage and retrieval. The required structure of data for computer storage depends upon the specific computer technology employed and the need for efficient processing of data. [6]

Figure 1: Traditional View of Data A2 2 Traditional View of Data.svg
Figure 1: Traditional View of Data
Figure 2: Three schema approach A2 3 Three schema approach.svg
Figure 2: Three schema approach

These two traditional views of data have been defined by analysts over the years on an application by application basis as specific business needs were addressed, see Figure 1. Typically, the internal schema defined for an initial application cannot be readily used for subsequent applications, resulting in the creation of redundant and often inconsistent definition of the same data. Data was defined by the layout of physical records and processed sequentially in early information systems. The need for flexibility, however, led to the introduction of Database Management Systems (DBMSs), which allow for random access of logically connected pieces of data. The logical data structures within a DBMS are typically defined as either hierarchies, networks or relations. Although DBMSs have greatly improved the shareability of data, the use of a DBMS alone does not guarantee a consistent definition of data. Furthermore, most large companies have had to develop multiple databases which are often under the control of different DBMSs and still have the problems of redundancy and inconsistency. [6]

The recognition of this problem led the ANSI/X3/SPARC Study Group on Database Management Systems to conclude that in an ideal data management environment a third view of data is needed. This view, referred to as a "conceptual schema" is a single integrated definition of the data within an enterprise which is unbiased toward any single application of data and is independent of how the data is physically stored or accessed, see Figure 2. The primary objective of this conceptual schema is to provide a consistent definition of the meanings and interrelationship of data which can be used to integrate, share, and manage the integrity of data. [6]

History

Image of the six layers in the Zachman Framework. ZF What column Data example.svg
Image of the six layers in the Zachman Framework.

The notion of a three-schema model consisting of a conceptual model, an external model, and an internal or physical model was first introduced by the ANSI/X3/SPARC Standards Planning and Requirements Committee directed by Charles Bachman in 1975. The ANSI/X3/SPARC Report characterized DBMSs as having a two-schema organization. That is, DBMSs utilize an internal schema, which represents the structure of the data as viewed by the DBMS, and an external schema, which represents various structures of the data as viewed by the end user. The concept of a third schema (conceptual) was introduced in the report. The conceptual schema represents the basic underlying structure of data as viewed by the enterprise as a whole. [2]

The ANSI/SPARC report was intended as a basis for interoperable computer systems. All database vendors adopted the three-schema terminology, but they implemented it in incompatible ways. Over the next twenty years, various groups attempted to define standards for the conceptual schema and its mappings to databases and programming languages. Unfortunately, none of the vendors had a strong incentive to make their formats compatible with their competitors'. A few reports were produced, but no standards. [4]

As the practice of Data Administration has evolved and more graphical techniques have evolved, the term "schema" has given way to the term "model". The conceptual model represents the view of data that is negotiated between end users and database administrators covering those entities about which it is important to keep data, the meaning of the data, and the relationships of the data to each other. [2]

One further development is the IDEF1X information modeling methodology, which is based on the three-schema concept[ citation needed ]. Another is the Zachman Framework, proposed by John Zachman in 1987 and developed ever since in the field of Enterprise Architecture. In this framework, the three-schema model has evolved into a layer of six perspectives. In other Enterprise Architecture frameworks some kind of view model is incorporated.

See also

Related Research Articles

Database organized collection of data

A database is an organized collection of data, generally stored and accessed electronically from a computer system. Where databases are more complex they are often developed using formal design and modeling techniques.

SQL is a domain-specific language used in programming and designed for managing data held in a relational database management system (RDBMS), or for stream processing in a relational data stream management system (RDSMS). It is particularly useful in handling structured data, i.e. data incorporating relations among entities and variables.

Object-relational database database management system

An object-relational database (ORD), or object-relational database management system (ORDBMS), is a database management system (DBMS) similar to a relational database, but with an object-oriented database model: objects, classes and inheritance are directly supported in database schemas and in the query language. In addition, just as with pure relational systems, it supports extension of the data model with custom data types and methods.

Data model abstract model for organizing data; abstract model that organizes elements of data and standardizes how they relate to one another and to properties of the real world entities

A data model is an abstract model that organizes elements of data and standardizes how they relate to one another and to the properties of real-world entities. For instance, a data model may specify that the data element representing a car be composed of a number of other elements which, in turn, represent the color and size of the car and define its owner.

Data modeling (in software engineering) process of creating a data model for an information system by applying certain formal techniques

Data modeling in software engineering is the process of creating a data model for an information system by applying certain formal techniques.

Zachman Framework

The Zachman Framework is an enterprise ontology and is a fundamental structure for Enterprise Architecture which provides a formal and structured way of viewing and defining an enterprise. The ontology is a two dimensional classification schema that reflects the intersection between two historical classifications. The first are primitive interrogatives: What, How, When, Who, Where, and Why. The second is derived from the philosophical concept of reification, the transformation of an abstract idea into an instantiation. The Zachman Framework reification transformations are: Identification, Definition, Representation, Specification, Configuration and Instantiation.

A logical data model or logical schema is a data model of a specific problem domain expressed independently of a particular database management product or storage technology but in terms of data structures such as relational tables and columns, object-oriented classes, or XML tags. This is as opposed to a conceptual data model, which describes the semantics of an organization without reference to technology.

Physical schema

A physical data model is a representation of a data design as implemented, or intended to be implemented, in a database management system. In the lifecycle of a project it typically derives from a logical data model, though it may be reverse-engineered from a given database implementation. A complete physical data model will include all the database artifacts required to create relationships between tables or to achieve performance goals, such as indexes, constraint definitions, linking tables, partitioned tables or clusters. Analysts can usually use a physical data model to calculate storage estimates; it may include specific storage allocation details for a given database system.

A federated database system is a type of meta-database management system (DBMS), which transparently maps multiple autonomous database systems into a single federated database. The constituent databases are interconnected via a computer network and may be geographically decentralized. Since the constituent database systems remain autonomous, a federated database system is a contrastable alternative to the task of merging several disparate databases. A federated database, or virtual database, is a composite of all constituent databases in a federated database system. There is no actual data integration in the constituent disparate databases as a result of data federation.

Data independence is the type of data transparency that matters for a centralized DBMS. It refers to the immunity of user applications to changes made in the definition and organization of data. Application programs should not, ideally, be exposed to details of data representation and storage. The DBMS provides an abstract view of the data that hides such details.

The object-relational impedance mismatch is a set of conceptual and technical difficulties that are often encountered when a relational database management system (RDBMS) is being served by an application program written in an object-oriented programming language or style, particularly because objects or class definitions must be mapped to database tables defined by a relational schema.

Uniface (programming language) Fourth-generation programming language

Uniface is a development and deployment platform for enterprise applications that can run in a large range of runtime environments, including mobile, mainframe, web, Service-oriented architecture (SOA), Windows, Java EE and .NET. Uniface is a model-driven, Rapid Application Development (RAD) environment used to create mission-critical applications.

IDEF1X

Integration DEFinition for information modeling (IDEF1X) is a data modeling language for the development of semantic data models. IDEF1X is used to produce a graphical information model which represents the structure and semantics of information within an environment or system.

Data integration involves combining data residing in different sources and providing users with a unified view of them. This process becomes significant in a variety of situations, which include both commercial and scientific domains. Data integration appears with increasing frequency as the volume and the need to share existing data explodes. It has become the focus of extensive theoretical work, and numerous open problems remain unsolved. Data integration encourages collaboration between internal as well as external users.

Enterprise modelling

Enterprise modelling is the abstract representation, description and definition of the structure, processes, information and resources of an identifiable business, government body, or other large organization.

ANSI-SPARC Architecture

The ANSI-SPARC Architecture, where ANSI-SPARC stands for American National Standards Institute, Standards Planning And Requirements Committee, is an abstract design standard for a Database Management System (DBMS), first proposed in 1975.

Semantic data model Database model

Semantic data model(SDM) is a high-level semantics-based database description and structuring formalism for databases. This database model is designed to capture more of the meaning of an application environment than is possible with contemporary database models. An SDM specification describes a database in terms of the kinds of entities that exist in the application environment, the classifications and groupings of those entities, and the structural interconnections among them. SDM provides a collection of high-level modeling primitives to capture the semantics of an application environment. By accommodating derived information in a database structural specification, SDM allows the same information to be viewed in several ways; this makes it possible to directly accommodate the variety of needs and processing requirements typically present in database applications. The design of the present SDM is based on our experience in using a preliminary version of it. SDM is designed to enhance the effectiveness and usability of database systems. An SDM database description can serve as a formal specification and documentation tool for a database; it can provide a basis for supporting a variety of powerful user interface facilities, it can serve as a conceptual database model in the database design process; and, it can be used as the database model for a new kind of database management system.[5]

NIST Enterprise Architecture Model Reference model of enterprise architecture

NIST Enterprise Architecture Model is a late-1980s reference model for enterprise architecture. It defines an enterprise architecture by the interrelationship between an enterprise's business, information, and technology environments.

References

PD-icon.svg This article incorporates  public domain material from the National Institute of Standards and Technology website https://www.nist.gov .

  1. Matthew West and Julian Fowler (1999). High Quality Data Models. The European Process Industries STEP Technical Liaison Executive (EPISTLE).
  2. 1 2 3 "Strap Section 2 Approach" . Retrieved 30 September 2008.
  3. Loomis, Mary E.S. (1987). Data Base Book. Macmillan. p. 26. ISBN   9780023717604.
  4. 1 2 Sowa, John F. (2004). Ramadas, J.; Chunawala, S. (eds.). The Challenge of Knowledge Soup. Research Trends in Science, Technology and Mathematics Education. Homi Bhabha Centre, Mumbai.
  5. Ariav, Gad; Clifford, James (1986). New Directions for Database Systems: Revised Versions of the Papers. New York University Graduate School of Business Administration. Center for Research on Information Systems.
  6. 1 2 3 4 5 "Integration Definition for Information Modeling (IDEFIX)". 21 Dec 1993. Archived from the original on 3 December 2013.