Information model

Last updated
An IDEF1X Diagram, an example of an Integration Definition for Information Modeling. B 5 1 IDEF1X Diagram.jpg
An IDEF1X Diagram, an example of an Integration Definition for Information Modeling.

An information model in software engineering is a representation of concepts and the relationships, constraints, rules, and operations to specify data semantics for a chosen domain of discourse. Typically it specifies relations between kinds of things, but may also include relations with individual things. It can provide sharable, stable, and organized structure of information requirements or knowledge for the domain context. [1]

Contents

Overview

The term information model in general is used for models of individual things, such as facilities, buildings, process plants, etc. In those cases, the concept is specialised to facility information model, building information model, plant information model, etc. Such an information model is an integration of a model of the facility with the data and documents about the facility.

Within the field of software engineering and data modeling, an information model is usually an abstract, formal representation of entity types that may include their properties, relationships and the operations that can be performed on them. The entity types in the model may be kinds of real-world objects, such as devices in a network, or occurrences, or they may themselves be abstract, such as for the entities used in a billing system. Typically, they are used to model a constrained domain that can be described by a closed set of entity types, properties, relationships and operations.

An information model provides formalism to the description of a problem domain without constraining how that description is mapped to an actual implementation in software. There may be many mappings of the information model. Such mappings are called data models, irrespective of whether they are object models (e.g. using UML), entity relationship models or XML schemas.

Information modeling languages

A sample ER diagram. ER Diagram MMORPG.svg
A sample ER diagram.
Database requirements for a CD collection in EXPRESS-G notation. A 01 Audio compact disc collection.svg
Database requirements for a CD collection in EXPRESS-G notation.

In 1976, an entity-relationship (ER) graphic notation was introduced by Peter Chen. He stressed that it was a "semantic" modelling technique and independent of any database modelling techniques such as Hierarchical, CODASYL, Relational etc. [2] Since then, languages for information models have continued to evolve. Some examples are the Integrated Definition Language 1 Extended (IDEF1X), the EXPRESS language and the Unified Modeling Language (UML). [1]

Research by contemporaries of Peter Chen such as J.R.Abrial (1974) and G.M Nijssen (1976) led to today's Fact Oriented Modeling (FOM) languages which are based on linguistic propositions rather than on "entities". FOM tools can be used to generate an ER model which means that the modeler can avoid the time-consuming and error prone practice of manual normalization. Object-Role Modeling language (ORM) and Fully Communication Oriented Information Modeling (FCO-IM) are both research results, based upon earlier research.

In the 1980s there were several approaches to extend Chen’s Entity Relationship Model. Also important in this decade is REMORA by Colette Rolland. [3]

The ICAM Definition (IDEF) Language was developed from the U.S. Air Force ICAM Program during the 1976 to 1982 timeframe. [4] The objective of the ICAM Program, according to Lee (1999), was to increase manufacturing productivity through the systematic application of computer technology. IDEF includes three different modeling methods: IDEF0, IDEF1, and IDEF2 for producing a functional model, an information model, and a dynamic model respectively. IDEF1X is an extended version of IDEF1. The language is in the public domain. It is a graphical representation and is designed using the ER approach and the relational theory. It is used to represent the “real world” in terms of entities, attributes, and relationships between entities. Normalization is enforced by KEY Structures and KEY Migration. The language identifies property groupings (Aggregation) to form complete entity definitions. [1]

EXPRESS was created as ISO 10303-11 for formally specifying information requirements of product data model. It is part of a suite of standards informally known as the STandard for the Exchange of Product model data (STEP). It was first introduced in the early 1990s. [5] [6] The language, according to Lee (1999), is a textual representation. In addition, a graphical subset of EXPRESS called EXPRESS-G is available. EXPRESS is based on programming languages and the O-O paradigm. A number of languages have contributed to EXPRESS. In particular, Ada, Algol, C, C++, Euler, Modula-2, Pascal, PL/1, and SQL. EXPRESS consists of language elements that allow an unambiguous object definition and specification of constraints on the objects defined. It uses SCHEMA declaration to provide partitioning and it supports specification of data properties, constraints, and operations. [1]

UML is a modeling language for specifying, visualizing, constructing, and documenting the artifacts, rather than processes, of software systems. It was conceived originally by Grady Booch, James Rumbaugh, and Ivar Jacobson. UML was approved by the Object Management Group (OMG) as a standard in 1997. The language, according to Lee (1999), is non-proprietary and is available to the public. It is a graphical representation. The language is based on the objected-oriented paradigm. UML contains notations and rules and is designed to represent data requirements in terms of O-O diagrams. UML organizes a model in a number of views that present different aspects of a system. The contents of a view are described in diagrams that are graphs with model elements. A diagram contains model elements that represent common O-O concepts such as classes, objects, messages, and relationships among these concepts. [1]

IDEF1X, EXPRESS, and UML all can be used to create a conceptual model and, according to Lee (1999), each has its own characteristics. Although some may lead to a natural usage (e.g., implementation), one is not necessarily better than another. In practice, it may require more than one language to develop all information models when an application is complex. In fact, the modeling practice is often more important than the language chosen. [1]

Information models can also be expressed in formalized natural languages, such as Gellish. Gellish, which has natural language variants Gellish Formal English, Gellish Formal Dutch (Gellish Formeel Nederlands), etc. is an information representation language or modeling language that is defined in the Gellish smart Dictionary-Taxonomy, which has the form of a Taxonomy/Ontology. A Gellish Database is not only suitable to store information models, but also knowledge models, requirements models and dictionaries, taxonomies and ontologies. Information models in Gellish English use Gellish Formal English expressions. For example, a geographic information model might consist of a number of Gellish Formal English expressions, such as:

- the Eiffel tower <is located in> Paris - Paris <is classified as a> city

whereas information requirements and knowledge can be expressed for example as follows:

- tower <shall be located in a> geographical area - city <is a kind of> geographical area

Such Gellish expressions use names of concepts (such as 'city') and relation types (such as is located in and is classified as a) that should be selected from the Gellish Formal English Dictionary-Taxonomy (or of your own domain dictionary). The Gellish English Dictionary-Taxonomy enables the creation of semantically rich information models, because the dictionary contains definitions of more than 40000 concepts, including more than 600 standard relation types. Thus, an information model in Gellish consists of a collection of Gellish expressions that use those phrases and dictionary concepts to express facts or make statements, queries and answers.

Standard sets of information models

The Distributed Management Task Force (DMTF) provides a standard set of information models for various enterprise domains under the general title of the Common Information Model (CIM). Specific information models are derived from CIM for particular management domains.

The TeleManagement Forum (TMF) has defined an advanced model for the Telecommunication domain (the Shared Information/Data model, or SID) as another. This includes views from the business, service and resource domains within the Telecommunication industry. The TMF has established a set of principles that an OSS integration should adopt, along with a set of models that provide standardized approaches.

The models interact with the information model (the Shared Information/Data Model, or SID), via a process model (the Business Process Framework (eTOM), or eTOM) and a life cycle model.

See also

Notes

  1. 1 2 3 4 5 6 Y. Tina Lee (1999). "Information modeling from design to implementation" National Institute of Standards and Technology.
  2. Peter Chen (1976). "The Entity-Relationship Model - Towards a Unified View of Data". In: ACM Transactions on database Systems, Vol. 1, No.1, March, 1976.
  3. The history of conceptual modeling Archived 2012-02-15 at the Wayback Machine at uni-klu.ac.at.
  4. D. Appleton Company, Inc. (1985). "Integrated Information Support System: Information Modeling Manual, IDEF1 - Extended (IDEF1X)". ICAM Project Priority 6201, Subcontract #013-078846, USAF Prime Contract #F33615-80-C-5155, Wright-Patterson Air Force Base, Ohio, December, 1985.
  5. ISO 10303-11:1994(E), Industrial Automation Systems and Integration - Product Data Representation and Exchange - Part 11: The EXPRESS Language Reference Manual.
  6. D. Schenck and P. Wilson (1994). Information Modeling the EXPRESS Way. Oxford University Press, New York, NY, 1994.

Related Research Articles

<span class="mw-page-title-main">Semantic network</span> Knowledge base that represents semantic relations between concepts in a network

A semantic network, or frame network is a knowledge base that represents semantic relations between concepts in a network. This is often used as a form of knowledge representation. It is a directed or undirected graph consisting of vertices, which represent concepts, and edges, which represent semantic relations between concepts, mapping or connecting semantic fields. A semantic network may be instantiated as, for example, a graph database or a concept map. Typical standardized semantic networks are expressed as semantic triples.

In computer science and information science, an ontology encompasses a representation, formal naming, and definition of the categories, properties, and relations between the concepts, data, and entities that substantiate one, many, or all domains of discourse. More simply, an ontology is a way of showing the properties of a subject area and how they are related, by defining a set of concepts and categories that represent the subject.

<span class="mw-page-title-main">Data model</span> Model that organizes elements of data and how they relate to one another and to real-world entities.

A data model is an abstract model that organizes elements of data and standardizes how they relate to one another and to the properties of real-world entities. For instance, a data model may specify that the data element representing a car be composed of a number of other elements which, in turn, represent the color and size of the car and define its owner.

A modeling language is any artificial language that can be used to express information or knowledge or systems in a structure that is defined by a consistent set of rules. The rules are used for interpretation of the meaning of components in the structure.

<span class="mw-page-title-main">IDEF</span> Family of modeling languages

IDEF, initially an abbreviation of ICAM Definition and renamed in 1999 as Integration Definition, is a family of modeling languages in the field of systems and software engineering. They cover a wide range of uses from functional modeling to data, simulation, object-oriented analysis and design, and knowledge acquisition. These definition languages were developed under funding from U.S. Air Force and, although still most commonly used by them and other military and United States Department of Defense (DoD) agencies, are in the public domain.

<span class="mw-page-title-main">Entity–relationship model</span> Model or diagram describing interrelated things

An entity–relationship model describes interrelated things of interest in a specific domain of knowledge. A basic ER model is composed of entity types and specifies relationships that can exist between entities.

<span class="mw-page-title-main">Data modeling</span> Creating a model of the data in a system

Data modeling in software engineering is the process of creating a data model for an information system by applying certain formal techniques.

<span class="mw-page-title-main">Object-role modeling</span> Programming technique

Object-role modeling (ORM) is used to model the semantics of a universe of discourse. ORM is often used for data modeling and software engineering.

A conceptual model is a representation of a system. It consists of concepts used to help people know, understand, or simulate a subject the model represents. In contrast, physical models are physical object such as a toy model that may be assembled and made to work like the object it represents.

<span class="mw-page-title-main">IDEF1X</span>

Integration DEFinition for information modeling (IDEF1X) is a data modeling language for the development of semantic data models. IDEF1X is used to produce a graphical information model which represents the structure and semantics of information within an environment or system.

In computer science and artificial intelligence, ontology languages are formal languages used to construct ontologies. They allow the encoding of knowledge about specific domains and often include reasoning rules that support the processing of that knowledge. Ontology languages are usually declarative languages, are almost always generalizations of frame languages, and are commonly based on either first-order logic or on description logic.

The ISO 15926 is a standard for data integration, sharing, exchange, and hand-over between computer systems.

Gellish is an ontology language for data storage and communication, designed and developed by Andries van Renssen since mid-1990s. It started out as an engineering modeling language but evolved into a universal and extendable conceptual data modeling language with general applications. Because it includes domain-specific terminology and definitions, it is also a semantic data modelling language and the Gellish modeling methodology is a member of the family of semantic modeling methodologies.

Data exchange is the process of taking data structured under a source schema and transforming it into a target schema, so that the target data is an accurate representation of the source data. Data exchange allows data to be shared between different computer programs.

The Gellish English Dictionary-Taxonomy is an example of an open-source “smart” electronic dictionary, in which concepts are arranged in a subtype-supertype hierarchy, thus forming a taxonomy. The dictionary-taxonomy is machine readable. It is compliant with the guidelines of ISO 16354. Apart from the fact that it is an English (business-technical) dictionary, it also defines the semantics of Gellish English, which is a computer-interpretable structured subset of the natural English language for data storage and data exchange. The dictionary-taxonomy differs from conventional dictionaries because of several additional capabilities. Therefore it is called "smart." This means that it satisfies the following criteria:

The Semantics of Business Vocabulary and Business Rules (SBVR) is an adopted standard of the Object Management Group (OMG) intended to be the basis for formal and detailed natural language declarative description of a complex entity, such as a business. SBVR is intended to formalize complex compliance rules, such as operational rules for an enterprise, security policy, standard compliance, or regulatory compliance rules. Such formal vocabularies and rules can be interpreted and used by computer systems. SBVR is an integral part of the OMG's model-driven architecture (MDA).

<span class="mw-page-title-main">Three-schema approach</span>

The three-schema approach, or three-schema concept, in software engineering is an approach to building information systems and systems information management that originated in the 1970s. It proposes three different views in systems development, with conceptual modelling being considered the key to achieving data integration.

<span class="mw-page-title-main">Semantic data model</span> Database model

Semantic data model (SDM) is a high-level semantics-based database description and structuring formalism for databases. This database model is designed to capture more of the meaning of an application environment than is possible with contemporary database models. An SDM specification describes a database in terms of the kinds of entities that exist in the application environment, the classifications and groupings of those entities, and the structural interconnections among them. SDM provides a collection of high-level modeling primitives to capture the semantics of an application environment. By accommodating derived information in a database structural specification, SDM allows the same information to be viewed in several ways; this makes it possible to directly accommodate the variety of needs and processing requirements typically present in database applications. The design of the present SDM is based on our experience in using a preliminary version of it. SDM is designed to enhance the effectiveness and usability of database systems. An SDM database description can serve as a formal specification and documentation tool for a database; it can provide a basis for supporting a variety of powerful user interface facilities, it can serve as a conceptual database model in the database design process; and, it can be used as the database model for a new kind of database management system.

<span class="mw-page-title-main">IDEF5</span>

IDEF5 is a software engineering method to develop and maintain usable, accurate domain ontologies. This standard is part of the IDEF family of modeling languages in the field of software engineering.

<span class="mw-page-title-main">Ontology engineering</span> Field which studies the methods and methodologies for building ontologies

In computer science, information science and systems engineering, ontology engineering is a field which studies the methods and methodologies for building ontologies, which encompasses a representation, formal naming and definition of the categories, properties and relations between the concepts, data and entities. In a broader sense, this field also includes a knowledge construction of the domain using formal ontology representations such as OWL/RDF. A large-scale representation of abstract concepts such as actions, time, physical objects and beliefs would be an example of ontological engineering. Ontology engineering is one of the areas of applied ontology, and can be seen as an application of philosophical ontology. Core ideas and objectives of ontology engineering are also central in conceptual modeling.

References

Further reading