An information model in software engineering is a representation of concepts and the relationships, constraints, rules, and operations to specify data semantics for a chosen domain of discourse. Typically it specifies relations between kinds of things, but may also include relations with individual things. It can provide sharable, stable, and organized structure of information requirements or knowledge for the domain context.
The term information model in general is used for models of individual things, such as facilities, buildings, process plants, etc. In those cases the concept is specialised to facility information model, building information model, plant information model, etc. Such an information model is an integration of a model of the facility with the data and documents about the facility.
Within the field of software engineering and data modeling an information model is usually an abstract, formal representation of entity types that may include their properties, relationships and the operations that can be performed on them. The entity types in the model may be kinds of real-world objects, such as devices in a network, or occurrences, or they may themselves be abstract, such as for the entities used in a billing system. Typically, they are used to model a constrained domain that can be described by a closed set of entity types, properties, relationships and operations.
An information model provides formalism to the description of a problem domain without constraining how that description is mapped to an actual implementation in software. There may be many mappings of the information model. Such mappings are called data models, irrespective of whether they are object models (e.g. using UML), entity relationship models or XML schemas.
In 1976, an entity-relationship (ER) graphic notation was introduced by Peter Chen. He stressed that it was a "semantic" modelling technique and independent of any database modelling techniques such as Hierarchical, CODASYL, Relational etc.Since then, languages for information models have continued to evolve. Some examples are the Integrated Definition Language 1 Extended (IDEF1X), the EXPRESS language and the Unified Modeling Language (UML).
Research by contemporaries of Peter Chen such as J.R.Abrial (1974) and G.M Nijssen (1976) led to today's Fact Oriented Modeling (FOM) languages which are based on linguistic propositions rather than on "entities". FOM tools can be used to generate an ER model which means that the modeler can avoid the time-consuming and error prone practice of manual normalization. Object-Role Modeling language (ORM) and Fully Communication Oriented Information Modeling (FCO-IM) are both research results, based upon earlier research.
In the 1980s there were several approaches to extend Chen’s Entity Relationship Model. Also important in this decade is REMORA by Colette Rolland.
The ICAM Definition (IDEF) Language was developed from the U.S. Air Force ICAM Program during the 1976 to 1982 timeframe.The objective of the ICAM Program, according to Lee (1999), was to increase manufacturing productivity through the systematic application of computer technology. IDEF includes three different modeling methods: IDEF0, IDEF1, and IDEF2 for producing a functional model, an information model, and a dynamic model respectively. IDEF1X is an extended version of IDEF1. The language is in the public domain. It is a graphical representation and is designed using the ER approach and the relational theory. It is used to represent the “real world” in terms of entities, attributes, and relationships between entities. Normalization is enforced by KEY Structures and KEY Migration. The language identifies property groupings (Aggregation) to form complete entity definitions.
EXPRESS was created as ISO 10303-11 for formally specifying information requirements of product data model. It is part of a suite of standards informally known as the STandard for the Exchange of Product model data (STEP). It was first introduced in the early 1990s.The language, according to Lee (1999), is a textual representation. In addition, a graphical subset of EXPRESS called EXPRESS-G is available. EXPRESS is based on programming languages and the O-O paradigm. A number of languages have contributed to EXPRESS. In particular, Ada, Algol, C, C++, Euler, Modula-2, Pascal, PL/1, and SQL. EXPRESS consists of language elements that allow an unambiguous object definition and specification of constraints on the objects defined. It uses SCHEMA declaration to provide partitioning and it supports specification of data properties, constraints, and operations.
UML is a modeling language for specifying, visualizing, constructing, and documenting the artifacts, rather than processes, of software systems. It was conceived originally by Grady Booch, James Rumbaugh, and Ivar Jacobson. UML was approved by the Object Management Group (OMG) as a standard in 1997. The language, according to Lee (1999), is non-proprietary and is available to the public. It is a graphical representation. The language is based on the objected-oriented paradigm. UML contains notations and rules and is designed to represent data requirements in terms of O-O diagrams. UML organizes a model in a number of views that present different aspects of a system. The contents of a view are described in diagrams that are graphs with model elements. A diagram contains model elements that represent common O-O concepts such as classes, objects, messages, and relationships among these concepts.
IDEF1X, EXPRESS, and UML all can be used to create a conceptual model and, according to Lee (1999), each has its own characteristics. Although some may lead to a natural usage (e.g., implementation), one is not necessarily better than another. In practice, it may require more than one language to develop all information models when an application is complex. In fact, the modeling practice is often more important than the language chosen.
Information models can also be expressed in formalized natural languages, such as Gellish. Gellish, which has natural language variants Gellish Formal English, Gellish Formal Dutch (Gellish Formeel Nederlands), etc. is an information representation language or modeling language that is defined in the Gellish smart Dictionary-Taxonomy, which has the form of a Taxonomy/Ontology. A Gellish Database is not only suitable to store information models, but also knowledge models, requirements models and dictionaries, taxonomies and ontologies. Information models in Gellish English use Gellish Formal English expressions. For example, a geographic information model might consist of a number of Gellish Formal English expressions, such as:
- the Eiffel tower <is located in> Paris - Paris <is classified as a> city
whereas information requirements and knowledge can be expressed for example as follows:
- tower <shall be located in a> geographical area - city <is a kind of> geographical area
Such Gellish expressions use names of concepts (such as 'city') and relation types (such as <is located in> and <is classified as a>) that should be selected from the Gellish Formal English Dictionary-Taxonomy (or of your own domain dictionary). The Gellish English Dictionary-Taxonomy enables the creation of semantically rich information models, because the dictionary contains definitions of more than 40000 concepts, including more than 600 standard relation types. Thus, an information model in Gellish consists of a collection of Gellish expressions that use those phrases and dictionary concepts to express facts or make statements, queries and answers.
The Distributed Management Task Force (DMTF) provides a standard set of information models for various enterprise domains under the general title of the Common Information Model (CIM). Specific information models are derived from CIM for particular management domains.
The TeleManagement Forum (TMF) has defined an advanced model for the Telecommunication domain (the Shared Information/Data model, or SID) as another. This includes views from the business, service and resource domains within the Telecommunication industry. The TMF has established a set of principles that an OSS integration should adopt, along with a set of models that provide standardized approaches.
The models interact with the information model (the Shared Information/Data Model, or SID), via a process model (the Business Process Framework (eTOM), or eTOM) and a life cycle model.
In computer science and information science, an ontology encompasses a representation, formal naming and definition of the categories, properties and relations between the concepts, data and entities that substantiate one, many or all domains of discourse.
A data model is an abstract model that organizes elements of data and standardizes how they relate to one another and to the properties of real-world entities. For instance, a data model may specify that the data element representing a car be composed of a number of other elements which, in turn, represent the color and size of the car and define its owner.
The Meta-Object Facility (MOF) is an Object Management Group (OMG) standard for model-driven engineering. Its purpose is to provide a type system for entities in the CORBA architecture and a set of interfaces through which those types can be created and manipulated. The official reference page may be found at OMG's website.
A modeling language is any artificial language that can be used to express information or knowledge or systems in a structure that is defined by a consistent set of rules. The rules are used for interpretation of the meaning of components in the structure.
IDEF, initially an abbreviation of ICAM Definition and renamed in 1999 as Integration Definition, is a family of modeling languages in the field of systems and software engineering. They cover a wide range of uses from functional modeling to data, simulation, object-oriented analysis and design, and knowledge acquisition. These definition languages were developed under funding from U.S. Air Force and, although still most commonly used by them and other military and United States Department of Defense (DoD) agencies, are in the public domain.
Data modeling in software engineering is the process of creating a data model for an information system by applying certain formal techniques.
Object-role modeling (ORM) is used to model the semantics of a universe of discourse. ORM is often used for data modeling and software engineering.
IDEF0, a compound acronym, is a function modeling methodology for describing manufacturing functions, which offers a functional modeling language for the analysis, development, reengineering, and integration of information systems; business processes; or software engineering analysis.
Integration DEFinition for information modeling (IDEF1X) is a data modeling language for the development of semantic data models. IDEF1X is used to produce a graphical information model which represents the structure and semantics of information within an environment or system.
The ISO 15926 is a standard for data integration, sharing, exchange, and hand-over between computer systems.
Gellish is an ontology language for data storage and communication, designed and developed by Andries van Renssen since mid-1990s. It started out as an engineering modeling language but evolved into a universal and extendable conceptual data modeling language with general applications. Because it includes domain-specific terminology and definitions, it is also a semantic data modelling language and the Gellish modeling methodology is a member of the family of semantic modeling methodologies.
Enterprise modelling is the abstract representation, description and definition of the structure, processes, information and resources of an identifiable business, government body, or other large organization.
EXPRESS is a standard data modeling language for product data. EXPRESS is formalized in the ISO Standard for the Exchange of Product model STEP, and standardized as ISO 10303-11.
Data exchange is the process of taking data structured under a source schema and transforming it into data structured under a target schema, so that the target data is an accurate representation of the source data. Data exchange allows data to be shared between different computer programs.
Ontology-based data integration involves the use of ontology(s) to effectively combine data or information from multiple heterogeneous sources. It is one of the multiple data integration approaches and may be classified as Global-As-View (GAV). The effectiveness of ontology based data integration is closely tied to the consistency and expressivity of the ontology used in the integration process.
The Gellish English Dictionary-Taxonomy is an example of an open-source “smart” electronic dictionary, in which concepts are arranged in a subtype-supertype hierarchy, thus forming a taxonomy. The dictionary-taxonomy is machine readable. It is compliant with the guidelines of ISO 16354. Apart from the fact that it is an English (business-technical) dictionary, it also defines the semantics of Gellish English, which is a computer-interpretable structured subset of the natural English language for data storage and data exchange. The dictionary-taxonomy differs from conventional dictionaries because of several additional capabilities. Therefore it is called "smart." This means that it satisfies the following criteria:
The Semantics of Business Vocabulary and Business Rules (SBVR) is an adopted standard of the Object Management Group (OMG) intended to be the basis for formal and detailed natural language declarative description of a complex entity, such as a business. SBVR is intended to formalize complex compliance rules, such as operational rules for an enterprise, security policy, standard compliance, or regulatory compliance rules. Such formal vocabularies and rules can be interpreted and used by computer systems. SBVR is an integral part of the OMG's model-driven architecture (MDA).
Semantic data model(SDM) is a high-level semantics-based database description and structuring formalism for databases. This database model is designed to capture more of the meaning of an application environment than is possible with contemporary database models. An SDM specification describes a database in terms of the kinds of entities that exist in the application environment, the classifications and groupings of those entities, and the structural interconnections among them. SDM provides a collection of high-level modeling primitives to capture the semantics of an application environment. By accommodating derived information in a database structural specification, SDM allows the same information to be viewed in several ways; this makes it possible to directly accommodate the variety of needs and processing requirements typically present in database applications. The design of the present SDM is based on our experience in using a preliminary version of it. SDM is designed to enhance the effectiveness and usability of database systems. An SDM database description can serve as a formal specification and documentation tool for a database; it can provide a basis for supporting a variety of powerful user interface facilities, it can serve as a conceptual database model in the database design process; and, it can be used as the database model for a new kind of database management system.
IDEF5 is a software engineering method to develop and maintain usable, accurate domain ontologies. This standard is part of the IDEF family of modeling languages in the field of software engineering.
Ontology engineering in computer science, information science and systems engineering is a field which studies the methods and methodologies for building ontologies: formal representations of a set of concepts within a domain and the relationships between those concepts. A large-scale representation of abstract concepts such as actions, time, physical objects and beliefs would be an example of ontological engineering. Ontology engineering is one of the areas of applied ontology, and can be seen as an application of philosophical ontology. Core ideas and objectives of ontology engineering are also central in conceptual modeling.