Ontology engineering

Last updated
Example of a constructed MBED Top Level Ontology based on the nominal set of views. MBED Top Level Ontology.jpg
Example of a constructed MBED Top Level Ontology based on the nominal set of views.

In computer science, information science and systems engineering, ontology engineering is a field which studies the methods and methodologies for building ontologies, which encompasses a representation, formal naming and definition of the categories, properties and relations between the concepts, data and entities of a given domain of interest. In a broader sense, this field also includes a knowledge construction of the domain using formal ontology representations such as OWL/RDF. A large-scale representation of abstract concepts such as actions, time, physical objects and beliefs would be an example of ontological engineering. [2] Ontology engineering is one of the areas of applied ontology, and can be seen as an application of philosophical ontology. Core ideas and objectives of ontology engineering are also central in conceptual modeling.

Contents


Ontology engineering aims at making explicit the knowledge contained within software applications, and within enterprises and business procedures for a particular domain. Ontology engineering offers a direction towards solving the inter-operability problems brought about by semantic obstacles, i.e. the obstacles related to the definitions of business terms and software classes. Ontology engineering is a set of tasks related to the development of ontologies for a particular domain.

Automated processing of information not interpretable by software agents can be improved by adding rich semantics to the corresponding resources, such as video files. One of the approaches for the formal conceptualization of represented knowledge domains is the use of machine-interpretable ontologies, which provide structured data in, or based on, RDF, RDFS, and OWL. Ontology engineering is the design and creation of such ontologies, which can contain more than just the list of terms (controlled vocabulary); they contain terminological, assertional, and relational axioms to define concepts (classes), individuals, and roles (properties) (TBox, ABox, and RBox, respectively). [3] Ontology engineering is a relatively new field of study concerning the ontology development process, the ontology life cycle, the methods and methodologies for building ontologies, [4] [5] and the tool suites and languages that support them. A common way to provide the logical underpinning of ontologies is to formalize the axioms with description logics, which can then be translated to any serialization of RDF, such as RDF/XML or Turtle. Beyond the description logic axioms, ontologies might also contain SWRL rules. The concept definitions can be mapped to any kind of resource or resource segment in RDF, such as images, videos, and regions of interest, to annotate objects, persons, etc., and interlink them with related resources across knowledge bases, ontologies, and LOD datasets. This information, based on human experience and knowledge, is valuable for reasoners for the automated interpretation of sophisticated and ambiguous contents, such as the visual content of multimedia resources. [6] Application areas of ontology-based reasoning include, but are not limited to, information retrieval, automated scene interpretation, and knowledge discovery.

Ontology languages

An ontology language is a formal language used to encode the ontology. There are a number of such languages for ontologies, both proprietary and standards-based:

Ontology engineering in life sciences

Life sciences is flourishing with ontologies that biologists use to make sense of their experiments. [7] For inferring correct conclusions from experiments, ontologies have to be structured optimally against the knowledge base they represent. The structure of an ontology needs to be changed continuously so that it is an accurate representation of the underlying domain.

Recently, an automated method was introduced for engineering ontologies in life sciences such as Gene Ontology (GO), [8] one of the most successful and widely used biomedical ontology. [9] Based on information theory, it restructures ontologies so that the levels represent the desired specificity of the concepts. Similar information theoretic approaches have also been used for optimal partition of Gene Ontology. [10] Given the mathematical nature of such engineering algorithms, these optimizations can be automated to produce a principled and scalable architecture to restructure ontologies such as GO.

Open Biomedical Ontologies (OBO), a 2006 initiative of the U.S. National Center for Biomedical Ontology, provides a common 'foundry' for various ontology initiatives, amongst which are:

and more

Methodologies and tools for ontology engineering

See also

Related Research Articles

Knowledge representation and reasoning is the field of artificial intelligence (AI) dedicated to representing information about the world in a form that a computer system can use to solve complex tasks such as diagnosing a medical condition or having a dialog in a natural language. Knowledge representation incorporates findings from psychology about how humans solve problems and represent knowledge in order to design formalisms that will make complex systems easier to design and build. Knowledge representation and reasoning also incorporates findings from logic to automate various kinds of reasoning, such as the application of rules or the relations of sets and subsets.

<span class="mw-page-title-main">Semantic Web</span> Extension of the Web to facilitate data exchange

The Semantic Web, sometimes known as Web 3.0, is an extension of the World Wide Web through standards set by the World Wide Web Consortium (W3C). The goal of the Semantic Web is to make Internet data machine-readable.

In information science, an ontology encompasses a representation, formal naming, and definition of the categories, properties, and relations between the concepts, data, and entities that substantiate one, many, or all domains of discourse. More simply, an ontology is a way of showing the properties of a subject area and how they are related, by defining a set of concepts and categories that represent the subject.

The DARPA Agent Markup Language (DAML) was the name of a US funding program at the US Defense Advanced Research Projects Agency (DARPA) started in 1999 by then-Program Manager James Hendler, and later run by Murray Burke, Mark Greaves and Michael Pagels. The program focused on the creation of machine-readable representations for the Web.

The Resource Description Framework (RDF) is a World Wide Web Consortium (W3C) standard originally designed as a data model for metadata. It has come to be used as a general method for description and exchange of graph data. RDF provides a variety of syntax notations and data serialization formats, with Turtle currently being the most widely used notation.

OIL can be regarded as an ontology infrastructure for the Semantic Web. OIL is based on concepts developed in Description Logic (DL) and frame-based systems and is compatible with RDFS.

Description logics (DL) are a family of formal knowledge representation languages. Many DLs are more expressive than propositional logic but less expressive than first-order logic. In contrast to the latter, the core reasoning problems for DLs are (usually) decidable, and efficient decision procedures have been designed and implemented for these problems. There are general, spatial, temporal, spatiotemporal, and fuzzy description logics, and each description logic features a different balance between expressive power and reasoning complexity by supporting different sets of mathematical constructors.

The Web Ontology Language (OWL) is a family of knowledge representation languages for authoring ontologies. Ontologies are a formal way to describe taxonomies and classification networks, essentially defining the structure of knowledge for various domains: the nouns representing classes of objects and the verbs representing relations between the objects.

In computer science and artificial intelligence, ontology languages are formal languages used to construct ontologies. They allow the encoding of knowledge about specific domains and often include reasoning rules that support the processing of that knowledge. Ontology languages are usually declarative languages, are almost always generalizations of frame languages, and are commonly based on either first-order logic or on description logic.

Semantic integration is the process of interrelating information from diverse sources, for example calendars and to do lists, email archives, presence information, documents of all sorts, contacts, search results, and advertising and marketing relevance derived from them. In this regard, semantics focuses on the organization of and action upon information by acting as an intermediary between heterogeneous data sources, which may conflict not only by structure but also context or value.

<span class="mw-page-title-main">Semantic technology</span> Technology to help machines understand data

The ultimate goal of semantic technology is to help machines understand data. To enable the encoding of semantics with the data, well-known technologies are RDF and OWL. These technologies formally represent the meaning involved in information. For example, ontology can describe concepts, relationships between things, and categories of things. These embedded semantics with the data offer significant advantages such as reasoning over data and dealing with heterogeneous data sources.

Gellish is an ontology language for data storage and communication, designed and developed by Andries van Renssen since mid-1990s. It started out as an engineering modeling language but evolved into a universal and extendable conceptual data modeling language with general applications. Because it includes domain-specific terminology and definitions, it is also a semantic data modelling language and the Gellish modeling methodology is a member of the family of semantic modeling methodologies.

Frames are an artificial intelligence data structure used to divide knowledge into substructures by representing "stereotyped situations". They were proposed by Marvin Minsky in his 1974 article "A Framework for Representing Knowledge". Frames are the primary data structure used in artificial intelligence frame languages; they are stored as ontologies of sets.

The concept of the Social Semantic Web subsumes developments in which social interactions on the Web lead to the creation of explicit and semantically rich knowledge representations. The Social Semantic Web can be seen as a Web of collective knowledge systems, which are able to provide useful information based on human contributions and which get better as more people participate. The Social Semantic Web combines technologies, strategies and methodologies from the Semantic Web, social software and the Web 2.0.

A semantic reasoner, reasoning engine, rules engine, or simply a reasoner, is a piece of software able to infer logical consequences from a set of asserted facts or axioms. The notion of a semantic reasoner generalizes that of an inference engine, by providing a richer set of mechanisms to work with. The inference rules are commonly specified by means of an ontology language, and often a description logic language. Many reasoners use first-order predicate logic to perform reasoning; inference commonly proceeds by forward chaining and backward chaining. There are also examples of probabilistic reasoners, including non-axiomatic reasoning systems, and probabilistic logic networks.

Contemporary ontologies share many structural similarities, regardless of the ontology language in which they are expressed. Most ontologies describe individuals (instances), classes (concepts), attributes, and relations.

Knowledge extraction is the creation of knowledge from structured and unstructured sources. The resulting knowledge needs to be in a machine-readable and machine-interpretable format and must represent knowledge in a manner that facilitates inferencing. Although it is methodically similar to information extraction (NLP) and ETL, the main criterion is that the extraction result goes beyond the creation of structured information or the transformation into a relational schema. It requires either the reuse of existing formal knowledge or the generation of a schema based on the source data.

In information technology a reasoning system is a software system that generates conclusions from available knowledge using logical techniques such as deduction and induction. Reasoning systems play an important role in the implementation of artificial intelligence and knowledge-based systems.

The Open Semantic Framework (OSF) is an integrated software stack using semantic technologies for knowledge management. It has a layered architecture that combines existing open source software with additional open source components developed specifically to provide a complete Web application framework. OSF is made available under the Apache 2 license.

In the Semantic Web and in knowledge representation, a metaclass is a class whose instances can themselves be classes. Similar to their role in programming languages, metaclasses in Semantic Web languages can have properties otherwise applicable only to individuals, while retaining the same class's ability to be classified in a concept hierarchy. This enables knowledge about instances of those metaclasses to be inferred by semantic reasoners using statements made in the metaclass. Metaclasses thus enhance the expressivity of knowledge representations in a way that can be intuitive for users. While classes are suitable to represent a population of individuals, metaclasses can, as one of their feature, be used to represent the conceptual dimension of an ontology. Metaclasses are supported in the ontology language OWL and the data-modeling vocabulary RDFS.

References

PD-icon.svg This article incorporates public domain material from the National Institute of Standards and Technology.

  1. Peter Shames, Joseph Skipper. "Toward a Framework for Modeling Space Systems Architectures" Archived 2009-02-27 at the Wayback Machine . NASA, JPL.
  2. http://ontology.buffalo.edu/bfo/BeyondConcepts.pdf [ bare URL PDF ]
  3. Sikos, L. F. (14 March 2016). "A Novel Approach to Multimedia Ontology Engineering for Automated Reasoning over Audiovisual LOD Datasets". Lecture Notes in Artificial Intelligence. Vol. 9621. Springer. pp. 1–13. arXiv: 1608.08072 . doi:10.1007/978-3-662-49381-6_1.
  4. Asunción Gómez-Pérez, Mariano Fernández-López, Oscar Corcho (2004). Ontological Engineering: With Examples from the Areas of Knowledge Management, E-commerce and the Semantic Web . Springer, 2004.
  5. De Nicola, A; Missikoff, M; Navigli, R (2009). "A software engineering approach to ontology building" (PDF). Information Systems. 34 (2): 258. CiteSeerX   10.1.1.149.7258 . doi:10.1016/j.is.2008.07.002.
  6. Zarka, M; Ammar, AB; AM, Alimi (2015). "Fuzzy reasoning framework to improve semantic video interpretation". Multimedia Tools and Applications. 75 (10): 5719–5750. doi:10.1007/s11042-015-2537-1. S2CID   16505884.
  7. Malone, J; Holloway, E; Adamusiak, T; Kapushesky, M; Zheng, J; Kolesnikov, N; Zhukova, A; Brazma, A; Parkinson, H (2010). "Modeling sample variables with an Experimental Factor Ontology". Bioinformatics. 26 (8): 1112–1118. doi:10.1093/bioinformatics/btq099. PMC   2853691 . PMID   20200009.
  8. Alterovitz, G; Xiang, M; Hill, DP; Lomax, J; Liu, J; Cherkassky, M; Dreyfuss, J; Mungall, C; et al. (2010). "Ontology engineering". Nature Biotechnology. 28 (2): 128–30. doi:10.1038/nbt0210-128. PMC   4829499 . PMID   20139945.
  9. Botstein, David; Cherry, J. Michael; Ashburner, Michael; Ball, Catherine A.; Blake, Judith A.; Butler, Heather; Davis, Allan P.; Dolinski, Kara; et al. (2000). "Gene ontology: Tool for the unification of biology. The Gene Ontology Consortium" (PDF). Nature Genetics. 25 (1): 25–9. doi:10.1038/75556. PMC   3037419 . PMID   10802651. Archived from the original (PDF) on 2011-05-26.
  10. Alterovitz, G.; Xiang, M.; Mohan, M.; Ramoni, M. F. (2007). "GO PaD: The Gene Ontology Partition Database". Nucleic Acids Research. 35 (Database issue): D322–7. doi:10.1093/nar/gkl799. PMC   1669720 . PMID   17098937.
  11. Falbo, Ricardo (2014). "SABiO: Systematic Approach for Building Ontologies" (PDF). Proceedings of the 1st Joint Workshop ONTO.COM / ODISE on Ontologies in Conceptual Modeling and Information Systems Engineering Co-located with 8th International Conference on Formal Ontology in Information Systems, ONTO.COM/ODISE@FOIS 2014, Rio de Janeiro, Brazil, September 21, 2014. 1301 via CEUR-WS.org.

Further reading