Semantic interoperability

Last updated

Semantic interoperability is the ability of computer systems to exchange data with unambiguous, shared meaning. Semantic interoperability is a requirement to enable machine computable logic, inferencing, knowledge discovery, and data federation between information systems. [1]

Contents

Semantic interoperability is therefore concerned not just with the packaging of data (syntax), but the simultaneous transmission of the meaning with the data (semantics). This is accomplished by adding data about the data (metadata), linking each data element to a controlled, shared vocabulary. The meaning of the data is transmitted with the data itself, in one self-describing "information package" that is independent of any information system. It is this shared vocabulary, and its associated links to an ontology, which provides the foundation and capability of machine interpretation, inference, and logic.

Syntactic interoperability (see below) is a prerequisite for semantic interoperability. Syntactic interoperability refers to the packaging and transmission mechanisms for data. In healthcare, HL7 has been in use for over thirty years (which predates the internet and web technology), and uses the pipe character (|) as a data delimiter. The current internet standard for document markup is XML, which uses "< >" as a data delimiter. The data delimiters convey no meaning to the data other than to structure the data. Without a data dictionary to translate the contents of the delimiters, the data remains meaningless. While there are many attempts at creating data dictionaries and information models to associate with these data packaging mechanisms, none have been practical to implement. This has only perpetuated the ongoing "babelization" of data and inability to exchange data with meaning.

Since the introduction of the Semantic Web concept by Tim Berners-Lee in 1999, [2] there has been growing interest and application of the W3C (World Wide Web Consortium) standards to provide web-scale semantic data exchange, federation, and inferencing capabilities.

Semantic as a function of syntactic interoperability

Syntactic interoperability, provided by for instance XML or the SQL standards, is a pre-requisite to semantic. It involves a common data format and common protocol to structure any data so that the manner of processing the information will be interpretable from the structure. It also allows detection of syntactic errors, thus allowing receiving systems to request resending of any message that appears to be garbled or incomplete. No semantic communication is possible if the syntax is garbled or unable to represent the data. However, information represented in one syntax may in some cases be accurately translated into a different syntax. Where accurate translation of syntaxes is possible, systems using different syntaxes may also interoperate accurately. In some cases, the ability to accurately translate information among systems using different syntaxes may be limited to one direction, when the formalisms used have different levels of expressivity (ability to express information).

A single ontology containing representations of every term used in every application is generally considered impossible, because of the rapid creation of new terms or assignments of new meanings to old terms. However, though it is impossible to anticipate every concept that a user may wish to represent in a computer, there is the possibility of finding some finite set of "primitive" concept representations that can be combined to create any of the more specific concepts that users may need for any given set of applications or ontologies. Having a foundation ontology (also called upper ontology ) that contains all those primitive elements would provide a sound basis for general semantic interoperability, and allow users to define any new terms they need by using the basic inventory of ontology elements, and still have those newly defined terms properly interpreted by any other computer system that can interpret the basic foundation ontology. Whether the number of such primitive concept representations is in fact finite, or will expand indefinitely, is a question under active investigation. If it is finite, then a stable foundation ontology suitable to support accurate and general semantic interoperability can evolve after some initial foundation ontology has been tested and used by a wide variety of users. At the present time, no foundation ontology has been adopted by a wide community, so such a stable foundation ontology is still in the future.

Words and meanings

One persistent misunderstanding recurs in discussion of semantics is "the confusion of words and meanings". The meanings of words change, sometimes rapidly. But a formal language such as used in an ontology can encode the meanings (semantics) of concepts in a form that does not change. In order to determine what is the meaning of a particular word (or term in a database, for example) it is necessary to label each fixed concept representation in an ontology with the word(s) or term(s) that may refer to that concept. When multiple words refer to the same (fixed) concept in language this is called synonymy; when one word is used to refer to more than one concept, that is called ambiguity.

Ambiguity and synonymy are among the factors that make computer understanding of language very difficult. The use of words to refer to concepts (the meanings of the words used) is very sensitive to the context and the purpose of any use for many human-readable terms. The use of ontologies in supporting semantic interoperability is to provide a fixed set of concepts whose meanings and relations are stable and can be agreed to by users. The task of determining which terms in which contexts (each database is a different context) is then separated from the task of creating the ontology, and must be taken up by the designer of a database, or the designer of a form for data entry, or the developer of a program for language understanding. When the meaning of a word used in some interoperable context is changed, then to preserve interoperability it is necessary to change the pointer to the ontology element(s) that specifies the meaning of that word.

Knowledge representation requirements and languages

A knowledge representation language may be sufficiently expressive to describe nuances of meaning in well understood fields. There are at least five levels of complexity of these[ specify ].

For general semi-structured data one may use a general purpose language such as XML. [3]

Languages with the full power of first-order predicate logic may be required for many tasks.

Human languages are highly expressive, but are considered too ambiguous to allow the accurate interpretation desired, given the current level of human language technology. Semantic interoperability healthcare systems leverage data in a standardized way as they break down and share information. For example, two systems can now recognize terminology, medication symbols,Semantic interoperability healthcare systems leverage data in a standardized way as they break down and share information. For example, two systems can now recognize terminology, medication symbols, and other nuances while exchanging data automatically, without human intervention.09-Apr-2021 and other nuances while exchanging data automatically, without human intervention.09-Apr-2021

Prior agreement not required

Semantic interoperability may be distinguished from other forms of interoperability by considering whether the information transferred has, in its communicated form, all of the meaning required for the receiving system to interpret it correctly, even when the algorithms used by the receiving system are unknown to the sending system. Consider sending one number:

If that number is intended to be the sum of money owed by one company to another, it implies some action or lack of action on the part of both those who send it and those who receive it.

It may be correctly interpreted if sent in response to a specific request, and received at the time and in the form expected. This correct interpretation does not depend only on the number itself, which could represent almost any of millions of types of quantitative measurement, rather it depends strictly on the circumstances of transmission. That is, the interpretation depends on both systems expecting that the algorithms in the other system use the number in exactly the same sense, and it depends further on the entire envelope of transmissions that preceded the actual transmission of the bare number.

By contrast, if the transmitting system does not know how the information will be used by other systems, it is necessary to have a shared agreement on how information with some specific meaning (out of many possible meanings) will appear in a communication. For a particular task, one solution is to standardize a form, such as a request for payment; that request would have to encode, in standardized fashion, all of the information needed to evaluate it, such as: the agent owing the money, the agent owed the money, the nature of the action giving rise to the debt, the agents, goods, services, and other participants in that action; the time of the action; the amount owed and currency in which the debt is reckoned; the time allowed for payment; the form of payment demanded; and other information. When two or more systems have agreed on how to interpret the information in such a request, they can achieve semantic interoperability for that specific type of transaction. For semantic interoperability generally, it is necessary to provide standardized ways to describe the meanings of many more things than just commercial transactions, and the number of concepts whose representation needs to be agreed upon are at a minimum several thousand.

Ontology research

How to achieve semantic interoperability for more than a few restricted scenarios is currently a matter of research and discussion. For the problem of General Semantic Interoperability, some form of foundation ontology ('upper ontology') is required that is sufficiently comprehensive to provide the definition of concepts for more specialized ontologies in multiple domains. Over the past decade, more than ten foundation ontologies have been developed, but none have as yet been adopted by a wide user base.

The need for a single comprehensive all-inclusive ontology to support Semantic Interoperability can be avoided by designing the common foundation ontology as a set of basic ("primitive") concepts that can be combined to create the logical descriptions of the meanings of terms used in local domain ontologies or local databases. This tactic is based on the principle that:

If:

(1) the meanings and usage of the primitive ontology elements in the foundation ontology are agreed on, and  (2) the ontology elements in the domain ontologies are constructed as logical combinations of the elements in the foundation ontology, 

Then:

The intended meanings of the domain ontology elements can be computed automatically using an FOL (first-order logic) reasoner, by any system that accepts the meanings of the elements in the foundation ontology, and has both the foundation ontology and the logical specifications of the elements in the domain ontology. 

Therefore:

Any system wishing to interoperate accurately with another system need transmit only the data to be communicated, plus any logical descriptions of terms used in that data that were created locally and are not already in the common foundation ontology. 

This tactic then limits the need for prior agreement on meanings to only those ontology elements in the common Foundation Ontology (FO). Based on several considerations, this may require fewer than 10,000 elements (types and relations). However, for ease of understanding and use, more ontology elements with additional detail and specifics can help to find the exact location in the FO where specific domain concepts can be found or added.

In practice, together with the FO focused on representations of the primitive concepts, a set of domain extension ontologies to the FO with elements specified using the FO elements will likely also be used. Such pre-existing extensions will ease the cost of creating domain ontologies by providing existing elements with the intended meaning, and will reduce the chance of error by using elements that have already been tested. Domain extension ontologies may be logically inconsistent with each other, and that needs to be determined if different domain extensions are used in any communication.

Whether use of such a single foundation ontology can itself be avoided by sophisticated mapping techniques among independently developed ontologies is also under investigation.

Importance

The practical significance of semantic interoperability has been measured by several studies that estimate the cost (in lost efficiency) due to lack of semantic interoperability. One study, [4] focusing on the lost efficiency in the communication of healthcare information, estimated that US$77.8 billion per year could be saved by implementing an effective interoperability standard in that area. Other studies, of the construction industry [5] and of the automobile manufacturing supply chain, [6] estimate costs of over US$10 billion per year due to lack of semantic interoperability in those industries. In total these numbers can be extrapolated to indicate that well over US$100 billion per year is lost because of the lack of a widely used semantic interoperability standard in the US alone.

There has not yet been a study about each policy field that might offer big cost savings applying semantic interoperability standards. But to see which policy fields are capable of profiting from semantic interoperability, see 'Interoperability' in general. Such policy fields are eGovernment, health, security and many more. The EU also set up the Semantic Interoperability Centre Europe in June 2007.

Semantic Interoperability for Internet of Things

Digital transformation holds huge benefits for enabling organizations to be more efficient, more flexible, and more nimble in responding to changes in business and operating conditions. This involves the need to integrate heterogeneous data and services throughout organizations. Semantic interoperability addresses the need for shared understanding of the meaning and context.

To support this, a cross-organization expert group involving ISO/IEC JTC1, ETSI, oneM2M and W3C are collaborating with AIOTI on accelerating adoption of semantic technologies in the IoT. The group has very recently published two joint white papers on semantic interoperability respectively named “Semantic IoT Solutions – A Developer Perspective” and “Towards semantic interoperability standards based on ontologies“. This follows on the success of the earlier white paper on “Semantic Interoperability for the Web of Things.”


Source:

“Semantic IoT Solutions – A Developer Perspective”

“Towards semantic interoperability standards based on ontologies“.

This follows on the success of the earlier white paper on “Semantic Interoperability for the Web of Things.”

https://www.w3.org/blog/2019/10/aioti-iso-iec-jtc1-etsi-onem2m-and-w3c-collaborate-on-two-joint-white-papers-on-semantic-interoperability-targeting-developers-and-standardization-engineers/

See also

Related Research Articles

<span class="mw-page-title-main">Programming language</span> Language for communicating instructions to a machine

A programming language is a system of notation for writing computer programs. Most programming languages are text-based formal languages, but they may also be graphical. They are a kind of computer language.

<span class="mw-page-title-main">Semantics</span> Study of meaning in language

Semantics is the study of reference, meaning, or truth. The term can be used to refer to subfields of several distinct disciplines, including philosophy, linguistics and computer science.

<span class="mw-page-title-main">Semantic Web</span> Extension of the Web to facilitate data exchange

The Semantic Web, sometimes known as Web 3.0, is an extension of the World Wide Web through standards set by the World Wide Web Consortium (W3C). The goal of the Semantic Web is to make Internet data machine-readable.

In information science, an ontology encompasses a representation, formal naming, and definition of the categories, properties, and relations between the concepts, data, and entities that substantiate one, many, or all domains of discourse. More simply, an ontology is a way of showing the properties of a subject area and how they are related, by defining a set of concepts and categories that represent the subject.

<span class="mw-page-title-main">Topic map</span> Knowledge organization system

A topic map is a standard for the representation and interchange of knowledge, with an emphasis on the findability of information. Topic maps were originally developed in the late 1990s as a way to represent back-of-the-book index structures so that multiple indexes from different sources could be merged. However, the developers quickly realized that with a little additional generalization, they could create a meta-model with potentially far wider application. The ISO/IEC standard is formally known as ISO/IEC 13250:2003.

The Web Ontology Language (OWL) is a family of knowledge representation languages for authoring ontologies. Ontologies are a formal way to describe taxonomies and classification networks, essentially defining the structure of knowledge for various domains: the nouns representing classes of objects and the verbs representing relations between the objects.

A modeling language is any artificial language that can be used to express data, information or knowledge or systems in a structure that is defined by a consistent set of rules. The rules are used for interpretation of the meaning of components in the structure Programing language.

Conceptual semantics is a framework for semantic analysis developed mainly by Ray Jackendoff in 1976. Its aim is to provide a characterization of the conceptual elements by which a person understands words and sentences, and thus to provide an explanatory semantic representation. Explanatory in this sense refers to the ability of a given linguistic theory to describe how a component of language is acquired by a child.

The semantic spectrum is a series of increasingly precise or rather semantically expressive definitions for data elements in knowledge representations, especially for machine use.

<span class="mw-page-title-main">SNOMED CT</span> System for medical classification

SNOMED CT or SNOMED Clinical Terms is a systematically organized computer-processable collection of medical terms providing codes, terms, synonyms and definitions used in clinical documentation and reporting. SNOMED CT is considered to be the most comprehensive, multilingual clinical healthcare terminology in the world. The primary purpose of SNOMED CT is to encode the meanings that are used in health information and to support the effective clinical recording of data with the aim of improving patient care. SNOMED CT provides the core general terminology for electronic health records. SNOMED CT comprehensive coverage includes: clinical findings, symptoms, diagnoses, procedures, body structures, organisms and other etiologies, substances, pharmaceuticals, devices and specimens.

In information science, an upper ontology is an ontology which consists of very general terms that are common across all domains. An important function of an upper ontology is to support broad semantic interoperability among a large number of domain-specific ontologies by providing a common starting point for the formulation of definitions. Terms in the domain ontology are ranked under the terms in the upper ontology, e.g., the upper ontology classes are superclasses or supersets of all the classes in the domain ontologies.

Ontology alignment, or ontology matching, is the process of determining correspondences between concepts in ontologies. A set of correspondences is also called an alignment. The phrase takes on a slightly different meaning, in computer science, cognitive science or philosophy.

Ontology-based data integration involves the use of one or more ontologies to effectively combine data or information from multiple heterogeneous sources. It is one of the multiple data integration approaches and may be classified as Global-As-View (GAV). The effectiveness of ontology‑based data integration is closely tied to the consistency and expressivity of the ontology used in the integration process.

The terms schema matching and mapping are often used interchangeably for a database process. For this article, we differentiate the two as follows: Schema matching is the process of identifying that two objects are semantically related while mapping refers to the transformations between the objects. For example, in the two schemas DB1.Student and DB2.Grad-Student ; possible matches would be: DB1.Student ≈ DB2.Grad-Student; DB1.SSN = DB2.ID etc. and possible transformations or mappings would be: DB1.Marks to DB2.Grades.

Machine interpretation of documents and services in Semantic Web environment is primarily enabled by (a) the capability to mark documents, document segments and services with semantic tags and (b) the ability to establish contextual relations between the tags with a domain model, which is formally represented as ontology. Human beings use natural languages to communicate an abstract view of the world. Natural language constructs are symbolic representations of human experience and are close to the conceptual model that Semantic Web technologies deal with. Thus, natural language constructs have been naturally used to represent the ontology elements. This makes it convenient to apply Semantic Web technologies in the domain of textual information. In contrast, multimedia documents are perceptual recording of human experience. An attempt to use a conceptual model to interpret the perceptual records gets severely impaired by the semantic gap that exists between the perceptual media features and the conceptual world. Notably, the concepts have their roots in perceptual experience of human beings and the apparent disconnect between the conceptual and the perceptual world is rather artificial. The key to semantic processing of multimedia data lies in harmonizing the seemingly isolated conceptual and the perceptual worlds. Representation of the Domain knowledge needs to be extended to enable perceptual modeling, over and above conceptual modeling that is supported. The perceptual model of a domain primarily comprises observable media properties of the concepts. Such perceptual models are useful for semantic interpretation of media documents, just as the conceptual models help in the semantic interpretation of textual documents.

A concept search is an automated information retrieval method that is used to search electronically stored unstructured text for information that is conceptually similar to the information provided in a search query. In other words, the ideas expressed in the information retrieved in response to a concept search query are relevant to the ideas contained in the text of the query.

<span class="mw-page-title-main">Ontology engineering</span> Field that studies the methods and methodologies for building ontologies

In computer science, information science and systems engineering, ontology engineering is a field which studies the methods and methodologies for building ontologies, which encompasses a representation, formal naming and definition of the categories, properties and relations between the concepts, data and entities. In a broader sense, this field also includes a knowledge construction of the domain using formal ontology representations such as OWL/RDF. A large-scale representation of abstract concepts such as actions, time, physical objects and beliefs would be an example of ontological engineering. Ontology engineering is one of the areas of applied ontology, and can be seen as an application of philosophical ontology. Core ideas and objectives of ontology engineering are also central in conceptual modeling.

Semantic primes or semantic primitives are a set of semantic concepts that are argued to be innately understood by all people but impossible to express in simpler terms. They represent words or phrases that are learned through practice but cannot be defined concretely. For example, although the meaning of "touching" is readily understood, a dictionary might define "touch" as "to make contact" and "contact" as "touching", providing no information if neither of these words is understood. The concept of universal semantic primes was largely introduced by Anna Wierzbicka's book, Semantics: Primes and Universals.

Knowledge extraction is the creation of knowledge from structured and unstructured sources. The resulting knowledge needs to be in a machine-readable and machine-interpretable format and must represent knowledge in a manner that facilitates inferencing. Although it is methodically similar to information extraction (NLP) and ETL, the main criterion is that the extraction result goes beyond the creation of structured information or the transformation into a relational schema. It requires either the reuse of existing formal knowledge or the generation of a schema based on the source data.

Semantic heterogeneity is when database schema or datasets for the same domain are developed by independent parties, resulting in differences in meaning and interpretation of data values. Beyond structured data, the problem of semantic heterogeneity is compounded due to the flexibility of semi-structured data and various tagging methods applied to documents or unstructured data. Semantic heterogeneity is one of the more important sources of differences in heterogeneous datasets.

References

  1. NCOIC, "SCOPE", Network Centric Operations Industry Consortium, 2008
  2. Berners-Lee, Tim; Fischetti, Mark (1999). Weaving the Web. HarperSanFrancisco. chapter 12. ISBN   978-0-06-251587-2.
  3. XML as a tool for Semantic Interoperability Semantic Interoperability on the Web, Jeff Heflin and James Hendler
  4. Jan Walker, Eric Pan, Douglas Johnston, Julia Adler-Milstein, David W. Bates and Blackford Middleton, The Value of Healthcare Information Exchange and Interoperability Health Affairs, 19 January 2005
  5. Microsoft Word - 08657 Final Rpt_8-2-04.doc
  6. "Archived copy" (PDF). Archived from the original (PDF) on 2012-06-16. Retrieved 2017-07-13.{{cite web}}: CS1 maint: archived copy as title (link)