Semantic gap

Last updated November 06, 2023

The semantic gap characterizes the difference between two descriptions of an object by different linguistic representations, for instance languages or symbols. According to Andreas M. Hein, the semantic gap can be defined as "the difference in meaning between constructs formed within different representation systems".^[1] In computer science, the concept is relevant whenever ordinary human activities, observations, and tasks are transferred into a computational representation.^[2]^[3]^[1]

More precisely the gap means the difference between ambiguous formulation of contextual knowledge in a powerful language (e.g. natural language) and its sound, reproducible and computational representation in a formal language (e.g. programming language). Semantics of an object depends on the context it is regarded within. For practical application this means any formal representation of real world tasks requires the translation of the contextual expert knowledge of an application (high-level) into the elementary and reproducible operations of a computing machine (low-level). Since natural language allows the expression of tasks which are impossible to compute in a formal language there are no means to automate this translation in a general way. Moreover, the examination of languages within the Chomsky hierarchy indicates that there is no formal and consequently automated way of translating from one language into another above a certain level of expressional power.

Theoretical background

The yet unproven but commonly accepted Church-Turing thesis states that a Turing machine and all equivalent formal languages such as the lambda calculus perform and represent all formal operations respectively as applied by a computing human. However the selection of adequate operations for the correct computation itself is not formally deducible, moreover it depends on the computability of the underlying problem. Tasks, such as the halting problem, may be formulated comprehensively in natural language, but the computational representation will not terminate or does not provide a usable result, which is proven by Rice's theorem. The general expression of limitations for rule based deduction by Gödel's incompleteness theorem indicates that the semantic gap is never to be fully closed. These are general statements, considering the generalized limits of computation on the highest level of abstraction where the semantic gap manifests itself. There are however many subsets of problems which may be translated automatically, especially in the higher-numbered levels of the Chomsky hierarchy.

Formal languages

Real world tasks are formalized by programming languages, which are executed on computers based on the von Neumann architecture. Since programming languages are only comfortable representations of the Turing machine any program on a von Neumann computer has the same properties and limitations as the Turing machine or its equivalent representation. Consequently, every programming language such as CPU level machine code, assembler, or any high level programming language has the same expressional power as the underlying Turing machine is able to compute. There is no semantic gap between them since a program is transferred from the high level language to the machine code by a program, e.g. a compiler which itself runs on a Turing machine without any user interaction. The semantic gap actually opens between the selection of the rules and the representation of the task.

Practical consequences

Selection of rules for formal representations of real world applications, corresponds to writing a program. Writing programs is independent from the actual programming language and basically requires the translation of the domain specific knowledge of the user into the formal rules operating a turing machine. It is this transfer from contextual knowledge into formal representation which cannot be automatized with respect to the theoretical limitations of computation. Consequently, any mapping from real world applications into computer applications requires a certain amount of technical background knowledge by the user, where the semantic gap manifests itself.

It is a fundamental task of software engineering to close the gap between application specific knowledge and technically doable formalization. For this purpose domain specific (high-level) knowledge must be transferred into an algorithm and its parameters (low-level). This requires the dialogue between user and developer. Aim is always a software which allows the user to represent his knowledge as parameters of an algorithm without knowing the details of the implementation, and to interpret the outcome of the algorithm without the aid of the developer. For this purpose user interfaces play the key role in software design, while developers are supported by frameworks which help organizing the integration of contextual information.

Examples

Document retrieval

A simple example can be formulated as a series of increasingly difficult natural language queries to locate a target document that may or may not exist locally on a known computer system.

Example queries:

1) Locate any file in the known directory "/usr/local/funny".
2) Locate any file where the word "funny" appears in the filename.
3) Locate any text file where the word "funny" or the substring "humor" appears in the text.
4) Locate any mp3 file where "funny", "comic" or "humor" appears in the metadata.
5) Locate any file of any type related to humor.
6) Locate any image that is likely to make my grandmother laugh.

The progressive difficulty of these queries is represented by the increasing degree of abstraction from the types and semantics defined the system architecture (directories and files on a known computer) to the types and semantics that occupy the realm of ordinary human discourse (subjects such as "humor" and entities such as "my grandmother"). Moreover, this disparity of realms is further complicated by leaky abstractions, such as is common in the case of query 4), where the target document may exist, but may not encapsulate the "metadata" in a manner expected by the user, nor the designer of the query processing system.

Image analysis

Image analysis is a typical domain for which a high degree of abstraction from low-level methods is required, and where the semantic gap immediately affects the user. If image content is to be identified to understand the meaning of an image, the only available independent information is the low-level pixel data. Textual annotations always depend on the knowledge, capability of expression and specific language of the annotator and therefore is unreliable. To recognize the displayed scenes from the raw data of an image the algorithms for selection and manipulation of pixels must be combined and parameterized in an adequate manner and finally linked with the natural description. Even the simple linguistic representation of shape or color such as round or yellow requires entirely different mathematical formalization methods, which are neither intuitive nor unique and sound.

Layered systems

In many layered systems, some conflicts arise when concepts at a high level of abstraction need to be translated into lower, more concrete artifacts. This mismatch is often called semantic gap.

Databases

OODBMSs (object-oriented database management system) advocates sometimes claim that these databases help to reduce the semantic gap between the application domain (miniworld) and the traditional RDBMS systems.^[4] However Relational proponents would posit the exact opposite, because by definition object databases fix the data being recorded into a single binding abstraction.

Related Research Articles

A computation is any type of arithmetic or non-arithmetic calculation that is well-defined. Common examples of computations are mathematical equations and computer algorithms.

Natural language processing (NLP) is an interdisciplinary subfield of computer science and linguistics. It is primarily concerned with giving computers the ability to support and manipulate speech. It involves processing natural language datasets, such as text corpora or speech corpora, using either rule-based or probabilistic machine learning approaches. The goal is a computer capable of "understanding" the contents of documents, including the contextual nuances of the language within them. The technology can then accurately extract information and insights contained in the documents as well as categorize and organize the documents themselves.

A programming language is a system of notation for writing computer programs.

A semantic network, or frame network is a knowledge base that represents semantic relations between concepts in a network. This is often used as a form of knowledge representation. It is a directed or undirected graph consisting of vertices, which represent concepts, and edges, which represent semantic relations between concepts, mapping or connecting semantic fields. A semantic network may be instantiated as, for example, a graph database or a concept map. Typical standardized semantic networks are expressed as semantic triples.

<span class="mw-page-title-main">Semantic Web</span> Extension of the Web to facilitate data exchange

The Semantic Web, sometimes known as Web 3.0, is an extension of the World Wide Web through standards set by the World Wide Web Consortium (W3C). The goal of the Semantic Web is to make Internet data machine-readable.

<span class="mw-page-title-main">Theory of computation</span> Academic subfield of computer science

In theoretical computer science and mathematics, the theory of computation is the branch that deals with what problems can be solved on a model of computation, using an algorithm, how efficiently they can be solved or to what degree. The field is divided into three major branches: automata theory and formal languages, computability theory, and computational complexity theory, which are linked by the question: "What are the fundamental capabilities and limitations of computers?".

In computability theory, a system of data-manipulation rules is said to be Turing-complete or computationally universal if it can be used to simulate any Turing machine. This means that this system is able to recognize or decide other data-manipulation rule sets. Turing completeness is used as a way to express the power of such a data-manipulation rule set. Virtually all programming languages today are Turing-complete.

Natural-language understanding (NLU) or natural-language interpretation (NLI) is a subtopic of natural-language processing in artificial intelligence that deals with machine reading comprehension. Natural-language understanding is considered an AI-hard problem.

Computer science is the study of the theoretical foundations of information and computation and their implementation and application in computer systems. One well known subject classification system for computer science is the ACM Computing Classification System devised by the Association for Computing Machinery.

In computer science, declarative programming is a programming paradigm—a style of building the structure and elements of computer programs—that expresses the logic of a computation without describing its control flow.

A modeling language is any artificial language that can be used to express data, information or knowledge or systems in a structure that is defined by a consistent set of rules. The rules are used for interpretation of the meaning of components in the structure Programing language.

Logic in computer science covers the overlap between the field of logic and that of computer science. The topic can essentially be divided into three main areas:

Natural-language programming (NLP) is an ontology-assisted way of programming in terms of natural-language sentences, e.g. English. A structured document with Content, sections and subsections for explanations of sentences forms a NLP document, which is actually a computer program. Natural language programming is not to be mixed up with natural language interfacing or voice control where a program is first written and then communicated with through natural language using an interface added on. In NLP the functionality of a program is organised only for the definition of the meaning of sentences. For instance, NLP can be used to represent all the knowledge of an autonomous robot. Having done so, its tasks can be scripted by its users so that the robot can execute them autonomously while keeping to prescribed rules of behaviour as determined by the robot's user. Such robots are called transparent robots as their reasoning is transparent to users and this develops trust in robots. Natural language use and natural-language user interfaces include Inform 7, a natural programming language for making interactive fiction, Shakespeare, an esoteric natural programming language in the style of the plays of William Shakespeare, and Wolfram Alpha, a computational knowledge engine, using natural-language input. Some methods for program synthesis are based on natural-language programming.

Natural-language user interface is a type of computer human interface where linguistic phenomena such as verbs, phrases and clauses act as UI controls for creating, selecting and modifying data in software applications.

<span class="mw-page-title-main">Computer algebra</span> Scientific area at the interface between computer science and mathematics

In mathematics and computer science, computer algebra, also called symbolic computation or algebraic computation, is a scientific area that refers to the study and development of algorithms and software for manipulating mathematical expressions and other mathematical objects. Although computer algebra could be considered a subfield of scientific computing, they are generally considered as distinct fields because scientific computing is usually based on numerical computation with approximate floating point numbers, while symbolic computation emphasizes exact computation with expressions containing variables that have no given value and are manipulated as symbols.

Data-intensive computing is a class of parallel computing applications which use a data parallel approach to process large volumes of data typically terabytes or petabytes in size and typically referred to as big data. Computing applications which devote most of their execution time to computational requirements are deemed compute-intensive, whereas computing applications which require large volumes of data and devote most of their processing time to I/O and manipulation of data are deemed data-intensive.

The following outline is provided as an overview of and topical guide to natural-language processing:

This glossary of artificial intelligence is a list of definitions of terms and concepts relevant to the study of artificial intelligence, its sub-disciplines, and related fields. Related glossaries include Glossary of computer science, Glossary of robotics, and Glossary of machine vision.

<span class="mw-page-title-main">Semantic parsing</span>

Semantic parsing is the task of converting a natural language utterance to a logical form: a machine-understandable representation of its meaning. Semantic parsing can thus be understood as extracting the precise meaning of an utterance. Applications of semantic parsing include machine translation, question answering, ontology induction, automated reasoning, and code generation. The phrase was first used in the 1970s by Yorick Wilks as the basis for machine translation programs working with only semantic representations.. Semantic parsing is one of the important tasks in computational linguistics and natural language processing. Semantic parsing maps text to formal meaning representations. This contrasts with semantic role labeling and other forms of shallow semantic processing, which do not aim to produce complete formal meanings..

This glossary of computer science is a list of definitions of terms and concepts used in computer science, its sub-disciplines, and related fields, including terms relevant to software, data science, and computer programming.

References

1 2 Hein, A. M. (2010). "Identification and Bridging of Semantic Gaps in the Context of Multi-Domain Engineering". Abstracts of the 2010 Forum on Philosophy, Engineering & Technology. Colorado.
↑ Smeulders, A. W. M.; et al. (2000). "Content-Based Image Retrieval at the End of the Early Years". IEEE Trans Pattern Anal Mach Intell. 22 (12): 1349–80. doi:10.1109/34.895972. S2CID 2827898.
↑ Dorai, C.; Venkatesh, S. (2003). "Bridging the Semantic Gap with Computational Media Aesthetics". IEEE MultiMedia. 10 (2): 15–17. doi:10.1109/MMUL.2003.1195157. hdl: 10536/DRO/DU:30044313 . S2CID 206477548.
↑ Schlatter, M.; et al. (1994). "The Business Object Management System". IBM Systems Journal. 33 (2): 239–263. doi:10.1147/sj.332.0239.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[Hein-1] 1 2 Hein, A. M. (2010). "Identification and Bridging of Semantic Gaps in the Context of Multi-Domain Engineering". Abstracts of the 2010 Forum on Philosophy, Engineering & Technology. Colorado.

[IRdef-2] Smeulders, A. W. M.; et al. (2000). "Content-Based Image Retrieval at the End of the Early Years". IEEE Trans Pattern Anal Mach Intell. 22 (12): 1349–80. doi:10.1109/34.895972. S2CID 2827898.

[Multimediadef-3] Dorai, C.; Venkatesh, S. (2003). "Bridging the Semantic Gap with Computational Media Aesthetics". IEEE MultiMedia. 10 (2): 15–17. doi:10.1109/MMUL.2003.1195157. hdl: 10536/DRO/DU:30044313 . S2CID 206477548.

[4] Schlatter, M.; et al. (1994). "The Business Object Management System". IBM Systems Journal. 33 (2): 239–263. doi:10.1147/sj.332.0239.

[1]

[2]

[3]

[4]