Semantics (computer science)

Last updated

In programming language theory, semantics is the field concerned with the rigorous mathematical study of the meaning of programming languages. It does so by evaluating the meaning of syntactically valid strings defined by a specific programming language, showing the computation involved. In such a case that the evaluation would be of syntactically invalid strings, the result would be non-computation. Semantics describes the processes a computer follows when executing a program in that specific language. This can be shown by describing the relationship between the input and output of a program, or an explanation of how the program will be executed on a certain platform, hence creating a model of computation.

Programming language theory branch of computer science that deals with the design, implementation, analysis, characterization, and classification of programming languages and their individual features

Programming language theory (PLT) is a branch of computer science that deals with the design, implementation, analysis, characterization, and classification of programming languages and their individual features. It falls within the discipline of computer science, both depending on and affecting mathematics, software engineering, linguistics and even cognitive science. It is a well-recognized branch of computer science, and an active research area, with results published in numerous journals dedicated to PLT, as well as in general computer science and engineering publications.

String (computer science) sequence of characters, data type

In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed. A string is generally considered a data type and is often implemented as an array data structure of bytes that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence data types and structures.

In computer science, and more specifically in computability theory and computational complexity theory, a model of computation is a model which describes how a set of outputs are computed given a set of inputs. This model describes how units of computations, memories, and communications are organized. The computational complexity of an algorithm can be measured given a model of computation. Using a model allows studying the performance of algorithms independently of the variations that are specific to particular implementations and specific technology.

Contents

Formal semantics, for instance, helps to write compilers, better understand what a program is doing and to prove, e.g., that the following if statement

if 1 == 1 then S1 else S2

has the same effect as S1 alone.

Overview

The field of formal semantics encompasses all of the following:

Mathematical logic is a subfield of mathematics exploring the applications of formal logic to mathematics. It bears close connections to metamathematics, the foundations of mathematics, and theoretical computer science. The unifying themes in mathematical logic include the study of the expressive power of formal systems and the deductive power of formal proof systems.

Set theory branch of mathematics that studies sets, which are collections of objects

Set theory is a branch of mathematical logic that studies sets, which informally are collections of objects. Although any type of object can be collected into a set, set theory is applied most often to objects that are relevant to mathematics. The language of set theory can be used to define nearly all mathematical objects.

In mathematics, model theory is the study of classes of mathematical structures from the perspective of mathematical logic. The objects of study are models of theories in a formal language. A set of sentences in a formal language is one of the components that form a theory. A model of a theory is a structure that satisfies the sentences of that theory.

It has close links with other areas of computer science such as programming language design, type theory, compilers and interpreters, program verification and model checking.

Computer science study of the theoretical foundations of information and computation

Computer science is the study of mathematical algorithms and processes that interact with data and that can be represented as data in the form of programs. It enables the use of algorithms to manipulate, store, and communicate digital information. A computer scientist studies the theory of computation and the practice of designing software systems.

In mathematics, logic, and computer science, a type theory is any of a class of formal systems, some of which can serve as alternatives to set theory as a foundation for all mathematics. In type theory, every "term" has a "type" and operations are restricted to terms of a certain type.

A compiler is a computer program that transforms computer code written in one programming language into another programming language. Compilers are a type of translator that support digital devices, primarily computers. The name compiler is primarily used for programs that translate source code from a high-level programming language to a lower level language to create an executable program.

Approaches

There are many approaches to formal semantics; these belong to three major classes:

In computer science, denotational semantics is an approach of formalizing the meanings of programming languages by constructing mathematical objects that describe the meanings of expressions from the languages. Other approaches provide formal semantics of programming languages including axiomatic semantics and operational semantics.

In semiotics, denotation is the surface or the literal meaning. The definition most likely to appear in a dictionary.

Domain theory is a branch of mathematics that studies special kinds of partially ordered sets (posets) commonly called domains. Consequently, domain theory can be considered as a branch of order theory. The field has major applications in computer science, where it is used to specify denotational semantics, especially for functional programming languages. Domain theory formalizes the intuitive ideas of approximation and convergence in a very general way and has close relations to topology.

The distinctions between the three broad classes of approaches can sometimes be vague, but all known approaches to formal semantics use the above techniques, or some combination thereof.

Apart from the choice between denotational, operational, or axiomatic approaches, most variation in formal semantic systems arises from the choice of supporting mathematical formalism.

Variations

Some variations of formal semantics include the following:

Action semantics is a framework for the formal specification of semantics of programming languages invented by David Watt and Peter D. Mosses in the 1990s. It is a mixture of denotational, operational and algebraic semantics.

In computer science, algebraic semantics is a form of axiomatic semantics based on algebraic laws for describing and reasoning about program semantics in a formal manner.

Axiomatic semantics is an approach based on mathematical logic for proving the correctness of computer programs. It is closely related to Hoare logic.

Describing relationships

For a variety of reasons, one might wish to describe the relationships between different formal semantics. For example:

It is also possible to relate multiple semantics through abstractions via the theory of abstract interpretation.

History

Robert W. Floyd is credited with founding the field of programming language semantics in Floyd (1967). [1]

See also

Related Research Articles

Formal language set of strings of symbols that may be constrained by rules that are specific to it

In mathematics, computer science, and linguistics, a formal language consists of words whose letters are taken from an alphabet and are well-formed according to a specific set of rules.

Semantics is the linguistic and philosophical study of meaning, in language, programming languages, formal logics, and semiotics. It is concerned with the relationship between signifiers—like words, phrases, signs, and symbols—and what they stand for in reality, their denotation.

In computer science, specifically software engineering and hardware engineering, formal methods are a particular kind of mathematically based technique for the specification, development and verification of software and hardware systems. The use of formal methods for software and hardware design is motivated by the expectation that, as in other engineering disciplines, performing appropriate mathematical analysis can contribute to the reliability and robustness of a design.

Lexical functional grammar (LFG) is a constraint-based grammar framework in theoretical linguistics. It posits two separate levels of syntactic structure, a phrase structure grammar representation of word order and constituency, and a representation of grammatical functions such as subject and object, similar to dependency grammar. The development of the theory was initiated by Joan Bresnan and Ronald Kaplan in the 1970s, in reaction to the theory of transformational grammar which was current in the late 1970s. It mainly focuses on syntax, including its relation with morphology and semantics. There has been little LFG work on phonology.

Head-driven phrase structure grammar (HPSG) is a highly lexicalized, constraint-based grammar developed by Carl Pollard and Ivan Sag. It is a type of phrase structure grammar, as opposed to a dependency grammar, and it is the immediate successor to generalized phrase structure grammar. HPSG draws from other fields such as computer science and uses Ferdinand de Saussure's notion of the sign. It uses a uniform formalism and is organized in a modular way which makes it attractive for natural language processing.

In the context of hardware and software systems, formal verification is the act of proving or disproving the correctness of intended algorithms underlying a system with respect to a certain formal specification or property, using formal methods of mathematics.

Operational semantics is a category of formal programming language semantics in which certain desired properties of a program, such as correctness, safety or security, are verified by constructing proofs from logical statements about its execution and procedures, rather than by attaching mathematical meanings to its terms. Operational semantics are classified in two categories: structural operational semantics formally describe how the individual steps of a computation take place in a computer-based system; by opposition natural semantics describe how the overall results of the executions are obtained. Other approaches to providing a formal semantics of programming languages include axiomatic semantics and denotational semantics.

Parsing, syntax analysis, or syntactic analysis is the process of analysing a string of symbols, either in natural language, computer languages or data structures, conforming to the rules of a formal grammar. The term parsing comes from Latin pars (orationis), meaning part.

A formal system is used to infer theorems from axioms according to a set of rules. These rules used to carry out the inference of theorems from axioms are known as the logical calculus of the formal system. A formal system is essentially an "axiomatic system". In 1921, David Hilbert proposed to use such system as the foundation for the knowledge in mathematics. A formal system may represent a well-defined system of abstract thought. Spinoza's Ethics imitates the form of Euclid's Elements. Spinoza employed Euclidean elements such as "axioms" or "primitive truths", rules of inferences, etc., so that a calculus can be built using these.

In mathematics, semantics, and philosophy of language, the principle of compositionality is the principle that the meaning of a complex expression is determined by the meanings of its constituent expressions and the rules used to combine them. This principle is also called Frege's principle, because Gottlob Frege is widely credited for the first modern formulation of it. However, the idea appears already among Indian philosophers of grammar such as Yāska, and also in Plato's work such as in Theaetetus. Besides, the principle was never explicitly stated by Frege, and it was arguably already assumed by George Boole decades before Frege's work.

In linguistics, construction grammar groups a number of models of grammar that all subscribe to the idea that knowledge of a language is based on a collection of "form and function pairings". The "function" side covers what is commonly understood as meaning, content, or intent; it usually extends over both conventional fields of semantics and pragmatics.

Syntax (programming languages) in programming languages

In computer science, the syntax of a computer language is the set of rules that defines the combinations of symbols that are considered to be a correctly structured document or fragment in that language. This applies both to programming languages, where the document represents source code, and markup languages, where the document represents data. The syntax of a language defines its surface form. Text-based computer languages are based on sequences of characters, while visual programming languages are based on the spatial layout and connections between symbols. Documents that are syntactically invalid are said to have a syntax error.

Glue semantics, or simply Glue, is a linguistic theory of semantic composition and the syntax-semantics interface which assumes that meaning composition is constrained by a set of instructions stated within a formal logic. These instructions, called meaning constructors, state how the meanings of the parts of a sentence can be combined to provide the meaning of the sentence.

In foundations of mathematics, philosophy of mathematics, and philosophy of logic, formalism is a theory that holds that statements of mathematics and logic can be considered to be statements about the consequences of certain string manipulation rules.

The categorical abstract machine (CAM) is a model of computation for programs that preserves the abilities of applicative, functional, or compositional style. It is based on the techniques of applicative computing.

References

  1. Knuth, Donald E. "Memorial Resolution: Robert W. Floyd (1936–2001)" (PDF). Stanford University Faculty Memorials. Stanford Historical Society.

Further reading

Textbooks
Lecture notes