Expressive power (computer science)

Last updated August 27, 2023

In computer science, the expressive power (also called expressiveness or expressivity) of a language is the breadth of ideas that can be represented and communicated in that language. The more expressive a language is, the greater the variety and quantity of ideas it can be used to represent.

For example, the Web Ontology Language expression language profile (OWL2 EL) lacks ideas (such as negation) that can be expressed in OWL2 RL (rule language). OWL2 EL may therefore be said to have less expressive power than OWL2 RL. These restrictions allow for more efficient (polynomial time) reasoning in OWL2 EL than in OWL2 RL. So OWL2 EL trades some expressive power for more efficient reasoning (processing of the knowledge representation language).^[1]

Information description

The term expressive power may be used with a range of meaning. It may mean a measure of the ideas expressible in that language:^[2]

regardless of ease (theoretical expressivity)
concisely and readily (practical expressivity)

The first sense dominates in areas of mathematics and logic that deal with the formal description of languages and their meaning, such as formal language theory, mathematical logic and process algebra.^[2]

In informal discussions, the term often refers to the second sense, or to both. This is often the case when discussing programming languages.^[3]^{[ page needed ]} Efforts have been made to formalize these informal uses of the term.^[4]

The notion of expressive power is always relative to a particular kind of thing that the language in question can describe, and the term is normally used when comparing languages that describe the same kind of things, or at least comparable kinds of things.^[4]

The design of languages and formalisms involves a trade-off between expressive power and analyzability. The more a formalism can express, the harder it becomes to understand what instances of the formalism say. Decision problems become harder to answer or completely undecidable.^[5]

Examples

In formal language theory

Formal language theory mostly studies formalisms to describe sets of strings, such as context-free grammars and regular expressions. Each instance of a formalism, e.g. each grammar and each regular expression, describes a particular set of strings. In this context, the expressive power of a formalism is the set of sets of strings its instances describe, and comparing expressive power is a matter of comparing these sets.

An important yardstick for describing the relative expressive power of formalisms in this area is the Chomsky hierarchy. It says, for instance, that regular expressions, nondeterministic finite automata and regular grammars have equal expressive power, while that of context-free grammars is greater; what this means is that the sets of sets of strings described by the first three formalisms are equal, and a proper subset of the set of sets of strings described by context-free grammars.

In this area, the cost of expressive power is a central topic of study. It is known, for instance, that deciding whether two arbitrary regular expressions describe the same set of strings is hard, while doing the same for arbitrary context-free grammars is completely impossible. However, it can still be efficiently decided whether any given string is in the set.

For more expressive formalisms, this problem can be harder, or even undecidable. For a Turing complete formalism, such as arbitrary formal grammars, not only this problem, but every nontrivial property regarding the set of strings they describe is undecidable, a fact known as Rice's Theorem.

There are some results on conciseness as well; for instance, nondeterministic finite automata and regular grammars are more concise than regular expressions, in the sense that the latter can be translated to the former without a blowup in size (i.e. in O(1)), while the reverse is not possible.

Similar considerations apply to formalisms that describe not sets of strings, but sets of trees (e.g. XML schema languages), of graphs, or other structures.

In database theory

Database theory is concerned, among other things, with database queries, e.g. formulas that, given the contents of a database, specify certain information to be extracted from it. In the predominant relational database paradigm, the contents of a database are described as a finite set of finite mathematical relations; Boolean queries, that always yield true or false, are formulated in first-order logic.

It turns out that first-order logic is lacking in expressive power: it cannot express certain types of Boolean queries, e.g. queries involving transitive closure.^[6] However, adding expressive power must be done with care: it must still remain possible to evaluate queries with reasonable efficiency, which is not the case, e.g., for second-order logic. Consequently, a literature sprang up in which many query languages and language constructs were compared on the basis of expressive power and efficiency, e.g. various versions of Datalog.^[7]

Similar considerations apply for query languages on other types of data, e.g. XML query languages such as XQuery.

Related Research Articles

<span class="mw-page-title-main">Chomsky hierarchy</span> Hierarchy of classes of formal grammars

The Chomsky hierarchy in the fields of formal language theory, computer science, and linguistics, is a containment hierarchy of classes of formal grammars. A formal grammar describes how to form strings from a language's vocabulary that are valid according to the language's syntax. Linguist Noam Chomsky theorized that four different classes of formal grammars existed that could generate increasingly complex languages. Each class can also completely generate the language of all inferior classes.

In formal language theory, a context-free grammar (CFG) is a formal grammar whose production rules can be applied to a nonterminal symbol regardless of its context. In particular, in a context-free grammar, each production rule is of the form

In logic, mathematics, computer science, and linguistics, a formal language consists of words whose letters are taken from an alphabet and are well-formed according to a specific set of rules.

A regular expression is a sequence of characters that specifies a match pattern in text. Usually such patterns are used by string-searching algorithms for "find" or "find and replace" operations on strings, or for input validation. Regular expression techniques are developed in theoretical computer science and formal language theory.

In theoretical computer science and formal language theory, a regular language is a formal language that can be defined by a regular expression, in the strict sense in theoretical computer science.

In theoretical computer science and formal language theory, a regular grammar is a grammar that is right-regular or left-regular. While their exact definition varies from textbook to textbook, they all require that

<span class="mw-page-title-main">Theory of computation</span> Academic subfield of computer science

In theoretical computer science and mathematics, the theory of computation is the branch that deals with what problems can be solved on a model of computation, using an algorithm, how efficiently they can be solved or to what degree. The field is divided into three major branches: automata theory and formal languages, computability theory, and computational complexity theory, which are linked by the question: "What are the fundamental capabilities and limitations of computers?".

A Turing machine is a mathematical model of computation describing an abstract machine that manipulates symbols on a strip of tape according to a table of rules. Despite the model's simplicity, it is capable of implementing any computer algorithm.

A formal system is an abstract structure used for inferring theorems from axioms according to a set of rules. These rules, which are used for carrying out the inference of theorems from axioms, are the logical calculus of the formal system. A formal system is essentially an "axiomatic system".

Computability is the ability to solve a problem in an effective manner. It is a key topic of the field of computability theory within mathematical logic and the theory of computation within computer science. The computability of a problem is closely linked to the existence of an algorithm to solve the problem.

In computer science, a parsing expression grammar (PEG) is a type of analytic formal grammar, i.e. it describes a formal language in terms of a set of rules for recognizing strings in the language. The formalism was introduced by Bryan Ford in 2004 and is closely related to the family of top-down parsing languages introduced in the early 1970s. Syntactically, PEGs also look similar to context-free grammars (CFGs), but they have a different interpretation: the choice operator selects the first match in PEG, while it is ambiguous in CFG. This is closer to how string recognition tends to be done in practice, e.g. by a recursive descent parser.

In computer science, a Van Wijngaarden grammar is a formalism for defining formal languages. The name derives from the formalism invented by Adriaan van Wijngaarden for the purpose of defining the ALGOL 68 programming language. The resulting specification remains its most notable application.

In computer science, a linear grammar is a context-free grammar that has at most one nonterminal in the right-hand side of each of its productions.

Conjunctive grammars are a class of formal grammars studied in formal language theory. They extend the basic type of grammars, the context-free grammars, with a conjunction operation. Besides explicit conjunction, conjunctive grammars allow implicit disjunction represented by multiple rules for a single nonterminal symbol, which is the only logical connective expressible in context-free grammars. Conjunction can be used, in particular, to specify intersection of languages. A further extension of conjunctive grammars known as Boolean grammars additionally allows explicit negation.

Grammar induction is the process in machine learning of learning a formal grammar from a set of observations, thus constructing a model which accounts for the characteristics of the observed objects. More generally, grammatical inference is that branch of machine learning where the instance space consists of discrete combinatorial objects such as strings, trees and graphs.

In automata theory, the class of unrestricted grammars is the most general class of grammars in the Chomsky hierarchy. No restrictions are made on the productions of an unrestricted grammar, other than each of their left-hand sides being non-empty. This grammar class can generate arbitrary recursively enumerable languages.

In theoretical computer science and formal language theory, a regular tree grammar is a formal grammar that describes a set of directed trees, or terms. A regular word grammar can be seen as a special kind of regular tree grammar, describing a set of single-path trees.

An affix grammar is a kind of formal grammar; it is used to describe the syntax of languages, mainly computer languages, using an approach based on how natural language is typically described.

In formal language theory, a grammar describes how to form strings from a language's alphabet that are valid according to the language's syntax. A grammar does not describe the meaning of the strings or what can be done with them in whatever context—only their form. A formal grammar is defined as a set of production rules for such strings in a formal language.

In computational learning theory, induction of regular languages refers to the task of learning a formal description of a regular language from a given set of example strings. Although E. Mark Gold has shown that not every regular language can be learned this way, approaches have been investigated for a variety of subclasses. They are sketched in this article. For learning of more general grammars, see Grammar induction.

References

↑ Grau, Bernardo Cuenca; Horrocks, Ian; Motik, Boris; Parsia, Bijan; Patel-Schneider, Peter; Sattler, Ulrike (2008). "OWL 2: The next step for OWL". Web Semantics: Science, Services and Agents on the World Wide Web. 6 (4): 309–322. doi:10.1016/j.websem.2008.05.001. ISSN 1570-8268.
1 2 Farmer, William (2007). "Chiron: A multi-paradigm logic". In R. Matuszewski; A. Zalewska (eds.). From Insight to Proof: Festschrift in Honour of Andrzej Trybulec. Studies in Logic, Grammar and Rhetoric. pp. 1–19. ISBN 978-83-7431-128-1.
↑ Structure and Interpretation of Computer Programs , by Abelson and Sussman
1 2 Felleisen, Matthias (1991-12-01). "On the expressive power of programming languages". Science of Computer Programming. 17 (1): 35–75. doi: 10.1016/0167-6423(91)90036-W .
↑ Okhotin, Alexander (December 2005). "Unresolved systems of language equations: Expressive power and decision problems". Theoretical Computer Science. 349 (3): 283–308. doi: 10.1016/j.tcs.2005.07.038 .
↑ Serge Abiteboul, Richard B. Hull, Victor Vianu: Foundations of Databases. Addison-Wesley, 1995.
↑ Evgeny Dantsin, Thomas Eiter, Georg Gottlob, and Andrei Voronkov: Complexity and expressive power of logic programming. ACM Comput. Surv. 33(3): 374-425 (2001).

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[grau.2008-1] Grau, Bernardo Cuenca; Horrocks, Ian; Motik, Boris; Parsia, Bijan; Patel-Schneider, Peter; Sattler, Ulrike (2008). "OWL 2: The next step for OWL". Web Semantics: Science, Services and Agents on the World Wide Web. 6 (4): 309–322. doi:10.1016/j.websem.2008.05.001. ISSN 1570-8268.

[farmer:2007-2] 1 2 Farmer, William (2007). "Chiron: A multi-paradigm logic". In R. Matuszewski; A. Zalewska (eds.). From Insight to Proof: Festschrift in Honour of Andrzej Trybulec. Studies in Logic, Grammar and Rhetoric. pp. 1–19. ISBN 978-83-7431-128-1.

[3] Structure and Interpretation of Computer Programs , by Abelson and Sussman

[felleisen-4] 1 2 Felleisen, Matthias (1991-12-01). "On the expressive power of programming languages". Science of Computer Programming. 17 (1): 35–75. doi: 10.1016/0167-6423(91)90036-W .

[5] Okhotin, Alexander (December 2005). "Unresolved systems of language equations: Expressive power and decision problems". Theoretical Computer Science. 349 (3): 283–308. doi: 10.1016/j.tcs.2005.07.038 .

[6] Serge Abiteboul, Richard B. Hull, Victor Vianu: Foundations of Databases. Addison-Wesley, 1995.

[7] Evgeny Dantsin, Thomas Eiter, Georg Gottlob, and Andrei Voronkov: Complexity and expressive power of logic programming. ACM Comput. Surv. 33(3): 374-425 (2001).

[1]

[2]

[3]

[4]

[5]

[6]

[7]

v t e Computer science
Note: This template roughly follows the 2012 ACM Computing Classification System.
Hardware	Printed circuit board Peripheral Integrated circuit Very Large Scale Integration Systems on Chip (SoCs) Energy consumption (Green computing) Electronic design automation Hardware acceleration
Computer systems organization	Computer architecture Embedded system Real-time computing Dependability
Networks	Network architecture Network protocol Network components Network scheduler Network performance evaluation Network service
Software organization	Interpreter Middleware Virtual machine Operating system Software quality
Software notations and tools	Programming paradigm Programming language Compiler Domain-specific language Modeling language Software framework Integrated development environment Software configuration management Software library Software repository
Software development	Control variable Software development process Requirements analysis Software design Software construction Software deployment Software engineering Software maintenance Programming team Open-source model
Theory of computation	Model of computation Formal language Automata theory Computability theory Computational complexity theory Logic Semantics
Algorithms	Algorithm design Analysis of algorithms Algorithmic efficiency Randomized algorithm Computational geometry
Mathematics of computing	Discrete mathematics Probability Statistics Mathematical software Information theory Mathematical analysis Numerical analysis Theoretical computer science
Information systems	Database management system Information storage systems Enterprise information system Social information systems Geographic information system Decision support system Process control system Multimedia information system Data mining Digital library Computing platform Digital marketing World Wide Web Information retrieval
Security	Cryptography Formal methods Security services Intrusion detection system Hardware security Network security Information security Application security
Human–computer interaction	Interaction design Social computing Ubiquitous computing Visualization Accessibility
Concurrency	Concurrent computing Parallel computing Distributed computing Multithreading Multiprocessing
Artificial intelligence	Natural language processing Knowledge representation and reasoning Computer vision Automated planning and scheduling Search methodology Control method Philosophy of artificial intelligence Distributed artificial intelligence
Machine learning	Supervised learning Unsupervised learning Reinforcement learning Multi-task learning Cross-validation
Graphics	Animation Rendering Photograph manipulation Graphics processing unit Mixed reality Virtual reality Image compression Solid modeling
Applied computing	E-commerce Enterprise software Computational mathematics Computational physics Computational chemistry Computational biology Computational social science Computational engineering Computational healthcare Digital art Electronic publishing Cyberwarfare Electronic voting Video games Word processing Operations research Educational technology Document management
Category Outline WikiProject Commons