Substitution (logic)

Last updated

A substitution is a syntactic transformation on formal expressions. To apply a substitution to an expression means to consistently replace its variable, or placeholder, symbols with other expressions.

Contents

The resulting expression is called a substitution instance, or instance for short, of the original expression.

Propositional logic

Definition

Where ψ and φ represent formulas of propositional logic, ψ is a substitution instance of φ if and only if ψ may be obtained from φ by substituting formulas for propositional variables in φ, replacing each occurrence of the same variable by an occurrence of the same formula. For example:

ψ: (R → S) & (T → S)

is a substitution instance of

φ: P & Q

That is, ψ can be obtained by replacing P and Q in φ with (R → S) and (T → S) respectively. Similarly:

ψ: (A ↔ A) ↔ (A ↔ A)

is a substitution instance of:

φ: (A ↔ A)

since ψ can be obtained by replacing each A in φ with (A ↔ A).

In some deduction systems for propositional logic, a new expression (a proposition) may be entered on a line of a derivation if it is a substitution instance of a previous line of the derivation. [1] [ failed verification ] This is how new lines are introduced in some axiomatic systems. In systems that use rules of transformation, a rule may include the use of a substitution instance for the purpose of introducing certain variables into a derivation.

Tautologies

A propositional formula is a tautology if it is true under every valuation (or interpretation) of its predicate symbols. If Φ is a tautology, and Θ is a substitution instance of Φ, then Θ is again a tautology. This fact implies the soundness of the deduction rule described in the previous section.

First-order logic

In first-order logic, a substitution is a total mapping σ: VT from variables to terms; many, [2] :73 [3] :445 but not all [4] :250 authors additionally require σ(x) = x for all but finitely many variables x. The notation { x1  t1, …, xk  tk } [note 1] refers to a substitution mapping each variable xi to the corresponding term ti, for i=1,…,k, and every other variable to itself; the xi must be pairwise distinct. Most authors additionally require each term ti to be syntactically different from xi, to avoid infinitely many distinct notations for the same substitution. Applying that substitution to a term t is written in postfix notation as t { x1  t1, ..., xk  tk }; it means to (simultaneously) replace every occurrence of each xi in t by ti. [note 2] The result of applying a substitution σ to a term t is called an instance of that term t. For example, applying the substitution { x  z, z  h(a,y) } to the term

f(z, a, g(x), y)  yields
f(h(a,y), a, g(z), y).

The domaindom(σ) of a substitution σ is commonly defined as the set of variables actually replaced, i.e. dom(σ) = { xV | x }. A substitution is called a ground substitution if it maps all variables of its domain to ground, i.e. variable-free, terms. The substitution instance of a ground substitution is a ground term if all of t's variables are in σ's domain, i.e. if vars(t) ⊆ dom(σ). A substitution σ is called a linear substitution if is a linear term for some (and hence every) linear term t containing precisely the variables of σ's domain, i.e. with vars(t) = dom(σ). A substitution σ is called a flat substitution if is a variable for every variable x. A substitution σ is called a renaming substitution if it is a permutation on the set of all variables. Like every permutation, a renaming substitution σ always has an inverse substitution σ−1, such that tσσ−1 = t = −1σ for every term t. However, it is not possible to define an inverse for an arbitrary substitution.

For example, { x  2, y  3+4 } is a ground substitution, { x  x1, y  y2+4 } is non-ground and non-flat, but linear, { x  y2, y  y2+4 } is non-linear and non-flat, { x  y2, y  y2 } is flat, but non-linear, { x  x1, y  y2 } is both linear and flat, but not a renaming, since it maps both y and y2 to y2; each of these substitutions has the set {x,y} as its domain. An example for a renaming substitution is { x  x1, x1  y, y  y2, y2  x }, it has the inverse { x  y2, y2  y, y  x1, x1  x }. The flat substitution { x  z, y  z } cannot have an inverse, since e.g. (x+y) { x  z, y  z } = z+z, and the latter term cannot be transformed back to x+y, as the information about the origin a z stems from is lost. The ground substitution { x  2 } cannot have an inverse due to a similar loss of origin information e.g. in (x+2) { x  2 } = 2+2, even if replacing constants by variables was allowed by some fictitious kind of "generalized substitutions".

Two substitutions are considered equal if they map each variable to syntactically equal result terms, formally: σ = τ if = for each variable xV. The composition of two substitutions σ = { x1  t1, …, xk  tk } and τ = { y1  u1, …, yl  ul } is obtained by removing from the substitution { x1  t1τ, …, xk  tkτ, y1  u1, …, yl  ul } those pairs yi  ui for which yi ∈ { x1, …, xk }. The composition of σ and τ is denoted by στ. Composition is an associative operation, and is compatible with substitution application, i.e. (ρσ)τ = ρ(στ), and ()τ = t(στ), respectively, for every substitutions ρ, σ, τ, and every term t. The identity substitution, which maps every variable to itself, is the neutral element of substitution composition. A substitution σ is called idempotent if σσ = σ, and hence tσσ = for every term t. When xiti for all i, the substitution { x1  t1, …, xk  tk } is idempotent if and only if none of the variables xi occurs in any tj. Substitution composition is not commutative, that is, στ may be different from τσ, even if σ and τ are idempotent. [2] :73–74 [3] :445–446

For example, { x  2, y  3+4 } is equal to { y  3+4, x  2 }, but different from { x  2, y  7 }. The substitution { x  y+y } is idempotent, e.g. ((x+y) {xy+y}) {xy+y} = ((y+y)+y) {xy+y} = (y+y)+y, while the substitution { x  x+y } is non-idempotent, e.g. ((x+y) {xx+y}) {xx+y} = ((x+y)+y) {xx+y} = ((x+y)+y)+y. An example for non-commuting substitutions is { x  y } { y  z } = { x  z, y  z }, but { y  z} { x  y} = { x  y, y  z }.

Mathematics

In mathematics, there are two common uses of substitution: substitution of variables for constants (also called assignment for that variable), and the substitution property of equality , [5] also called Leibniz's Law. [6]

Considering mathematics as a formal language, a variable is a symbol from an alphabet, usually a letter like x, y, and z, which denotes a range of possible values. [7] If a variable is free in a given expression or formula, then it can be replaced with any of the values in its range. [8] Certain kinds of bound variables can be substituted too. For instance, parameters of an expression (like the coefficients of a polynomial), or the argument of a function. Moreover, variables being univerally quantified can be replaced with any of the values in its range, and the result will a true statement. (This is called Universal instantiation)

For a non-formalized language, that is, in most mathematical texts outside of mathematical logic, for an individual expression it is not always possible to identify which variables are free and bound. For example, in , depending on the context, the variable  can be free and bound, or vice-versa, but they cannot both be free. Determining which value is assumed to be free depends on context and semantics.

The substitution property of equality, or Leibniz's Law (though the latter term is usually reserved for philosophical contexts), generally states that, if two things are equal, then any property of one, must be a property of the other. It can be formally stated in logical notation as:For every and , and any well-formed formula (with a free variable x). For example: For all real numbers a and b, if a = b, then a ≥ 0 implies b ≥ 0 (here, is x ≥ 0). This is a property which is most often used in algebra, especially in solving systems of equations, but is apllied in nearly every area of math that uses equality. This, taken together with the reflexive property of equality, forms the axioms of equality in first-order logic. [9]

Substitution is related to, but not identical to, function composition; it is closely related to β-reduction in lambda calculus. In contrast to these notions, however, the accent in algebra is on the preservation of algebraic structure by the substitution operation, the fact that substitution gives a homomorphism for the structure at hand (in the case of polynomials, the ring structure).[ citation needed ]

Algebra

Substitution is a basic operation in algebra, in particular in computer algebra. [10] [11]

A common case of substitution involves polynomials, where substitution of a numerical value (or another expression) for the indeterminate of a univariate polynomial amounts to evaluating the polynomial at that value. Indeed, this operation occurs so frequently that the notation for polynomials is often adapted to it; instead of designating a polynomial by a name like P, as one would do for other mathematical objects, one could define

so that substitution for X can be designated by replacement inside "P(X)", say

or

Substitution can also be applied to other kinds of formal objects built from symbols, for instance elements of free groups. In order for substitution to be defined, one needs an algebraic structure with an appropriate universal property, that asserts the existence of unique homomorphisms that send indeterminates to specific values; the substitution then amounts to finding the image of an element under such a homomorphism.

See also

Notes

  1. Some authors use [ t1/x1, …, tk/xk ] to denote that substitution, e.g. M. Wirsing (1990). Jan van Leeuwen (ed.). Algebraic Specification. Handbook of Theoretical Computer Science. Vol. B. Elsevier. pp. 675–788., here: p. 682.
  2. From a term algebra point of view, the set T of terms is the free term algebra over the set V of variables, hence for each substitution mapping σ: VT there is a unique homomorphism σ: TT that agrees with σ on VT; the above-defined application of σ to a term t is then viewed as applying the function σ to the argument t.

Citations

  1. Hunter, Geoffrey (1996) [1971]. Metalogic: An Introduction to the Metatheory of Standard First-Order Logic. University of California Press (published 1973). p. 118. ISBN   9780520023567. OCLC   36312727. ( accessible to patrons with print disabilities )
  2. 1 2 David A. Duffy (1991). Principles of Automated Theorem Proving. Wiley.
  3. 1 2 Franz Baader, Wayne Snyder (2001). Alan Robinson and Andrei Voronkov (ed.). Unification Theory (PDF). Elsevier. pp. 439–526. Archived from the original (PDF) on 2015-06-08. Retrieved 2014-09-24.
  4. N. Dershowitz; J.-P. Jouannaud (1990). "Rewrite Systems". In Jan van Leeuwen (ed.). Formal Models and Semantics. Handbook of Theoretical Computer Science. Vol. B. Elsevier. pp. 243–320.
  5. Sobolev, S.K. (originator). " Equality axioms". Encyclopedia of Mathematics . Springer. ISBN   1402006098.
  6. Deutsch, Harry and Pawel Garbacz, "Relative Identity", The Stanford Encyclopedia of Philosophy (Fall 2024 Edition), Edward N. Zalta & Uri Nodelman (eds.), forthcoming URL: https://plato.stanford.edu/entries/identity-relative/#StanAccoIden
  7. Sobolev, S.K. (originator). "Individual variable". Encyclopedia of Mathematics . Springer. ISBN   1402006098 . Retrieved 2024-09-05.
  8. Sobolev, S.K. (originator). Free variable. Encyclopedia of Mathematics . Springer. ISBN   1402006098.
  9. Fitting, M., First-Order Logic and Automated Theorem Proving (Berlin/Heidelberg: Springer, 1990), pp. 198–200.
  10. Margret H. Hoft; Hartmut F.W. Hoft (6 November 2002). Computing with Mathematica. Elsevier. ISBN   978-0-08-048855-4.
  11. Andre Heck (6 December 2012). Introduction to Maple . Springer Science & Business Media. ISBN   978-1-4684-0484-5. substitution.

Related Research Articles

An axiom, postulate, or assumption is a statement that is taken to be true, to serve as a premise or starting point for further reasoning and arguments. The word comes from the Ancient Greek word ἀξίωμα (axíōma), meaning 'that which is thought worthy or fit' or 'that which commends itself as evident'.

In mathematics, an associative algebraA over a commutative ring K is a ring A together with a ring homomorphism from K into the center of A. This is thus an algebraic structure with an addition, a multiplication, and a scalar multiplication. The addition and multiplication operations together give A the structure of a ring; the addition and scalar multiplication operations together give A the structure of a module or vector space over K. In this article we will also use the term K-algebra to mean an associative algebra over K. A standard first example of a K-algebra is a ring of square matrices over a commutative ring K, with the usual matrix multiplication.

First-order logic—also called predicate logic, predicate calculus, quantificational logic—is a collection of formal systems used in mathematics, philosophy, linguistics, and computer science. First-order logic uses quantified variables over non-logical objects, and allows the use of sentences that contain variables. Rather than propositions such as "all men are mortal", in first-order logic one can have expressions in the form "for all x, if x is a man, then x is mortal"; where "for all x" is a quantifier, x is a variable, and "... is a man" and "... is mortal" are predicates. This distinguishes it from propositional logic, which does not use quantifiers or relations; in this sense, propositional logic is the foundation of first-order logic.

<span class="mw-page-title-main">Original proof of Gödel's completeness theorem</span>

The proof of Gödel's completeness theorem given by Kurt Gödel in his doctoral dissertation of 1929 is not easy to read today; it uses concepts and formalisms that are no longer used and terminology that is often obscure. The version given below attempts to represent all the steps in the proof and all the important ideas faithfully, while restating the proof in the modern language of mathematical logic. This outline should not be considered a rigorous proof of the theorem.

In mathematical logic, model theory is the study of the relationship between formal theories, and their models. The aspects investigated include the number and size of models of a theory, the relationship of different models to each other, and their interaction with the formal language itself. In particular, model theorists also investigate the sets that can be defined in a model of a theory, and the relationship of such definable sets to each other. As a separate discipline, model theory goes back to Alfred Tarski, who first used the term "Theory of Models" in publication in 1954. Since the 1970s, the subject has been shaped decisively by Saharon Shelah's stability theory.

<span class="mw-page-title-main">Normal distribution</span> Probability distribution

In probability theory and statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is The parameter is the mean or expectation of the distribution, while the parameter is the variance. The standard deviation of the distribution is (sigma). A random variable with a Gaussian distribution is said to be normally distributed, and is called a normal deviate.

In mathematics, a partition of unity of a topological space is a set of continuous functions from to the unit interval [0,1] such that for every point :

In logic and computer science, specifically automated reasoning, unification is an algorithmic process of solving equations between symbolic expressions, each of the form Left-hand side = Right-hand side. For example, using x,y,z as variables, and taking f to be an uninterpreted function, the singleton equation set { f(1,y) = f(x,2) } is a syntactic first-order unification problem that has the substitution { x ↦ 1, y ↦ 2 } as its only solution.

In mathematics, the discriminant of a polynomial is a quantity that depends on the coefficients and allows deducing some properties of the roots without computing them. More precisely, it is a polynomial function of the coefficients of the original polynomial. The discriminant is widely used in polynomial factoring, number theory, and algebraic geometry.

In mathematics, a Diophantine equation is an equation of the form P(x1, ..., xj, y1, ..., yk) = 0 (usually abbreviated P(x, y) = 0) where P(x, y) is a polynomial with integer coefficients, where x1, ..., xj indicate parameters and y1, ..., yk indicate unknowns.

In algebra and in particular in algebraic combinatorics, the ring of symmetric functions is a specific limit of the rings of symmetric polynomials in n indeterminates, as n goes to infinity. This ring serves as universal structure in which relations between symmetric polynomials can be expressed in a way independent of the number n of indeterminates. Among other things, this ring plays an important role in the representation theory of the symmetric group.

<span class="mw-page-title-main">Expression (mathematics)</span> Symbolic description of a mathematical object

In mathematics, an expression is a written arrangement of symbols following the context-dependent, syntactic conventions of mathematical notation. Symbols can denote numbers, variables, operations, and functions. Other symbols include punctuation marks and brackets, used for grouping where there is not a well-defined order of operations.

<span class="mw-page-title-main">Square (algebra)</span> Product of a number by itself

In mathematics, a square is the result of multiplying a number by itself. The verb "to square" is used to denote this operation. Squaring is the same as raising to the power 2, and is denoted by a superscript 2; for instance, the square of 3 may be written as 32, which is the number 9. In some cases when superscripts are not available, as for instance in programming languages or plain text files, the notations x^2 (caret) or x**2 may be used in place of x2. The adjective which corresponds to squaring is quadratic.

In computability theory, the μ-operator, minimization operator, or unbounded search operator searches for the least natural number with a given property. Adding the μ-operator to the primitive recursive functions makes it possible to define all computable functions.

In model theory and related areas of mathematics, a type is an object that describes how a element or finite collection of elements in a mathematical structure might behave. More precisely, it is a set of first-order formulas in a language L with free variables x1, x2,..., xn that are true of a set of n-tuples of an L-structure . Depending on the context, types can be complete or partial and they may use a fixed set of constants, A, from the structure . The question of which types represent actual elements of leads to the ideas of saturated models and omitting types.

In abstract algebraic logic, a branch of mathematical logic, the Leibniz operator is a tool used to classify deductive systems, which have a precise technical definition and capture a large number of logics. The Leibniz operator was introduced by Wim Blok and Don Pigozzi, two of the founders of the field, as a means to abstract the well-known Lindenbaum–Tarski process, that leads to the association of Boolean algebras to classical propositional calculus, and make it applicable to as wide a variety of sentential logics as possible. It is an operator that assigns to a given theory of a given sentential logic, perceived as a term algebra with a consequence operation on its universe, the largest congruence on the algebra that is compatible with the theory.

In number theory, the classical modular curve is an irreducible plane algebraic curve given by an equation

In mathematical physics, spacetime algebra (STA) is the application of Clifford algebra Cl1,3(R), or equivalently the geometric algebra G(M4) to physics. Spacetime algebra provides a "unified, coordinate-free formulation for all of relativistic physics, including the Dirac equation, Maxwell equation and General Relativity" and "reduces the mathematical divide between classical, quantum and relativistic physics."

An interpretation is an assignment of meaning to the symbols of a formal language. Many formal languages used in mathematics, logic, and theoretical computer science are defined in solely syntactic terms, and as such do not have any meaning until they are given some interpretation. The general study of interpretations of formal languages is called formal semantics.

Anti-unification is the process of constructing a generalization common to two given symbolic expressions. As in unification, several frameworks are distinguished depending on which expressions are allowed, and which expressions are considered equal. If variables representing functions are allowed in an expression, the process is called "higher-order anti-unification", otherwise "first-order anti-unification". If the generalization is required to have an instance literally equal to each input expression, the process is called "syntactical anti-unification", otherwise "E-anti-unification", or "anti-unification modulo theory".

References