Ogden's lemma

Last updated May 25, 2024

In the theory of formal languages, Ogden's lemma (named after William F. Ogden)^[1] is a generalization of the pumping lemma for context-free languages.

Despite Ogden's lemma being a strengthening of the pumping lemma, it is insufficient to fully characterize the class of context-free languages.^[2] This is in contrast to the Myhill-Nerode theorem, which unlike the pumping lemma for regular languages is a necessary and sufficient condition for regularity.

Statement

Ogden's lemma — If a language $L$ is generated by a context-free grammar, then there exists some $p\geq 1$ such that $\forall w\in L$ with length $\geq p$ , and any way of marking $\geq p$ positions of $w$ as "marked", there exists a nonterminal $A$ and a way to split $w$ into 5 segments $uxzyv$ , such that

S\Rightarrow ^{*}uAv\Rightarrow ^{*}uxAyv\Rightarrow ^{*}uxzyv

$z$ contains at least one marked position.

$xzy$ contains at most $p$ marked positions.

$u,x$ both contain marked positions, or $y,v$ both contain marked positions.

We will use underlines to indicate "marked" positions.

Special cases

Ogden's lemma is often stated in the following form, which can be obtained by "forgetting about" the grammar, and concentrating on the language itself: If a language $L$ is context-free, then there exists some number $p\geq 1$ (where $p$ may or may not be a pumping length) such that for any string $s$ of length at least $p$ in $L$ and every way of "marking" $p$ or more of the positions in $s$ , $s$ can be written as

s=uvwxy

with strings $u, v, w, x,$ and $y$ , such that

$vx$ has at least one marked position,
$vwx$ has at most $p$ marked positions, and
$uv^{n}wx^{n}y\in L$ for all $n\geq 0$ .

In the special case where every position is marked, Ogden's lemma is equivalent to the pumping lemma for context-free languages. Ogden's lemma can be used to show that certain languages are not context-free in cases where the pumping lemma is not sufficient. An example is the language $\{a^{i}b^{j}c^{k}d^{l}:i=0{\text{ or }}j=k=l\}$ .

Example applications

Non-context-freeness

The special case of Ogden's lemma is often sufficient to prove some languages are not context-free. For example, $\{a^{m}b^{n}c^{m}d^{n}|m,n\geq 1\}$ is a standard example of non-context-free language,^[3]

Proof

Suppose the language is generated by a context-free grammar, then let $p$ be the length required in Ogden's lemma, then consider the word $a^{p}{\underline {b^{p}c^{p}}}d^{p}$ in the language. Then the three conditions implied by Ogden's lemma cannot all be satisfied.

Similarly, one can prove the "copy twice" language $L=\{w^{2}|w\in \{a,b\}^{*}\}$ is not context-free, by using Ogden's lemma on $a^{2p}{\underline {b^{2p}}}a^{2p}b^{2p}$ .

And the given example last section $\{a^{i}b^{j}c^{k}d^{l}:i=0{\text{ or }}j=k=l\}$ is not context-free by using Ogden's lemma on $ab^{2p}{\underline {c^{2p}}}d^{2p}$ .

Inherent ambiguity

Ogden's lemma can be used to prove the inherent ambiguity of some languages, which is implied by the title of Ogden's paper.

Example: Let $L_{0}=\{a^{n}b^{m}c^{m}|m,n\geq 1\},L_{1}=\{a^{m}b^{m}c^{n}|m,n\geq 1\}$ . The language $L=L_{0}\cup L_{1}$ is inherently ambiguous. (Example from page 3 of Ogden's paper.)

Proof

Let $p$ be the pumping length needed for Ogden's lemma, and apply it to the sentence $a^{p!+p}{\underline {b^{p}c^{p}}}$ .

By routine checking of the conditions of Ogden's lemma, we find that the derivation is

S\Rightarrow ^{*}uAv\Rightarrow ^{*}uxAyv\Rightarrow ^{*}uxzyv

where $u=a^{p!+p}b^{p-s-k},x=b^{k},z=b^{s}c^{s'},y=c^{k},v=c^{p-s'-k}$ , satisfying $s+s'\geq 1$ and $k\geq 1$ and $p\geq s+s'+2k$ .

Thus, we obtain a derivation of $a^{p!+p}b^{p!+p}c^{p!+p}$ by interpolating the derivation with $p!/k$ copies of $A\Rightarrow ^{*}xAy$ . According to this derivation, an entire sub-sentence $x^{p!/k+1}zy^{p!/k+1}=b^{p!+k+s}c^{p!+k+s'}$ is the descendent of one node $A$ in the derivation tree.

Symmetrically, we can obtain another derivation of $a^{p!+p}b^{p!+p}c^{p!+p}$ , according to which there is an entire sub-sentence $a^{p!+k''+s''}b^{p!+k''+s'''}$ being the descendent of one node in the derivation tree.

Since $(p!+k+s)+(p!+k''+s''')>2p!+1>p!+p$ , the two sub-sentences have nonempty intersection, and since neither contains the other, the two derivation trees are different.

Similarly, $L^{*}$ is inherently ambiguous, and for any CFG of the language, letting $p$ be the constant for Ogden's lemma, we find that $(a^{p!+p}b^{p!+p}c^{p!+p})^{n}$ has at least $2^{n}$ different parses. Thus $L^{*}$ has an unbounded degree of inherent ambiguity.

Undecidability

The proof can be extended to show that deciding whether a CFG is inherently ambiguous is undecidable, by reduction to the Post correspondence problem. It can also show that deciding whether a CFG has an unbounded degree of inherent ambiguity is undecidable. (page 4 of Ogden's paper)

Construction

Given any Post correspondence problem over binary strings, we reduce it to a decision problem over a CFG.

Given any two lists of binary strings $\xi _{1},...,\xi _{n}$ and $\eta _{1},...,\eta _{n}$ , rewrite the binary alphabet to $\{d,e\}$ .

Let $L\left(\xi _{1},\cdots ,\xi _{n}\right)$ be the language over alphabet $\{d,e,f\}$ , generated by the CFG with rules $S\to \xi _{i}Sd^{i}e|f$ for every $i=1,...,n$ . Similarly define $L\left(\eta _{1},\cdots ,\eta _{n}\right)$ .

Now, by the same argument as above, the language $(L_{0}\cdot L\left(\xi _{1},\cdots ,\xi _{n}\right))\cup (L_{1}\cdot L\left(\eta _{1},\cdots ,\eta _{n}\right))$ is inherently ambiguous iff the Post correspondence problem has a solution.

And the language $((L_{0}\cdot L\left(\xi _{1},\cdots ,\xi _{n}\right))\cup (L_{1}\cdot L\left(\eta _{1},\cdots ,\eta _{n}\right)))^{*}$ has an unbounded degree of inherent ambiguity iff the Post correspondence problem has a solution.

Generalized condition

Bader and Moura have generalized the lemma^[4] to allow marking some positions that are not to be included in $vx$ . Their dependence of the parameters was later improved by Dömösi and Kudlek.^[5] If we denote the number of such excluded positions by $e$ , then the number $d$ of marked positions of which we want to include some in $vx$ must satisfy $d\geq p(e+1)$ , where $p$ is some constant that depends only on the language. The statement becomes that every $s$ can be written as

s=uvwxy

with strings $u, v, w, x,$ and $y$ , such that

$vx$ has at least one marked position and no excluded position,
$vwx$ has at most $p^{(e+1)}$ marked positions, and
$uv^{n}wx^{n}y\in L$ for all $n\geq 0$ .

Moreover, either each of $u,v,w$ has a marked position, or each of $w,x,y$ has a marked position.

Related Research Articles

In formal language theory, a context-free grammar (CFG) is a formal grammar whose production rules can be applied to a nonterminal symbol regardless of its context. In particular, in a context-free grammar, each production rule is of the form

In mathematics, the Hodge star operator or Hodge star is a linear map defined on the exterior algebra of a finite-dimensional oriented vector space endowed with a nondegenerate symmetric bilinear form. Applying the operator to an element of the algebra produces the Hodge dual of the element. This map was introduced by W. V. D. Hodge.

In mathematics, and especially in algebraic geometry, the intersection number generalizes the intuitive notion of counting the number of times two curves intersect to higher dimensions, multiple curves, and accounting properly for tangency. One needs a definition of intersection number in order to state results like Bézout's theorem.

In mathematics, the Poincaré lemma gives a sufficient condition for a closed differential form to be exact. Precisely, it states that every closed p-form on an open ball in Rⁿ is exact for p with 1 ≤ p ≤ n. The lemma was introduced by Henri Poincaré in 1886.

In classical mechanics, the Laplace–Runge–Lenz (LRL) vector is a vector used chiefly to describe the shape and orientation of the orbit of one astronomical body around another, such as a binary star or a planet revolving around a star. For two bodies interacting by Newtonian gravity, the LRL vector is a constant of motion, meaning that it is the same no matter where it is calculated on the orbit; equivalently, the LRL vector is said to be conserved. More generally, the LRL vector is conserved in all problems in which two bodies interact by a central force that varies as the inverse square of the distance between them; such problems are called Kepler problems.

In functional analysis, the Friedrichs extension is a canonical self-adjoint extension of a non-negative densely defined symmetric operator. It is named after the mathematician Kurt Friedrichs. This extension is particularly useful in situations where an operator may fail to be essentially self-adjoint or whose essential self-adjointness is difficult to show.

In computer science, in particular in formal language theory, the pumping lemma for context-free languages, also known as the Bar-Hillel lemma, is a lemma that gives a property shared by all context-free languages and generalizes the pumping lemma for regular languages.

In convex analysis and the calculus of variations, both branches of mathematics, a pseudoconvex function is a function that behaves like a convex function with respect to finding its local minima, but need not actually be convex. Informally, a differentiable function is pseudoconvex if it is increasing in any direction where it has a positive directional derivative. The property must hold in all of the function domain, and not only for nearby points.

<span class="mw-page-title-main">Classical group</span>

In mathematics, the classical groups are defined as the special linear groups over the reals $, the complex numbers and the quaternions together with special automorphism groups of symmetric or skew-symmetric bilinear forms and Hermitian or skew-Hermitian sesquilinear forms defined on real, complex and quaternionic finite-dimensional vector spaces. Of these, the complex classical Lie groups are four infinite families of Lie groups that together with the exceptional groups exhaust the classification of simple Lie groups. The compact classical groups are compact real forms of the complex classical groups. The finite analogues of the classical groups are the classical groups of Lie type . The term "classical group" was coined by Hermann Weyl, it being the title of his 1939 monograph The Classical Groups .$

In mathematics and mathematical physics, raising and lowering indices are operations on tensors which change their type. Raising and lowering indices are a form of index manipulation in tensor expressions.

<span class="mw-page-title-main">Peridynamics</span>

Peridynamics is a non-local formulation of continuum mechanics that is oriented toward deformations with discontinuities, especially fractures. Originally, bond-based peridynamic has been introduced, wherein, internal interaction forces between a material point and all the other ones with which it can interact, are modeled as a central forces field. This type of force fields can be imagined as a mesh of bonds connecting each point of the body with every other interacting point within a certain distance which depends on material property, called peridynamic horizon. Later, to overcome bond-based framework limitations for the material Poisson’s ratio, state-base peridynamics, has been formulated. Its characteristic feature is that the force exchanged between a point and another one is influenced by the deformation state of all other bonds relative to its interaction zone.

In mathematics, the spectral theory of ordinary differential equations is the part of spectral theory concerned with the determination of the spectrum and eigenfunction expansion associated with a linear ordinary differential equation. In his dissertation, Hermann Weyl generalized the classical Sturm–Liouville theory on a finite closed interval to second order differential operators with singularities at the endpoints of the interval, possibly semi-infinite or infinite. Unlike the classical case, the spectrum may no longer consist of just a countable set of eigenvalues, but may also contain a continuous part. In this case the eigenfunction expansion involves an integral over the continuous part with respect to a spectral measure, given by the Titchmarsh–Kodaira formula. The theory was put in its final simplified form for singular differential equations of even degree by Kodaira and others, using von Neumann's spectral theorem. It has had important applications in quantum mechanics, operator theory and harmonic analysis on semisimple Lie groups.

In fracture mechanics, the energy release rate, $, is the rate at which energy is transformed as a material undergoes fracture. Mathematically, the energy release rate is expressed as the decrease in total potential energy per increase in fracture surface area, and is thus expressed in terms of energy per unit area. Various energy balances can be constructed relating the energy released during fracture to the energy of the resulting new surface, as well as other dissipative processes such as plasticity and heat generation. The energy release rate is central to the field of fracture mechanics when solving problems and estimating material properties related to fracture and fatigue.$

In fluid dynamics, a cnoidal wave is a nonlinear and exact periodic wave solution of the Korteweg–de Vries equation. These solutions are in terms of the Jacobi elliptic function cn, which is why they are called cnoidal waves. They are used to describe surface gravity waves of fairly long wavelength, as compared to the water depth.

The uncertainty theory invented by Baoding Liu is a branch of mathematics based on normality, monotonicity, self-duality, countable subadditivity, and product measure axioms.

In mathematics, particularly linear algebra, the Schur–Horn theorem, named after Issai Schur and Alfred Horn, characterizes the diagonal of a Hermitian matrix with given eigenvalues. It has inspired investigations and substantial generalizations in the setting of symplectic geometry. A few important generalizations are Kostant's convexity theorem, Atiyah–Guillemin–Sternberg convexity theorem, Kirwan convexity theorem.

Parikh's theorem in theoretical computer science says that if one looks only at the number of occurrences of each terminal symbol in a context-free language, without regard to their order, then the language is indistinguishable from a regular language. It is useful for deciding that strings with a given number of terminals are not accepted by a context-free grammar. It was first proved by Rohit Parikh in 1961 and republished in 1966.

The invariant extended Kalman filter (IEKF) (not to be confused with the iterated extended Kalman filter) was first introduced as a version of the extended Kalman filter (EKF) for nonlinear systems possessing symmetries (or invariances), then generalized and recast as an adaptation to Lie groups of the linear Kalman filtering theory. Instead of using a linear correction term based on a linear output error, the IEKF uses a geometrically adapted correction term based on an invariant output error; in the same way the gain matrix is not updated from a linear state error, but from an invariant state error. The main benefit is that the gain and covariance equations have reduced dependence on the estimated value of the state. In some cases they converge to constant values on a much bigger set of trajectories than is the case for the EKF, which results in a better convergence of the estimation.

In the mathematical theory of probability, the voter model is an interacting particle system introduced by Richard A. Holley and Thomas M. Liggett in 1975.

<span class="mw-page-title-main">General Concept Lattice</span>

The General Concept Lattice (GCL) proposes a novel general construction of concept hierarchy from formal context, where the conventional Formal Concept Lattice based on Formal Concept Analysis (FCA) only serves as a substructure.

References

↑ Ogden, William (September 1968). "A helpful result for proving inherent ambiguity". Mathematical Systems Theory. 2 (3): 191–194. doi:10.1007/bf01694004. ISSN 0025-5661. S2CID 13197551.
↑ Kracht, Marcus (2004). Too Many Languages Satisfy Ogden’s Lemma (PDF). Proceedings of the 27th Pennsylvania Linguistics Colloquium. Philadelphia. pp. 115–121. Retrieved 16 May 2024.
↑ Hopcroft, John E. (1979). Introduction to automata theory, languages, and computation. Jeffrey D. Ullman. Reading, Mass.: Addison-Wesley. p. 128. ISBN 0-201-02988-X. OCLC 4549363.
↑ Bader, Christopher; Moura, Arnaldo (April 1982). "A Generalization of Ogden's Lemma". Applied Mathematics and Computation. 29 (2): 404–407. doi: 10.1145/322307.322315 . S2CID 33988796.
↑ Dömösi, Pál; Kudlek, Manfred (1999), "Strong iteration lemmata for regular, linear, context-free, and linear indexed languages", Fundamentals of Computation Theory, Berlin, Heidelberg: Springer Berlin Heidelberg, pp. 226–233, doi:10.1007/3-540-48321-7_18, ISBN 978-3-540-66412-3 , retrieved 2023-02-26

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Ogden, William (September 1968). "A helpful result for proving inherent ambiguity". Mathematical Systems Theory. 2 (3): 191–194. doi:10.1007/bf01694004. ISSN 0025-5661. S2CID 13197551.

[2] Kracht, Marcus (2004). Too Many Languages Satisfy Ogden’s Lemma (PDF). Proceedings of the 27th Pennsylvania Linguistics Colloquium. Philadelphia. pp. 115–121. Retrieved 16 May 2024.

[3] Hopcroft, John E. (1979). Introduction to automata theory, languages, and computation. Jeffrey D. Ullman. Reading, Mass.: Addison-Wesley. p. 128. ISBN 0-201-02988-X. OCLC 4549363.

[4] Bader, Christopher; Moura, Arnaldo (April 1982). "A Generalization of Ogden's Lemma". Applied Mathematics and Computation. 29 (2): 404–407. doi: 10.1145/322307.322315 . S2CID 33988796.

[5] Dömösi, Pál; Kudlek, Manfred (1999), "Strong iteration lemmata for regular, linear, context-free, and linear indexed languages", Fundamentals of Computation Theory, Berlin, Heidelberg: Springer Berlin Heidelberg, pp. 226–233, doi:10.1007/3-540-48321-7_18, ISBN 978-3-540-66412-3 , retrieved 2023-02-26

[1]

[2]

[3]

[4]

[5]