In computer science, a linear grammar is a context-free grammar that has at most one nonterminal in the right-hand side of each of its productions.
A linear language is a language generated by some linear grammar.
An example of a linear grammar is G with N = {S}, Σ = {a, b}, P with start symbol S and rules
It generates the language .
Two special types of linear grammars are the following:
Each of these can describe exactly the regular languages. A regular grammar is a grammar that is left-linear or right-linear.
Observe that by inserting new nonterminals, any linear grammar can be replaced by an equivalent one where some of the rules are left-linear and some are right-linear. For instance, the rules of G above can be replaced with
However, the requirement that all rules be left-linear (or all rules be right-linear) leads to a strict decrease in the expressive power of linear grammars.
All regular languages are linear; conversely, an example of a linear, non-regular language is { anbn }. as explained above. All linear languages are context-free; conversely, an example of a context-free, non-linear language is the Dyck language of well-balanced bracket pairs. Hence, the regular languages are a proper subset of the linear languages, which in turn are a proper subset of the context-free languages.
While regular languages are deterministic, there exist linear languages that are nondeterministic. For example, the language of even-length palindromes on the alphabet of 0 and 1 has the linear grammar S → 0S0 | 1S1 | ε. An arbitrary string of this language cannot be parsed without reading all its letters first which means that a pushdown automaton has to try alternative state transitions to accommodate for the different possible lengths of a semi-parsed string. [1] This language is nondeterministic. Since nondeterministic context-free languages cannot be accepted in linear time [ clarification needed ], linear languages cannot be accepted in linear time in the general case. Furthermore, it is undecidable whether a given context-free language is a linear context-free language. [2]
A language is linear iff it can be generated by a one-turn pushdown automaton – a pushdown automaton that, once it starts popping, never pushes again.
Linear languages are closed under union. Construction is the same as the construction for the union of context-free languages. Let be two linear languages, then is constructed by a linear grammar with , and playing the role of the linear grammars for .
If L is a linear language and M is a regular language, then the intersection is again a linear language; in other words, the linear languages are closed under intersection with regular sets.
Linear languages are closed under homomorphism and inverse homomorphism. [3]
As a corollary, linear languages form a full trio. Full trios in general are language families that enjoy a couple of other desirable mathematical properties.
Linear languages are not closed under intersection. For example, let , then their intersection is not only not linear, but also not context-free. See pumping lemma for context-free languages.
As a corollary, linear languages are not closed under complement (as intersection can be constructed by de Morgan's laws out of union and complement).
The Chomsky hierarchy in the fields of formal language theory, computer science, and linguistics, is a containment hierarchy of classes of formal grammars. A formal grammar describes how to form strings from a language's vocabulary that are valid according to the language's syntax. The linguist Noam Chomsky theorized that four different classes of formal grammars existed that could generate increasingly complex languages. Each class can also completely generate the language of all inferior classes.
A context-sensitive grammar (CSG) is a formal grammar in which the left-hand sides and right-hand sides of any production rules may be surrounded by a context of terminal and nonterminal symbols. Context-sensitive grammars are more general than context-free grammars, in the sense that there are languages that can be described by a CSG but not by a context-free grammar. Context-sensitive grammars are less general than unrestricted grammars. Thus, CSGs are positioned between context-free and unrestricted grammars in the Chomsky hierarchy.
In formal language theory, a context-sensitive language is a language that can be defined by a context-sensitive grammar. Context-sensitive is known as type-1 in the Chomsky hierarchy of formal languages.
In formal language theory, a context-free grammar (CFG) is a formal grammar whose production rules can be applied to a nonterminal symbol regardless of its context. In particular, in a context-free grammar, each production rule is of the form
In formal language theory, a context-free language (CFL), also called a Chomsky type-2 language, is a language generated by a context-free grammar (CFG).
In formal language theory, a context-free grammar, G, is said to be in Chomsky normal form if all of its production rules are of the form:
In the theory of computation, a branch of theoretical computer science, a pushdown automaton (PDA) is a type of automaton that employs a stack.
In theoretical computer science and formal language theory, a regular language is a formal language that can be defined by a regular expression, in the strict sense in theoretical computer science.
In theoretical computer science and formal language theory, a regular grammar is a grammar that is right-regular or left-regular. While their exact definition varies from textbook to textbook, they all require that
In computer science, an LL parser is a top-down parser for a restricted context-free language. It parses the input from Left to right, performing Leftmost derivation of the sentence.
In computer science, an ambiguous grammar is a context-free grammar for which there exists a string that can have more than one leftmost derivation or parse tree. Every non-empty context-free language admits an ambiguous grammar by introducing e.g. a duplicate rule. A language that only admits ambiguous grammars is called an inherently ambiguous language. Deterministic context-free grammars are always unambiguous, and are an important subclass of unambiguous grammars; there are non-deterministic unambiguous grammars, however.
In computer science, a parsing expression grammar (PEG) is a type of analytic formal grammar, i.e. it describes a formal language in terms of a set of rules for recognizing strings in the language. The formalism was introduced by Bryan Ford in 2004 and is closely related to the family of top-down parsing languages introduced in the early 1970s. Syntactically, PEGs also look similar to context-free grammars (CFGs), but they have a different interpretation: the choice operator selects the first match in PEG, while it is ambiguous in CFG. This is closer to how string recognition tends to be done in practice, e.g. by a recursive descent parser.
Conjunctive grammars are a class of formal grammars studied in formal language theory. They extend the basic type of grammars, the context-free grammars, with a conjunction operation. Besides explicit conjunction, conjunctive grammars allow implicit disjunction represented by multiple rules for a single nonterminal symbol, which is the only logical connective expressible in context-free grammars. Conjunction can be used, in particular, to specify intersection of languages. A further extension of conjunctive grammars known as Boolean grammars additionally allows explicit negation.
In automata theory, a deterministic pushdown automaton is a variation of the pushdown automaton. The class of deterministic pushdown automata accepts the deterministic context-free languages, a proper subset of context-free languages.
In formal language theory, deterministic context-free languages (DCFL) are a proper subset of context-free languages. They are the context-free languages that can be accepted by a deterministic pushdown automaton. DCFLs are always unambiguous, meaning that they admit an unambiguous grammar. There are non-deterministic unambiguous CFLs, so DCFLs form a proper subset of unambiguous CFLs.
Indexed grammars are a generalization of context-free grammars in that nonterminals are equipped with lists of flags, or index symbols. The language produced by an indexed grammar is called an indexed language.
An embedded pushdown automaton or EPDA is a computational model for parsing languages generated by tree-adjoining grammars (TAGs). It is similar to the context-free grammar-parsing pushdown automaton, but instead of using a plain stack to store symbols, it has a stack of iterated stacks that store symbols, giving TAGs a generative capacity between context-free and context-sensitive grammars, or a subset of mildly context-sensitive grammars. Embedded pushdown automata should not be confused with nested stack automata which have more computational power.
A formal grammar describes which strings from an alphabet of a formal language are valid according to the language's syntax. A grammar does not describe the meaning of the strings or what can be done with them in whatever context—only their form. A formal grammar is defined as a set of production rules for such strings in a formal language.
In computer science, more specifically in automata and formal language theory, nested words are a concept proposed by Alur and Madhusudan as a joint generalization of words, as traditionally used for modelling linearly ordered structures, and of ordered unranked trees, as traditionally used for modelling hierarchical structures. Finite-state acceptors for nested words, so-called nested word automata, then give a more expressive generalization of finite automata on words. The linear encodings of languages accepted by finite nested word automata gives the class of visibly pushdown languages. The latter language class lies properly between the regular languages and the deterministic context-free languages. Since their introduction in 2004, these concepts have triggered much research in that area.
In formal language theory, an LL grammar is a context-free grammar that can be parsed by an LL parser, which parses the input from Left to right, and constructs a Leftmost derivation of the sentence. A language that has an LL grammar is known as an LL language. These form subsets of deterministic context-free grammars (DCFGs) and deterministic context-free languages (DCFLs), respectively. One says that a given grammar or language "is an LL grammar/language" or simply "is LL" to indicate that it is in this class.