Simple precedence parser

Last updated October 22, 2022

In computer science, a simple precedence parser is a type of bottom-up parser for context-free grammars that can be used only by simple precedence grammars.

Implementation

Compute the Wirth–Weber precedence relationship table for a grammar with initial symbol S.
Initialize a stack with the starting marker $.
Append an ending marker $ to the string being parsed (Input).
Until Stack equals "$ S" and Input equals "$"
- Search the table for the relationship between Top(stack) and NextToken(Input)
- if the relationship is ⋖ or ≐
  - Shift:
  - Push(Stack, relationship)
  - Push(Stack, NextToken(Input))
  - RemoveNextToken(Input)
- if the relationship is ⋗
  - Reduce:
  - SearchProductionToReduce(Stack)
  - Remove the Pivot from the Stack
  - Search the table for the relationship between the nonterminal from the production and first symbol in the stack (Starting from top)
  - Push(Stack, relationship)
  - Push(Stack, Non terminal)

SearchProductionToReduce (Stack)

Find the topmost ⋖ in the stack; this and all the symbols above it are the Pivot.
Find the production of the grammar which has the Pivot as its right side.

Example

Given following language, which can parse arithmetic expressions with the multiplication and addition operations:

E  --> E + T' | T' T' --> T T  --> T * F  | F F  --> ( E' ) | num E' --> E

num is a terminal, and the lexer parse any integer as num; E represents an arithmetic expression, T is a term and F is a factor.

and the Parsing table:

	E	E'	T	T'	F	+	*	(	)	num	$
E						≐			⋗
E'									≐
T						⋗	≐		⋗		⋗
T'						⋗			⋗		⋗
F						⋗	⋗		⋗		⋗
+			⋖	≐	⋖			⋖		⋖
*					≐			⋖		⋖
(	⋖	≐	⋖	⋖	⋖			⋖		⋖
)						⋗	⋗		⋗		⋗
num						⋗	⋗		⋗		⋗
$	⋖		⋖	⋖	⋖			⋖		⋖

STACK                   PRECEDENCE    INPUT            ACTION  $                            ⋖        2 * ( 1 + 3 )$   SHIFT $ ⋖ 2                        ⋗        * ( 1 + 3 )$     REDUCE (F -> num) $ ⋖ F                        ⋗        * ( 1 + 3 )$     REDUCE (T -> F) $ ⋖ T                        ≐        * ( 1 + 3 )$     SHIFT $ ⋖ T ≐ *                    ⋖        ( 1 + 3 )$       SHIFT $ ⋖ T ≐ * ⋖ (                ⋖        1 + 3 )$         SHIFT $ ⋖ T ≐ * ⋖ ( ⋖ 1            ⋗        + 3 )$           REDUCE 4× (F -> num) (T -> F) (T' -> T) (E ->T ')  $ ⋖ T ≐ * ⋖ ( ⋖ E            ≐        + 3 )$           SHIFT $ ⋖ T ≐ * ⋖ ( ⋖ E ≐ +        ⋖        3 )$             SHIFT $ ⋖ T ≐ * ⋖ ( ⋖ E ≐ + < 3    ⋗        )$               REDUCE 3× (F -> num) (T -> F) (T' -> T)  $ ⋖ T ≐ * ⋖ ( ⋖ E ≐ + ≐ T    ⋗        )$               REDUCE 2× (E -> E + T) (E' -> E) $ ⋖ T ≐ * ⋖ ( ≐ E'           ≐        )$               SHIFT $ ⋖ T ≐ * ⋖ ( ≐ E' ≐ )       ⋗        $                REDUCE (F -> ( E' )) $ ⋖ T ≐ * ≐ F                ⋗        $                REDUCE (T -> T * F) $ ⋖ T                        ⋗        $                REDUCE 2× (T' -> T) (E -> T') $ ⋖ E                                 $                ACCEPT

Related Research Articles

In computer science, an LALR parser or Look-Ahead LR parser is a simplified version of a canonical LR parser, to parse a text according to a set of production rules specified by a formal grammar for a computer language.

In computer science, LR parsers are a type of bottom-up parser that analyse deterministic context-free languages in linear time. There are several variants of LR parsers: SLR parsers, LALR parsers, Canonical LR(1) parsers, Minimal LR(1) parsers, and GLR parsers. LR parsers can be generated by a parser generator from a formal grammar defining the syntax of the language to be parsed. They are widely used for the processing of computer languages.

GNU Bison, commonly known as Bison, is a parser generator that is part of the GNU Project. Bison reads a specification in the BNF notation, warns about any parsing ambiguities, and generates a parser that reads sequences of tokens and decides whether the sequence conforms to the syntax specified by the grammar.

In computer science, an LL parser is a top-down parser for a restricted context-free language. It parses the input from Left to right, performing Leftmost derivation of the sentence.

In computer science, a recursive descent parser is a kind of top-down parser built from a set of mutually recursive procedures where each such procedure implements one of the nonterminals of the grammar. Thus the structure of the resulting program closely mirrors that of the grammar it recognizes.

In computer science, a Simple LR or SLR parser is a type of LR parser with small parse tables and a relatively simple parser generator algorithm. As with other types of LR(1) parser, an SLR parser is quite efficient at finding the single correct bottom-up parse in a single left-to-right scan over the input stream, without guesswork or backtracking. The parser is mechanically generated from a formal grammar for the language.

In computer science, a canonical LR parser or LR(1) parser is an LR(k) parser for k=1, i.e. with a single lookahead terminal. The special attribute of this parser is that any LR(k) grammar with k>1 can be transformed into an LR(1) grammar. However, back-substitutions are required to reduce k and as back-substitutions increase, the grammar can quickly become large, repetitive and hard to understand. LR(k) can handle all deterministic context-free languages. In the past this LR(k) parser has been avoided because of its huge memory requirements in favor of less powerful alternatives such as the LALR and the LL(1) parser. Recently, however, a "minimal LR(1) parser" whose space requirements are close to LALR parsers, is being offered by several parser generators.

In computer science, lexical analysis, lexing or tokenization is the process of converting a sequence of characters into a sequence of lexical tokens. A program that performs lexical analysis may be termed a lexer, tokenizer, or scanner, although scanner is also a term for the first stage of a lexer. A lexer is generally combined with a parser, which together analyze the syntax of programming languages, web pages, and so forth.

In computer programming, the interpreter pattern is a design pattern that specifies how to evaluate sentences in a language. The basic idea is to have a class for each symbol in a specialized computer language. The syntax tree of a sentence in the language is an instance of the composite pattern and is used to evaluate (interpret) the sentence for a client. See also Composite pattern.

Parsing, syntax analysis, or syntactic analysis is the process of analyzing a string of symbols, either in natural language, computer languages or data structures, conforming to the rules of a formal grammar. The term parsing comes from Latin pars (orationis), meaning part.

Top-down parsing in computer science is a parsing strategy where one first looks at the highest level of the parse tree and works down the parse tree by using the rewriting rules of a formal grammar. LL parsers are a type of parser that uses a top-down parsing strategy.

This is a list of operators in the C and C++ programming languages. All the operators listed exist in C++; the column "Included in C", states whether an operator is also present in C. Note that C does not support operator overloading.

In computer science, an operator precedence parser is a bottom-up parser that interprets an operator-precedence grammar. For example, most calculators use operator precedence parsers to convert from the human-readable infix notation relying on order of operations to a format that is optimized for evaluation such as Reverse Polish notation (RPN).

In computer science, a Wirth–Weber relationship between a pair of symbols $is necessary to determine if a formal grammar is a simple precedence grammar. In such a case, the simple precedence parser can be used. The relationship is named after computer scientists Niklaus Wirth and Helmut Weber.$

An operator precedence grammar is a kind of grammar for formal languages.

In computer science, the shunting yard algorithm is a method for parsing arithmetical or logical expressions, or a combination of both, specified in infix notation. It can produce either a postfix notation string, also known as Reverse Polish notation (RPN), or an abstract syntax tree (AST). The algorithm was invented by Edsger Dijkstra and named the "shunting yard" algorithm because its operation resembles that of a railroad shunting yard. Dijkstra first described the shunting yard algorithm in the Mathematisch Centrum report MR 34/61.

A GLR parser is an extension of an LR parser algorithm to handle non-deterministic and ambiguous grammars. The theoretical foundation was provided in a 1974 paper by Bernard Lang. It describes a systematic way to produce such algorithms, and provides uniform results regarding correctness proofs, complexity with respect to grammar classes, and optimization techniques. The first actual implementation of GLR was described in a 1984 paper by Masaru Tomita, it has also been referred to as a "parallel parser". Tomita presented five stages in his original work, though in practice it is the second stage that is recognized as the GLR parser.

In computer science, recursive ascent parsing is a technique for implementing an LALR parser which uses mutually-recursive functions rather than tables. Thus, the parser is directly encoded in the host language similar to recursive descent. Direct encoding usually yields a parser which is faster than its table-driven equivalent for the same reason that compilation is faster than interpretation. It is also (nominally) possible to hand edit a recursive ascent parser, whereas a tabular implementation is nigh unreadable to the average human.

SLR grammars are the class of formal grammars accepted by a Simple LR parser. SLR grammars are a superset of all LR(0) grammars and a subset of all LALR(1) and LR(1) grammars.

A shift-reduce parser is a class of efficient, table-driven bottom-up parsing methods for computer languages and other notations formally defined by a grammar. The parsing methods most commonly used for parsing programming languages, LR parsing and its variations, are shift-reduce methods. The precedence parsers used before the invention of LR parsing are also shift-reduce methods. All shift-reduce parsers have similar outward effects, in the incremental order in which they build a parse tree or call specific output actions.

References

Alfred V. Aho, Jeffrey D. Ullman (1977). Principles of Compiler Design . 1st Edition. Addison–Wesley.
William A. Barrett, John D. Couch (1979). Compiler construction: Theory and Practice. Science Research Associate.
Jean-Paul Tremblay, P. G. Sorenson (1985). The Theory and Practice of Compiler Writing. McGraw–Hill.

This computer science article is a stub. You can help Wikipedia by expanding it.

This programming-language-related article is a stub. You can help Wikipedia by expanding it.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

v t e Parsing algorithms
Top-down	LL Recursive descent Tail recursive
Bottom-up	Precedence Simple Operator Shunting-yard LR Simple Look-ahead Canonical Generalized CYK Recursive ascent Shift-reduce
Mixed, other	Combinator Chart Earley
Related topics	PEG Definite clause grammar Deterministic parsing Dynamic programming Memoization Parser generator LALR Parse tree AST Scannerless parsing History of compiler construction Comparison of parser generators Operator-precedence grammar

	E	E'	T	T'	F	+	*	(	)	num	$
E						≐			⋗
E'									≐
T						⋗	≐		⋗		⋗
T'						⋗			⋗		⋗
F						⋗	⋗		⋗		⋗
+			⋖	≐	⋖			⋖		⋖
*					≐			⋖		⋖
(	⋖	≐	⋖	⋖	⋖			⋖		⋖
)						⋗	⋗		⋗		⋗
num						⋗	⋗		⋗		⋗
$	⋖		⋖	⋖	⋖			⋖		⋖

	E	E'	T	T'	F	+	*	(	)	num	$
E						≐			⋗
E'									≐
T						⋗	≐		⋗		⋗
T'						⋗			⋗		⋗
F						⋗	⋗		⋗		⋗
+			⋖	≐	⋖			⋖		⋖
*					≐			⋖		⋖
(	⋖	≐	⋖	⋖	⋖			⋖		⋖
)						⋗	⋗		⋗		⋗
num						⋗	⋗		⋗		⋗
$	⋖		⋖	⋖	⋖			⋖		⋖

Simple precedence parser

Contents

Implementation

Example

Related Research Articles

References

	E	E'	T	T'	F	+	*	(	)	num	$
E						≐			⋗
E'									≐
T						⋗	≐		⋗		⋗
T'						⋗			⋗		⋗
F						⋗	⋗		⋗		⋗
+			⋖	≐	⋖			⋖		⋖
*					≐			⋖		⋖
(	⋖	≐	⋖	⋖	⋖			⋖		⋖
)						⋗	⋗		⋗		⋗
num						⋗	⋗		⋗		⋗
$	⋖		⋖	⋖	⋖			⋖		⋖