LR-attributed grammar

Last updated April 16, 2019

LR-attributed grammars are a special type of attribute grammars. They allow the attributes to be evaluated on LR parsing. As a result, attribute evaluation in LR-attributed grammars can be incorporated conveniently in bottom-up parsing. zyacc is based on LR-attributed grammars. They are a subset of the L-attributed grammars, where the attributes can be evaluated in one left-to-right traversal of the abstract syntax tree. They are a superset of the S-attributed grammars, which allow only synthesized attributes. In yacc, a common hack is to use global variables to simulate some kind of inherited attributes and thus LR-attribution.

An attribute grammar is a formal way to define attributes for the productions of a formal grammar, associating these attributes with values. The evaluation occurs in the nodes of the abstract syntax tree, when the language is processed by some parser or compiler.

In computer science, LR parsers are a type of bottom-up parser that efficiently read deterministic context-free languages, in guaranteed linear time. There are several variants of LR parsers: SLR parsers, LALR parsers, Canonical LR(1) parsers, Minimal LR(1) parsers, GLR parsers. LR parsers can be generated by a parser generator from a formal grammar defining the syntax of the language to be parsed. They are widely used for the processing of computer languages.

L-attributed grammars are a special type of attribute grammars. They allow the attributes to be evaluated in one depth-first left-to-right traversal of the abstract syntax tree. As a result, attribute evaluation in L-attributed grammars can be incorporated conveniently in top-down parsing.

External links

http://www.cs.binghamton.edu/~zdu/zyacc/doc/zyacc_4.html
Reinhard Wilhelm: LL- and LR-Attributed Grammars. Programmiersprachen und Programmentwicklung, 7. Fachtagung, veranstaltet vom Fachausschuß 2 der GI (1982), 151–164, Informatik-Fachberichte volume 53.
J. van Katwijk: A preprocessor for YACC or A poor man's approach to parsing attributed grammars. Sigplan Notices18:10 (1983), 12–15.

Reinhard Wilhelm is a German computer scientist.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

Related Research Articles

In computer science, an LALR parser or Look-Ahead LR parser is a simplified version of a canonical LR parser, to parse a text according to a set of production rules specified by a formal grammar for a computer language.

Yacc is a computer program for the Unix operating system developed by Stephen C. Johnson. It is a Look Ahead Left-to-Right (LALR) parser generator, generating a parser, the part of a compiler that tries to make syntactic sense of the source code, specifically a LALR parser, based on an analytic grammar written in a notation similar to Backus–Naur Form (BNF). Yacc is supplied as a standard utility on BSD and AT&T Unix. GNU-based Linux distributions include Bison, a forward-compatible Yacc replacement.

GNU Bison, commonly known as Bison, is a parser generator that is part of the GNU Project. Bison reads a specification of a context-free language, warns about any parsing ambiguities, and generates a parser which reads sequences of tokens and decides whether the sequence conforms to the syntax specified by the grammar. The generated parsers are portable: they do not require any specific compilers. Bison by default generates LALR(1) parsers but it can also generate canonical LR, IELR(1) and GLR parsers.

In computer science, a Simple LR or SLR parser is a type of LR parser with small parse tables and a relatively simple parser generator algorithm. As with other types of LR(1) parser, an SLR parser is quite efficient at finding the single correct bottom-up parse in a single left-to-right scan over the input stream, without guesswork or backtracking. The parser is mechanically generated from a formal grammar for the language.

In computer science, a canonical LR parser or LR(1) parser is an LR(k) parser for k=1, i.e. with a single lookahead terminal. The special attribute of this parser is that any LR(k) grammar with k>1 can be transformed into an LR(1) grammar. However, back-substitutions are required to reduce k and as back-substitutions increase, the grammar can quickly become large, repetitive and hard to understand. LR(k) can handle all deterministic context-free languages.. In the past this LR(k) parser has been avoided because of its huge memory requirements in favor of less powerful alternatives such as the LALR and the LL(1) parser. Recently, however, a "minimal LR(1) parser" whose space requirements are close to LALR parsers, is being offered by several parser generators.

Lex is a computer program that generates lexical analyzers.

Parsing, syntax analysis, or syntactic analysis is the process of analysing a string of symbols, either in natural language, computer languages or data structures, conforming to the rules of a formal grammar. The term parsing comes from Latin pars (orationis), meaning part.

S-attributed grammars are a class of attribute grammars characterized by having no inherited attributes, but only synthesized attributes. Inherited attributes, which must be passed down from parent nodes to children nodes of the abstract syntax tree during the semantic analysis of the parsing process, are a problem for bottom-up parsing because in bottom-up parsing, the parent nodes of the abstract syntax tree are created after creation of all of their children. Attribute evaluation in S-attributed grammars can be incorporated conveniently in both top-down parsing and bottom-up parsing.

In computer science, the syntax of a computer language is the set of rules that defines the combinations of symbols that are considered to be a correctly structured document or fragment in that language. This applies both to programming languages, where the document represents source code, and markup languages, where the document represents data. The syntax of a language defines its surface form. Text-based computer languages are based on sequences of characters, while visual programming languages are based on the spatial layout and connections between symbols. Documents that are syntactically invalid are said to have a syntax error.

In computer science, scannerless parsing refers to performing both tokenization and parsing in a single step, rather than breaking it up into a pipeline of a lexer followed by a parser, executing concurrently. It also refers to the associated grammar: using a single formalism to express both the lexical grammar and phrase level grammar used to parse a language.

In computer programming, the lexer hack describes a common solution to the problems in parsing ANSI C, due to the reference grammar being context-sensitive. In C, classifying a sequence of characters as a variable name or a type name requires contextual information of the phrase structure, which prevents one from having a context-free lexer.

A GLR parser is an extension of an LR parser algorithm to handle nondeterministic and ambiguous grammars. The theoretical foundation was provided in a 1974 paper by Bernard Lang. It describes a systematic way to produce such algorithms, and provides uniform results regarding correctness proofs, complexity with respect to grammar classes, and optimization techniques. The first actual implementation of GLR was described in a 1984 paper by Masaru Tomita, it has also been referred to as a "parallel parser". Tomita presented five stages in his original work, though in practice it is the second stage that is recognized as the GLR parser.

In computer science, SYNTAX is a system used to generate lexical and syntactic analyzers (parsers) for all kinds of context-free grammars (CFGs) as well as some classes of contextual grammars. It has been developed at INRIA (France) for several decades, mostly by Pierre Boullier, but has become free software since 2007 only. SYNTAX is distributed under the CeCILL license.

In computing, a compiler is a computer program that transforms source code written in a programming language or computer language, into another computer language. The most common reason for transforming source code is to create an executable program.

A lookahead left-to-right (LALR) parser generator is a software tool that reads a BNF grammar and creates an LALR parser which is capable of parsing files written in the computer language defined by the BNF grammar. LALR parsers are desirable because they are very fast and small in comparison to other types of parsers.

Irony is a parser generator framework for language implementation on the .NET platform. Unlike most existing yacc/lex-style solutions, it does not employ code generation of a scanner/parser from grammars written in an external DSL. The grammars for the target language are coded directly in C# instead. The framework implements a LALR(1) parser.

PLY is a parsing tool written purely in Python. It is basically a re-implementation of Lex and Yacc originally in C-language. It was written by David M. Beazley. Unlike Lex and Yacc in C which uses LALR parsing technique, PLY uses LR parsing which can incorporate large grammars easily. PLY also has extensive debugging and error reporting facilities.