S/SL programming language

Last updated

The Syntax/Semantic Language (S/SL) is an executable high level specification language for recursive descent parsers, semantic analyzers and code generators developed by James Cordy, Ric Holt and David Wortman at the University of Toronto in 1980. [1]

In computer science, a high-level programming language is a programming language with strong abstraction from the details of the computer. In contrast to low-level programming languages, it may use natural language elements, be easier to use, or may automate significant areas of computing systems, making the process of developing a program simpler and more understandable than when using a lower-level language. The amount of abstraction provided defines how "high-level" a programming language is.

A specification language is a formal language in computer science used during systems analysis, requirements analysis, and systems design to describe a system at a much higher level than a programming language, which is used to produce the executable code for a system.

In computer science, a recursive descent parser is a kind of top-down parser built from a set of mutually recursive procedures where each such procedure implements one of the nonterminals of the grammar. Thus the structure of the resulting program closely mirrors that of the grammar it recognizes.

S/SL is a small programming language that supports cheap recursion and defines input, output, and error token names (& values), semantic mechanisms (class interfaces whose methods are really escapes to routines in a host programming language but allow good abstraction in the pseudocode) and a pseudocode program that defines the syntax of the input language by the token stream the program accepts. Alternation, control flow and one-symbol look-ahead constructs are part of the language.

Programming language Language designed to communicate instructions to a machine

A programming language is a formal language, which comprises a set of instructions that produce various kinds of output. Programming languages are used in computer programming to implement algorithms.

Recursion Process of repeating items in a self-similar way

Recursion occurs when a thing is defined in terms of itself or of its type. Recursion is used in a variety of disciplines ranging from linguistics to logic. The most common application of recursion is in mathematics and computer science, where a function being defined is applied within its own definition. While this apparently defines an infinite number of instances, it is often done in such a way that no loop or infinite chain of references can occur.

Pseudocode is an informal high-level description of the operating principle of a computer program or other algorithm.

The S/SL processor compiles this pseudocode into a table (byte-codes) that is interpreted by the S/SL table-walker (interpreter). The pseudocode language processes the input language in LL(1) recursive descent style but extensions allow it to process any LR(k) language relatively easily. [2] S/SL is designed to provide excellent syntax error recovery and repair. It is more powerful and transparent than Yacc but can be slower.

A compiler is a computer program that translates computer code written in one programming language into another language. The name compiler is primarily used for programs that translate source code from a high-level programming language to a lower level language to create an executable program.

In computer science, an interpreter is a computer program that directly executes instructions written in a programming or scripting language, without requiring them previously to have been compiled into a machine language program. An interpreter generally uses one of the following strategies for program execution:

  1. Parse the source code and perform its behavior directly;
  2. Translate source code into some efficient intermediate representation and immediately execute this;
  3. Explicitly execute stored precompiled code made by a compiler which is part of the interpreter system.

In computer science, LR parsers are a type of bottom-up parser that analyses deterministic context-free languages in linear time. There are several variants of LR parsers: SLR parsers, LALR parsers, Canonical LR(1) parsers, Minimal LR(1) parsers, GLR parsers. LR parsers can be generated by a parser generator from a formal grammar defining the syntax of the language to be parsed. They are widely used for the processing of computer languages.

S/SL's "semantic mechanisms" extend its capabilities to all phases of compiling, and it has been used to implement all phases of compilation, including scanners, parsers, semantic analyzers, code generators and virtual machine interpreters in multi-pass language processors. [3]

In computer science, lexical analysis, lexing or tokenization is the process of converting a sequence of characters into a sequence of tokens. A program that performs lexical analysis may be termed a lexer, tokenizer, or scanner, though scanner is also a term for the first stage of a lexer. A lexer is generally combined with a parser, which together analyze the syntax of programming languages, web pages, and so forth.

Semantic analysis or context sensitive analysis is a process in compiler construction, usually after parsing, to gather necessary semantic information from the source code. It usually includes type checking, or makes sure a variable is declared before use which is impossible to describe in the extended Backus–Naur form and thus not easily detected during parsing.

In computing, code generation is the process by which a compiler's code generator converts some intermediate representation of source code into a form that can be readily executed by a machine.

S/SL has been used to implement production commercial compilers for languages such as PL/I, Euclid, Turing, Ada, and COBOL, as well as interpreters, command processors, and domain specific languages of many kinds. It is the primary technology used in IBM's ILE/400 COBOL compiler, [4] and the ZMailer mail transfer agent uses S/SL [5] for defining both its mail router processing language and its RFC 822 email address validation.

PL/I is a procedural, imperative computer programming language developed and published by IBM. It is designed for scientific, engineering, business and system programming. It has been used by academic, commercial and industrial organizations since it was introduced in the 1960s, and is still used.

Euclid is an imperative programming language for writing verifiable programs. It was designed by Butler Lampson and associates at the Xerox PARC lab in the mid-1970s. The implementation was led by Ric Holt at the University of Toronto and James Cordy was the principal programmer for the first implementation of the compiler. It was originally designed for the Motorola 6809 microprocessor. It was considered innovative for the time; the compiler development team had a $2 million budget over 2 years and was commissioned by the Defense Advanced Research Projects Agency of the U.S. Department of Defense and the Canadian Department of National Defence. It was used for a few years at I. P. Sharp Associates, MITRE Corporation, SRI International and various other international institutes for research in systems programming and secure software systems.

Turing (programming language) programming language

Turing is a Pascal-like programming language developed in 1982 by Ric Holt and James Cordy, then of University of Toronto, in Toronto, Ontario, Canada. Turing is a descendant of Euclid, Pascal and SP/k that features a clean syntax and precise machine-independent semantics.

Related Research Articles

GNU Bison, commonly known as Bison, is a parser generator that is part of the GNU Project. Bison reads a specification of a context-free language, warns about any parsing ambiguities, and generates a parser which reads sequences of tokens and decides whether the sequence conforms to the syntax specified by the grammar. The generated parsers are portable: they do not require any specific compilers. Bison by default generates LALR(1) parsers but it can also generate canonical LR, IELR(1) and GLR parsers.

In computer science, Backus–Naur form or Backus normal form (BNF) is a notation technique for context-free grammars, often used to describe the syntax of languages used in computing, such as computer programming languages, document formats, instruction sets and communication protocols. They are applied wherever exact descriptions of languages are needed: for instance, in official language specifications, in manuals, and in textbooks on programming language theory.

In computer science, a compiler-compiler or compiler generator is a programming tool that creates a parser, interpreter, or compiler from some form of formal description of a programming language and machine.

Lex is a computer program that generates lexical analyzers.

Parsing, syntax analysis, or syntactic analysis is the process of analysing a string of symbols, either in natural language, computer languages or data structures, conforming to the rules of a formal grammar. The term parsing comes from Latin pars (orationis), meaning part.

James Cordy Canadian computer scientist

James Reginald Cordy is a Canadian computer scientist and educator who is a Professor in the School of Computing at Queen's University. As a researcher he is currently active in the fields of source code analysis and manipulation, software reverse and re-engineering, and pattern analysis and machine intelligence. He has a long record of previous work in programming languages, compiler technology, and software architecture.

In computer science, a parsing expression grammar, or PEG, is a type of analytic formal grammar, i.e. it describes a formal language in terms of a set of rules for recognizing strings in the language. The formalism was introduced by Bryan Ford in 2004 and is closely related to the family of top-down parsing languages introduced in the early 1970s. Syntactically, PEGs also look similar to context-free grammars (CFGs), but they have a different interpretation: the choice operator selects the first match in PEG, while it is ambiguous in CFG. This is closer to how string recognition tends to be done in practice, e.g. by a recursive descent parser.

XPL is a programming language based on PL/I, a portable one-pass compiler written in its own language, and a parser generator tool for easily implementing similar compilers for other languages. XPL was designed in 1967 as a way to teach compiler design principles and as starting point for students to build compilers for their own languages.

Syntax (programming languages) in programming languages

In computer science, the syntax of a computer language is the set of rules that defines the combinations of symbols that are considered to be a correctly structured document or fragment in that language. This applies both to programming languages, where the document represents source code, and to markup languages, where the document represents data.

META II is a domain-specific programming language for writing compilers. It was created in 1963-1964 by Dewey Val Schorre at UCLA. META II uses what Schorre called syntax equations. Its operation is simply explained as:

Each syntax equation is translated into a recursive subroutine which tests the input string for a particular phrase structure, and deletes it if found.

In computer science, SYNTAX is a system used to generate lexical and syntactic analyzers (parsers) for all kinds of context-free grammars (CFGs) as well as some classes of contextual grammars. It has been developed at INRIA (France) for several decades, mostly by Pierre Boullier, but has become free software since 2007 only. SYNTAX is distributed under the CeCILL license.

In computing, a compiler is a computer program that transforms source code written in a programming language or computer language, into another computer language. The most common reason for transforming source code is to create an executable program.

A shift-reduce parser is a class of efficient, table-driven bottom-up parsing methods for computer languages and other notations formally defined by a grammar. The parsing methods most commonly used for parsing programming languages, LR parsing and its variations, are shift-reduce methods. The precedence parsers used before the invention of LR parsing are also shift-reduce methods. All shift-reduce parsers have similar outward effects, in the incremental order in which they build a parse tree or call specific output actions.

References

  1. J. R. Cordy, R. C. Holt and D. B. Wortman, "S/SL: Syntax/Semantic Language - Introduction and Specification", Technical Report CSRG-118, Computer Systems Research Group, University of Toronto, Sept. 1980
  2. Barnard, D.T.; Cordy, J.R. (1988). "SL Parses the LR Languages". Computer Languages. 13 (2): 65–74. doi:10.1016/0096-0551(88)90010-0.
  3. Holt, Richard C.; Cordy, James R.; Wortman, David B. (1982). "An Introduction to S/SL: Syntax/Semantic Language". ACM Transactions on Programming Languages and Systems. 4 (2): 149–178. doi:10.1145/357162.357164.
  4. Ian H. Carmichael and Stephen Perelgut. "S/SL revisited". Proc. CASCON'95, Conference of the Centre for Advanced Studies on Collaborative Research, Toronto, Canada, November 1995 http://portal.acm.org/citation.cfm?id=781915.781926
  5. ZMailer the Manual, http://www.zmailer.org/zman/zmanual.shtml