M-expression

Last updated
John McCarthy John McCarthy Stanford.jpg
John McCarthy

In computer programming, M-expressions (or meta-expressions) were an early proposed syntax for the Lisp programming language, inspired by contemporary languages such as Fortran and ALGOL. The notation was never implemented into the language and, as such, it was never finalized. [1]

Contents

Compared to S-expressions, M-expressions introduce function notation, infix operators (including a defun operator), and shorthands for cond and list into the language. [2]

Background

John McCarthy published the first paper on Lisp in 1960 while a research fellow at the Massachusetts Institute of Technology. In it he described a language of symbolic expressions (S-expressions) that could represent complex structures as lists. Then he defined a set of primitive operations on the S-expressions, and a language of meta-expressions (M-expressions) that could be used to define more complex operations. Finally, he showed how the meta-language itself could be represented with S-expressions, resulting in a system that was potentially self-hosting. [3] The draft version of this paper is known as "AI Memo 8". [4]

Example M-expressions (LISP 1.5, 1965) [2]
Expression typeMathematical notationM-expressionModern Lisp S-expression
List value[1;2;3](quote(123))
Function applicationf[x;y](fxy)
Function definition1=label[square;λ[[x];times[x;x]]]}}(definesquare(lambda(x)(*xx)))
Conditional expression[lessp[x;0] → minus[x]; T → x](cond((<x0)(-x))(tx))

McCarthy had planned to develop an automatic Lisp compiler (LISP 2) using M-expressions as the language syntax and S-expressions to describe the compiler's internal processes. Stephen B. Russell read the paper and suggested to him that S-expressions were a more convenient syntax. Although McCarthy disapproved of the idea, Russell and colleague Daniel J. Edwards hand-coded an interpreter program that could execute S-expressions. [2] This program was adopted by McCarthy's research group, establishing S-expressions as the dominant form of Lisp.

McCarthy reflected on the fate of M-expressions in 1979:

The project of defining M-expressions precisely and compiling them or at least translating them into S-expressions was neither finalized nor explicitly abandoned. It just receded into the indefinite future, and a new generation of programmers appeared who preferred internal notation to any FORTRAN-like or ALGOL-like notation that could be devised. [5]

Implementations

A form of sugared M-expressions has been implemented in the Wolfram language of Wolfram Mathematica since 1988:

Example Wolfram snippets
Expression typeSugared syntax (InputForm)Function form (FullForm)
List value{1,2,3}List[1,2,3]
Function applicationf[x, y]f[x, y]
Function definitionpuresquare=#*#&Set[square,Function[Times[Slot[1],Slot[1]]]]
namedsquare=xx*xSet[square, Function[x, Times[x, x]]]
patternsquare[x_]:=x*xSetDelayed[square[Pattern[x, Blank[]]], Times[x, x]]
Conditional expression [note 1]
If[x<0,-x,x]Piecewise[{{-x,x<0}},x]

For LISP

MLisp was a contemporary (1968–1973) project to implement an M-expression-like frontend for Lisp. A few extra features like hygienic macros, pattern matching, and backtracking were incorporated. It eventually evolved into an abandoned LISP70 draft. M-LISP (MetaLISP) from 1989 was another attempt to blend M-expressions with Scheme. [7]

A parser for the "AI Memo 8" M-expression is available in Common Lisp, but the author intends it as a case against M-expressions due to its perceived inability to cope with macros. [8]

For K

The K (programming language) also includes M-Expressions, in addition to the more terse notation in the APL-tradition.

fibs:{[n]if[less[n;3];:iota[n]]fibrec:{[list]if[equal[n;count[list]];:list]a:list[minus[count[list];1]]b:list[minus[count[list];2]]:_f[join[list;plus[a;b]]]}:fibrec[(0;1)]}

Further development

A CGOL (1977) was implemented in MacLisp and follows a similar goal of introducing Algol-like syntax with infix operators. [7] It is known to work on Armed Bear Common Lisp. [9]

A more recent (circa 2003) variant is the I-expression, which use indentation to indicate parentheses implicitly, and are thus in some ways intermediate between S-expressions and M-expressions. I-expressions were introduced in Scheme Request For Implementation 49 as an auxiliary syntax for Scheme, but they have not been widely adopted. [10]

A further development is the "sweet" t-expression, which has infix operators without precedence. Like I-expressions, t-expressions are only a simple transformation away from S-expressions, so that theoretically they can be used on any Lisp dialect and not interfere with features like macros. [11]

Additional syntax-related include Apple's Dylan (Algol-like tokens) and Clojure's addition of other literal syntaxes. [7]

Notes

  1. Conditionals take more to explain, as the general conditional system in the language relies on pattern matching and rewriting. [6]


Related Research Articles

<span class="mw-page-title-main">Lisp (programming language)</span> Programming language family

Lisp is a family of programming languages with a long history and a distinctive, fully parenthesized prefix notation. Originally specified in 1960, Lisp is the third-oldest high-level programming language still in common use, after Fortran and COBOL. Lisp has changed since its early days, and many dialects have existed over its history. Today, the best-known general-purpose Lisp dialects are Common Lisp, Scheme, Racket, and Clojure.

<span class="mw-page-title-main">Macro (computer science)</span> Rule for substituting a set input with a set output

In computer programming, a macro is a rule or pattern that specifies how a certain input should be mapped to a replacement output. Applying a macro to an input is known as macro expansion. The input and output may be a sequence of lexical tokens or characters, or a syntax tree. Character macros are supported in software applications to make it easy to invoke common command sequences. Token and tree macros are supported in some programming languages to enable code reuse or to extend the language, sometimes for domain-specific languages.

Polish notation (PN), also known as normal Polish notation (NPN), Łukasiewicz notation, Warsaw notation, Polish prefix notation or simply prefix notation, is a mathematical notation in which operators precede their operands, in contrast to the more common infix notation, in which operators are placed between operands, as well as reverse Polish notation (RPN), in which operators follow their operands. It does not need any parentheses as long as each operator has a fixed number of operands. The description "Polish" refers to the nationality of logician Jan Łukasiewicz, who invented Polish notation in 1924.

Rebol is a cross-platform data exchange language and a multi-paradigm dynamic programming language designed by Carl Sassenrath for network communications and distributed computing. It introduces the concept of dialecting: small, optimized, domain-specific languages for code and data, which is also the most notable property of the language according to its designer Carl Sassenrath:

Although it can be used for programming, writing functions, and performing processes, its greatest strength is the ability to easily create domain-specific languages or dialects

<span class="mw-page-title-main">Scheme (programming language)</span> Dialect of Lisp

Scheme is a dialect of the Lisp family of programming languages. Scheme was created during the 1970s at the MIT Computer Science and Artificial Intelligence Laboratory and released by its developers, Guy L. Steele and Gerald Jay Sussman, via a series of memos now known as the Lambda Papers. It was the first dialect of Lisp to choose lexical scope and the first to require implementations to perform tail-call optimization, giving stronger support for functional programming and associated techniques such as recursive algorithms. It was also one of the first programming languages to support first-class continuations. It had a significant influence on the effort that led to the development of Common Lisp.

Maclisp is a programming language, a dialect of the language Lisp. It originated at the Massachusetts Institute of Technology's (MIT) Project MAC in the late 1960s and was based on Lisp 1.5. Richard Greenblatt was the main developer of the original codebase for the PDP-6; Jon L. White was responsible for its later maintenance and development. The name Maclisp began being used in the early 1970s to distinguish it from other forks of PDP-6 Lisp, notably BBN Lisp.

<span class="mw-page-title-main">S-expression</span> Data serialization format

In computer programming, an S-expression is an expression in a like-named notation for nested list (tree-structured) data. S-expressions were invented for and popularized by the programming language Lisp, which uses them for source code as well as data.

In computer science, Backus–Naur form is a notation used to describe the syntax of programming languages or other formal languages. It was developed by John Backus and Peter Naur. BNF can be described as a metasyntax notation for context-free grammars. Backus–Naur form is applied wherever exact descriptions of languages are needed, such as in official language specifications, in manuals, and in textbooks on programming language theory. BNF can be used to describe document formats, instruction sets, and communication protocols.

In computer science, a compiler-compiler or compiler generator is a programming tool that creates a parser, interpreter, or compiler from some form of formal description of a programming language and machine.

In computer science, a preprocessor is a program that processes its input data to produce output that is used as input in another program. The output is said to be a preprocessed form of the input data, which is often used by some subsequent programs like compilers. The amount and kind of processing done depends on the nature of the preprocessor; some preprocessors are only capable of performing relatively simple textual substitutions and macro expansions, while others have the power of full-fledged programming languages.

In computer science, hygienic macros are macros whose expansion is guaranteed not to cause the accidental capture of identifiers. They are a feature of programming languages such as Scheme, Dylan, Rust, Nim, and Julia. The general problem of accidental capture was well known in the Lisp community before the introduction of hygienic macros. Macro writers would use language features that would generate unique identifiers or use obfuscated identifiers to avoid the problem. Hygienic macros are a programmatic solution to the capture problem that is integrated into the macro expander. The term "hygiene" was coined in Kohlbecker et al.'s 1986 paper that introduced hygienic macro expansion, inspired by terminology used in mathematics.

A computer programming language is said to adhere to the off-side rule of syntax if blocks in that language are expressed by their indentation. The term was coined by Peter Landin, possibly as a pun on the offside rule in association football. This is contrasted with free-form languages, notably curly-bracket programming languages, where indentation has no computational meaning and indent style is only a matter of coding conventions and formatting. Off-side-rule languages are also described as having significant indentation.

In a given programming language design, a first-class citizen is an entity which supports all the operations generally available to other entities. These operations typically include being passed as an argument, returned from a function, and assigned to a variable.

MLISP is a variant of Lisp with an Algol-like syntax based on M-Expressions, which were the function syntax in the original description of Lisp by John McCarthy. McCarthy's M-expressions were never implemented in an exact form.

<span class="mw-page-title-main">Racket (programming language)</span> Lisp dialect

Racket is a general-purpose, multi-paradigm programming language and a multi-platform distribution that includes the Racket language, compiler, large standard library, IDE, development tools, and a set of additional languages including Typed Racket, Swindle, FrTime, Lazy Racket, R5RS & R6RS Scheme, Scribble, Datalog, Racklog, Algol 60 and several teaching languages.

In computer programming, homoiconicity is a property of some programming languages. A language is homoiconic if a program written in it can be manipulated as data using the language. The program's internal representation can thus be inferred just by reading the program itself. This property is often summarized by saying that the language treats code as data.

CGOL is an alternative syntax featuring an extensible algebraic notation for the Lisp programming language. It was designed for MACLISP by Vaughan Pratt and subsequently ported to Common Lisp.

<span class="mw-page-title-main">History of compiler construction</span>

In computing, a compiler is a computer program that transforms source code written in a programming language or computer language, into another computer language. The most common reason for transforming source code is to create an executable program.

The history of the programming language Scheme begins with the development of earlier members of the Lisp family of languages during the second half of the twentieth century. During the design and development period of Scheme, language designers Guy L. Steele and Gerald Jay Sussman released an influential series of Massachusetts Institute of Technology (MIT) AI Memos known as the Lambda Papers (1975–1980). This resulted in the growth of popularity in the language and the era of standardization from 1990 onward. Much of the history of Scheme has been documented by the developers themselves.

In the programming language Lisp, the reader or read function is the parser which converts the textual form of Lisp objects to the corresponding internal object structure.

References

  1. "The implementation of LISP". www-formal.stanford.edu. Retrieved 2020-03-29.
  2. 1 2 3 "LISP 1.5 Programmer's Manual" (PDF). Community.computerhistory.org. 1965. Archived from the original (PDF) on 2006-02-11. Retrieved 2013-09-02.
  3. McCarthy, John (April 1960) "Recursive Functions of Symbolic Expressions and Their Computation by Machine, Part I"
  4. McCarthy, John (March 1959). "Recursive Functions of Symbolic Expressions and Their Computation by Machine (AI Memo 8)".
  5. "The implementation of LISP". Formal.stanford.edu. 1979-02-12. Retrieved 2013-08-24.
  6. Mathematica as a Rewrite Language.
  7. 1 2 3 Lee, Xah. "LISP Infix Syntax Survey".
  8. "A Parser for M-Expressions". Let's newbies play with them, and realize how impractical they are. Note for example, that we cannot use macros anymore because their syntax would need to be known by the M-expression parser.
  9. CGOL on ABCL Development of the Armed Bear Common Lisp implementation blog.
  10. Möller, Egil (2003). "SRFI 49: Indentation-sensitive syntax". srfi.schemers.org.
  11. Wheeler, DA (2013). "SRFI 110: Sweet-expressions (t-expressions)". srfi.schemers.org.