Stack-oriented programming

Last updated

Stack-oriented programming is a programming paradigm that relies on a stack (or multiple stacks) to manipulate data and/or pass parameters. Several programming languages fit this description, notably Forth, RPL, and PostScript. Stack-oriented programming languages operate on one or more stacks, each of which may serve a different purpose. Programming constructs in other programming languages need to be modified for use in a stack-oriented system. [1] Most stack-oriented languages operate in postfix or Reverse Polish notation. Any arguments or parameters for a command are stated before that command. For example, postfix notation would be written 2, 3, multiply instead of multiply, 2, 3 (prefix or Polish notation), or 2 multiply 3 (infix notation). The programming languages Forth, Factor, RPL, PostScript, BibTeX style design language [2] and many assembly languages fit this paradigm.

Contents

Stack-based algorithms consider data, by utilising one piece of data from atop the stack, and returning data back atop the stack. The need for stack manipulation operators, allow for the stack to manipulate data. To emphasise the effect of a statement, a comment is used showing the top of the stack before and after the statement. This is known as the stack effect diagram. PostScript stacks consider separate stacks for additional purposes. This considers variables, dictionaries, procedures, anatomy of some typical procedures, control and flow. The analysis of the language model allows expressions and programs to be interpreted simply and theoretically.

Stack-based algorithms

PostScript is an example of a postfix stack-based language. An expression example in this language is 2 3 mul('mul' being the command for the multiplication operation). Calculating the expression involves understanding how stack-orientation works.

Stack-orientation can be presented as the following conveyor belt analogy. at the end of a conveyor belt (the input), plates marked 2, 3, and mul are placed in sequence. The plate at the end of the conveyor (2) can be taken, however other plates cannot be accessed until the plate at the end is removed. The plates can only be stored in a stack, and can only be added or removed from atop the stack, not from the middle or bottom. Blank plates (and a marker) can be supplied and plates can be permanently discarded.

Human stack.svg

Take plate 2 and put it on the stack, then take plate 3 and put it on the stack. Next, take the mul plate. This is an instruction to perform. Then, take the top two plates off the stack, multiply their labels (2 and 3), and write the result (6) on a new plate. Discard the two old plates (2 and 3) and the plate mul, and put the new plate on the stack. With no more plates remaining on the conveyor, the result of the calculation (6) is shown on the plate atop the stack.

This is a very simple calculation. What if a more complex calculation is needed, such as (2 + 3) × 11 + 1? If it is first written in postfix form, that is, 2 3 add 11 mul 1 add, the calculation can be performed in exactly the same manner and achieve the correct result. The steps of the calculation are shown in the table below. Each column shows an input element (the plate at the end of the conveyor), and the contents of the stack after processing that input.

Input23add11mul1add
Stack23
2
511
5
551
55
56

After processing all the input, the stack contains 56, which is the answer.

From this, the following can be concluded: a stack-based programming language has only one way to handle data, by taking one piece of data from atop the stack, termed popping, and putting data back atop the stack, termed pushing. Any expression that can be written conventionally, or in another programming language, can be written in postfix (or prefix) form and thus be amenable to being interpreted by a stack-oriented language.

Stack manipulation

Since the stack is the key means to manipulate data in a stack-oriented language, such languages often provide some sort of stack manipulation operators. Commonly provided are dup, to duplicate the element atop the stack, exch (or swap), to exchange elements atop the stack (the first becomes second and the second becomes first), roll, to cyclically permute elements in the stack or on part of the stack, pop (or drop), to discard the element atop the stack (push is implicit), and others. These become key in studying procedures.

Stack effect diagrams

As an aid to understanding the effect of statement, a short comment is used showing the top of the stack before and after the statement. The top of the stack is rightmost if there are multiple items. This notation is commonly used in the Forth language, where comments are enclosed in parentheses.

( before -- after )

For example, the basic Forth stack operators are described:

dup( a -- a a )drop( a -- )swap( a b -- b a )over( a b -- a b a )rot( a b c -- b c a )

And the fib function below is described:

fib( n -- n' )

It is equivalent to preconditions and postconditions in Hoare logic. Both comments may also be referenced as assertions, though not necessarily in context of Stack-based languages.

PostScript stacks

PostScript and some other stack languages have other separate stacks for other purposes.

Variables and dictionaries

The evaluation of different expressions has already been analysed. The implementation of variables is important for any programming language, but for stack-oriented languages it is of special concern, as there is only one way to interact with data.

The way variables are implemented in stack-oriented languages such as PostScript usually involves a separate, specialized stack which holds dictionaries of key–value pairs. To create a variable, a key (the variable name) must be created first, with which a value is then associated. In PostScript, a name data object is prefixed with a /, so /x is a name data object which can be associated with, for example, the number 42. The define command is def, so

/x 42 def

associates with the name x with the number 42 in the dictionary atop the stack. A difference exists between /x and x – the former is a data object representing a name, x stands for what is defined under /x.

Procedures

A procedure in a stack-based programming language is treated as a data object in its own right. In PostScript, procedures are denoted between { and }.

For example, in PostScript syntax,

{ dup mul }

represents an anonymous procedure to duplicate what is on the top of the stack and then multiply the result – a squaring procedure.

Since procedures are treated as simple data objects, names with procedures can be defined. When they are retrieved, they are executed directly.

Dictionaries provide a means of controlling scoping, as well as storing of definitions.

Since data objects are stored in the top-most dictionary, an unexpected ability arises naturally: when looking up a definition from a dictionary, the topmost dictionary is checked, then the next, and so on. If a procedure is defined that has the same name as another already defined in a different dictionary, the local one will be called.

Anatomy of some typical procedures

Procedures often take arguments. They are handled by the procedure in a very specific way, different from that of other programming languages.

To examine a Fibonacci number program in PostScript:

/fib{dupdup1eqexch0eqornot{dup1subfibexch2subfibadd}if}def

A recursive definition is used on the stack. The Fibonacci number function takes one argument. First, it is tested for being 1 or 0.

Decomposing each of the program's key steps, reflecting the stack, assuming calculation of fib(4) :

                stack: 4    dup                 stack: 4 4    dup                 stack: 4 4 4    1 eq                 stack: 4 4 false    exch                 stack: 4 false 4    0 eq                 stack: 4 false false    or                 stack: 4 false    not                 stack: 4 true

Since the expression evaluates to true, the inner procedure is evaluated.

                stack: 4    dup                 stack: 4 4    1 sub                 stack: 4 3    fib
(recursive call here)
                stack: 4 F(3)    exch                 stack: F(3) 4    2 sub                 stack: F(3) 2    fib
(recursive call here)
                stack: F(3) F(2)    add                 stack: F(3)+F(2)

which is the expected result.

This procedure does not use named variables, purely the stack. Named variables can be created by using the /a exch def construct. For example, {/n exch def n n mul}

is a squaring procedure with a named variable n. Assuming that /sq {/n exch def n n mul} def and 3 sq is called, the procedure sq is analysed in the following way:

               stack: 3 /n    exch                stack: /n 3    def                stack: empty (it has been defined)    n                stack: 3    n                stack: 3 3    mul                stack: 9

which is the expected result.

Control and flow

As there exist anonymous procedures, flow control can arise naturally. Three pieces of data are required for an if-then-else statement: a condition, a procedure to be done if the condition is true, and one to be done if the condition is false. In PostScript for example,

23gt{(2 is greater than three)=}{(2 is not greater than three)=}ifelse

performs the near equivalent in C:

if(2>3){printf("2 is greater than three\n");}else{printf("2 is not greater than three\n");}

Looping and other constructs are similar.

Analysis of the language model

The simple model provided in a stack-oriented language allows expressions and programs to be interpreted simply and theoretically evaluated much faster, since no syntax analysis need be done, only lexical analysis. The way such programs are written facilitates being interpreted by machines, which is why PostScript suits printers well for its use. However, the slightly artificial way of writing PostScript programs can form an initial barrier to understanding stack-oriented languages such as PostScript.

While the ability to shadow by overriding inbuilt and other definitions can make programs hard to debug, and irresponsible use of this feature can cause unpredictable behaviour, it can simplify some functions greatly. For example, in PostScript use, the showpage operator can be overridden with a custom one that applies a certain style to the page, instead of having to define a custom operator or to repeat code to generate the style.

See also

Related Research Articles

<span class="mw-page-title-main">PostScript</span> File format and programming language

PostScript is a page description language and dynamically typed, stack-based programming language. It is most commonly used in the electronic publishing and desktop publishing realm, but as a Turing complete programming language, it can be used for many other purposes as well. PostScript was created at Adobe Systems by John Warnock, Charles Geschke, Doug Brotz, Ed Taft and Bill Paxton from 1982 to 1984. The most recent version, PostScript 3, was released in 1997.

Polish notation (PN), also known as normal Polish notation (NPN), Łukasiewicz notation, Warsaw notation, Polish prefix notation or simply prefix notation, is a mathematical notation in which operators precede their operands, in contrast to the more common infix notation, in which operators are placed between operands, as well as reverse Polish notation (RPN), in which operators follow their operands. It does not need any parentheses as long as each operator has a fixed number of operands. The description "Polish" refers to the nationality of logician Jan Łukasiewicz, who invented Polish notation in 1924.

Reverse Polish notation (RPN), also known as reverse Łukasiewicz notation, Polish postfix notation or simply postfix notation, is a mathematical notation in which operators follow their operands, in contrast to prefix or Polish notation (PN), in which operators precede their operands. The notation does not need any parentheses for as long as each operator has a fixed number of operands.

<span class="mw-page-title-main">Scheme (programming language)</span> Dialect of Lisp

Scheme is a dialect of the Lisp family of programming languages. Scheme was created during the 1970s at the MIT Computer Science and Artificial Intelligence Laboratory and released by its developers, Guy L. Steele and Gerald Jay Sussman, via a series of memos now known as the Lambda Papers. It was the first dialect of Lisp to choose lexical scope and the first to require implementations to perform tail-call optimization, giving stronger support for functional programming and associated techniques such as recursive algorithms. It was also one of the first programming languages to support first-class continuations. It had a significant influence on the effort that led to the development of Common Lisp.

<span class="mw-page-title-main">Lua (programming language)</span> Lightweight programming language

Lua is a lightweight, high-level, multi-paradigm programming language designed primarily for embedded use in applications. Lua is cross-platform, since the interpreter of compiled bytecode is written in ANSI C, and Lua has a relatively simple C API to embed it into applications.

In computer programming, the scope of a name binding is the part of a program where the name binding is valid; that is, where the name can be used to refer to the entity. In other parts of the program, the name may refer to a different entity, or to nothing at all. Scope helps prevent name collisions by allowing the same name to refer to different objects – as long as the names have separate scopes. The scope of a name binding is also known as the visibility of an entity, particularly in older or more technical literature—this is in relation to the referenced entity, not the referencing name.

In programming languages, a closure, also lexical closure or function closure, is a technique for implementing lexically scoped name binding in a language with first-class functions. Operationally, a closure is a record storing a function together with an environment. The environment is a mapping associating each free variable of the function with the value or reference to which the name was bound when the closure was created. Unlike a plain function, a closure allows the function to access those captured variables through the closure's copies of their values or references, even when the function is invoked outside their scope.

In computer science, imperative programming is a programming paradigm of software that uses statements that change a program's state. In much the same way that the imperative mood in natural languages expresses commands, an imperative program consists of commands for the computer to perform. Imperative programming focuses on describing how a program operates step by step, rather than on high-level descriptions of its expected results.

<span class="mw-page-title-main">Stack (abstract data type)</span> Abstract data type

In computer science, a stack is an abstract data type that serves as a collection of elements with two main operations:

<span class="mw-page-title-main">Oberon-2</span> Programming language

Oberon-2 is an extension of the original Oberon programming language that adds limited reflection and object-oriented programming facilities, open arrays as pointer base types, read-only field export, and reintroduces the FOR loop from Modula-2.

<span class="mw-page-title-main">Apache Groovy</span> Programming language

Apache Groovy is a Java-syntax-compatible object-oriented programming language for the Java platform. It is both a static and dynamic language with features similar to those of Python, Ruby, and Smalltalk. It can be used as both a programming language and a scripting language for the Java Platform, is compiled to Java virtual machine (JVM) bytecode, and interoperates seamlessly with other Java code and libraries. Groovy uses a curly-bracket syntax similar to Java's. Groovy supports closures, multiline strings, and expressions embedded in strings. Much of Groovy's power lies in its AST transformations, triggered through annotations.

In computer science, computer engineering and programming language implementations, a stack machine is a computer processor or a virtual machine in which the primary interaction is moving short-lived temporary values to and from a push down stack. In the case of a hardware processor, a hardware stack is used. The use of a stack significantly reduces the required number of processor registers. Stack machines extend push-down automata with additional load/store operations or multiple stacks and hence are Turing-complete.

<span class="mw-page-title-main">Factor (programming language)</span> Stack-oriented programming language

Factor is a stack-oriented programming language created by Slava Pestov. Factor is dynamically typed and has automatic memory management, as well as powerful metaprogramming features. The language has a single implementation featuring a self-hosted optimizing compiler and an interactive development environment. The Factor distribution includes a large standard library.

Cilk, Cilk++, Cilk Plus and OpenCilk are general-purpose programming languages designed for multithreaded parallel computing. They are based on the C and C++ programming languages, which they extend with constructs to express parallel loops and the fork–join idiom.

In computer science, corecursion is a type of operation that is dual to recursion. Whereas recursion works analytically, starting on data further from a base case and breaking it down into smaller data and repeating until one reaches a base case, corecursion works synthetically, starting from a base case and building it up, iteratively producing data further removed from a base case. Put simply, corecursive algorithms use the data that they themselves produce, bit by bit, as they become available, and needed, to produce further bits of data. A similar but distinct concept is generative recursion, which may lack a definite "direction" inherent in corecursion and recursion.

Tacit programming, also called point-free style, is a programming paradigm in which function definitions do not identify the arguments on which they operate. Instead the definitions merely compose other functions, among which are combinators that manipulate the arguments. Tacit programming is of theoretical interest, because the strict use of composition results in programs that are well adapted for equational reasoning. It is also the natural style of certain programming languages, including APL and its derivatives, and concatenative languages such as Forth. The lack of argument naming gives point-free style a reputation of being unnecessarily obscure, hence the epithet "pointless style".

A property, in some object-oriented programming languages, is a special sort of class member, intermediate in functionality between a field and a method. The syntax for reading and writing of properties is like for fields, but property reads and writes are (usually) translated to 'getter' and 'setter' method calls. The field-like syntax is easier to read and write than many method calls, yet the interposition of method calls "under the hood" allows for data validation, active updating, or implementation of what may be called "read-only fields".

In computer programming, a function, subprogram, procedure, method, routine or subroutine is a callable unit that has a well-defined behavior and can be invoked by other software units to exhibit that behavior.

Tcl is a high-level, general-purpose, interpreted, dynamic programming language. It was designed with the goal of being very simple but powerful. Tcl casts everything into the mold of a command, even programming constructs like variable assignment and procedure definition. Tcl supports multiple programming paradigms, including object-oriented, imperative, functional, and procedural styles.

<span class="mw-page-title-main">Nim (programming language)</span> Programming language

Nim is a general-purpose, multi-paradigm, statically typed, compiled high-level systems programming language, designed and developed by a team around Andreas Rumpf. Nim is designed to be "efficient, expressive, and elegant", supporting metaprogramming, functional, message passing, procedural, and object-oriented programming styles by providing several features such as compile time code generation, algebraic data types, a foreign function interface (FFI) with C, C++, Objective-C, and JavaScript, and supporting compiling to those same languages as intermediate representations.

References

  1. Luerweg, T. (2015). Stack based programming paradigms. Concepts of Programming Languages–CoPL’15, 33.
  2. Oren Patashnik, Designing BibTeX styles (PDF)[ dead link ]