Befunge

Last updated
Befunge
Developer Chris Pressey
First appeared1993 (1993)
Website catseye.tc/node/Befunge-93.html
Influenced by
Forth, FALSE

Befunge is a two-dimensional stack-based, reflective, esoteric programming language. [1] It differs from conventional languages in that programs are arranged on a two-dimensional grid. "Arrow" instructions direct the control flow to the left, right, up or down, and loops are constructed by sending the control flow in a cycle. It has been described as "a cross between Forth and Lemmings". [2]

Contents

Befunge was created by Chris Pressey in 1993 for the Amiga. The language was designed to be as hard to compile as possible, featuring self-modifying code and a multi-dimensional playfield. Despite this, several compilers have been written for the language. The original Befunge-93 specification limited programs to an 80x25 grid, and while not Turing-complete, subsequent extensions like Funge-98 expanded the concept to achieve Turing completeness.

The name "Befunge" originated from a typing error in an online discussion. While it was designed to be difficult to compile, compilers such as bef2c and Betty have managed to implement the language using various techniques. Befunge programs are characterized by their use of arrows to change control flow, and they can produce outputs like random number sequences or classic "Hello, World!" messages.

History

The language was originally created by Chris Pressey [3] in 1993 for the Amiga, as an attempt to devise a language which is as hard to compile as possible. Note that the p command allows for self-modifying code. Nevertheless, a number of compilers have subsequently been written. A number of extensions to the original "Befunge-93" specification also exist, most notably Funge-98, which extends the concept to an arbitrary number of dimensions and can be multithreaded, with multiple instruction pointers operating simultaneously on the same space. Befunge-extensions and variants are called Fungeoids or just Funges.

The Befunge-93 specification restricts each valid program to a grid of 80 instructions horizontally by 25 instructions vertically. Program execution which exceeds these limits "wraps around" to a corresponding point on the other side of the grid; a Befunge program is in this manner topologically equivalent to a torus. Since a Befunge-93 program can only have a single stack and its storage array is bounded, the Befunge-93 language is not Turing-complete (however, it has been shown that Befunge-93 is Turing Complete with unbounded stack word size). [4] The later Funge-98 specification provides Turing completeness by removing the size restrictions on the program; rather than wrapping around at a fixed limit, the movement of a Funge-98 instruction pointer follows a model dubbed "Lahey-space" after its originator, Chris Lahey. In this model, the grid behaves like a torus of finite size with respect to wrapping, while still allowing itself to be extended indefinitely.

Etymology

The word Befunge is derived from a typing error in an online discussion, where the word 'before' was intended.[ citation needed ]

Compilation

As stated, the design goal for Befunge was to create a language which was difficult to compile. This was attempted with the implementation of self-modifying code (the 'p' instruction can write new instructions into the playfield) and a multi-dimensional playfield (the same instruction can be executed in four different directions).

Nevertheless, these obstacles have been overcome, to some degree, and Befunge compilers have been written using appropriate techniques.

The bef2c compiler included with the standard Befunge-93 distribution uses threaded code: each instruction is compiled to a snippet of C code, and control flows through the snippets just as it does in a Befunge interpreter (that is, conditionally on the value of some 'direction' register). This does not result in a significant advantage over a good interpreter. Note that the bef2c compiler is not correct since it does not handle either 'p' or string mode, but it would not be impossible to make it do so (although the C language might not be well-suited for this).

The etty compiler, for example, treats every possible straight line of instructions as a subprogram, and if a 'p' instruction alters that subprogram, that subprogram is recompiled. This variation on just-in-time compilation results in a much better advantage over an interpreter, since many instructions can be executed in native code without making intervening decisions on the 'direction' register.

Sample Befunge-93 code

The technique of using arrows to change control flow is demonstrated in the random number generator program below. The Befunge instruction pointer begins in the upper left corner and will travel to the right if not redirected. Following the arrows around, the ? instructions send the instruction pointer in random cardinal directions until the pointer hits a digit, pushing it to the stack. Then the arrows navigate to the . to output the digit from the stack and return the pointer to the first directional randomiser. There is no @ to terminate this program, so it produces an endless stream of random numbers from 1 to 9.

v>>>>>v12345^?^>??^v?v6789>>>>v^.<

The following code is an example of the classic "Hello World!" program. First the letters "olleH" are pushed onto the stack as ASCII numbers. These are then popped from the stack in LIFO order and output as text characters to give "Hello". A space is character number 32 in ASCII, which here is constructed by multiplying 4 and 8, before being output as text. The remaining code then outputs "World!" in a similar way, followed by ASCII character 10 (a line feed character, moving the output cursor to a new line).

>vv,,,,,"Hello"<>48*,vv,,,,,,"World!"<>25*,@

The following code is a slightly more complicated version. It adds the ASCII character 10 (a line feed character) to the stack, and then pushes "!dlrow ,olleH" to the stack. Again, LIFO ordering means that "H" is now the top of the stack and will be the first printed, "e" is second, and so on. To print the characters, the program enters a loop that first duplicates the top value on the stack (so now the stack would look like "\n!dlrow ,olleHH"). Then the "_" operation will pop the duplicated value, and go right if it's a zero, left otherwise. (This assumes a compliant interpreter that "returns" 0 when popping an empty stack.) When it goes left, it pops and prints the top value as an ASCII character. It then duplicates the next character and loops back to the "_" test, continuing to print the rest of the stack until it is empty and so the next value popped is 0, at which point "@" ends the program.

>25*"!dlrow ,olleH":vv:,_@>^

Befunge-93 instruction list

0-9Push this number onto the stack.
+Addition: Pop a and b, then push a+b
-Subtraction: Pop a and b, then push b-a
*Multiplication: Pop a and b, then push a*b
/Integer division: Pop a and b, then push b/a, rounded towards 0.
%Modulo: Pop a and b, then push the remainder of the integer division of b/a.
!Logical NOT: Pop a value. If the value is zero, push 1; otherwise, push zero.
`Greater than: Pop a and b, then push 1 if b>a, otherwise zero.
>Start moving right
<Start moving left
^Start moving up
vStart moving down
?Start moving in a random cardinal direction
_Pop a value; move right if value=0, left otherwise
|Pop a value; move down if value=0, up otherwise
"Start string mode: push each character's ASCII value all the way up to the next "
:Duplicate value on top of the stack
\Swap two values on top of the stack
$Pop value from the stack and discard it
.Pop value and output as an integer followed by a space
,Pop value and output as ASCII character
#Bridge: Skip next cell
pA "put" call (a way to store a value for later use). Pop y, x, and v, then change the character at (x,y) in the program to the character with ASCII value v
gA "get" call (a way to retrieve data in storage). Pop y and x, then push ASCII value of the character at that position in the program
&Ask user for a number and push it
~Ask user for a character and push its ASCII value
@End program
(space) No-op. Does nothing

Most one-dimensional programming languages require some syntactic distinction between comment text and source code although that distinction may be as trivial as Brainfuck's rule that any character not in the set +-[]<>,. is a comment. Languages like Lisp and Python treat strings as comments in contexts where the values are not used. Similarly, in Befunge, there is no comment syntax: to embed documentation in the code, the programmer simply routes the control flow around the "comment" area, so that the text in that area is never executed.

See also

Related Research Articles

Brainfuck is an esoteric programming language created in 1993 by Urban Müller.

C is a general-purpose computer programming language. It was created in the 1970s by Dennis Ritchie, and remains very widely used and influential. By design, C's features cleanly reflect the capabilities of the targeted CPUs. It has found lasting use in operating systems, device drivers, and protocol stacks, but its use in application software has been decreasing. C is commonly used on computer architectures that range from the largest supercomputers to the smallest microcontrollers and embedded systems.

Forth is a procedural, stack-oriented programming language and interactive environment designed by Charles H. "Chuck" Moore and first used by other programmers in 1970. Although not an acronym, the language's name in its early years was often spelled in all capital letters as FORTH. The FORTH-79 and FORTH-83 implementations, which were not written by Moore, became de facto standards, and an official standardization of the language was published in 1994 as ANS Forth. A wide range of Forth derivatives existed before and after ANS Forth. The free software Gforth implementation is actively maintained, as are several commercially supported systems.

A "Hello, World!" program is generally a simple computer program which outputs to the screen a message similar to "Hello, World!" while ignoring any user input. A small piece of code in most general-purpose programming languages, this program is used to illustrate a language's basic syntax. A "Hello, World!" program is often the first written by a student of a new programming language, but such a program can also be used as a sanity check to ensure that the computer software intended to compile or run source code is correctly installed, and that its operator understands how to use it.

<span class="mw-page-title-main">INTERCAL</span> Esoteric programming language

The Compiler Language With No Pronounceable Acronym (INTERCAL) is an esoteric programming language that was created as a parody by Don Woods and James M. Lyon, two Princeton University students, in 1972. It satirizes aspects of the various programming languages at the time, as well as the proliferation of proposed language constructs and notations in the 1960s.

<span class="mw-page-title-main">Programming language</span> Language for communicating instructions to a machine

A programming language is a system of notation for writing computer programs.

In computing, a segmentation fault or access violation is a fault, or failure condition, raised by hardware with memory protection, notifying an operating system (OS) the software has attempted to access a restricted area of memory. On standard x86 computers, this is a form of general protection fault. The operating system kernel will, in response, usually perform some corrective action, generally passing the fault on to the offending process by sending the process a signal. Processes can in some cases install a custom signal handler, allowing them to recover on their own, but otherwise the OS default signal handler is used, generally causing abnormal termination of the process, and sometimes a core dump.

An esoteric programming language is a programming language designed to test the boundaries of computer programming language design, as a proof of concept, as software art, as a hacking interface to another language, or as a joke. The use of the word esoteric distinguishes them from languages that working developers use to write software. The creators of most esolangs do not intend them to be used for mainstream programming, although some esoteric features, such as visuospatial syntax, have inspired practical applications in the arts. Such languages are often popular among hackers and hobbyists.

<span class="mw-page-title-main">Interpreter (computing)</span> Program that executes source code without a separate compilation step

In computer science, an interpreter is a computer program that directly executes instructions written in a programming or scripting language, without requiring them previously to have been compiled into a machine language program. An interpreter generally uses one of the following strategies for program execution:

  1. Parse the source code and perform its behavior directly;
  2. Translate source code into some efficient intermediate representation or object code and immediately execute that;
  3. Explicitly execute stored precompiled bytecode made by a compiler and matched with the interpreter Virtual Machine.

In computer science, an abstract machine is a theoretical model that allows for a detailed and precise analysis of how a computer system functions. It is similar to a mathematical function in that it receives inputs and produces outputs based on predefined rules. Abstract machines vary from literal machines in that they are expected to perform correctly and independently of hardware. Abstract machines are "machines" because they allow step-by-step execution of programmes; they are "abstract" because they ignore many aspects of actual (hardware) machines. A typical abstract machine consists of a definition in terms of input, output, and the set of allowable operations used to turn the former into the latter. They can be used for purely theoretical reasons as well as models for real-world computer systems. In the theory of computation, abstract machines are often used in thought experiments regarding computability or to analyse the complexity of algorithms. This use of abstract machines is fundamental to the field of computational complexity theory, such as finite state machines, Mealy machines, push-down automata, and Turing machines.

In computer science, imperative programming is a programming paradigm of software that uses statements that change a program's state. In much the same way that the imperative mood in natural languages expresses commands, an imperative program consists of commands for the computer to perform. Imperative programming focuses on describing how a program operates step by step, rather than on high-level descriptions of its expected results.

<span class="mw-page-title-main">Whitespace (programming language)</span> Esoteric programming language

Whitespace is an esoteric programming language developed by Edwin Brady and Chris Morris at the University of Durham. It was released on 1 April 2003. Its name is a reference to whitespace characters. Unlike most programming languages, which ignore or assign little meaning to most whitespace characters, the Whitespace interpreter ignores any non-whitespace characters. Only spaces, tabs and linefeeds have meaning.

x86 assembly language is the name for the family of assembly languages which provide some level of backward compatibility with CPUs back to the Intel 8008 microprocessor, which was launched in April 1972. It is used to produce object code for the x86 class of processors.

<span class="mw-page-title-main">Malbolge</span> 1998 esoteric programming language

Malbolge is a public domain esoteric programming language invented by Ben Olmstead in 1998, named after the eighth circle of hell in Dante's Inferno, the Malebolge. It was specifically designed to be almost impossible to use, via a counter-intuitive 'crazy operation', base-three arithmetic, and self-altering code. It builds on the difficulty of earlier challenging esoteric languages, but exaggerates this aspect to an extreme degree, playing on the entangled histories of computer science and encryption. Despite this design, it is possible to write useful Malbolge programs.

In computer science, computer engineering and programming language implementations, a stack machine is a computer processor or a virtual machine in which the primary interaction is moving short-lived temporary values to and from a push down stack. In the case of a hardware processor, a hardware stack is used. The use of a stack significantly reduces the required number of processor registers. Stack machines extend push-down automata with additional load/store operations or multiple stacks and hence are Turing-complete.

In computer science, a calling convention is an implementation-level (low-level) scheme for how subroutines or functions receive parameters from their caller and how they return a result. When some code calls a function, design choices have been taken for where and how parameters are passed to that function, and where and how results are returned from that function, with these transfers typically done via certain registers or within a stack frame on the call stack. There are design choices for how the tasks of preparing for a function call and restoring the environment after the function has completed are divided between the caller and the callee. Some calling convention specifies the way every function should get called. The correct calling convention should be used for every function call, to allow the correct and reliable execution of the whole program using these functions.

IP Pascal is an implementation of the Pascal programming language using the IP portability platform, a multiple machine, operating system and language implementation system. It implements the language "Pascaline", and has passed the Pascal Validation Suite.

<span class="mw-page-title-main">Control table</span> Data structures that control the execution order of computer commands

Control tables are tables that control the control flow or play a major part in program control. There are no rigid rules about the structure or content of a control table—its qualifying attribute is its ability to direct control flow in some way through "execution" by a processor or interpreter. The design of such tables is sometimes referred to as table-driven design. In some cases, control tables can be specific implementations of finite-state-machine-based automata-based programming. If there are several hierarchical levels of control table they may behave in a manner equivalent to UML state machines

In computer programming, a function or subroutine is a sequence of program instructions that performs a specific task, packaged as a unit. This unit can then be used in programs wherever that particular task should be performed.

Leet is an esoteric programming language based loosely on Brainfuck and named for the resemblance of its source code to the symbolic language "L33t 5p34k". L33t was designed by Stephen McGreal and Alex Mole to be as confusing as possible. It is Turing-complete and has the possibility for self-modifying code. Software written in the language can make network connections and may therefore be used to write malware.

References

  1. "Befunge - Esolang".
  2. "The Befunge FAQ v.4". 1997-11-04. Archived from the original on 2001-04-17. Retrieved 2014-01-23.
  3. Ais523 (2008-12-18). "Chris Pressey". Esolang. Retrieved 2014-01-23.
  4. Oerjan (2014-01-18). "Talk:Befunge". Esolang. Retrieved 2014-01-23.