Abstract machine

Last updated

In computer science, an abstract machine is a theoretical model that allows for a detailed and precise analysis of how a computer system functions. [1] It is similar to a mathematical function in that it receives inputs and produces outputs based on predefined rules. Abstract machines vary from literal machines in that they are expected to perform correctly and independently of hardware. [2] Abstract machines are "machines" because they allow step-by-step execution of programs; they are "abstract" because they ignore many aspects of actual (hardware) machines. [3] A typical abstract machine consists of a definition in terms of input, output, and the set of allowable operations used to turn the former into the latter. They can be used for purely theoretical reasons as well as models for real-world computer systems. [2] In the theory of computation, abstract machines are often used in thought experiments regarding computability or to analyse the complexity of algorithms. [3] This use of abstract machines is fundamental to the field of computational complexity theory, such as finite state machines, Mealy machines, push-down automata, and Turing machines. [4]

Contents

Classification

Abstract machines are typically categorized into two types based on the quantity of operations they can execute simultaneously at any given moment: deterministic abstract machines and non-deterministic abstract machines. [2] A deterministic abstract machine is a system in which a particular beginning state or condition always yields the same outputs. There is no randomness or variation in how inputs are transformed into outputs. [5] In contrast, a non-deterministic abstract machine can provide various outputs for the same input on different executions. [2] Unlike a deterministic algorithm, which gives the same result for the same input regardless of the number of iterations, a non-deterministic algorithm takes various paths to arrive to different outputs. [6] Non-deterministic algorithms are helpful for obtaining approximate answers when deriving a precise solution using a deterministic approach is difficult or costly. [7]

A run of a Turing machine 2-state 3-symbol Turing Machine.png
A run of a Turing machine

Turing machines, for example, are some of the most fundamental abstract machines in computer science. [2] These machines conduct operations on a tape (a string of symbols) of any length. Their instructions provide for both modifying the symbols and changing the symbol that the machine’s pointer is currently at. For example, a rudimentary Turing machine could have a single command, "convert symbol to 1 then move right", and this machine would only produce a string of 1s. [8] This basic Turing machine is deterministic; however, nondeterministic Turing machines that can execute several actions given the same input may also be built. [2]

Implementation

Any implementation of an abstract machine in the case of physical implementation (in hardware) uses some kind of physical device (mechanical or electronic) to execute the instructions of a programming language. An abstract machine, however, can also be implemented in software or firmware at levels between the abstract machine and underlying physical device. [9]

Programming language implementation

An abstract machine is, intuitively, just an abstraction of the idea of a physical computer. [13] For actual execution, algorithms must be properly formalised using the constructs offered by a programming language. This implies that the algorithms to be executed must be expressed using programming language instructions. [3] The syntax of a programming language enables the construction of programs using a finite set of constructs known as instructions. Most abstract machines share a program store and a state, which often includes a stack and registers. [9] [14] In digital computers, the stack is simply a memory unit with an address register that can count only positive integers (after an initial value is loaded into it). The address register for the stack is known as a stack pointer because its value always refers to the top item on the stack. [15] The program consists of a series of instructions, with a stack pointer indicating the next instruction to be performed. When the instruction is completed, a stack pointer is advanced. This fundamental control mechanism of an abstract machine is also known as its execution loop. [3] Thus, an abstract machine for a programming language is any collection of data structures and algorithms capable of storing and running programs written in the programming language. It bridges the gap between the high level of a programming language and the low level of an actual machine by providing an intermediate language step for compilation. An abstract machine's instructions are adapted to the unique operations necessary to implement operations of a certain source language or set of source languages. [9]

Imperative languages

In the late 1950s, the Association for Computing Machinery (ACM) and other allied organisations developed many proposals for Universal Computer Oriented Language (UNCOL), such as Conway's machine. The UNCOL concept is good, but it has not been widely used due to the poor performance of the generated code. In many areas of computing, its performance will continue to be an issue despite the development of the Java Virtual Machine in the late 1990s. Algol Object Code (1964), P4-machine (1976), UCSD P-machine (1977), and Forth (1970) are some successful abstract machines of this kind. [3]

Object-oriented languages

Abstract machines for object-oriented programming languages are often stack-based and have special access instructions for object fields and methods. In these machines, memory management is often implicit performed by a garbage collector (memory recovery feature built into programming languages). [16] Smalltalk-80 (1980), Self (1989), and Java (1994) are examples of this implementation. [3]

String processing languages

A string processing language is a computer language that focuses on processing strings rather than numbers. There have been string processing languages in the form of command shells, programming tools, macro processors, and scripting languages for decades. [17] Using a suitable abstract machine has two benefits: increased execution speed and enhanced portability. Snobol4 and ML/I are two notable instances of early string processing languages that use an abstract machine to gain machine independence. [3]

Functional programming languages

Pictorial representation of a Krivine machine Machine de Krivine.jpg
Pictorial representation of a Krivine machine

The early abstract machines for functional languages, including the SECD machine (1964) and Cardelli's Functional Abstract Machine (1983), defined strict evaluation, also known as eager or call-by-value evaluation, [3] in which function arguments are evaluated before the call and precisely once. Recently, the majority of research has been on lazy (or call-by-need) evaluation, [18] such as the G-machine (1984), Krivine machine (1985), and Three Instruction Machine (1986), in which function arguments are evaluated only if necessary and at most once. One reason is because effective implementation of strict evaluation is now well-understood, therefore the necessity for an abstract machine has diminished. [3]

Logical languages

Predicate calculus (first order logic) is the foundation of logic programming languages. The most well-known logic programming language is Prolog. The rules in Prolog are written in a uniform format known as universally quantified 'Horn clauses', which means to begin the calculation that attempts to discover a proof of the objective. The Warren Abstract Machine WAM (1983), [3] which has become the de facto standard in Prolog program compilation, has been the focus of most study. It provides special purpose instructions such as data unification instructions and control flow instructions to support backtracking (searching algorithm). [19]

Structure

A generic abstract machine is made up of a memory and an interpreter. The memory is used to store data and programs, while the interpreter is the component that executes the instructions included in programs. [9]

The structure of an abstract machine The structure of an abstract machine.png
The structure of an abstract machine

The interpreter must carry out the operations that are unique to the language it is interpreting. However, given the variety of languages, it is conceivable to identify categories of operations and an "execution mechanism" shared by all interpreters. The interpreter's operations and accompanying data structures are divided into the following categories: [9] [20]

  1. Operations for processing primitive data:
  2. Operations and data structures for controlling the sequence of execution of operations;
  3. Operations and data structures for controlling data transfers;
  4. Operations and data structures for memory management.

Processing primitive data

An abstract machine must contain operations for manipulating primitive data types such as strings and integers. [9] For example, integers are nearly universally considered a basic data type for both physical abstract machines and the abstract machines used by many programming languages. The machine carries out the arithmetic operations necessary, such as addition and multiplication, within a single time step. [21]

Sequence control

Operations and structures for "sequence control" allow controlling the execution flow of program instructions. When certain conditions are met, it is necessary to change the typical sequential execution of a program. [9] Therefore, the interpreter employs data structures (such as those used to store the address of the next instruction to execute) that are modified by operations distinct from those used for data manipulation (for example, operations to update the address of the next instruction to execute). [22]

Controlling data transfers

Data transfer operations are used to control how operands and data are transported from memory to the interpreter and vice versa. These operations deal with the store and the retrieval order of operands from the store. [9]

Memory management

Memory management is concerned with the operations performed in memory to allocate data and applications. In the abstract machine, data and programmes can be held indefinitely, or in the case of programming languages, memory can be allocated or deallocated using a more complex mechanism. [9]

Hierarchies

A hierarchy of abstract machines A hierarchy of abstract machines.png
A hierarchy of abstract machines

Abstract machine hierarchies are often employed, in which each machine uses the functionality of the level immediately below and adds additional functionality of its own to meet the level immediately above. A hardware computer, constructed with physical electronic devices, can be added at the most basic level. Above this level, the abstract microprogrammed machine level may be introduced. The abstract machine supplied by the operating system, which is implemented by a program written in machine language, is located immediately above (or directly above the hardware if the firmware level is not there). On the one hand, the operating system extends the capability of the physical machine by providing higher-level primitives that are not available on the physical machine (for example, primitives that act on files). The host machine is formed by the abstract machine given by the operating system, on which a high-level programming language is implemented using an intermediary machine, such as the Java Virtual machine and its byte code language. The level given by the abstract machine for the high-level language (for example, Java) is not usually the final level of hierarchy. At this point, one or more applications that deliver additional services together may be introduced. A "web machine" level, for example, can be added to implement the functionalities necessary to handle Web communications (communications protocols or HTML code presentation). The "Web Service" level is located above this, and it provides the functionalities necessary to make web services communicate, both in terms of interaction protocols and the behaviour of the processes involved. At this level, entirely new languages that specify the behaviour of so-called "business processes" based on Web services may be developed (an example is the Business Process Execution Language). Finally, a specialised application can be found at the highest level (for example, E-commerce) which has very specific and limited functionality. [9]

See also

Related Research Articles

<span class="mw-page-title-main">Algorithm</span> Sequence of operations for a task

In mathematics and computer science, an algorithm is a finite sequence of mathematically rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algorithms are used as specifications for performing calculations and data processing. More advanced algorithms can use conditionals to divert the code execution through various routes and deduce valid inferences.

<span class="mw-page-title-main">Computer program</span> Instructions a computer can execute

A computer program is a sequence or set of instructions in a programming language for a computer to execute. It is one component of software, which also includes documentation and other intangible components.

<span class="mw-page-title-main">Programming language</span> Language for communicating instructions to a machine

A programming language is a system of notation for writing computer programs. Programming languages are described in terms of their syntax (form) and semantics (meaning), usually defined by a formal language. Languages usually provide features such as a type system, variables, and mechanisms for error handling. An implementation of a programming language is required in order to execute programs, namely an interpreter or a compiler. An interpreter directly executes the source code, while a compiler produces an executable program.

Prolog is a logic programming language that has its origins in artificial intelligence, automated theorem proving and computational linguistics.

<span class="mw-page-title-main">Turing machine</span> Computation model defining an abstract machine

A Turing machine is a mathematical model of computation describing an abstract machine that manipulates symbols on a strip of tape according to a table of rules. Despite the model's simplicity, it is capable of implementing any computer algorithm.

In computability theory, a system of data-manipulation rules is said to be Turing-complete or computationally universal if it can be used to simulate any Turing machine. This means that this system is able to recognize or decode other data-manipulation rule sets. Turing completeness is used as a way to express the power of such a data-manipulation rule set. Virtually all programming languages today are Turing-complete.

<span class="mw-page-title-main">Virtual machine</span> Software that emulates an entire computer

In computing, a virtual machine (VM) is the virtualization or emulation of a computer system. Virtual machines are based on computer architectures and provide the functionality of a physical computer. Their implementations may involve specialized hardware, software, or a combination of the two. Virtual machines differ and are organized by their function, shown here:

In computer science, threaded code is a programming technique where the code has a form that essentially consists entirely of calls to subroutines. It is often used in compilers, which may generate code in that form or be implemented in that form themselves. The code may be processed by an interpreter or it may simply be a sequence of machine code call instructions.

In computer science, an instruction set architecture (ISA) is an abstract model that generally defines how software controls the CPU in a computer or a family of computers. A device or program that executes instructions described by that ISA, such as a central processing unit (CPU), is called an implementation of that ISA.

<span class="mw-page-title-main">Interpreter (computing)</span> Program that executes source code without a separate compilation step

In computer science, an interpreter is a computer program that directly executes instructions written in a programming or scripting language, without requiring them previously to have been compiled into a machine language program. An interpreter generally uses one of the following strategies for program execution:

  1. Parse the source code and perform its behavior directly;
  2. Translate source code into some efficient intermediate representation or object code and immediately execute that;
  3. Explicitly execute stored precompiled bytecode made by a compiler and matched with the interpreter's virtual machine.

In computer science, algorithmic efficiency is a property of an algorithm which relates to the amount of computational resources used by the algorithm. Algorithmic efficiency can be thought of as analogous to engineering productivity for a repeating or continuous process.

<span class="mw-page-title-main">Stack (abstract data type)</span> Abstract data type

In computer science, a stack is an abstract data type that serves as a collection of elements with two main operations:

Computability is the ability to solve a problem in an effective manner. It is a key topic of the field of computability theory within mathematical logic and the theory of computation within computer science. The computability of a problem is closely linked to the existence of an algorithm to solve the problem.

In computer science, computer engineering and programming language implementations, a stack machine is a computer processor or a virtual machine in which the primary interaction is moving short-lived temporary values to and from a push down stack. In the case of a hardware processor, a hardware stack is used. The use of a stack significantly reduces the required number of processor registers. Stack machines extend push-down automata with additional load/store operations or multiple stacks and hence are Turing-complete.

In computer science, and more specifically in computability theory and computational complexity theory, a model of computation is a model which describes how an output of a mathematical function is computed given an input. A model describes how units of computations, memories, and communications are organized. The computational complexity of an algorithm can be measured given a model of computation. Using a model allows studying the performance of algorithms independently of the variations that are specific to particular implementations and specific technology.

<span class="mw-page-title-main">Hardware acceleration</span> Specialized computer hardware

Hardware acceleration is the use of computer hardware designed to perform specific functions more efficiently when compared to software running on a general-purpose central processing unit (CPU). Any transformation of data that can be calculated in software running on a generic CPU can also be calculated in custom-made hardware, or in some mix of both.

<span class="mw-page-title-main">Emulator</span> System allowing a device to imitate another

In computing, an emulator is hardware or software that enables one computer system to behave like another computer system. An emulator typically enables the host system to run software or use peripheral devices designed for the guest system. Emulation refers to the ability of a computer program in an electronic device to emulate another program or device.

<span class="mw-page-title-main">Input/output</span> Communication between an information processing system and the outside world

In computing, input/output is the communication between an information processing system, such as a computer, and the outside world, such as another computer system, peripherals, or a human operator. Inputs are the signals or data received by the system and outputs are the signals or data sent from it. The term can also be used as part of an action; to "perform I/O" is to perform an input or output operation.

In computer programming, a function is a callable unit of software logic that has a well-defined interface and behavior and can be invoked multiple times.

This glossary of computer science is a list of definitions of terms and concepts used in computer science, its sub-disciplines, and related fields, including terms relevant to software, data science, and computer programming.

References

  1. Weisstein, Eric W. "Abstract Machine". mathworld.wolfram.com. Retrieved 2022-05-16.
  2. 1 2 3 4 5 6 "What is an Abstract Machine?". EasyTechJunkie. Retrieved 2022-05-16.
  3. 1 2 3 4 5 6 7 8 9 10 Diehl, Stephan; Hartel, Pieter; Sestoft, Peter (May 2000). "Abstract machines for programming language implementation". Future Generation Computer Systems. 16 (7): 739–751. doi:10.1016/S0167-739X(99)00088-6.
  4. "9.1.1: Finite-State Machine Overview". Engineering LibreTexts. 2021-04-29. Retrieved 2022-05-31.
  5. "What is Deterministic System? - Definition from Techopedia". Techopedia.com. 29 August 2019. Retrieved 2022-05-30.
  6. Stearns, Richard E. (January 2003). "Deterministic versus nondeterministic time and lower bound problems". Journal of the ACM. 50 (1): 91–95. doi:10.1145/602382.602409. ISSN   0004-5411. S2CID   2194820.
  7. Armoni, Michal; Gal-Ezer, Judith (December 2007). "Non-determinism: An abstract concept in computer science studies". Computer Science Education. 17 (4): 243–262. Bibcode:2007CSEd...17..243A. doi:10.1080/08993400701442885. ISSN   0899-3408. S2CID   41928460.
  8. Gill, John (December 1977). "Computational Complexity of Probabilistic Turing Machines". SIAM Journal on Computing. 6 (4): 675–695. doi:10.1137/0206049. ISSN   0097-5397.
  9. 1 2 3 4 5 6 7 8 9 10 11 12 13 Gabbrielli, Maurizio; Martini, Simone (2010), "Abstract Machines", Programming Languages: Principles and Paradigms, London: Springer London, pp. 1–25, doi:10.1007/978-1-84882-914-5_1, ISBN   978-1-84882-913-8 , retrieved 2022-05-16
  10. Bair, Ray; Chien, Andrew; Cook, Jeanine; Donofrio, Dave; Grider, Gary; Kuehn, Jeff; Moore, Shirley; Shalf, John; Vetter, Jeff (2018-02-01). Hardware Evaluation: Abstract Machine Models and Proxy Architectures for Exascale Computing (Technical report). U.S. Department of Energy Office of Scientific and Technical Information. doi:10.2172/1733300. OSTI   1733300.
  11. "abstract machine from FOLDOC". foldoc.org. Retrieved 2021-08-07.
  12. Gee, J.; Melvin, S. W.; Patt, Y. N. (1986). "The implementation of Prolog via VAX 8600 microcode". Proceedings of the 19th annual workshop on Microprogramming. New York, New York, USA: ACM Press. pp. 68–74. doi:10.1145/19551.19538. ISBN   081860736X. S2CID   3846072.
  13. "abstract machine". Oxford Reference. Retrieved 2022-05-16.
  14. García-Martín, Julio Manuel; Sutil-Martin, Miguel (August 15, 1999). "The Abstract Machine: A Pattern for Designing Abstract Machines" (PDF). Proceedings of Pattern Languages of Programs '99.
  15. upscfever.com (2017-01-25). "Computer Organization and Architecture (Stack Organization) - UPSC FEVER". upscfever.com. Retrieved 2022-05-31.
  16. "What is Object-Oriented Programming (OOP)?". SearchAppArchitecture. Retrieved 2022-05-31.
  17. "Design considerations for string processing languages", A Study in String Processing Languages, Lecture Notes in Computer Science, vol. 205, Berlin, Heidelberg: Springer Berlin Heidelberg, pp. 17–37, 1985, doi:10.1007/3-540-16041-8_2, ISBN   978-3-540-16041-0 , retrieved 2022-05-31
  18. Hackett, Jennifer; Hutton, Graham (2019-07-26). "Call-by-need is clairvoyant call-by-value". Proceedings of the ACM on Programming Languages. 3 (ICFP): 1–23. doi: 10.1145/3341718 . ISSN   2475-1421. S2CID   195782686.
  19. "Prolog | An Introduction". GeeksforGeeks. 2018-05-26. Retrieved 2022-05-31.
  20. Accattoli, Beniamino; Barenbaum, Pablo; Mazza, Damiano (2014-11-26). "Distilling abstract machines". ACM SIGPLAN Notices. 49 (9): 363–376. doi:10.1145/2692915.2628154. ISSN   0362-1340. S2CID   234775413.
  21. baeldung (2018-01-11). "Introduction to Java Primitives | Baeldung". www.baeldung.com. Retrieved 2022-05-31.
  22. Kuchana, Partha (2004), "Interpreter", Software Architecture Design Patterns in Java, Auerbach Publications, doi:10.1201/9780203496213, ISBN   978-0-8493-2142-9

Further reading

Jan van Leeuwen, ed. "Handbook of Theoretical Computer Science. Volume A: Algorithms and Complexity, The MIT PRESS/Elsevier, 1990. ISBN   0-444-88071-2 (volume A). QA 76.H279 1990