Weird machine

Last updated January 30, 2024

In computer security, a weird machine is a computational artifact where additional code execution can happen outside the original specification of the program.^[1] It is closely related to the concept of weird instructions, which are the building blocks of an exploit based on crafted input data.^[2]

Theory

From a theoretical perspective, the emergence of weird machines becomes clear when one considers software as a way to restrict the number of reachable states and state transitions of a computer: The general-purpose CPU is, through software, specialized to simulate a finite-state machine (with potentially very large state space). Many states the CPU could be in are excluded, and certain state transitions are ruled out - for example those that violate the software's security requirements. When the system is somehow moved into a state that "makes no sense" when viewed from the perspective of the intended finite-state machine (through memory corruption, hardware failure, or other programming mistakes), the software will keep transforming the broken state into new broken states, triggered by further user input. A new computational device arises: The weird machine which can reach different states of the CPU than the programmer anticipated, and which does so in reaction to inputs.

Applications

While expected, valid input activates the normal, intended functionality in a computer program, input that was unexpected by the program developer may activate unintended functionality. The weird machine consists of this unintended functionality that can be programmed with selected inputs in an exploit.

In a classical attack taking advantage of a stack buffer overflow, the input given to a vulnerable program is crafted and delivered so that it itself becomes executed as program code. However, if the data areas of the program memory have been protected so that they cannot be executed directly like this, the input may instead take the form of pointers into pieces of existing program code that then become executed in an unexpected order to generate the functionality of the exploit. These snippets of code that are used by the exploit are referred to as gadgets in the context of return-oriented programming.

Through interpretation of data as code, weird machine functionality that is by definition outside the original program specification can be reached also by proof-carrying code (PCC), which has been formally proven to function in a certain specific way.^[3] This disparity is essentially caused by a disconnect between formal abstract modelling of a computer program and its real-world instance, which can be influenced by events that are not captured in the original abstraction, such as memory errors or power outages.

Weird machine behaviors are observed even in hardware. For instance, it has been shown that one can do computation with only MOV instructions in x86.^[4]

Mitigation

Two central categories of mitigation to the problems caused by weird machine functionality include input validation within the software and protecting against problems arising from the platform on which the program runs, such as memory errors. Input validation aims to limit the scope and forms of unexpected inputs e.g. through whitelists of allowed inputs, so that the software program itself would not end up in an unexpected state by interpreting the data internally. Equally importantly, secure programming practices such as protecting against buffer overflows make it less likely that input data becomes interpreted in unintended ways by lower layers, such as the hardware on which the program is executed.

Related Research Articles

In programming and information security, a buffer overflow or buffer overrun is an anomaly whereby a program writes data to a buffer beyond the buffer's allocated memory, overwriting adjacent memory locations.

In computing, a process is the instance of a computer program that is being executed by one or many threads. There are many different process models, some of which are light weight, but almost all processes are rooted in an operating system (OS) process which comprises the program code, assigned system resources, physical and logical access permissions, and data structures to initiate, control and coordinate execution activity. Depending on the OS, a process may be made up of multiple threads of execution that execute instructions concurrently.

In computer science, an abstract machine is a theoretical model that allows for a detailed and precise analysis of how a computer system functions. It is similar to a mathematical function in that it receives inputs and produces outputs based on predefined rules. Abstract machines vary from literal machines in that they are expected to perform correctly and independently of hardware. Abstract machines are "machines" because they allow step-by-step execution of programmes; they are "abstract" because they ignore many aspects of actual (hardware) machines. A typical abstract machine consists of a definition in terms of input, output, and the set of allowable operations used to turn the former into the latter. They can be used for purely theoretical reasons as well as models for real-world computer systems. In the theory of computation, abstract machines are often used in thought experiments regarding computability or to analyse the complexity of algorithms. This use of abstract machines is fundamental to the field of computational complexity theory, such as finite state machines, Mealy machines, push-down automata, and Turing machines.

In computer science, self-modifying code is code that alters its own instructions while it is executing – usually to reduce the instruction path length and improve performance or simply to reduce otherwise repetitively similar code, thus simplifying maintenance. The term is usually only applied to code where the self-modification is intentional, not in situations where code accidentally modifies itself due to an error such as a buffer overflow.

In computing, a crash, or system crash, occurs when a computer program such as a software application or an operating system stops functioning properly and exits. On some operating systems or individual applications, a crash reporting service will report the crash and any details relating to it, usually to the developer(s) of the application. If the program is a critical part of the operating system, the entire system may crash or hang, often resulting in a kernel panic or fatal system error.

In computer security, a side-channel attack is any attack based on extra information that can be gathered because of the fundamental way a computer protocol or algorithm is implemented, rather than flaws in the design of the protocol or algorithm itself or minor, but potentially devastating, mistakes or oversights in the implementation. Timing information, power consumption, electromagnetic leaks, and sound are examples of extra information which could be exploited to facilitate side-channel attacks.

In programming and software development, fuzzing or fuzz testing is an automated software testing technique that involves providing invalid, unexpected, or random data as inputs to a computer program. The program is then monitored for exceptions such as crashes, failing built-in code assertions, or potential memory leaks. Typically, fuzzers are used to test programs that take structured inputs. This structure is specified, e.g., in a file format or protocol and distinguishes valid from invalid input. An effective fuzzer generates semi-valid inputs that are "valid enough" in that they are not directly rejected by the parser, but do create unexpected behaviors deeper in the program and are "invalid enough" to expose corner cases that have not been properly dealt with.

In computer security, arbitrary code execution (ACE) is an attacker's ability to run any commands or code of the attacker's choice on a target machine or in a target process. An arbitrary code execution vulnerability is a security flaw in software or hardware allowing arbitrary code execution. A program that is designed to exploit such a vulnerability is called an arbitrary code execution exploit. The ability to trigger arbitrary code execution over a network is often referred to as remote code execution (RCE).

<span class="mw-page-title-main">Hardware acceleration</span> Specialized computer hardware

Hardware acceleration is the use of computer hardware designed to perform specific functions more efficiently when compared to software running on a general-purpose central processing unit (CPU). Any transformation of data that can be calculated in software running on a generic CPU can also be calculated in custom-made hardware, or in some mix of both.

In computer programming, an integer overflow occurs when an arithmetic operation attempts to create a numeric value that is outside of the range that can be represented with a given number of digits – either higher than the maximum or lower than the minimum representable value.

In computer science, stream processing is a programming paradigm which views streams, or sequences of events in time, as the central input and output objects of computation. Stream processing encompasses dataflow programming, reactive programming, and distributed data processing. Stream processing systems aim to expose parallel processing for data streams and rely on streaming algorithms for efficient implementation. The software stack for these systems includes components such as programming models and query languages, for expressing computation; stream management systems, for distribution and scheduling; and hardware components for acceleration including floating-point units, graphics processing units, and field-programmable gate arrays.

In computer security, executable-space protection marks memory regions as non-executable, such that an attempt to execute machine code in these regions will cause an exception. It makes use of hardware features such as the NX bit, or in some cases software emulation of those features. However, technologies that emulate or supply an NX bit will usually impose a measurable overhead while using a hardware-supplied NX bit imposes no measurable overhead.

Stress testing is a software testing activity that determines the robustness of software by testing beyond the limits of normal operation. Stress testing is particularly important for "mission critical" software, but is used for all types of software. Stress tests commonly put a greater emphasis on robustness, availability, and error handling under a heavy load, than on what would be considered correct behavior under normal circumstances.

In computer security, a NOP slide, NOP sled or NOP ramp is a sequence of NOP (no-operation) instructions meant to "slide" the CPU's instruction execution flow to its final, desired destination whenever the program branches to a memory address anywhere on the slide.

In software, a stack buffer overflow or stack buffer overrun occurs when a program writes to a memory address on the program's call stack outside of the intended data structure, which is usually a fixed-length buffer. Stack buffer overflow bugs are caused when a program writes more data to a buffer located on the stack than what is actually allocated for that buffer. This almost always results in corruption of adjacent data on the stack, and in cases where the overflow was triggered by mistake, will often cause the program to crash or operate incorrectly. Stack buffer overflow is a type of the more general programming malfunction known as buffer overflow. Overfilling a buffer on the stack is more likely to derail program execution than overfilling a buffer on the heap because the stack contains the return addresses for all active function calls.

Return-oriented programming (ROP) is a computer security exploit technique that allows an attacker to execute code in the presence of security defenses such as executable space protection and code signing.

JIT spraying is a class of computer security exploit that circumvents the protection of address space layout randomization and data execution prevention by exploiting the behavior of just-in-time compilation. It has been used to exploit the PDF format and Adobe Flash.

Sigreturn-oriented programming (SROP) is a computer security exploit technique that allows an attacker to execute code in presence of security measures such as non-executable memory and code signing. It was presented for the first time at the 35th IEEE Symposium on Security and Privacy in 2014 where it won the best student paper award. This technique employs the same basic assumptions behind the return-oriented programming (ROP) technique: an attacker controlling the call stack, for example through a stack buffer overflow, is able to influence the control flow of the program through simple instruction sequences called gadgets. The attack works by pushing a forged sigcontext structure on the call stack, overwriting the original return address with the location of a gadget that allows the attacker to call the sigreturn system call. Often just a single gadget is needed to successfully put this attack into effect. This gadget may reside at a fixed location, making this attack simple and effective, with a setup generally simpler and more portable than the one needed by the plain return-oriented programming technique.

Spectre is one of the two original transient execution CPU vulnerabilities, which involve microarchitectural timing side-channel attacks. These affect modern microprocessors that perform branch prediction and other forms of speculation. On most processors, the speculative execution resulting from a branch misprediction may leave observable side effects that may reveal private data to attackers. For example, if the pattern of memory accesses performed by such speculative execution depends on private data, the resulting state of the data cache constitutes a side channel through which an attacker may be able to extract information about the private data using a timing attack.

This glossary of computer science is a list of definitions of terms and concepts used in computer science, its sub-disciplines, and related fields, including terms relevant to software, data science, and computer programming.

References

↑ Bratus, Sergey; Locasto, Michael E.; Patterson, Meredith L.; Sassaman, Len; Shubina, Anna (December 2011). "Exploit Programming - From Buffer Overflows to "Weird Machines" and Theory of Computation" (PDF). ;login: . 36 (6): 13–21.
↑ Bratus, Sergey; Darley, Trey; Locasto, Michael E.; Patterson, Meredith L.; Shabiro, Rebecca; Shubina, Anna (January 2014). "Beyond Planted Bugs in "Trusting Trust": The Input-Processing Frontier". IEEE Security & Privacy. 12: 83–87. doi:10.1109/MSP.2014.1. S2CID 8355623.
↑ Vanegue, Julien (2014). "The Weird Machines in Proof-Carrying Code" (PDF). 2014 IEEE Security and Privacy Workshops. IEEE. pp. 209–213. doi:10.1109/SPW.2014.37. ISBN 978-1-4799-5103-1. S2CID 9913043.
↑ Stephen, Dolan. "mov is Turing-complete" (PDF).

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Bratus, Sergey; Locasto, Michael E.; Patterson, Meredith L.; Sassaman, Len; Shubina, Anna (December 2011). "Exploit Programming - From Buffer Overflows to "Weird Machines" and Theory of Computation" (PDF). ;login: . 36 (6): 13–21.

[2] Bratus, Sergey; Darley, Trey; Locasto, Michael E.; Patterson, Meredith L.; Shabiro, Rebecca; Shubina, Anna (January 2014). "Beyond Planted Bugs in "Trusting Trust": The Input-Processing Frontier". IEEE Security & Privacy. 12: 83–87. doi:10.1109/MSP.2014.1. S2CID 8355623.

[3] Vanegue, Julien (2014). "The Weird Machines in Proof-Carrying Code" (PDF). 2014 IEEE Security and Privacy Workshops. IEEE. pp. 209–213. doi:10.1109/SPW.2014.37. ISBN 978-1-4799-5103-1. S2CID 9913043.

[4] Stephen, Dolan. "mov is Turing-complete" (PDF).

[1]

[2]

[3]

[4]