Stack buffer overflow

Last updated

In software, a stack buffer overflow or stack buffer overrun occurs when a program writes to a memory address on the program's call stack outside of the intended data structure, which is usually a fixed-length buffer. [1] [2] Stack buffer overflow bugs are caused when a program writes more data to a buffer located on the stack than what is actually allocated for that buffer. This almost always results in corruption of adjacent data on the stack, and in cases where the overflow was triggered by mistake, will often cause the program to crash or operate incorrectly. Stack buffer overflow is a type of the more general programming malfunction known as buffer overflow (or buffer overrun). [1] Overfilling a buffer on the stack is more likely to derail program execution than overfilling a buffer on the heap because the stack contains the return addresses for all active function calls.

Contents

A stack buffer overflow can be caused deliberately as part of an attack known as stack smashing. If the affected program is running with special privileges, or accepts data from untrusted network hosts (e.g. a webserver) then the bug is a potential security vulnerability. If the stack buffer is filled with data supplied from an untrusted user then that user can corrupt the stack in such a way as to inject executable code into the running program and take control of the process. This is one of the oldest and more reliable methods for attackers to gain unauthorized access to a computer. [3] [4] [5]

Exploiting stack buffer overflows

The canonical method for exploiting a stack-based buffer overflow is to overwrite the function return address with a pointer to attacker-controlled data (usually on the stack itself). [3] [6] This is illustrated with strcpy() in the following example:

#include<string.h>voidfoo(char*bar){charc[12];strcpy(c,bar);// no bounds checking}intmain(intargc,char**argv){foo(argv[1]);return0;}

This code takes an argument from the command line and copies it to a local stack variable c. This works fine for command-line arguments smaller than 12 characters (as can be seen in figure B below). Any arguments larger than 11 characters long will result in corruption of the stack. (The maximum number of characters that is safe is one less than the size of the buffer here because in the C programming language, strings are terminated by a null byte character. A twelve-character input thus requires thirteen bytes to store, the input followed by the sentinel zero byte. The zero byte then ends up overwriting a memory location that's one byte beyond the end of the buffer.)

The program stack in foo() with various inputs:

A. - Before data is copied. Stack Overflow 2.png
A. - Before data is copied.
B. - "hello" is the first command line argument. Stack Overflow 3.png
B. - "hello" is the first command line argument.
C. - "AAAAAAAAAAAAAAAAAAAA\x08\x35\xC0\x80" is the first command line argument. Stack Overflow 4.png
C. - "AAAAAAAAAAAAAAAAAAAA\x08\x35\xC0\x80" is the first command line argument.

In figure C above, when an argument larger than 11 bytes is supplied on the command line foo() overwrites local stack data, the saved frame pointer, and most importantly, the return address. When foo() returns it pops the return address off the stack and jumps to that address (i.e. starts executing instructions from that address). Thus, the attacker has overwritten the return address with a pointer to the stack buffer char c[12], which now contains attacker-supplied data. In an actual stack buffer overflow exploit the string of "A"'s would instead be shellcode suitable to the platform and desired function. If this program had special privileges (e.g. the SUID bit set to run as the superuser), then the attacker could use this vulnerability to gain superuser privileges on the affected machine. [3]

The attacker can also modify internal variable values to exploit some bugs. With this example:

#include<string.h>#include<stdio.h>voidfoo(char*bar){floatMy_Float=10.5;// Addr = 0x0023FF4Ccharc[28];// Addr = 0x0023FF30// Will print 10.500000printf("My Float value = %f\n",My_Float);/* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~       Memory map:       @ : c allocated memory       # : My_Float allocated memory           *c                      *My_Float       0x0023FF30                  0x0023FF4C           |                           |           @@@@@@@@@@@@@@@@@@@@@@@@@@@@#####      foo("my string is too long !!!!! XXXXX");   memcpy will put 0x1010C042 (little endian) in My_Float value.   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~*/memcpy(c,bar,strlen(bar));// no bounds checking...// Will print 96.031372printf("My Float value = %f\n",My_Float);}intmain(intargc,char**argv){foo("my string is too long !!!!! \x10\x10\xc0\x42");return0;}

There are typically two methods that are used to alter the stored address in the stack - direct and indirect. Attackers started developing indirect attacks, which have fewer dependencies, in order to bypass protection measures that were made to reduce direct attacks. [7]

A number of platforms have subtle differences in their implementation of the call stack that can affect the way a stack buffer overflow exploit will work. Some machine architectures store the top-level return address of the call stack in a register. This means that any overwritten return address will not be used until a later unwinding of the call stack. Another example of a machine-specific detail that can affect the choice of exploitation techniques is the fact that most RISC-style machine architectures will not allow unaligned access to memory. [8] Combined with a fixed length for machine opcodes, this machine limitation can make the technique of jumping to the stack almost impossible to implement (with the one exception being when the program actually contains the unlikely code to explicitly jump to the stack register). [9] [10]

Stacks that grow up

Within the topic of stack buffer overflows, an often-discussed-but-rarely-seen architecture is one in which the stack grows in the opposite direction. This change in architecture is frequently suggested as a solution to the stack buffer overflow problem because any overflow of a stack buffer that occurs within the same stack frame cannot overwrite the return pointer. However, any overflow that occurs in a buffer from a previous stack frame will still overwrite a return pointer and allow for malicious exploitation of the bug. [11] For instance, in the example above, the return pointer for foo will not be overwritten because the overflow actually occurs within the stack frame for memcpy. However, because the buffer that overflows during the call to memcpy resides in a previous stack frame, the return pointer for memcpy will have a numerically higher memory address than the buffer. This means that instead of the return pointer for foo being overwritten, the return pointer for memcpy will be overwritten. At most, this means that growing the stack in the opposite direction will change some details of how stack buffer overflows are exploitable, but it will not reduce significantly the number of exploitable bugs.[ citation needed ]

Protection schemes

Over the years, a number of control-flow integrity schemes have been developed to inhibit malicious stack buffer overflow exploitation. These may usually be classified into three categories:

Stack canaries

Stack canaries, named for their analogy to a canary in a coal mine, are used to detect a stack buffer overflow before execution of malicious code can occur. This method works by placing a small integer, the value of which is randomly chosen at program start, in memory just before the stack return pointer. Most buffer overflows overwrite memory from lower to higher memory addresses, so in order to overwrite the return pointer (and thus take control of the process) the canary value must also be overwritten. This value is checked to make sure it has not changed before a routine uses the return pointer on the stack. [2] This technique can greatly increase the difficulty of exploiting a stack buffer overflow because it forces the attacker to gain control of the instruction pointer by some non-traditional means such as corrupting other important variables on the stack. [2]

Nonexecutable stack

Another approach to preventing stack buffer overflow exploitation is to enforce a memory policy on the stack memory region that disallows execution from the stack (W^X, "Write XOR Execute"). This means that in order to execute shellcode from the stack an attacker must either find a way to disable the execution protection from memory, or find a way to put their shellcode payload in a non-protected region of memory. This method is becoming more popular now that hardware support for the no-execute flag is available in most desktop processors.

While this method prevents the canonical stack smashing exploit, stack overflows can be exploited in other ways. First, it is common to find ways to store shellcode in unprotected memory regions like the heap, and so very little need change in the way of exploitation. [12]

Another attack is the so-called return to libc method for shellcode creation. In this attack the malicious payload will load the stack not with shellcode, but with a proper call stack so that execution is vectored to a chain of standard library calls, usually with the effect of disabling memory execute protections and allowing shellcode to run as normal. [13] This works because the execution never actually vectors to the stack itself.

A variant of return-to-libc is return-oriented programming (ROP), which sets up a series of return addresses, each of which executes a small sequence of cherry-picked machine instructions within the existing program code or system libraries, sequence which ends with a return. These so-called gadgets each accomplish some simple register manipulation or similar execution before returning, and stringing them together achieves the attacker's ends. It is even possible to use "returnless" return-oriented programming by exploiting instructions or groups of instructions that behave much like a return instruction. [14]

Randomization

Instead of separating the code from the data, another mitigation technique is to introduce randomization to the memory space of the executing program. Since the attacker needs to determine where executable code that can be used resides, either an executable payload is provided (with an executable stack) or one is constructed using code reuse such as in ret2libc or return-oriented programming (ROP). Randomizing the memory layout will, as a concept, prevent the attacker from knowing where any code is. However, implementations typically will not randomize everything; usually the executable itself is loaded at a fixed address and hence even when ASLR (address space layout randomization) is combined with a non-executable stack the attacker can use this fixed region of memory. Therefore, all programs should be compiled with PIE (position-independent executables) such that even this region of memory is randomized. The entropy of the randomization is different from implementation to implementation and a low enough entropy can in itself be a problem in terms of brute forcing the memory space that is randomized.

Bypass countermeasures

The previous mitigations make the steps of the exploitation harder. But it is still possible to exploit a stack buffer overflow if some vulnerabilities are presents or if some conditions are met.

Stack canary bypass

Information leak with format string vulnerability exploitation

An attacker is able to exploit the format string vulnerability for revealing the memory locations in the vulnerable program. [15]

Non executable stack bypass

When Data Execution Prevention is enabled to forbid any execute access to the stack, the attacker can still use the overwritten return address (the instruction pointer) to point to data in a code segment (.text on Linux) or every other executable section of the program. The goal is to reuse existing code. [16]

Rop chain

Consists to overwrite the return pointer a bit before a return instruction (ret in x86) of the program. The instructions between the new return pointer and the return instruction will be executed and the return instruction will return to the payload controlled by the exploiter. [16] [ clarification needed ]

Jop chain

Jump Oriented Programming is a technique that uses jump instructions to reuse code instead of ret instruction. [17]

Randomization bypass

A limitation of ASLR realization on 64-bit systems is that it is vulnerable to memory disclosure and information leakage attacks. The attacker can launch the ROP by revealing a single function address using information leakage attack. The following section describes the similar existing strategy for breaking down the ASLR protection. [18]

Notable examples

See also

Related Research Articles

<span class="mw-page-title-main">Buffer overflow</span> Anomaly in computer security and programming

In programming and information security, a buffer overflow or buffer overrun is an anomaly whereby a program writes data to a buffer beyond the buffer's allocated memory, overwriting adjacent memory locations.

In computing, a segmentation fault or access violation is a fault, or failure condition, raised by hardware with memory protection, notifying an operating system (OS) the software has attempted to access a restricted area of memory. On standard x86 computers, this is a form of general protection fault. The operating system kernel will, in response, usually perform some corrective action, generally passing the fault on to the offending process by sending the process a signal. Processes can in some cases install a custom signal handler, allowing them to recover on their own, but otherwise the OS default signal handler is used, generally causing abnormal termination of the process, and sometimes a core dump.

In hacking, a shellcode is a small piece of code used as the payload in the exploitation of a software vulnerability. It is called "shellcode" because it typically starts a command shell from which the attacker can control the compromised machine, but any piece of code that performs a similar task can be called shellcode. Because the function of a payload is not limited to merely spawning a shell, some have suggested that the name shellcode is insufficient. However, attempts at replacing the term have not gained wide acceptance. Shellcode is commonly written in machine code.

A heap overflow, heap overrun, or heap smashing is a type of buffer overflow that occurs in the heap data area. Heap overflows are exploitable in a different manner to that of stack-based overflows. Memory on the heap is dynamically allocated at runtime and typically contains program data. Exploitation is performed by corrupting this data in specific ways to cause the application to overwrite internal structures such as linked list pointers. The canonical heap overflow technique overwrites dynamic memory allocation linkage and uses the resulting pointer exchange to overwrite a program function pointer.

Uncontrolled format string is a type of code injection vulnerability discovered around 1989 that can be used in security exploits. Originally thought harmless, format string exploits can be used to crash a program or to execute harmful code. The problem stems from the use of unchecked user input as the format string parameter in certain C functions that perform formatting, such as printf . A malicious user may use the %s and %x format tokens, among others, to print data from the call stack or possibly other locations in memory. One may also write arbitrary data to arbitrary locations using the %n format token, which commands printf and similar functions to write the number of bytes formatted to an address stored on the stack.

In computer security, hardening is usually the process of securing a system by reducing its surface of vulnerability, which is larger when a system performs more functions; in principle a single-function system is more secure than a multipurpose one. Reducing available ways of attack typically includes changing default passwords, the removal of unnecessary software, unnecessary usernames or logins, and the disabling or removal of unnecessary services.

Buffer overflow protection is any of various techniques used during software development to enhance the security of executable programs by detecting buffer overflows on stack-allocated variables, and preventing them from causing program misbehavior or from becoming serious security vulnerabilities. A stack buffer overflow occurs when a program writes to a memory address on the program's call stack outside of the intended data structure, which is usually a fixed-length buffer. Stack buffer overflow bugs are caused when a program writes more data to a buffer located on the stack than what is actually allocated for that buffer. This almost always results in corruption of adjacent data on the stack, which could lead to program crashes, incorrect operation, or security issues.

A "return-to-libc" attack is a computer security attack usually starting with a buffer overflow in which a subroutine return address on a call stack is replaced by an address of a subroutine that is already present in the process executable memory, bypassing the no-execute bit feature and ridding the attacker of the need to inject their own code. The first example of this attack in the wild was contributed by Alexander Peslyak on the Bugtraq mailing list in 1997.

Code injection is the exploitation of a computer bug that is caused by processing invalid data. The injection is used by an attacker to introduce code into a vulnerable computer program and change the course of execution. The result of successful code injection can be disastrous, for example, by allowing computer viruses or computer worms to propagate.

In computing, an uninitialized variable is a variable that is declared but is not set to a definite known value before it is used. It will have some value, but not a predictable one. As such, it is a programming error and a common source of bugs in software.

In computer science, a call stack is a stack data structure that stores information about the active subroutines of a computer program. This type of stack is also known as an execution stack, program stack, control stack, run-time stack, or machine stack, and is often shortened to simply "the stack". Although maintenance of the call stack is important for the proper functioning of most software, the details are normally hidden and automatic in high-level programming languages. Many computer instruction sets provide special instructions for manipulating stacks.

<i>Hacking: The Art of Exploitation</i> 2003 book by Jon "Smibbs" Erickson

Hacking: The Art of Exploitation (ISBN 1-59327-007-0) is a book by Jon "Smibbs" Erickson about computer security and network security. It was published by No Starch Press in 2003, with a second edition in 2008. All of the examples in the book were developed, compiled, and tested on Gentoo Linux. The book also comes with a CD that contains a Linux environment with all the tools and examples used in the book.

setjmp.h is a header defined in the C standard library to provide "non-local jumps": control flow that deviates from the usual subroutine call and return sequence. The complementary functions setjmp and longjmp provide this functionality.

In computer security, a NOP slide, NOP sled or NOP ramp is a sequence of NOP (no-operation) instructions meant to "slide" the CPU's instruction execution flow to its final, desired destination whenever the program branches to a memory address anywhere on the slide.

Secure coding is the practice of developing computer software in such a way that guards against the accidental introduction of security vulnerabilities. Defects, bugs and logic flaws are consistently the primary cause of commonly exploited software vulnerabilities. Through the analysis of thousands of reported vulnerabilities, security professionals have discovered that most vulnerabilities stem from a relatively small number of common software programming errors. By identifying the insecure coding practices that lead to these errors and educating developers on secure alternatives, organizations can take proactive steps to help significantly reduce or eliminate vulnerabilities in software before deployment.

Return-oriented programming (ROP) is a computer security exploit technique that allows an attacker to execute code in the presence of security defenses such as executable space protection and code signing.

JIT spraying is a class of computer security exploit that circumvents the protection of address space layout randomization and data execution prevention by exploiting the behavior of just-in-time compilation. It has been used to exploit the PDF format and Adobe Flash.

Blind return oriented programming (BROP) is an exploit technique which can successfully create an exploit even if the attacker does not possess the target binary. BROP attacks shown by Bittau et al. have defeated address space layout randomization (ASLR) and stack canaries on 64-bit systems.

Control-flow integrity (CFI) is a general term for computer security techniques that prevent a wide variety of malware attacks from redirecting the flow of execution of a program.

Sigreturn-oriented programming (SROP) is a computer security exploit technique that allows an attacker to execute code in presence of security measures such as non-executable memory and code signing. It was presented for the first time at the 35th IEEE Symposium on Security and Privacy in 2014 where it won the best student paper award. This technique employs the same basic assumptions behind the return-oriented programming (ROP) technique: an attacker controlling the call stack, for example through a stack buffer overflow, is able to influence the control flow of the program through simple instruction sequences called gadgets. The attack works by pushing a forged sigcontext structure on the call stack, overwriting the original return address with the location of a gadget that allows the attacker to call the sigreturn system call. Often just a single gadget is needed to successfully put this attack into effect. This gadget may reside at a fixed location, making this attack simple and effective, with a setup generally simpler and more portable than the one needed by the plain return-oriented programming technique.

References

  1. 1 2 Fithen, William L.; Seacord, Robert (2007-03-27). "VT-MB. Violation of Memory Bounds". US CERT.
  2. 1 2 3 Dowd, Mark; McDonald, John; Schuh, Justin (November 2006). The Art Of Software Security Assessment. Addison Wesley. pp. 169–196. ISBN   0-321-44442-6.
  3. 1 2 3 Levy, Elias (1996-11-08). "Smashing The Stack for Fun and Profit". Phrack . 7 (49): 14.
  4. Pincus, J.; Baker, B. (July–August 2004). "Beyond Stack Smashing: Recent Advances in Exploiting Buffer Overruns" (PDF). IEEE Security and Privacy Magazine. 2 (4): 20–27. doi:10.1109/MSP.2004.36. S2CID   6647392.
  5. Burebista. "Stack Overflows" (PDF). Archived from the original (PDF) on September 28, 2007.[ dead link ]
  6. Bertrand, Louis (2002). "OpenBSD: Fix the Bugs, Secure the System". MUSESS '02: McMaster University Software Engineering Symposium. Archived from the original on 2007-09-30.
  7. Kuperman, Benjamin A.; Brodley, Carla E.; Ozdoganoglu, Hilmi; Vijaykumar, T. N.; Jalote, Ankit (November 2005). "Detection and prevention of stack buffer overflow attacks". Communications of the ACM. 48 (11): 50–56. doi:10.1145/1096000.1096004. ISSN   0001-0782. S2CID   120462.
  8. pr1. "Exploiting SPARC Buffer Overflow vulnerabilities".{{cite web}}: CS1 maint: numeric names: authors list (link)
  9. Curious (2005-01-08). "Reverse engineering - PowerPC Cracking on Mac OS X with GDB". Phrack . 11 (63): 16.
  10. Sovarel, Ana Nora; Evans, David; Paul, Nathanael. Where's the FEEB? The Effectiveness of Instruction Set Randomization (Report).
  11. Zhodiac (2001-12-28). "HP-UX (PA-RISC 1.1) Overflows". Phrack . 11 (58): 11.
  12. Foster, James C.; Osipov, Vitaly; Bhalla, Nish; Heinen, Niels (2005). Buffer Overflow Attacks: Detect, Exploit, Prevent (PDF). United States of America: Syngress Publishing, Inc. ISBN   1-932266-67-4.
  13. Nergal (2001-12-28). "The advanced return-into-lib(c) exploits: PaX case study". Phrack . 11 (58): 4.
  14. Checkoway, S.; Davi, L.; Dmitrienko, A.; Sadeghi, A. R.; Shacham, H.; Winandy, M. (October 2010). "Return-Oriented Programming without Returns". Proceedings of the 17th ACM conference on Computer and communications security - CCS '10. pp. 559–572. doi:10.1145/1866307.1866370. ISBN   978-1-4503-0245-6. S2CID   207182734.
  15. Butt, Muhammad Arif; Ajmal, Zarafshan; Khan, Zafar Iqbal; Idrees, Muhammad; Javed, Yasir (January 2022). "An In-Depth Survey of Bypassing Buffer Overflow Mitigation Techniques". Applied Sciences. 12 (26): 6702. doi: 10.3390/app12136702 . ISSN   2076-3417.
  16. 1 2 Butt, Muhammad Arif; Ajmal, Zarafshan; Khan, Zafar Iqbal; Idrees, Muhammad; Javed, Yasir (January 2022). "An In-Depth Survey of Bypassing Buffer Overflow Mitigation Techniques". Applied Sciences. 12 (13): 12–13. doi: 10.3390/app12136702 . ISSN   2076-3417.
  17. Sécurité matérielle des systèmes (in French). 2022-09-03.
  18. Butt, Muhammad Arif; Ajmal, Zarafshan; Khan, Zafar Iqbal; Idrees, Muhammad; Javed, Yasir (January 2022). "An In-Depth Survey of Bypassing Buffer Overflow Mitigation Techniques". Applied Sciences. 12 (16): 6702. doi: 10.3390/app12136702 . ISSN   2076-3417.
  19. "Twilight Hack - WiiBrew". wiibrew.org. Retrieved 2018-01-18.
  20. "Smash Stack - WiiBrew". wiibrew.org. Retrieved 2018-01-18.