Buffer overflow protection

Last updated

Buffer overflow protection is any of various techniques used during software development to enhance the security of executable programs by detecting buffer overflows on stack-allocated variables, and preventing them from causing program misbehavior or from becoming serious security vulnerabilities. A stack buffer overflow occurs when a program writes to a memory address on the program's call stack outside of the intended data structure, which is usually a fixed-length buffer. Stack buffer overflow bugs are caused when a program writes more data to a buffer located on the stack than what is actually allocated for that buffer. This almost always results in corruption of adjacent data on the stack, which could lead to program crashes, incorrect operation, or security issues.

Contents

Typically, buffer overflow protection modifies the organization of stack-allocated data so it includes a canary value that, when destroyed by a stack buffer overflow, shows that a buffer preceding it in memory has been overflowed. By verifying the canary value, execution of the affected program can be terminated, preventing it from misbehaving or from allowing an attacker to take control over it. Other buffer overflow protection techniques include bounds checking , which checks accesses to each allocated block of memory so they cannot go beyond the actually allocated space, and tagging, which ensures that memory allocated for storing data cannot contain executable code.

Overfilling a buffer allocated on the stack is more likely to influence program execution than overfilling a buffer on the heap because the stack contains the return addresses for all active function calls. However, similar implementation-specific protections also exist against heap-based overflows.

There are several implementations of buffer overflow protection, including those for the GNU Compiler Collection, LLVM, Microsoft Visual Studio, and other compilers.

Overview

A stack buffer overflow occurs when a program writes to a memory address on the program's call stack outside of the intended data structure, which is usually a fixed-length buffer. Stack buffer overflow bugs are caused when a program writes more data to a buffer located on the stack than what is actually allocated for that buffer. This almost always results in corruption of adjacent data on the stack, and in cases where the overflow was triggered by mistake, will often cause the program to crash or operate incorrectly. Stack buffer overflow is a type of the more general programming malfunction known as buffer overflow (or buffer overrun). Overfilling a buffer on the stack is more likely to derail program execution than overfilling a buffer on the heap because the stack contains the return addresses for all active function calls. [1]

Stack buffer overflow can be caused deliberately as part of an attack known as stack smashing. If the affected program is running with special privileges, or if it accepts data from untrusted network hosts (for example, a public webserver), then the bug is a potential security vulnerability that allows an attacker to inject executable code into the running program and take control of the process. This is one of the oldest and more reliable methods for attackers to gain unauthorized access to a computer. [2]

Typically, buffer overflow protection modifies the organization of data in the stack frame of a function call to include a "canary" value that, when destroyed, shows that a buffer preceding it in memory has been overflowed. This provides the benefit of preventing an entire class of attacks. According to some researchers, [3] the performance impact of these techniques is negligible.

Stack-smashing protection is unable to protect against certain forms of attack. For example, it cannot protect against buffer overflows in the heap. There is no sane way to alter the layout of data within a structure; structures are expected to be the same between modules, especially with shared libraries. Any data in a structure after a buffer is impossible to protect with canaries; thus, programmers must be very careful about how they organize their variables and use their structures.

Canaries

Canaries or canary words or stack cookies are known values that are placed between a buffer and control data on the stack to monitor buffer overflows. When the buffer overflows, the first data to be corrupted will usually be the canary, and a failed verification of the canary data will therefore alert of an overflow, which can then be handled, for example, by invalidating the corrupted data. A canary value should not be confused with a sentinel value.

The terminology is a reference to the historic practice of using canaries in coal mines, since they would be affected by toxic gases earlier than the miners, thus providing a biological warning system. Canaries are alternately known as stack cookies, which is meant to evoke the image of a "broken cookie" when the value is corrupted.

There are three types of canaries in use: terminator, random, and random XOR . Current versions of StackGuard support all three, while ProPolice supports terminator and random canaries.

Terminator canaries

Terminator canaries use the observation that most buffer overflow attacks are based on certain string operations which end at string terminators. The reaction to this observation is that the canaries are built of null terminators, CR, LF, and FF. As a result, the attacker must write a null character before writing the return address to avoid altering the canary. This prevents attacks using strcpy() and other methods that return upon copying a null character, while the undesirable result is that the canary is known. Even with the protection, an attacker could potentially overwrite the canary with its known value and control information with mismatched values, thus passing the canary check code, which is executed soon before the specific processor's return-from-call instruction.

Random canaries

Random canaries are randomly generated, usually from an entropy-gathering daemon, in order to prevent an attacker from knowing their value. Usually, it is not logically possible or plausible to read the canary for exploiting; the canary is a secure value known only by those who need to know itthe buffer overflow protection code in this case.

Normally, a random canary is generated at program initialization, and stored in a global variable. This variable is usually padded by unmapped pages so that attempting to read it using any kinds of tricks that exploit bugs to read off RAM cause a segmentation fault, terminating the program. It may still be possible to read the canary if the attacker knows where it is or can get the program to read from the stack.

Random XOR canaries

Random XOR canaries are random canaries that are XOR-scrambled using all or part of the control data. In this way, once the canary or the control data is clobbered, the canary value is wrong.

Random XOR canaries have the same vulnerabilities as random canaries, except that the "read from stack" method of getting the canary is a bit more complicated. The attacker must get the canary, the algorithm, and the control data in order to re-generate the original canary needed to spoof the protection.

In addition, random XOR canaries can protect against a certain type of attack involving overflowing a buffer in a structure into a pointer to change the pointer to point at a piece of control data. Because of the XOR encoding, the canary will be wrong if the control data or return value is changed. Because of the pointer, the control data or return value can be changed without overflowing over the canary.

Although these canaries protect the control data from being altered by clobbered pointers, they do not protect any other data or the pointers themselves. Function pointers especially are a problem here, as they can be overflowed into and can execute shellcode when called.

Bounds checking

Bounds checking is a compiler-based technique that adds run-time bounds information for each allocated block of memory, and checks all pointers against those at run-time. For C and C++, bounds checking can be performed at pointer calculation time [4] or at dereference time. [5] [6] [7]

Implementations of this approach use either a central repository, which describes each allocated block of memory, [4] [5] [6] or fat pointers, [7] which contain both the pointer and additional data, describing the region that they point to.

Tagging

Tagging [8] is a compiler-based or hardware-based (requiring a tagged architecture) technique for tagging the type of a piece of data in memory, used mainly for type checking. By marking certain areas of memory as non-executable, it effectively prevents memory allocated to store data from containing executable code. Also, certain areas of memory can be marked as non-allocated, preventing buffer overflows.

Historically, tagging has been used for implementing high-level programming languages; [9] with appropriate support from the operating system, tagging can also be used to detect buffer overflows. [10] An example is the NX bit hardware feature, supported by Intel, AMD and ARM processors.

Implementations

GNU Compiler Collection (GCC)

Stack-smashing protection was first implemented by StackGuard in 1997, and published at the 1998 USENIX Security Symposium. [11] StackGuard was introduced as a set of patches to the Intel x86 backend of GCC 2.7. StackGuard was maintained for the Immunix Linux distribution from 1998 to 2003, and was extended with implementations for terminator, random and random XOR canaries. StackGuard was suggested for inclusion in GCC 3.x at the GCC 2003 Summit Proceedings, [12] but this was never achieved.

From 2001 to 2005, IBM developed GCC patches for stack-smashing protection, known as ProPolice. [13] It improved on the idea of StackGuard by placing buffers after local pointers and function arguments in the stack frame. This helped avoid the corruption of pointers, preventing access to arbitrary memory locations.

Red Hat engineers identified problems with ProPolice though, and in 2005 re-implemented stack-smashing protection for inclusion in GCC 4.1. [14] [15] This work introduced the -fstack-protector flag, which protects only some vulnerable functions, and the -fstack-protector-all flag, which protects all functions whether they need it or not. [16]

In 2012, Google engineers implemented the -fstack-protector-strong flag to strike a better balance between security and performance. [17] This flag protects more kinds of vulnerable functions than -fstack-protector does, but not every function, providing better performance than -fstack-protector-all. It is available in GCC since its version 4.9. [18]

All Fedora packages are compiled with -fstack-protector since Fedora Core 5, and -fstack-protector-strong since Fedora 20. [19] [20] Most packages in Ubuntu are compiled with -fstack-protector since 6.10. [21] Every Arch Linux package is compiled with -fstack-protector since 2011. [22] All Arch Linux packages built since 4 May 2014 use -fstack-protector-strong. [23] Stack protection is only used for some packages in Debian, [24] and only for the FreeBSD base system since 8.0. [25] Stack protection is standard in certain operating systems, including OpenBSD, [26] Hardened Gentoo [27] and DragonFly BSD.[ citation needed ]

StackGuard and ProPolice cannot protect against overflows in automatically allocated structures that overflow into function pointers. ProPolice at least will rearrange the allocation order to get such structures allocated before function pointers. A separate mechanism for pointer protection was proposed in PointGuard [28] and is available on Microsoft Windows. [29]

Microsoft Visual Studio

The compiler suite from Microsoft implements buffer overflow protection since version 2003 through the /GS command-line switch, which is enabled by default since version 2005. [30] Using /GS- disables the protection.

IBM Compiler

Stack-smashing protection can be turned on by the compiler flag -qstackprotect. [31]

Clang/LLVM

Clang supports the same -fstack-protector options as GCC [32] and a stronger "safe stack" (-fsanitize=safe-stack) system with similarly low performance impact. [33] Clang also has three buffer overflow detectors, namely AddressSanitizer (-fsanitize=address), [6] UBSan (-fsanitize=bounds), [34] and the unofficial SafeCode (last updated for LLVM 3.0). [35]

These systems have different tradeoffs in terms of performance penalty, memory overhead, and classes of detected bugs. Stack protection is standard in certain operating systems, including OpenBSD. [36]

Intel Compiler

Intel's C and C++ compiler supports stack-smashing protection with options similar to those provided by GCC and Microsoft Visual Studio. [37]

Fail-Safe C

Fail-Safe C [7] is an open-source memory-safe ANSI C compiler that performs bounds checking based on fat pointers and object-oriented memory access. [38]

StackGhost (hardware-based)

Invented by Mike Frantzen, StackGhost is a simple tweak to the register window spill/fill routines which makes buffer overflows much more difficult to exploit. It uses a unique hardware feature of the Sun Microsystems SPARC architecture (that being: deferred on-stack in-frame register window spill/fill) to detect modifications of return pointers (a common way for an exploit to hijack execution paths) transparently, automatically protecting all applications without requiring binary or source modifications. The performance impact is negligible, less than one percent. The resulting gdb issues were resolved by Mark Kettenis two years later, allowing enabling of the feature. Following this event, the StackGhost code was integrated (and optimized) into OpenBSD/SPARC.

See also

Related Research Articles

<span class="mw-page-title-main">Buffer overflow</span> Anomaly in computer security and programming

In programming and information security, a buffer overflow or buffer overrun is an anomaly whereby a program writes data to a buffer beyond the buffer's allocated memory, overwriting adjacent memory locations.

In computing, a segmentation fault or access violation is a fault, or failure condition, raised by hardware with memory protection, notifying an operating system (OS) the software has attempted to access a restricted area of memory. On standard x86 computers, this is a form of general protection fault. The operating system kernel will, in response, usually perform some corrective action, generally passing the fault on to the offending process by sending the process a signal. Processes can in some cases install a custom signal handler, allowing them to recover on their own, but otherwise the OS default signal handler is used, generally causing abnormal termination of the process, and sometimes a core dump.

A heap overflow, heap overrun, or heap smashing is a type of buffer overflow that occurs in the heap data area. Heap overflows are exploitable in a different manner to that of stack-based overflows. Memory on the heap is dynamically allocated at runtime and typically contains program data. Exploitation is performed by corrupting this data in specific ways to cause the application to overwrite internal structures such as linked list pointers. The canonical heap overflow technique overwrites dynamic memory allocation linkage and uses the resulting pointer exchange to overwrite a program function pointer.

The C standard library or libc is the standard library for the C programming language, as specified in the ISO C standard. Starting from the original ANSI C standard, it was developed at the same time as the C library POSIX specification, which is a superset of it. Since ANSI C was adopted by the International Organization for Standardization, the C standard library is also called the ISO C library.

C dynamic memory allocation refers to performing manual memory management for dynamic memory allocation in the C programming language via a group of functions in the C standard library, namely malloc, realloc, calloc, aligned_alloc and free.

Memory corruption occurs in a computer program when the contents of a memory location are modified due to programmatic behavior that exceeds the intention of the original programmer or program/language constructs; this is termed as violation of memory safety. The most likely causes of memory corruption are programming errors. When the corrupted memory contents are used later in that program, it leads either to program crash or to strange and bizarre program behavior. Nearly 10% of application crashes on Windows systems are due to heap corruption.

In computer programming, undefined behavior (UB) is the result of executing a program whose behavior is prescribed to be unpredictable, in the language specification of the programming language in which the source code is written. This is different from unspecified behavior, for which the language specification does not prescribe a result, and implementation-defined behavior that defers to the documentation of another component of the platform.

Address space layout randomization (ASLR) is a computer security technique involved in preventing exploitation of memory corruption vulnerabilities. In order to prevent an attacker from reliably redirecting code execution to, for example, a particular exploited function in memory, ASLR randomly arranges the address space positions of key data areas of a process, including the base of the executable and the positions of the stack, heap and libraries.

<span class="mw-page-title-main">Dangling pointer</span> Pointer that does not point to a valid object

Dangling pointers and wild pointers in computer programming are pointers that do not point to a valid object of the appropriate type. These are special cases of memory safety violations. More generally, dangling references and wild references are references that do not resolve to a valid destination.

In software, a stack overflow occurs if the call stack pointer exceeds the stack bound. The call stack may consist of a limited amount of address space, often determined at the start of the program. The size of the call stack depends on many factors, including the programming language, machine architecture, multi-threading, and amount of available memory. When a program attempts to use more space than is available on the call stack, the stack is said to overflow, typically resulting in a program crash.

The OpenBSD operating system focuses on security and the development of security features. According to author Michael W. Lucas, OpenBSD "is widely regarded as the most secure operating system available anywhere, under any licensing terms."

In computer security, executable-space protection marks memory regions as non-executable, such that an attempt to execute machine code in these regions will cause an exception. It makes use of hardware features such as the NX bit, or in some cases software emulation of those features. However, technologies that emulate or supply an NX bit will usually impose a measurable overhead while using a hardware-supplied NX bit imposes no measurable overhead.

In software, a stack buffer overflow or stack buffer overrun occurs when a program writes to a memory address on the program's call stack outside of the intended data structure, which is usually a fixed-length buffer. Stack buffer overflow bugs are caused when a program writes more data to a buffer located on the stack than what is actually allocated for that buffer. This almost always results in corruption of adjacent data on the stack, and in cases where the overflow was triggered by mistake, will often cause the program to crash or operate incorrectly. Stack buffer overflow is a type of the more general programming malfunction known as buffer overflow. Overfilling a buffer on the stack is more likely to derail program execution than overfilling a buffer on the heap because the stack contains the return addresses for all active function calls.

Memory safety is the state of being protected from various software bugs and security vulnerabilities when dealing with memory access, such as buffer overflows and dangling pointers. For example, Java is said to be memory-safe because its runtime error detection checks array bounds and pointer dereferences. In contrast, C and C++ allow arbitrary pointer arithmetic with pointers implemented as direct memory addresses with no provision for bounds checking, and thus are potentially memory-unsafe.

Secure coding is the practice of developing computer software in such a way that guards against the accidental introduction of security vulnerabilities. Defects, bugs and logic flaws are consistently the primary cause of commonly exploited software vulnerabilities. Through the analysis of thousands of reported vulnerabilities, security professionals have discovered that most vulnerabilities stem from a relatively small number of common software programming errors. By identifying the insecure coding practices that lead to these errors and educating developers on secure alternatives, organizations can take proactive steps to help significantly reduce or eliminate vulnerabilities in software before deployment.

Return-oriented programming (ROP) is a computer security exploit technique that allows an attacker to execute code in the presence of security defenses such as executable space protection and code signing.

Intel MPX are discontinued set of extensions to the x86 instruction set architecture. With compiler, runtime library and operating system support, Intel MPX claimed to enhance security to software by checking pointer references whose normal compile-time intentions are maliciously exploited at runtime due to buffer overflows. In practice, there have been too many flaws discovered in the design for it to be useful, and support has been deprecated or removed from most compilers and operating systems. Intel has listed MPX as removed in 2019 and onward hardware in section 2.5 of its Intel® 64 and IA-32 Architectures Software Developer's Manual Volume 1.

A code sanitizer is a programming tool that detects bugs in the form of undefined or suspicious behavior by a compiler inserting instrumentation code at runtime. The class of tools was first introduced by Google's AddressSanitizer of 2012, which uses directly mapped shadow memory to detect memory corruption such as buffer overflows or accesses to a dangling pointer (use-after-free).

Control-flow integrity (CFI) is a general term for computer security techniques that prevent a wide variety of malware attacks from redirecting the flow of execution of a program.

In computer security, a shadow stack is a mechanism for protecting a procedure's stored return address, such as from a stack buffer overflow. The shadow stack itself is a second, separate stack that "shadows" the program call stack. In the function prologue, a function stores its return address to both the call stack and the shadow stack. In the function epilogue, a function loads the return address from both the call stack and the shadow stack, and then compares them. If the two records of the return address differ, then an attack is detected; the typical course of action is simply to terminate the program or alert system administrators about a possible intrusion attempt. A shadow stack is similar to stack canaries in that both mechanisms aim to maintain the control-flow integrity of the protected program by detecting attacks that tamper the stored return address by an attacker during an exploitation attempt.

References

  1. Fithen, William L.; Seacord, Robert (2007-03-27). "VT-MB. Violation of Memory Bounds". US CERT.
  2. Levy, Elias (1996-11-08). "Smashing The Stack for Fun and Profit". Phrack . 7 (49): 14.
  3. "Buffer Overflows: Attacks and Defenses for the Vulnerability of the Decade*" (PDF). Archived from the original (PDF) on 2013-03-09.
  4. 1 2 "Bounds Checking for C". Doc.ic.ac.uk. Archived from the original on 2016-03-26. Retrieved 2014-04-27.
  5. 1 2 "SAFECode: Secure Virtual Architecture". Sva.cs.illinois.edu. 2009-08-12. Retrieved 2014-04-27.
  6. 1 2 3 "google/sanitizers". 19 June 2021.
  7. 1 2 3 "Fail-Safe C: Top Page". Staff.aist.go.jp. 2013-05-07. Archived from the original on 2016-07-07. Retrieved 2014-04-27.
  8. "Tuesday, April 05, 2005" (PDF). Feustel.us. Archived from the original (PDF) on June 23, 2016. Retrieved 2016-09-17.
  9. Steenkiste, Peter; Hennessy, John (1987). "Tags and type checking in LISP: hardware and software approaches". ACM Sigops Operating Systems Review. 21 (4). ACM: 50–59. doi: 10.1145/36204.36183 .
  10. "ClearPath Enterprise Servers MCP Security Overview" (PDF). Public.support.unisys.com. Archived from the original (PDF) on 2013-01-24. Retrieved 2014-04-27.
  11. "Papers - 7th USENIX Security Symposium, 1998". Usenix.org. 2002-04-12. Retrieved 2014-04-27.
  12. "Proceedings of the GCC Developers Summit" (PDF). May 2003. Archived from the original on 2004-07-15. Retrieved 2016-09-17.{{cite web}}: CS1 maint: bot: original URL status unknown (link)
  13. "GCC extension for protecting applications from stack-smashing attacks". Research.ibm.com. Retrieved 2014-04-27.
  14. "GCC 4.1 Release Series — Changes, New Features, and Fixes - GNU Project - Free Software Foundation (FSF)". Gcc.gnu.org. Retrieved 2014-04-27.
  15. "Richard Henderson - [rfc] reimplementation of ibm stack-smashing protector". Gcc.gnu.org. Retrieved 2014-04-27.
  16. "Optimize Options - Using the GNU Compiler Collection (GCC)". Gcc.gnu.org. Retrieved 2014-04-27.
  17. "Han Shen(ææ) - [PATCH] Add a new option "-fstack-protector-strong" (patch / doc inside)". Gcc.gnu.org. 2012-06-14. Retrieved 2014-04-27.
  18. Edge, Jake (February 5, 2014). ""Strong" stack protection for GCC". Linux Weekly News. Retrieved 28 November 2014. It has made its way into GCC 4.9
  19. "Security Features". FedoraProject. 2013-12-11. Retrieved 2014-04-27.
  20. "#1128 (switching from "-fstack-protector" to "-fstack-protector-strong" in Fedora 20) – FESCo". Fedorahosted.org. Retrieved 2014-04-27.
  21. "Security/Features - Ubuntu Wiki". Wiki.ubuntu.com. Retrieved 2014-04-27.
  22. "FS#18864 : Consider enabling GCC's stack-smashing protection (ProPolice, SSP) for all packages". Bugs.archlinux.org. Retrieved 2014-04-27.
  23. "svntogit/packages.git - Git clone of the 'packages' repository". Archived from the original on July 18, 2014.
  24. "Debian Security Hardening Statistics". Outflux.net. Archived from the original on 2014-04-28. Retrieved 2014-04-27.
  25. "FreeBSD 8.0-RELEASE Release Notes". Freebsd.org. 2013-11-13. Retrieved 2014-04-27.
  26. "OpenBSD's gcc-local(1) manual page". gcc comes with the ProPolice stack protection extension, which is enabled by default.
  27. "Hardened/Toolchain - Gentoo Wiki". 2016-07-31. The Gentoo hardened GCC switches on the stack protector by default unless explicitly requested not to.
  28. "12th USENIX Security Symposium — Technical Paper".
  29. "MSDN Blogs – Get the latest information, insights, announcements, and news from Microsoft experts and developers in the MSDN blogs". 6 August 2021.
  30. "/GS (Buffer Security Check) (C++)". msdn.microsoft.com. Retrieved 2014-04-27.
  31. "qstackprotect". Publib.boulder.ibm.com. Retrieved 2014-04-27.
  32. "Clang mailing list". Clang.llvm.org. 28 April 2017. Retrieved 2022-11-16.
  33. "SafeStack — Clang 17.0.0git documentation". clang.llvm.org.
  34. "Clang Compiler User's Manual — Clang 3.5 documentation". Clang.llvm.org. Retrieved 2014-04-27.
  35. "SAFECode". Safecode.cs.illinois.edu. Retrieved 2014-04-27.
  36. "OpenBSD's clang-local(1) manual page". clang comes with stack protection enabled by default, equivalent to the -fstack-protector-strong option on other systems.
  37. "User and Reference Guide for the Intel C++ Compiler 15.0: fstack-security-check, GS". software.intel.com. Retrieved 2015-02-13.
  38. "thesis.dvi" (PDF). Staff.aist.go.jp. Retrieved 2016-09-17.