Memory safety

Last updated

Memory safety is the state of being protected from various software bugs and security vulnerabilities when dealing with memory access, such as buffer overflows and dangling pointers. [1] For example, Java is said to be memory-safe because its runtime error detection checks array bounds and pointer dereferences. [1] In contrast, C and C++ allow arbitrary pointer arithmetic with pointers implemented as direct memory addresses with no provision for bounds checking, [2] and thus are potentially memory-unsafe. [3]

History

Memory errors were first considered in the context of resource management (computing) and time-sharing systems, in an effort to avoid problems such as fork bombs. [4] Developments were mostly theoretical until the Morris worm, which exploited a buffer overflow in fingerd. [5] The field of computer security developed quickly thereafter, escalating with multitudes of new attacks such as the return-to-libc attack and defense techniques such as the non-executable stack [6] and address space layout randomization. Randomization prevents most buffer overflow attacks and requires the attacker to use heap spraying or other application-dependent methods to obtain addresses, although its adoption has been slow. [5] However, deployments of the technology are typically limited to randomizing libraries and the location of the stack.

Impact

In 2019, a Microsoft security engineer reported that 70% of all security vulnerabilities were caused by memory safety issues. [7] In 2020, a team at Google similarly reported that 70% of all "severe security bugs" in Chromium were caused by memory safety problems. Many other high-profile vulnerabilities and exploits in critical software have ultimately stemmed from a lack of memory safety, including Heartbleed [8] and a long-standing privilege escalation bug in sudo. [9] The pervasiveness and severity of vulnerabilities and exploits arising from memory safety issues have led several security researchers to describe identifying memory safety issues as "shooting fish in a barrel". [10]

Approaches

Most[ citation needed ] modern high-level programming languages are memory-safe by default, though not completely since they only check their own code and not the system they interact with. Automatic memory management in the form of garbage collection is the most common technique for preventing some of the memory safety problems, since it prevents common memory safety errors like use-after-free for all data allocated within the language runtime. [11] When combined with automatic bounds checking on all array accesses and no support for raw pointer arithmetic, garbage collected languages provide strong memory safety guarantees (though the guarantees may be weaker for low-level operations explicitly marked unsafe, such as use of a foreign function interface). However, the performance overhead of garbage collection makes these languages unsuitable for certain performance-critical applications. [1]

For languages that use manual memory management, memory safety is not usually guaranteed by the runtime. Instead, memory safety properties must either be guaranteed by the compiler via static program analysis and automated theorem proving or carefully managed by the programmer at runtime. [11] For example, the Rust programming language implements a borrow checker to ensure memory safety, [12] while C and C++ provide no memory safety guarantees. The substantial amount of software written in C and C++ has motivated the development of external static analysis tools like Coverity, which offers static memory analysis for C. [13]

DieHard, [14] its redesign DieHarder, [15] and the Allinea Distributed Debugging Tool are special heap allocators that allocate objects in their own random virtual memory page, allowing invalid reads and writes to be stopped and debugged at the exact instruction that causes them. Protection relies upon hardware memory protection and thus overhead is typically not substantial, although it can grow significantly if the program makes heavy use of allocation. [16] Randomization provides only probabilistic protection against memory errors, but can often be easily implemented in existing software by relinking the binary.

The memcheck tool of Valgrind uses an instruction set simulator and runs the compiled program in a memory-checking virtual machine, providing guaranteed detection of a subset of runtime memory errors. However, it typically slows the program down by a factor of 40, [17] and furthermore must be explicitly informed of custom memory allocators. [18] [19]

With access to the source code, libraries exist that collect and track legitimate values for pointers ("metadata") and check each pointer access against the metadata for validity, such as the Boehm garbage collector. [20] In general, memory safety can be safely assured using tracing garbage collection and the insertion of runtime checks on every memory access; this approach has overhead, but less than that of Valgrind. All garbage-collected languages take this approach. [1] For C and C++, many tools exist that perform a compile-time transformation of the code to do memory safety checks at runtime, such as CheckPointer [21] and AddressSanitizer which imposes an average slowdown factor of 2. [22]

BoundWarden is a new spatial memory enforcement approach that utilizes a combination of compile-time transformation and runtime concurrent monitoring techniques. [23]

Types of memory errors

Many different types of memory errors can occur: [24] [25]

Related Research Articles

<span class="mw-page-title-main">Buffer overflow</span> Anomaly in computer security and programming

In programming and information security, a buffer overflow or buffer overrun is an anomaly whereby a program writes data to a buffer beyond the buffer's allocated memory, overwriting adjacent memory locations.

C is a general-purpose computer programming language. It was created in the 1970s by Dennis Ritchie, and remains very widely used and influential. By design, C's features cleanly reflect the capabilities of the targeted CPUs. It has found lasting use in operating systems, device drivers, and protocol stacks, but its use in application software has been decreasing. C is commonly used on computer architectures that range from the largest supercomputers to the smallest microcontrollers and embedded systems.

<span class="mw-page-title-main">Garbage collection (computer science)</span> Form of automatic memory management

In computer science, garbage collection (GC) is a form of automatic memory management. The garbage collector attempts to reclaim memory which was allocated by the program, but is no longer referenced; such memory is called garbage. Garbage collection was invented by American computer scientist John McCarthy around 1959 to simplify manual memory management in Lisp.

The Cyclone programming language was intended to be a safe dialect of the C language. It avoids buffer overflows and other vulnerabilities that are possible in C programs by design, without losing the power and convenience of C as a tool for system programming. It is no longer supported by its original developers, with the reference tooling not supporting 64-bit platforms. The Rust language is mentioned by the original developers for having integrated many of the same ideas Cyclone had.

In computing, a segmentation fault or access violation is a fault, or failure condition, raised by hardware with memory protection, notifying an operating system (OS) the software has attempted to access a restricted area of memory. On standard x86 computers, this is a form of general protection fault. The operating system kernel will, in response, usually perform some corrective action, generally passing the fault on to the offending process by sending the process a signal. Processes can in some cases install a custom signal handler, allowing them to recover on their own, but otherwise the OS default signal handler is used, generally causing abnormal termination of the process, and sometimes a core dump.

<span class="mw-page-title-main">Memory management</span> Computer memory management methodology

Memory management is a form of resource management applied to computer memory. The essential requirement of memory management is to provide ways to dynamically allocate portions of memory to programs at their request, and free it for reuse when no longer needed. This is critical to any advanced computer system where more than a single process might be underway at any time.

A heap overflow, heap overrun, or heap smashing is a type of buffer overflow that occurs in the heap data area. Heap overflows are exploitable in a different manner to that of stack-based overflows. Memory on the heap is dynamically allocated at runtime and typically contains program data. Exploitation is performed by corrupting this data in specific ways to cause the application to overwrite internal structures such as linked list pointers. The canonical heap overflow technique overwrites dynamic memory allocation linkage and uses the resulting pointer exchange to overwrite a program function pointer.

C dynamic memory allocation refers to performing manual memory management for dynamic memory allocation in the C programming language via a group of functions in the C standard library, namely malloc, realloc, calloc, aligned_alloc and free.

<span class="mw-page-title-main">Valgrind</span> Programming tool for profiling, memory debugging and memory leak detection

Valgrind is a programming tool for memory debugging, memory leak detection, and profiling.

Memory corruption occurs in a computer program when the contents of a memory location are modified due to programmatic behavior that exceeds the intention of the original programmer or program/language constructs; this is termed as violation of memory safety. The most likely causes of memory corruption are programming errors. When the corrupted memory contents are used later in that program, it leads either to program crash or to strange and bizarre program behavior. Nearly 10% of application crashes on Windows systems are due to heap corruption.

<span class="mw-page-title-main">Pointer (computer programming)</span> Object which stores memory addresses in a computer program

In computer science, a pointer is an object in many programming languages that stores a memory address. This can be that of another value located in computer memory, or in some cases, that of memory-mapped computer hardware. A pointer references a location in memory, and obtaining the value stored at that location is known as dereferencing the pointer. As an analogy, a page number in a book's index could be considered a pointer to the corresponding page; dereferencing such a pointer would be done by flipping to the page with the given page number and reading the text found on that page. The actual format and content of a pointer variable is dependent on the underlying computer architecture.

Buffer overflow protection is any of various techniques used during software development to enhance the security of executable programs by detecting buffer overflows on stack-allocated variables, and preventing them from causing program misbehavior or from becoming serious security vulnerabilities. A stack buffer overflow occurs when a program writes to a memory address on the program's call stack outside of the intended data structure, which is usually a fixed-length buffer. Stack buffer overflow bugs are caused when a program writes more data to a buffer located on the stack than what is actually allocated for that buffer. This almost always results in corruption of adjacent data on the stack, which could lead to program crashes, incorrect operation, or security issues.

Insure++ is a memory debugger computer program, used by software developers to detect various errors in programs written in C and C++. It is made by Parasoft, and is functionally similar to other memory debuggers, such as Purify, Valgrind and Dr Memory.

<span class="mw-page-title-main">Dangling pointer</span> Pointer that does not point to a valid object

Dangling pointers and wild pointers in computer programming are pointers that do not point to a valid object of the appropriate type. These are special cases of memory safety violations. More generally, dangling references and wild references are references that do not resolve to a valid destination.

In computing, a null pointer or null reference is a value saved for indicating that the pointer or reference does not refer to a valid object. Programs routinely use null pointers to represent conditions such as the end of a list of unknown length or the failure to perform some action; this use of null pointers can be compared to nullable types and to the Nothing value in an option type.

In computer science, region-based memory management is a type of memory management in which each allocated object is assigned to a region. A region, also called a zone, arena, area, or memory context, is a collection of allocated objects that can be efficiently reallocated or deallocated all at once. Like stack allocation, regions facilitate allocation and deallocation of memory with low overhead; but they are more flexible, allowing objects to live longer than the stack frame in which they were allocated. In typical implementations, all objects in a region are allocated in a single contiguous range of memory addresses, similarly to how stack frames are typically allocated.

Verve is a research operating system developed by Microsoft Research. Verve is verified end-to-end for type safety and memory safety.

<span class="mw-page-title-main">DynamoRIO</span> Software framework

DynamoRIO is a BSD-licensed dynamic binary instrumentation framework for the development of dynamic program analysis tools. DynamoRIO targets user space applications under the Android, Linux, and Windows operating systems running on the AArch32, IA-32, and x86-64 instruction set architectures.

A code sanitizer is a programming tool that detects bugs in the form of undefined or suspicious behavior by a compiler inserting instrumentation code at runtime. The class of tools was first introduced by Google's AddressSanitizer of 2012, which uses directly mapped shadow memory to detect memory corruption such as buffer overflows or accesses to a dangling pointer (use-after-free).

<span class="mw-page-title-main">Zig (programming language)</span> A general-purpose programming language, toolchain to build Zig/C/C++ code

Zig is an imperative, general-purpose, statically typed, compiled system programming language designed by Andrew Kelley. It is intended to be a successor to the C programming language, with the goals of being even smaller and simpler to program in while also offering modern features, new optimizations and a variety of safety mechanisms while not as demanding of runtime safety as seen in other languages. It is distinct from languages like Go, Rust and Carbon, which have similar goals but also target the C++ space.

References

  1. 1 2 3 4 Dhurjati, Dinakar; Kowshik, Sumant; Adve, Vikram; Lattner, Chris (1 January 2003). "Memory safety without runtime checks or garbage collection" (PDF). Proceedings of the 2003 ACM SIGPLAN conference on Language, compiler, and tool for embedded systems. ACM. pp. 69–80. doi:10.1145/780732.780743. ISBN   1581136471. S2CID   1459540 . Retrieved 13 March 2017.
  2. Koenig, Andrew. "How C Makes It Hard To Check Array Bounds". Dr. Dobb's. Retrieved 13 March 2017.
  3. Akritidis, Periklis (June 2011). "Practical memory safety for C" (PDF). Technical Report - University of Cambridge. Computer Laboratory. University of Cambridge, Computer Laboratory. ISSN   1476-2986. UCAM-CL-TR-798. Retrieved 13 March 2017.
  4. Anderson, James P. "Computer Security Planning Study" (PDF). 2. Electronic Systems Center. ESD-TR-73-51.{{cite journal}}: Cite journal requires |journal= (help)
  5. 1 2 van der Veen, Victor; dutt-Sharma, Nitish; Cavallaro, Lorenzo; Bos, Herbert (2012). "Memory Errors: The Past, the Present, and the Future" (PDF). Research in Attacks, Intrusions, and Defenses. Lecture Notes in Computer Science. Vol. 7462. pp. 86–106. doi:10.1007/978-3-642-33338-5_5. ISBN   978-3-642-33337-8 . Retrieved 13 March 2017.
  6. Wojtczuk, Rafal. "Defeating Solar Designer's Non-executable Stack Patch". insecure.org. Retrieved 13 March 2017.
  7. "Microsoft: 70 percent of all security bugs are memory safety issues". ZDNET. Retrieved 21 September 2022.
  8. "CVE-2014-0160". Common Vulnerabilities and Exposures. Mitre. Archived from the original on 24 January 2018. Retrieved 8 February 2018.
  9. Goodin, Dan (4 February 2020). "Serious flaw that lurked in sudo for 9 years hands over root privileges". Ars Technica.
  10. "Fish in a Barrel". fishinabarrel.github.io. Retrieved 21 September 2022.
  11. 1 2 Crichton, Will. "CS 242: Memory safety". stanford-cs242.github.io. Retrieved 22 September 2022.
  12. "References". The Rustonomicon. Rust.org. Retrieved 13 March 2017.
  13. Bessey, Al; Engler, Dawson; Block, Ken; Chelf, Ben; Chou, Andy; Fulton, Bryan; Hallem, Seth; Henri-Gros, Charles; Kamsky, Asya; McPeak, Scott (1 February 2010). "A few billion lines of code later". Communications of the ACM. 53 (2): 66–75. doi:10.1145/1646353.1646374. S2CID   2611544.
  14. Berger, Emery D.; Zorn, Benjamin G. (1 January 2006). "DieHard: Probabilistic memory safety for unsafe languages" (PDF). Proceedings of the 27th ACM SIGPLAN Conference on Programming Language Design and Implementation. ACM. pp. 158–168. doi:10.1145/1133981.1134000. ISBN   1595933204. S2CID   8984358 . Retrieved 14 March 2017.
  15. Novark, Gene; Berger, Emery D. (1 January 2010). "DieHarder: Securing the heap" (PDF). Proceedings of the 17th ACM conference on Computer and communications security. ACM. pp. 573–584. doi:10.1145/1866307.1866371. ISBN   9781450302456. S2CID   7880497 . Retrieved 14 March 2017.
  16. "Memory Debugging in Allinea DDT". Archived from the original on 2015-02-03.
  17. Gyllenhaal, John. "Using Valgrind's Memcheck Tool to Find Memory Errors and Leaks". computing.llnl.gov. Archived from the original on 7 November 2018. Retrieved 13 March 2017.
  18. "Memcheck: a memory error detector". Valgrind User Manual. valgrind.org. Retrieved 13 March 2017.
  19. Kreinin, Yossi. "Why custom allocators/pools are hard". Proper Fixation. Retrieved 13 March 2017.
  20. "Using the Garbage Collector as Leak Detector". www.hboehm.info. Retrieved 14 March 2017.
  21. "Semantic Designs: CheckPointer compared to other safety checking tools". www.semanticdesigns.com. Semantic Designs, Inc.
  22. "AddressSanitizerPerformanceNumbers". GitHub .
  23. Dhumbumroong, Smith (2020). "BoundWarden: Thread-enforced spatial memory safety through compile-time transformations". Science of Computer Programming. 198: 102519. doi:10.1016/j.scico.2020.102519. S2CID   224925197.
  24. Gv, Naveen. "How to Avoid, Find (and Fix) Memory Errors in your C/C++ Code". Cprogramming.com. Retrieved 13 March 2017.
  25. "CWE-633: Weaknesses that Affect Memory". Community Weakness Enumeration. MITRE. Retrieved 13 March 2017.
  26. "CWE-762: Mismatched Memory Management Routines". Community Weakness Enumeration. MITRE. Retrieved 13 March 2017.