Segmentation fault

Last updated

In computing, a segmentation fault (often shortened to segfault) or access violation is a fault, or failure condition, raised by hardware with memory protection, notifying an operating system (OS) the software has attempted to access a restricted area of memory (a memory access violation). On standard x86 computers, this is a form of general protection fault. The operating system kernel will, in response, usually perform some corrective action, generally passing the fault on to the offending process by sending the process a signal. Processes can in some cases install a custom signal handler, allowing them to recover on their own, [1] but otherwise the OS default signal handler is used, generally causing abnormal termination of the process (a program crash), and sometimes a core dump.

Contents

Segmentation faults are a common class of error in programs written in languages like C that provide low-level memory access and few to no safety checks. They arise primarily due to errors in use of pointers for virtual memory addressing, particularly illegal access. Another type of memory access error is a bus error, which also has various causes, but is today much rarer; these occur primarily due to incorrect physical memory addressing, or due to misaligned memory access – these are memory references that the hardware cannot address, rather than references that a process is not allowed to address.

Many programming languages have mechanisms designed to avoid segmentation faults and improve memory safety. For example, Rust employs an ownership-based [2] model to ensure memory safety. [3] Other languages, such as Lisp and Java, employ garbage collection, [4] which avoids certain classes of memory errors that could lead to segmentation faults. [5]

Overview

Example of human generated signal FreeBSD kernel panic.png
Example of human generated signal
Segmentation fault affecting Krita in KDE desktop environment KDE Crash Handler screenshot.png
Segmentation fault affecting Krita in KDE desktop environment
A null pointer dereference on Windows 8 Windows null ptr dereference.png
A null pointer dereference on Windows 8

A segmentation fault occurs when a program attempts to access a memory location that it is not allowed to access, or attempts to access a memory location in a way that is not allowed (for example, attempting to write to a read-only location, or to overwrite part of the operating system).

The term "segmentation" has various uses in computing; in the context of "segmentation fault", it refers to the address space of a program. [6] With memory protection, only the program's own address space is readable, and of this, only the stack and the read/write portion of the data segment of a program are writable, while read-only data allocated in the const segment and the code segment are not writable. Thus attempting to read outside of the program's address space, or writing to a read-only segment of the address space, results in a segmentation fault, hence the name.

On systems using hardware memory segmentation to provide virtual memory, a segmentation fault occurs when the hardware detects an attempt to refer to a non-existent segment, or to refer to a location outside the bounds of a segment, or to refer to a location in a fashion not allowed by the permissions granted for that segment. On systems using only paging, an invalid page fault generally leads to a segmentation fault, and segmentation faults and page faults are both faults raised by the virtual memory management system. Segmentation faults can also occur independently of page faults: illegal access to a valid page is a segmentation fault, but not an invalid page fault, and segmentation faults can occur in the middle of a page (hence no page fault), for example in a buffer overflow that stays within a page but illegally overwrites memory.

At the hardware level, the fault is initially raised by the memory management unit (MMU) on illegal access (if the referenced memory exists), as part of its memory protection feature, or an invalid page fault (if the referenced memory does not exist). If the problem is not an invalid logical address but instead an invalid physical address, a bus error is raised instead, though these are not always distinguished.

At the operating system level, this fault is caught and a signal is passed on to the offending process, activating the process's handler for that signal. Different operating systems have different signal names to indicate that a segmentation fault has occurred. On Unix-like operating systems, a signal called SIGSEGV (abbreviated from segmentation violation) is sent to the offending process. On Microsoft Windows, the offending process receives a STATUS_ACCESS_VIOLATION exception.

Causes

The conditions under which segmentation violations occur and how they manifest themselves are specific to hardware and the operating system: different hardware raises different faults for given conditions, and different operating systems convert these to different signals that are passed on to processes. The proximate cause is a memory access violation, while the underlying cause is generally a software bug of some sort. Determining the root causedebugging the bug – can be simple in some cases, where the program will consistently cause a segmentation fault (e.g., dereferencing a null pointer), while in other cases the bug can be difficult to reproduce and depend on memory allocation on each run (e.g., dereferencing a dangling pointer).

The following are some typical causes of a segmentation fault:

These in turn are often caused by programming errors that result in invalid memory access:

In C code, segmentation faults most often occur because of errors in pointer use, particularly in C dynamic memory allocation. Dereferencing a null pointer, which results in undefined behavior, will usually cause a segmentation fault. This is because a null pointer cannot be a valid memory address. On the other hand, wild pointers and dangling pointers point to memory that may or may not exist, and may or may not be readable or writable, and thus can result in transient bugs. For example:

char*p1=NULL;// Null pointerchar*p2;// Wild pointer: not initialized at all.char*p3=malloc(10*sizeof(char));// Initialized pointer to allocated memory// (assuming malloc did not fail)free(p3);// p3 is now a dangling pointer, as memory has been freed

Dereferencing any of these variables could cause a segmentation fault: dereferencing the null pointer generally will cause a segfault, while reading from the wild pointer may instead result in random data but no segfault, and reading from the dangling pointer may result in valid data for a while, and then random data as it is overwritten.

Handling

The default action for a segmentation fault or bus error is abnormal termination of the process that triggered it. A core file may be generated to aid debugging, and other platform-dependent actions may also be performed. For example, Linux systems using the grsecurity patch may log SIGSEGV signals in order to monitor for possible intrusion attempts using buffer overflows.

On some systems, like Linux and Windows, it is possible for the program itself to handle a segmentation fault. [7] Depending on the architecture and operating system, the running program can not only handle the event but may extract some information about its state like getting a stack trace, processor register values, the line of the source code when it was triggered, memory address that was invalidly accessed [8] and whether the action was a read or a write. [9]

Although a segmentation fault generally means that the program has a bug that needs fixing, it is also possible to intentionally cause such failure for the purposes of testing, debugging and also to emulate platforms where direct access to memory is needed. On the latter case, the system must be able to allow the program to run even after the fault occurs. In this case, when the system allows, it is possible to handle the event and increment the processor program counter to "jump" over the failing instruction to continue the execution. [10]

Examples

Segmentation fault on an EMV keypad Card reader segfault.jpg
Segmentation fault on an EMV keypad

Writing to read-only memory

Writing to read-only memory raises a segmentation fault. At the level of code errors, this occurs when the program writes to part of its own code segment or the read-only portion of the data segment, as these are loaded by the OS into read-only memory.

Here is an example of ANSI C code that will generally cause a segmentation fault on platforms with memory protection. It attempts to modify a string literal, which is undefined behavior according to the ANSI C standard. Most compilers will not catch this at compile time, and instead compile this to executable code that will crash:

intmain(void){char*s="hello world";*s='H';}

When the program containing this code is compiled, the string "hello world" is placed in the rodata section of the program executable file: the read-only section of the data segment. When loaded, the operating system places it with other strings and constant data in a read-only segment of memory. When executed, a variable, s, is set to point to the string's location, and an attempt is made to write an H character through the variable into the memory, causing a segmentation fault. Compiling such a program with a compiler that does not check for the assignment of read-only locations at compile time, and running it on a Unix-like operating system produces the following runtime error:

$ gccsegfault.c-g-osegfault $ ./segfault Segmentation fault

Backtrace of the core file from GDB:

ProgramreceivedsignalSIGSEGV,Segmentationfault.0x1c0005c2inmain()atsegfault.c:66*s='H';

This code can be corrected by using an array instead of a character pointer, as this allocates memory on stack and initializes it to the value of the string literal:

chars[]="hello world";s[0]='H';// equivalently, *s = 'H';

Even though string literals should not be modified (this has undefined behavior in the C standard), in C they are of static char [] type, [11] [12] [13] so there is no implicit conversion in the original code (which points a char * at that array), while in C++ they are of static const char [] type, and thus there is an implicit conversion, so compilers will generally catch this particular error.

Null pointer dereference

In C and C-like languages, null pointers are used to mean "pointer to no object" and as an error indicator, and dereferencing a null pointer (a read or write through a null pointer) is a very common program error. The C standard does not say that the null pointer is the same as the pointer to memory address  0, though that may be the case in practice. Most operating systems map the null pointer's address such that accessing it causes a segmentation fault. This behavior is not guaranteed by the C standard. Dereferencing a null pointer is undefined behavior in C, and a conforming implementation is allowed to assume that any pointer that is dereferenced is not null.

int*ptr=NULL;printf("%d",*ptr);

This sample code creates a null pointer, and then tries to access its value (read the value). Doing so causes a segmentation fault at runtime on many operating systems.

Dereferencing a null pointer and then assigning to it (writing a value to a non-existent target) also usually causes a segmentation fault:

int*ptr=NULL;*ptr=1;

The following code includes a null pointer dereference, but when compiled will often not result in a segmentation fault, as the value is unused and thus the dereference will often be optimized away by dead code elimination:

int*ptr=NULL;*ptr;

Buffer overflow

The following code accesses the character array s beyond its upper boundary. Depending on the compiler and the processor, this may result in a segmentation fault.

chars[]="hello world";charc=s[20];

Stack overflow

Another example is recursion without a base case:

intmain(void){returnmain();}

which causes the stack to overflow which results in a segmentation fault. [14] Infinite recursion may not necessarily result in a stack overflow depending on the language, optimizations performed by the compiler and the exact structure of a code. In this case, the behavior of unreachable code (the return statement) is undefined, so the compiler can eliminate it and use a tail call optimization that might result in no stack usage. Other optimizations could include translating the recursion into iteration, which given the structure of the example function would result in the program running forever, while probably not overflowing its stack.

See also

Related Research Articles

<span class="mw-page-title-main">Buffer overflow</span> Anomaly in computer security and programming

In programming and information security, a buffer overflow or buffer overrun is an anomaly whereby a program writes data to a buffer beyond the buffer's allocated memory, overwriting adjacent memory locations.

The Cyclone programming language was intended to be a safe dialect of the C language. It avoids buffer overflows and other vulnerabilities that are possible in C programs by design, without losing the power and convenience of C as a tool for system programming. It is no longer supported by its original developers, with the reference tooling not supporting 64-bit platforms. The Rust language is mentioned by the original developers for having integrated many of the same ideas Cyclone had.

<span class="mw-page-title-main">Crash (computing)</span> When a computer program stops functioning properly and self-terminates

In computing, a crash, or system crash, occurs when a computer program such as a software application or an operating system stops functioning properly and exits. On some operating systems or individual applications, a crash reporting service will report the crash and any details relating to it, usually to the developer(s) of the application. If the program is a critical part of the operating system, the entire system may crash or hang, often resulting in a kernel panic or fatal system error.

In computing, a bus error is a fault raised by hardware, notifying an operating system (OS) that a process is trying to access memory that the CPU cannot physically address: an invalid address for the address bus, hence the name. In modern use on most architectures these are much rarer than segmentation faults, which occur primarily due to memory access violations: problems in the logical address or permissions.

C dynamic memory allocation refers to performing manual memory management for dynamic memory allocation in the C programming language via a group of functions in the C standard library, namely malloc, realloc, calloc, aligned_alloc and free.

<span class="mw-page-title-main">C syntax</span> Set of rules defining correctly structured programs

The syntax of the C programming language is the set of rules governing writing of software in C. It is designed to allow for programs that are extremely terse, have a close relationship with the resulting object code, and yet provide relatively high-level data abstraction. C was the first widely successful high-level language for portable operating-system development.

<span class="mw-page-title-main">Pointer (computer programming)</span> Object which stores memory addresses in a computer program

In computer science, a pointer is an object in many programming languages that stores a memory address. This can be that of another value located in computer memory, or in some cases, that of memory-mapped computer hardware. A pointer references a location in memory, and obtaining the value stored at that location is known as dereferencing the pointer. As an analogy, a page number in a book's index could be considered a pointer to the corresponding page; dereferencing such a pointer would be done by flipping to the page with the given page number and reading the text found on that page. The actual format and content of a pointer variable is dependent on the underlying computer architecture.

<span class="mw-page-title-main">General protection fault</span>

A general protection fault (GPF) in the x86 instruction set architectures (ISAs) is a fault initiated by ISA-defined protection mechanisms in response to an access violation caused by some running code, either in the kernel or a user program. The mechanism is first described in Intel manuals and datasheets for the Intel 80286 CPU, which was introduced in 1983; it is also described in section 9.8.13 in the Intel 80386 programmer's reference manual from 1986. A general protection fault is implemented as an interrupt. Some operating systems may also classify some exceptions not related to access violations, such as illegal opcode exceptions, as general protection faults, even though they have nothing to do with memory protection. If a CPU detects a protection violation, it stops executing the code and sends a GPF interrupt. In most cases, the operating system removes the failing process from the execution queue, signals the user, and continues executing other processes. If, however, the operating system fails to catch the general protection fault, i.e. another protection violation occurs before the operating system returns from the previous GPF interrupt, the CPU signals a double fault, stopping the operating system. If yet another failure occurs, the CPU is unable to recover; since 80286, the CPU enters a special halt state called "Shutdown", which can only be exited through a hardware reset. The IBM PC AT, the first PC-compatible system to contain an 80286, has hardware that detects the Shutdown state and automatically resets the CPU when it occurs. All descendants of the PC AT do the same, so in a PC, a triple fault causes an immediate system reset.

Buffer overflow protection is any of various techniques used during software development to enhance the security of executable programs by detecting buffer overflows on stack-allocated variables, and preventing them from causing program misbehavior or from becoming serious security vulnerabilities. A stack buffer overflow occurs when a program writes to a memory address on the program's call stack outside of the intended data structure, which is usually a fixed-length buffer. Stack buffer overflow bugs are caused when a program writes more data to a buffer located on the stack than what is actually allocated for that buffer. This almost always results in corruption of adjacent data on the stack, which could lead to program crashes, incorrect operation, or security issues.

Signals are standardized messages sent to a running program to trigger specific behavior, such as quitting or error handling. They are a limited form of inter-process communication (IPC), typically used in Unix, Unix-like, and other POSIX-compliant operating systems.

<span class="mw-page-title-main">Dangling pointer</span> Pointer that does not point to a valid object

Dangling pointers and wild pointers in computer programming are pointers that do not point to a valid object of the appropriate type. These are special cases of memory safety violations. More generally, dangling references and wild references are references that do not resolve to a valid destination.

In computing, an uninitialized variable is a variable that is declared but is not set to a definite known value before it is used. It will have some value, but not a predictable one. As such, it is a programming error and a common source of bugs in software.

In computing, a page fault is an exception that the memory management unit (MMU) raises when a process accesses a memory page without proper preparations. Accessing the page requires a mapping to be added to the process's virtual address space. Besides, the actual page contents may need to be loaded from a backing store, such as a disk. The MMU detects the page fault, but the operating system's kernel handles the exception by making the required page accessible in the physical memory or denying an illegal memory access.

In computing, a null pointer or null reference is a value saved for indicating that the pointer or reference does not refer to a valid object. Programs routinely use null pointers to represent conditions such as the end of a list of unknown length or the failure to perform some action; this use of null pointers can be compared to nullable types and to the Nothing value in an option type.

In computing, a data segment is a portion of an object file or the corresponding address space of a program that contains initialized static variables, that is, global variables and static local variables. The size of this segment is determined by the size of the values in the program's source code, and does not change at run time.

In software, a stack buffer overflow or stack buffer overrun occurs when a program writes to a memory address on the program's call stack outside of the intended data structure, which is usually a fixed-length buffer. Stack buffer overflow bugs are caused when a program writes more data to a buffer located on the stack than what is actually allocated for that buffer. This almost always results in corruption of adjacent data on the stack, and in cases where the overflow was triggered by mistake, will often cause the program to crash or operate incorrectly. Stack buffer overflow is a type of the more general programming malfunction known as buffer overflow. Overfilling a buffer on the stack is more likely to derail program execution than overfilling a buffer on the heap because the stack contains the return addresses for all active function calls.

Memory safety is the state of being protected from various software bugs and security vulnerabilities when dealing with memory access, such as buffer overflows and dangling pointers. For example, Java is said to be memory-safe because its runtime error detection checks array bounds and pointer dereferences. In contrast, C and C++ allow arbitrary pointer arithmetic with pointers implemented as direct memory addresses with no provision for bounds checking, and thus are potentially memory-unsafe.

Secure coding is the practice of developing computer software in such a way that guards against the accidental introduction of security vulnerabilities. Defects, bugs and logic flaws are consistently the primary cause of commonly exploited software vulnerabilities. Through the analysis of thousands of reported vulnerabilities, security professionals have discovered that most vulnerabilities stem from a relatively small number of common software programming errors. By identifying the insecure coding practices that lead to these errors and educating developers on secure alternatives, organizations can take proactive steps to help significantly reduce or eliminate vulnerabilities in software before deployment.

<span class="mw-page-title-main">LLDB (debugger)</span> Software debugger

The LLDB Debugger (LLDB) is the debugger component of the LLVM project. It is built as a set of reusable components which extensively use existing libraries from LLVM, such as the Clang expression parser and LLVM disassembler. LLDB is free and open-source software under the University of Illinois/NCSA Open Source License, a BSD-style permissive software license. Since v9.0.0, it was relicensed to the Apache License 2.0 with LLVM Exceptions.

A code sanitizer is a programming tool that detects bugs in the form of undefined or suspicious behavior by a compiler inserting instrumentation code at runtime. The class of tools was first introduced by Google's AddressSanitizer of 2012, which uses directly mapped shadow memory to detect memory corruption such as buffer overflows or accesses to a dangling pointer (use-after-free).

References

  1. Expert C programming: deep C secrets By Peter Van der Linden, page 188
  2. "The Rust Programming Language - Ownership".
  3. "Fearless Concurrency with Rust - The Rust Programming Language Blog".
  4. McCarthy, John (April 1960). "Recursive functions of symbolic expressions and their computation by machine, Part I". Communications of the ACM . 4 (3): 184–195. doi: 10.1145/367177.367199 . S2CID   1489409 . Retrieved 2018-09-22.
  5. Dhurjati, Dinakar; Kowshik, Sumant; Adve, Vikram; Lattner, Chris (1 January 2003). "Memory safety without runtime checks or garbage collection" (PDF). Proceedings of the 2003 ACM SIGPLAN conference on Language, compiler, and tool for embedded systems. Vol. 38. ACM. pp. 69–80. doi:10.1145/780732.780743. ISBN   1581136471. S2CID   1459540 . Retrieved 2018-09-22.
  6. "Debugging Segmentation Faults and Pointer Problems - Cprogramming.com". www.cprogramming.com. Retrieved 2021-02-03.
  7. "Cleanly recovering from Segfaults under Windows and Linux (32-bit, x86)" . Retrieved 2020-08-23.
  8. "Implementation of the SIGSEGV/SIGABRT handler which prints the debug stack trace". GitHub . Retrieved 2020-08-23.
  9. "How to identify read or write operations of page fault when using sigaction handler on SIGSEGV?(LINUX)" . Retrieved 2020-08-23.
  10. "LINUX – WRITING FAULT HANDLERS". 12 November 2017. Retrieved 2020-08-23.
  11. "6.1.4 String literals". ISO/IEC 9899:1990 - Programming languages -- C.
  12. "6.4.5 String literals". ISO/IEC 9899:1999 - Programming languages -- C.
  13. "6.4.5 String literals". ISO/IEC 9899:2011 - Programming languages -- C.
  14. "What is the difference between a segmentation fault and a stack overflow?". Stack Overflow. Retrieved 2023-11-11.