Crash (computing)

Last updated January 04, 2025

In computing, a crash, or system crash, occurs when a computer program such as a software application or an operating system stops functioning properly and exits. On some operating systems or individual applications, a crash reporting service will report the crash and any details relating to it (or give the user the option to do so), usually to the developer(s) of the application. If the program is a critical part of the operating system, the entire system may crash or hang, often resulting in a kernel panic or fatal system error, on Windows this can result in a Blue Screen.

Most crashes are the result of a software bug. Typical causes include accessing invalid memory addresses,^[a] incorrect address values in the program counter, buffer overflow, overwriting a portion of the affected program code due to an earlier bug, executing invalid machine instructions (an illegal or unauthorized opcode), or triggering an unhandled exception. The original software bug that started this chain of events is typically considered to be the cause of the crash, which is discovered through the process of debugging. The original bug can be far removed from the code that actually triggered the crash.

In early personal computers, attempting to write data to hardware addresses outside the system's main memory could cause hardware damage. Some crashes are exploitable and let a malicious program or hacker execute arbitrary code, allowing the replication of viruses or the acquisition of data which would normally be inaccessible.

Application crashes

An application typically crashes when it performs an operation that is not allowed by the operating system. The operating system then triggers an exception or signal in the application. Unix applications traditionally responded to the signal by dumping core. Most Windows and Unix GUI applications respond by displaying a dialogue box (such as the one shown to the right) with the option to attach a debugger if one is installed. Some applications attempt to recover from the error and continue running instead of exiting.

An application can also contain code to crash^[b] after detecting a severe error.

Typical errors that result in application crashes include:

attempting to read or write memory that is not allocated for reading or writing by that application (e.g., segmentation fault, x86-specific general protection fault)
attempting to execute privileged or invalid instructions
attempting to perform I/O operations on hardware devices to which it does not have permission to access
passing invalid arguments to system calls
attempting to access other system resources to which the application does not have permission to access
attempting to execute machine instructions with bad arguments (depending on CPU architecture): divide by zero, operations on denormal number or NaN (not a number) values, memory access to unaligned addresses, etc.

Crash to desktop

A "crash to desktop" is said to occur when a program (commonly a video game) unexpectedly quits, abruptly taking the user back to the desktop. Usually, the term is applied only to crashes where no error is displayed, hence all the user sees as a result of the crash is the desktop. Many times there is no apparent action that causes a crash to desktop. During normal function, the program may freeze for a shorter period of time, and then close by itself. Also during normal function, the program may become a black screen and repeatedly play the last few seconds of sound (depending on the size of the audio buffer) that was being played before it crashes to desktop. Other times it may appear to be triggered by a certain action, such as loading an area.

Crash to desktop bugs are considered particularly problematic for users. Since they frequently display no error message, it can be very difficult to track down the source of the problem, especially if the times they occur and the actions taking place right before the crash do not appear to have any pattern or common ground. One way to track down the source of the problem for games is to run them in windowed-mode. Windows Vista has a feature that can help track down the cause of a CTD problem when it occurs on any program.^{[ clarification needed ]} Windows XP included a similar feature as well.^{[ clarification needed ]}

Some computer programs, such as StepMania and BBC's Bamzooki , also crash to desktop if in full-screen, but display the error in a separate window when the user has returned to the desktop.

Web server crashes

The software running the web server behind a website may crash, rendering it inaccessible entirely or providing only an error message instead of normal content.

For example: if a site is using an SQL database (such as MySQL) for a script (such as PHP) and that SQL database server crashes, then PHP will display a connection error.

Operating system crashes

An operating system crash commonly occurs when a hardware exception occurs that cannot be handled. Operating system crashes can also occur when internal sanity-checking logic within the operating system detects that the operating system has lost its internal self-consistency.

Modern multi-tasking operating systems, such as Linux, and macOS, usually remain unharmed when an application program crashes.

Some operating systems, e.g., z/OS, have facilities for Reliability, availability and serviceability (RAS) and the OS can recover from the crash of a critical component, whether due to hardware failure, e.g., uncorrectable ECC error, or to software failure, e.g., a reference to an unassigned page.

Abnormal end

An Abnormal end or ABEND is an abnormal termination of software, or a program crash. Errors or crashes on the Novell NetWare network operating system are usually called ABENDs. Communities of NetWare administrators sprung up around the Internet, such as abend.org.

This usage derives from the ABEND macro on IBM OS/360, ..., z/OS operating systems. Usually capitalized, but may appear as "abend". Some common ABEND codes are System ABEND 0C7 (data exception) and System ABEND 0CB (division by zero).^[1]^[2]^[3] Abends can be "soft" (allowing automatic recovery) or "hard" (terminating the activity).^[4] The term is jocularly claimed to be derived from the German word "Abend" meaning "evening".^[5]

Security and privacy implications of crashes

Depending on the application, the crash may contain the user's sensitive and private information.^[6] Moreover, many software bugs which cause crashes are also exploitable for arbitrary code execution and other types of privilege escalation.^[7]^[8] For example, a stack buffer overflow can overwrite the return address of a subroutine with an invalid value, which will cause, e.g., a segmentation fault, when the subroutine returns. However, if an exploit overwrites the return address with a valid value, the code in that address will be executed.

Crash reproduction

When crashes are collected in the field using a crash reporter, the next step for developers is to be able to reproduce them locally. For this, several techniques exist: STAR uses symbolic execution,^[9] EvoCrash performs evolutionary search.^[10]

Notes

↑
Types of invalid addresses include:
- Invalid real address
- Invalid segment number
- Invalid page number
- Address not on correct boundary (alignment error)
↑ In OS/360 and successors the application normally uses an ABEND macro with a user completion code.

Related Research Articles

In programming and information security, a buffer overflow or buffer overrun is an anomaly whereby a program writes data to a buffer beyond the buffer's allocated memory, overwriting adjacent memory locations.

In computing, BIOS is firmware used to provide runtime services for operating systems and programs and to perform hardware initialization during the booting process. The firmware comes pre-installed on the computer's motherboard.

Multiple Virtual Storage, more commonly called MVS, is the most commonly used operating system on the System/370, System/390 and IBM Z IBM mainframe computers. IBM developed MVS, along with OS/VS1 and SVS, as a successor to OS/360. It is unrelated to IBM's other mainframe operating system lines, e.g., VSE, VM, TPF.

An operating system (OS) is system software that manages computer hardware and software resources, and provides common services for computer programs.

In computing, a segmentation fault or access violation is a fault, or failure condition, raised by hardware with memory protection, notifying an operating system (OS) the software has attempted to access a restricted area of memory. On standard x86 computers, this is a form of general protection fault. The operating system kernel will, in response, usually perform some corrective action, generally passing the fault on to the offending process by sending the process a signal. Processes can in some cases install a custom signal handler, allowing them to recover on their own, but otherwise the OS default signal handler is used, generally causing abnormal termination of the process, and sometimes a core dump.

<span class="mw-page-title-main">Booting</span> Process of starting a computer

In computing, booting is the process of starting a computer as initiated via hardware such as a physical button on the computer or by a software command. After it is switched on, a computer's central processing unit (CPU) has no software in its main memory, so some process must load software into memory before it can be executed. This may be done by hardware or firmware in the CPU, or by a separate processor in the computer system.

In computer science, a thread of execution is the smallest sequence of programmed instructions that can be managed independently by a scheduler, which is typically a part of the operating system. In many cases, a thread is a component of a process.

In computing, a core dump, memory dump, crash dump, storage dump, system dump, or ABEND dump consists of the recorded state of the working memory of a computer program at a specific time, generally when the program has crashed or otherwise terminated abnormally. In practice, other key pieces of program state are usually dumped at the same time, including the processor registers, which may include the program counter and stack pointer, memory management information, and other processor and operating system flags and information. A snapshot dump is a memory dump requested by the computer operator or by the running program, after which the program is able to continue. Core dumps are often used to assist in diagnosing and debugging errors in computer programs.

<span class="mw-page-title-main">IBM i</span> Operating system

IBM i is an operating system developed by IBM for IBM Power Systems. It was originally released in 1988 as OS/400, as the sole operating system of the IBM AS/400 line of systems. It was renamed to i5/OS in 2004, before being renamed a second time to IBM i in 2008. It is an evolution of the System/38 CPF operating system, with compatibility layers for System/36 SSP and AIX applications. It inherits a number of distinctive features from the System/38 platform, including the Machine Interface which provides hardware independence, the implementation of object-based addressing on top of a single-level store, and the tight integration of a relational database into the operating system.

A patch is data that is intended to be used to modify an existing software resource such as a program or a file, often to fix bugs and security vulnerabilities. A patch may be created to improve functionality, usability, or performance. A patch is typically provided by a vendor for updating the software that they provide. A patch may be created manually, but commonly it is created via a tool that compares two versions of the resource and generates data that can be used to transform one to the other.

Memory protection is a way to control memory access rights on a computer, and is a part of most modern instruction set architectures and operating systems. The main purpose of memory protection is to prevent a process from accessing memory that has not been allocated to it. This prevents a bug or malware within a process from affecting other processes, or the operating system itself. Protection may encompass all accesses to a specified area of memory, write accesses, or attempts to execute the contents of the area. An attempt to access unauthorized memory results in a hardware fault, e.g., a segmentation fault, storage violation exception, generally causing abnormal termination of the offending process. Memory protection for computer security includes additional techniques such as address space layout randomization and executable-space protection.

In computing, a page fault is an exception that the memory management unit (MMU) raises when a process accesses a memory page without proper preparations. Accessing the page requires a mapping to be added to the process's virtual address space. Furthermore, the actual page contents may need to be loaded from a back-up, e.g. a disk. The MMU detects the page fault, but the operating system's kernel handles the exception by making the required page accessible in the physical memory or denying an illegal memory access.

BIOS implementations provide interrupts that can be invoked by operating systems and application programs to use the facilities of the firmware on IBM PC compatible computers. Traditionally, BIOS calls are mainly used by DOS programs and some other software such as boot loaders. BIOS runs in the real address mode of the x86 CPU, so programs that call BIOS either must also run in real mode or must switch from protected mode to real mode before calling BIOS and then switching back again. For this reason, modern operating systems that use the CPU in Protected mode or Long mode generally do not use the BIOS interrupt calls to support system functions, although they use the BIOS interrupt calls to probe and initialize hardware during booting. Real mode has the 1MB memory limitation, modern boot loaders use the unreal mode or protected mode to access up to 4GB memory.

A hypervisor, also known as a virtual machine monitor (VMM) or virtualizer, is a type of computer software, firmware or hardware that creates and runs virtual machines. A computer on which a hypervisor runs one or more virtual machines is called a host machine, and each virtual machine is called a guest machine. The hypervisor presents the guest operating systems with a virtual operating platform and manages the execution of the guest operating systems. Unlike an emulator, the guest executes most instructions on the native hardware. Multiple instances of a variety of operating systems may share the virtualized hardware resources: for example, Linux, Windows, and macOS instances can all run on a single physical x86 machine. This contrasts with operating-system–level virtualization, where all instances must share a single kernel, though the guest operating systems can differ in user space, such as different Linux distributions with the same kernel.

In computing, CHKDSK is a system tool and command in DOS and Microsoft Windows, as well as Digital Research FlexOS, IBM/Toshiba 4690 OS, IBM OS/2. It verifies the integrity of the file system on a volume and attempts to fix logical file system errors. Logical errors are typically defined as software-level problems with a filesystem as a result of prior software malfunction or irregular use. Logical errors are contrasted with and usually less severe than hardware-level errors, which can not be fixed with CHKDSK and may instead require data recovery software or expert assistance. CHKDSK is similar to the fsck command in Unix and similar to Microsoft ScanDisk, which co-existed with CHKDSK in Windows 9x and MS-DOS 6.x.

A machine check exception (MCE) is a type of computer error that occurs when a problem involving the computer's hardware is detected. With most mass-market personal computers, an MCE indicates faulty or misconfigured hardware.

Windows Error Reporting (WER) is a crash reporting technology introduced by Microsoft with Windows XP and included in later Windows versions and Windows Mobile 5.0 and 6.0. Not to be confused with the Dr. Watson debugging tool which left the memory dump on the user's local machine, Windows Error Reporting collects and offers to send post-error debug information using the Internet to Microsoft when an application crashes or stops responding on a user's desktop. No data is sent without the user's consent. When a crash dump reaches the Microsoft server, it is analyzed, and information about a solution is sent back to the user if available. Solutions are served using Windows Error Reporting Responses. Windows Error Reporting runs as a Windows service. Kinshuman Kinshumann is the original architect of WER. WER was also included in the Association for Computing Machinery (ACM) hall of fame for its impact on the computing industry.

In computing, virtualization (v12n) is a series of technologies that allows dividing of physical computing resources into a series of virtual machines, operating systems, processes or containers.

The blue screen of death (BSoD) – blue screen error, blue screen, fatal error or bugcheck, and officially known as a stop error – is a critical error screen displayed by the Microsoft Windows operating systems when a serious system error occurs. It indicates a system crash, in which the operating system reaches a critical condition where it can no longer operate safely.

A kernel is a computer program at the core of a computer's operating system that always has complete control over everything in the system. The kernel is also responsible for preventing and mitigating conflicts between different processes. It is the portion of the operating system code that is always resident in memory and facilitates interactions between hardware and software components. A full kernel controls all hardware resources via device drivers, arbitrates conflicts between processes concerning such resources, and optimizes the utilization of common resources e.g. CPU & cache usage, file systems, and network sockets. On most systems, the kernel is one of the first programs loaded on startup. It handles the rest of startup as well as memory, peripherals, and input/output (I/O) requests from software, translating them into data-processing instructions for the central processing unit.

References

↑ "ABEND" (PDF). OS Release 21 – System/360 Operating System – Supervisor Services and Macro Instructions (PDF) (Eighth ed.). IBM. September 1974. pp. 97–99. GC28-6646-7. Retrieved 8 July 2023.
↑ "0Cx – z/OS MVS System Codes". IBM.
↑ List of ABEND codes Archived 2018-09-16 at the Wayback Machine on madisoncollege.edu
↑ Parziale, Lydia (2008). z/VM and Linux Operations for z/OS System Programmers. IBM Redbooks. ISBN 9780738431598. page 352
↑ "Abend" Archived 29 September 2011 at the Wayback Machine on dictionary.die.net
↑ Satvat, Kiavash; Saxena, Nitesh (2018). "Crashing Privacy: An Autopsy of a Web Browser's Leaked Crash Reports". arXiv: 1808.01718 [cs.CR].
↑ "Analyze Crashes to Find Security Vulnerabilities in Your Apps". Msdn.microsoft.com. 26 April 2007. Archived from the original on 11 December 2011. Retrieved 26 June 2014.
↑ "Jesse Ruderman » Memory safety bugs in C++ code". Squarefree.com. 1 November 2006. Archived from the original on 11 December 2013. Retrieved 26 June 2014.
↑ Chen, Ning; Kim, Sunghun (2015). "STAR: Stack Trace Based Automatic Crash Reproduction via Symbolic Execution". IEEE Transactions on Software Engineering. 41 (2): 198–220. doi:10.1109/TSE.2014.2363469. ISSN 0098-5589. S2CID 6299263.
↑ Soltani, Mozhan; Panichella, Annibale; van Deursen, Arie (2017). "A Guided Genetic Algorithm for Automated Crash Reproduction". 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE). pp. 209–220. doi:10.1109/ICSE.2017.27. ISBN 978-1-5386-3868-2. S2CID 199514177. Archived from the original on 25 January 2022. Retrieved 21 December 2020.

External links

Picking Up The Pieces After A Computer Crash

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[invaddr-1] Types of invalid addresses include:
Invalid real address
Invalid segment number
Invalid page number
Address not on correct boundary (alignment error)

[mwuw] Invalid real address

[mwvA] Invalid segment number

[mwvg] Invalid page number

[mwwA] Address not on correct boundary (alignment error)

[2] In OS/360 and successors the application normally uses an ABEND macro with a user completion code.

[3] "ABEND" (PDF). OS Release 21 – System/360 Operating System – Supervisor Services and Macro Instructions (PDF) (Eighth ed.). IBM. September 1974. pp. 97–99. GC28-6646-7. Retrieved 8 July 2023.

[4] "0Cx – z/OS MVS System Codes". IBM.

[ABENDlist-5] List of ABEND codes Archived 2018-09-16 at the Wayback Machine on madisoncollege.edu

[6] Parziale, Lydia (2008). z/VM and Linux Operations for z/OS System Programmers. IBM Redbooks. ISBN 9780738431598. page 352

[dictionary-7] "Abend" Archived 29 September 2011 at the Wayback Machine on dictionary.die.net

[8] Satvat, Kiavash; Saxena, Nitesh (2018). "Crashing Privacy: An Autopsy of a Web Browser's Leaked Crash Reports". arXiv: 1808.01718 [cs.CR].

[9] "Analyze Crashes to Find Security Vulnerabilities in Your Apps". Msdn.microsoft.com. 26 April 2007. Archived from the original on 11 December 2011. Retrieved 26 June 2014.

[10] "Jesse Ruderman » Memory safety bugs in C++ code". Squarefree.com. 1 November 2006. Archived from the original on 11 December 2013. Retrieved 26 June 2014.

[ChenKim2015-11] Chen, Ning; Kim, Sunghun (2015). "STAR: Stack Trace Based Automatic Crash Reproduction via Symbolic Execution". IEEE Transactions on Software Engineering. 41 (2): 198–220. doi:10.1109/TSE.2014.2363469. ISSN 0098-5589. S2CID 6299263.

[SoltaniPanichella2017-12] Soltani, Mozhan; Panichella, Annibale; van Deursen, Arie (2017). "A Guided Genetic Algorithm for Automated Crash Reproduction". 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE). pp. 209–220. doi:10.1109/ICSE.2017.27. ISBN 978-1-5386-3868-2. S2CID 199514177. Archived from the original on 25 January 2022. Retrieved 21 December 2020.

[a]

[b]

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]