Core dump

Last updated

In computing, a core dump, [a] memory dump, crash dump, storage dump, system dump, or ABEND dump [1] consists of the recorded state of the working memory of a computer program at a specific time, generally when the program has crashed or otherwise terminated abnormally. [2] In practice, other key pieces of program state are usually dumped at the same time, including the processor registers, which may include the program counter and stack pointer, memory management information, and other processor and operating system flags and information. A snapshot dump (or snap dump) is a memory dump requested by the computer operator or by the running program, after which the program is able to continue. Core dumps are often used to assist in diagnosing and debugging errors in computer programs.

Contents

On many operating systems, a fatal exception in a program automatically triggers a core dump. By extension, the phrase "to dump core" has come to mean in many cases, any fatal error, regardless of whether a record of the program memory exists. The term "core dump", "memory dump", or just "dump" has also become jargon to indicate any output of a large amount of raw data for further examination or other purposes. [3] [4]

Background

The name comes from magnetic-core memory, [5] [6] the principal form of random-access memory from the 1950s to the 1970s. The name has remained long after magnetic-core technology became obsolete.

Earliest core dumps were paper printouts [7] of the contents of memory, typically arranged in columns of octal or hexadecimal numbers (a "hex dump"), sometimes accompanied by their interpretations as machine language instructions, text strings, or decimal or floating-point numbers (cf. disassembler).

As memory sizes increased and post-mortem analysis utilities were developed, dumps were written to magnetic media like tape or disk.

Instead of only displaying the contents of the applicable memory, modern operating systems typically generate a file containing an image of the memory belonging to the crashed process, or the memory images of parts of the address space related to that process, along with other information such as the values of processor registers, program counter, system flags, and other information useful in determining the root cause of the crash. These files can be viewed as text, printed, or analysed with specialised tools such as elfdump on Unix and Unix-like systems, objdump and kdump on Linux, IPCS (Interactive Problem Control System) on IBM z/OS, [8] DVF (Dump Viewing Facility) on IBM z/VM, [9] WinDbg on Microsoft Windows, Valgrind, or other debuggers.

In some operating systems [b] an application or operator may request a snapshot of selected storage blocks, rather than all of the storage used by the application or operating system.

Uses

Core dumps can serve as useful debugging aids in several situations. On early standalone or batch-processing systems, core dumps allowed a user to debug a program without monopolizing the (very expensive) computing facility for debugging; a printout could also be more convenient than debugging using front panel switches and lights.

On shared computers, whether time-sharing, batch processing, or server systems, core dumps allow off-line debugging of the operating system, so that the system can go back into operation immediately.

Core dumps allow a user to save a crash for later or off-site analysis, or comparison with other crashes. For embedded computers, it may be impractical to support debugging on the computer itself, so analysis of a dump may take place on a different computer. Some operating systems such as early versions of Unix did not support attaching debuggers to running processes, so core dumps were necessary to run a debugger on a process's memory contents.

Core dumps can be used to capture data freed during dynamic memory allocation and may thus be used to retrieve information from a program that is no longer running. In the absence of an interactive debugger, the core dump may be used by an assiduous programmer to determine the error from direct examination.

Snap dumps are sometimes a convenient way for applications to record quick and dirty debugging output.

Analysis

A core dump generally represents the complete contents of the dumped regions of the address space of the dumped process. Depending on the operating system, the dump may contain few or no data structures to aid interpretation of the memory regions. In these systems, successful interpretation requires that the program or user trying to interpret the dump understands the structure of the program's memory use.

A debugger can use a symbol table, if one exists, to help the programmer interpret dumps, identifying variables symbolically and displaying source code; if the symbol table is not available, less interpretation of the dump is possible, but there might still be enough possible to determine the cause of the problem. There are also special-purpose tools called dump analyzers to analyze dumps. One popular tool, available on many operating systems, is the GNU binutils' objdump.

On modern Unix-like operating systems, administrators and programmers can read core dump files using the GNU Binutils Binary File Descriptor library (BFD), and the GNU Debugger (gdb) and objdump that use this library. This library will supply the raw data for a given address in a memory region from a core dump; it does not know anything about variables or data structures in that memory region, so the application using the library to read the core dump will have to determine the addresses of variables and determine the layout of data structures itself, for example by using the symbol table for the program undergoing debugging.

Analysts of crash dumps from Linux systems can use kdump or the Linux Kernel Crash Dump (LKCD). [10]

Core dumps can save the context (state) of a process at a given state for returning to it later. Systems can be made highly available by transferring core between processors, sometimes via core dump files themselves.

Core can also be dumped onto a remote host over a network (which is a security risk). [11]

Users of IBM mainframes running z/OS can browse SVC and transaction dumps using Interactive Problem Control System (IPCS), a full screen dump reader which was originally introduced in OS/VS2 (MVS), supports user written scripts in REXX and supports point-and-shoot browsing [c] of dumps.

Core-dump files

Format

In older and simpler operating systems, each process had a contiguous address-space, so a dump file was sometimes simply a file with the sequence of bytes, digits, [d] characters [d] or words. On other early machines a dump file contained discrete records, each containing a storage address and the associated contents. On early machines, the dump was often written by a stand-alone dump program rather than by the application or the operating system.

The IBSYS monitor for the IBM 7090 included a System Core-Storage Dump Program [12] that supported post-motem and snap dumps.

On the IBM System/360, the standard operating systems wrote formatted ABEND and SNAP dumps, with the addresses, registers, storage contents, etc., all converted into printable forms. Later releases added the ability to write unformatted [e] dumps, called at that time core image dumps (also known as SVC dumps.)

In modern operating systems, a process address space may contain gaps, and it may share pages with other processes or files, so more elaborate representations are used; they may also include other information about the state of the program at the time of the dump.

In Unix-like systems, core dumps generally use the standard executable image-format:

Naming

OS/360 and successors

  • In OS/360 and successors, a job may assign arbitrary data set names (DSNs) to the ddnames SYSABEND and SYSUDUMP for a formatted ABEND dump and to arbitrary ddnames for SNAP dumps, or define those ddnames as SYSOUT. [f]
  • The Damage Assessment and Repair (DAR) facility added an automatic unformatted [h] storage dump to the dataset SYS1.DUMP [i] at the time of failure as well as a console dump requested by the operator.
  • The newer transaction dump is very similar to the older SVC dump.
  • The Interactive Problem Control System (IPCS), added to OS/VS2 by Selectable Unit (SU) 57 [14] [15] and part of every subsequent MVS release, can be used to interactively analyze storage dumps on DASD. IPCS understands the format and relationships of system control blocks, and can produce a formatted display for analysis. The current versions of IPCS allow inspection of active address spaces [16] [j] without first taking a storage dump.

Unix-like

  • Since Solaris 8, system utility coreadm allows the name and location of core files to be configured.
  • Dumps of user processes are traditionally created as core. On Linux (since versions 2.4.21 and 2.6 of the Linux kernel mainline), a different name can be specified via procfs using the /proc/sys/kernel/core_pattern configuration file; the specified name can also be a template that contains tags substituted by, for example, the executable filename, the process ID, or the reason for the dump. [17]
  • System-wide dumps on modern Unix-like systems often appear as vmcore or vmcore.incomplete.

Others

Windows memory dumps

Microsoft Windows supports two memory dump formats, described below.

Kernel-mode dumps

There are five types of kernel-mode dumps: [18]

  • Complete memory dump  contains full physical memory for the target system.
  • Kernel memory dump  contains all the memory in use by the kernel at the time of the crash.
  • Small memory dump  contains various info such as the stop code, parameters, list of loaded device drivers, etc.
  • Automatic Memory Dump (Windows 8 and later)  same as Kernel memory dump, but if the paging file is both System Managed and too small to capture the Kernel memory dump, it will automatically increase the paging file to at least the size of RAM for four weeks, then reduce it to the smaller size. [19]
  • Active memory dump (Windows 10 and later)  contains most of the memory in use by the kernel and user mode applications.

To analyze the Windows kernel-mode dumps Debugging Tools for Windows are used, a set that inludes tools like WinDbg & DumpChk. [20] [21] [22]

User-mode memory dumps

User-mode memory dump, also known as minidump, [23] is a memory dump of a single process. It contains selected data records: full or partial (filtered) process memory; list of the threads with their call stacks and state (such as registers or TEB); information about handles to the kernel objects; list of loaded and unloaded libraries. Full list of options available in MINIDUMP_TYPE enum. [24]

Space missions

The NASA Voyager program was probably the first craft to routinely utilize the core dump feature in the Deep Space segment. The core dump feature is a mandatory telemetry feature for the Deep Space segment as it has been proven to minimize system diagnostic costs.[ citation needed ] The Voyager craft uses routine core dumps to spot memory damage from cosmic ray events.

Space Mission core dump systems are mostly based on existing toolkits for the target CPU or subsystem. However, over the duration of a mission the core dump subsystem may be substantially modified or enhanced for the specific needs of the mission.

See also

Related Research Articles

<span class="mw-page-title-main">Executable and Linkable Format</span> Standard file format for executables, object code, shared libraries, and core dumps.

In computing, the Executable and Linkable Format is a common standard file format for executable files, object code, shared libraries, and core dumps. First published in the specification for the application binary interface (ABI) of the Unix operating system version named System V Release 4 (SVR4), and later in the Tool Interface Standard, it was quickly accepted among different vendors of Unix systems. In 1999, it was chosen as the standard binary file format for Unix and Unix-like systems on x86 processors by the 86open project.

<span class="mw-page-title-main">MVS</span> Operating system for IBM mainframes

Multiple Virtual Storage, more commonly called MVS, is the most commonly used operating system on the System/370, System/390 and IBM Z IBM mainframe computers. IBM developed MVS, along with OS/VS1 and SVS, as a successor to OS/360. It is unrelated to IBM's other mainframe operating system lines, e.g., VSE, VM, TPF.

Mach is an operating system kernel developed at Carnegie Mellon University by Richard Rashid and Avie Tevanian to support operating system research, primarily distributed and parallel computing. Mach is often considered one of the earliest examples of a microkernel. However, not all versions of Mach are microkernels. Mach's derivatives are the basis of the operating system kernel in GNU Hurd and of Apple's XNU kernel used in macOS, iOS, iPadOS, tvOS, and watchOS.

<span class="mw-page-title-main">Operating system</span> Software that manages computer hardware resources

An operating system (OS) is system software that manages computer hardware and software resources, and provides common services for computer programs.

<span class="mw-page-title-main">System call</span> Way for programs to access kernel services

In computing, a system call is the programmatic way in which a computer program requests a service from the operating system on which it is executed. This may include hardware-related services, creation and execution of new processes, and communication with integral kernel services such as process scheduling. System calls provide an essential interface between a process and the operating system.

<span class="mw-page-title-main">Kernel panic</span> Fatal error condition associated with Unix-like computer operating systems

A kernel panic is a safety measure taken by an operating system's kernel upon detecting an internal fatal error in which either it is unable to safely recover or continuing to run the system would have a higher risk of major data loss. The term is largely specific to Unix and Unix-like systems. The equivalent on Microsoft Windows operating systems is a stop error, often called a "blue screen of death".

<span class="mw-page-title-main">Crash (computing)</span> Unexpected program exit due to an error

In computing, a crash, or system crash, occurs when a computer program such as a software application or an operating system stops functioning properly and exits. On some operating systems or individual applications, a crash reporting service will report the crash and any details relating to it, usually to the developer(s) of the application. If the program is a critical part of the operating system, the entire system may crash or hang, often resulting in a kernel panic or fatal system error.

In computer operating systems, memory paging is a memory management scheme by which a computer stores and retrieves data from secondary storage for use in main memory. In this scheme, the operating system retrieves data from secondary storage in same-size blocks called pages. Paging is an important part of virtual memory implementations in modern operating systems, using secondary storage to let programs exceed the size of available physical memory.

Signals are standardized messages sent to a running program to trigger specific behavior, such as quitting or error handling. They are a limited form of inter-process communication (IPC), typically used in Unix, Unix-like, and other POSIX-compliant operating systems.

In computing, a page fault is an exception that the memory management unit (MMU) raises when a process accesses a memory page without proper preparations. Accessing the page requires a mapping to be added to the process's virtual address space. Furthermore, the actual page contents may need to be loaded from a back-up, e.g. a disk. The MMU detects the page fault, but the operating system's kernel handles the exception by making the required page accessible in the physical memory or denying an illegal memory access.

The proc filesystem (procfs) is a special filesystem in Unix-like operating systems that presents information about processes and other system information in a hierarchical file-like structure, providing a more convenient and standardized method for dynamically accessing process data held in the kernel than traditional tracing methods or direct access to kernel memory. Typically, it is mapped to a mount point named /proc at boot time. The proc file system acts as an interface to internal data structures about running processes in the kernel. In Linux, it can also be used to obtain information about the kernel and to change certain kernel parameters at runtime (sysctl).

<span class="mw-page-title-main">Crash reporter</span> System software that identify and report crash details

A crash reporter is usually a system software whose function is to identify reporting crash details and to alert when there are crashes, in production or on development / testing environments. Crash reports often include data such as stack traces, type of crash, trends and version of software. These reports help software developers- Web, SAAS, mobile apps and more, to diagnose and fix the underlying problem causing the crashes. Crash reports may contain sensitive information such as passwords, email addresses, and contact information, and so have become objects of interest for researchers in the field of computer security.

In computing, a dynamic linker is the part of an operating system that loads and links the shared libraries needed by an executable when it is executed, by copying the content of libraries from persistent storage to RAM, filling jump tables and relocating pointers. The specific operating system and executable format determine how the dynamic linker functions and how it is implemented.

A debug symbol is a special kind of symbol that attaches additional information to the symbol table of an object file, such as a shared library or an executable. This information allows a symbolic debugger to gain access to information from the source code of the binary, such as the names of identifiers, including variables and routines.

<span class="mw-page-title-main">OS/360 and successors</span> Operating system for IBM S/360 and later mainframes

OS/360, officially known as IBM System/360 Operating System, is a discontinued batch processing operating system developed by IBM for their then-new System/360 mainframe computer, announced in 1964; it was influenced by the earlier IBSYS/IBJOB and Input/Output Control System (IOCS) packages for the IBM 7090/7094 and even more so by the PR155 Operating System for the IBM 1410/7010 processors. It was one of the earliest operating systems to require the computer hardware to include at least one direct access storage device.

<span class="mw-page-title-main">Kernel (operating system)</span> Core of a computer operating system

A kernel is a computer program at the core of a computer's operating system that always has complete control over everything in the system. The kernel is also responsible for preventing and mitigating conflicts between different processes. It is the portion of the operating system code that is always resident in memory and facilitates interactions between hardware and software components. A full kernel controls all hardware resources via device drivers, arbitrates conflicts between processes concerning such resources, and optimizes the utilization of common resources e.g. CPU & cache usage, file systems, and network sockets. On most systems, the kernel is one of the first programs loaded on startup. It handles the rest of startup as well as memory, peripherals, and input/output (I/O) requests from software, translating them into data-processing instructions for the central processing unit.

ptrace is a system call found in Unix and several Unix-like operating systems. By using ptrace one process can control another, enabling the controller to inspect and manipulate the internal state of its target. ptrace is used by debuggers and other code-analysis tools, mostly as aids to software development.

kdump is a feature of the Linux kernel that creates crash dumps in the event of a kernel crash. When triggered, kdump exports a memory image that can be analyzed for the purposes of debugging and determining the cause of a crash. The dumped image of main memory, exported as an Executable and Linkable Format (ELF) object, can be accessed either directly through /proc/vmcore during the handling of a kernel crash, or it can be automatically saved to a locally accessible file system, to a raw device, or to a remote system accessible over network.

References

  1. "AIX 7.1 information".[ permanent dead link ]
  2. core(5) : Process core file   Solaris 11.4 File Formats Reference Manual
  3. Cory Janssen (25 October 2012). "What is a Database Dump? - Definition from Techopedia". Techopedia.com. Archived from the original on 20 August 2015. Retrieved 29 June 2015.
  4. "How to configure a computer to capture a complete memory dump". sophos.com. 12 July 2010. Archived from the original on 1 July 2015. Retrieved 29 June 2015.
  5. Oxford English Dictionary, s.v. 'core'
  6. Brian Kernighan. UNIX: a history and a memoir. ISBN   9781695978553.
  7. "storage dump definition". Archived from the original on 2013-05-11. Retrieved 2013-04-03.
  8. Rogers, Paul; Carey, David (August 2005). z/OS Diagnostic Data Collection and Analysis (PDF). IBM Corporation. pp. 77–93. ISBN   0738493996. Archived (PDF) from the original on December 21, 2018. Retrieved Jan 29, 2021.
  9. IBM Corporation (October 2008). z/VM and Linux Operations for z/OS System Programmers (PDF). p. 24. Retrieved Jan 25, 2022.
  10. Venkateswaran, Sreekrishnan (2008). Essential Linux device drivers. Prentice Hall open source software development series. Prentice Hall. p. 623. ISBN   978-0-13-239655-4. Archived from the original on 2014-06-26. Retrieved 2010-07-15. Until the advent of kdump, Linux Kernel Crash Dump (LKCD) was the popular mechanism to obtain and analyze dumps.
  11. Fedora Documentation Project (2010). Fedora 13 Security Guide. Fultus Corporation. p. 63. ISBN   978-1-59682-214-6. Archived from the original on 2014-06-26. Retrieved 2010-09-29. Remote memory dump services, like netdump, transmit the contents of memory over the network unencrypted.
  12. "System Core-Storage Dump Program" (PDF). IBM 7090/7094 IBSYS Operating System - Version 13 - System Monitor (IBSYS) (PDF). Systems Reference Library (Eighth ed.). IBM. December 30, 1966. pp. 18–20. C28-6248-7. Retrieved May 10, 2024.
  13. "Setting the name-pattern for dump data sets" (PDF). z/OS 2.5 MVS System Commands (PDF). March 25, 2022. pp. 474–475. SA38-0666-50. Retrieved April 6, 2022.
  14. OS/VS2 MVS Interactive Problem Control System (IPCS) System Information - SUID 5752-857 (PDF) (First ed.). IBM. March 1978. GC34-2004-0. Retrieved June 29, 2023.
  15. OS/VS2 MVS Interactive Problem Control System User's Guide and Reference - SUID 5752-857 (PDF) (Second ed.). IBM. October 1979. GC34-2006-1. Retrieved June 29, 2023.
  16. "SETDEF subcommand - set defaults" (PDF). z/OS 2.5 - MVS Interactive Problem Control System (IPCS) Commands (PDF). IBM. 2023-05-12. p.  239. SA23-1382-50. Retrieved April 6, 2022. ACTIVE, MAIN, or STORAGE specifies the central storage for the address space in which IPCS is currently running and allows you to access that active storage as the dump source. You can access private storage and any common storage accessible by an unauthorized program.
  17. "core(5) – Linux manual page". man7.org. 2015-12-05. Archived from the original on 2013-09-20. Retrieved 2016-04-17.
  18. "Varieties of Kernel-Mode Dump Files". Microsoft. Archived from the original on 22 February 2018. Retrieved 22 February 2018.
  19. "Automatic Memory Dump". Microsoft. 28 November 2017. Archived from the original on 17 March 2018. Retrieved 16 March 2018.
  20. "Getting Started with WinDbg (Kernel-Mode)". Archived from the original on 14 March 2016. Retrieved 30 September 2014.
  21. "Get started with Windows debugging" . Retrieved 14 December 2024.{{cite web}}: CS1 maint: url-status (link)
  22. "Tools included in Debugging Tools for Windows" . Retrieved 14 December 2024.{{cite web}}: CS1 maint: url-status (link)
  23. "Minidump Files". Archived from the original on 27 October 2014. Retrieved 30 September 2014.
  24. "MINIDUMP_TYPE enumeration". Archived from the original on 11 January 2015. Retrieved 30 September 2014.

Notes

  1. The term core is obsolete on contemporary hardware, but is used on many systems for historical reasons.
  2. E.g., z/OS
  3. That is, you can position the cursor at a word or doubleword containing an address and request a display of the storage at that address.
  4. 1 2 Some older machines were decimal.
  5. In the sense that the records were binary rather than formatted for printing.
  6. SYStem OUTput files (SYSOUT) files are temporary files owned by the SPOOL software.
  7. Initially the batch utility IMDPRDMP; currently the TSO command and ISPF panel repertoire for Interactive Problem Control System (IPCS).
  8. IBM provided tools for extracting and formatting data from an unformatted dump; those tools [g] often made it easier to deal with an unformatted dump than a formatted dump.
  9. Since then, IBM added the ability to have up to a hundred dump datasets named SYS1.DUMPnn (nn from 00 to 99). z/OS supports multiple system dump data sets with arbitrary dsname patterns under installation and operator [13] control.
  10. With read authority to facility class BLSACTV.ADDRSPAC, IPCS can view any address space.

Descriptions of the file format

Kernel core dumps: