Kdump (Linux)

Last updated

kdump is a feature of the Linux kernel that creates crash dumps in the event of a kernel crash. When triggered, kdump exports a memory image (also known as vmcore) that can be analyzed for the purposes of debugging and determining the cause of a crash. The dumped image of main memory, exported as an Executable and Linkable Format (ELF) object, can be accessed either directly through /proc/vmcore during the handling of a kernel crash, or it can be automatically saved to a locally accessible file system, to a raw device, or to a remote system accessible over network. [1] [2]

Contents

Internals

In a "dual kernel" layout, kdump uses kexec to boot another kernel and obtain a memory dump. Kdump.svg
In a "dual kernel" layout, kdump uses kexec to boot another kernel and obtain a memory dump.

In the event of a kernel crash, kdump preserves system consistency by booting another Linux kernel, which is known as the dump-capture kernel, and using it to export and save a memory dump. As a result, the system boots into a clean and reliable environment instead of relying on an already crashed kernel that may cause various issues, such as causing file system corruption while writing a memory dump file. To implement this "dual kernel" layout, kdump uses kexec for "warm" booting into the dump-capture kernel immediately after the kernel crash, using kexec's ability to boot "over" the currently running kernel while avoiding the execution of a bootloader and hardware initialization performed by the system firmware (BIOS or UEFI). A dump-capture kernel can be either a separate Linux kernel image built specifically for that purpose, or the primary kernel image can be reused on architectures that support relocatable kernels. [1] [3] [4] [5] :5–6

The contents of main memory (RAM) are preserved while booting into and running the dump-capture kernel by reserving a small amount of RAM in advance, into which the dump-capture kernel is preloaded so none of the RAM used by the primary kernel is overwritten when a kernel crash is handled. This reserved amount of RAM is used solely by the dump-capture kernel and is otherwise unused during normal system operation. Some architectures, including x86 and ppc64, require a small fixed-position portion of RAM to boot a kernel regardless of where it is loaded; in this case, kexec creates a copy of that portion of RAM so it is also accessible to the dump-capture kernel. Size and optional position of the reserved portion of RAM are specified through the kernel boot parameter crashkernel, and the kexec command-line utility is used after the primary kernel boots to preload a dump-capture kernel image and its associated initrd image into the reserved portion of RAM. [1] [3] [4]

In addition to the functionality that is part of the Linux kernel, additional userspace utilities support the kdump mechanism, including the kexec utility mentioned above. [1] [4] Besides the official utilities, which are provided as a patch to the kexec's suite of userspace utilities, some Linux distributions provide additional utilities that simplify the configuration of kdump's operation, including the setup of automated saving of memory dump files. [6] [7] [8] Created memory dump files can be analyzed using the GNU Debugger (gdb), or by using Red Hat's dedicated crash utility. [9] [10]

History

kdump functionality, together with kexec, was merged into the Linux kernel mainline in kernel version 2.6.13, which was released on August 29, 2005. [11]

See also

Related Research Articles

In computing, a core dump, memory dump, crash dump, storage dump, system dump, or ABEND dump consists of the recorded state of the working memory of a computer program at a specific time, generally when the program has crashed or otherwise terminated abnormally. In practice, other key pieces of program state are usually dumped at the same time, including the processor registers, which may include the program counter and stack pointer, memory management information, and other processor and operating system flags and information. A snapshot dump is a memory dump requested by the computer operator or by the running program, after which the program is able to continue. Core dumps are often used to assist in diagnosing and debugging errors in computer programs.

<span class="mw-page-title-main">Kernel panic</span> Fatal error condition associated with Unix-like computer operating systems

A kernel panic is a safety measure taken by an operating system's kernel upon detecting an internal fatal error in which either it is unable to safely recover or continuing to run the system would have a higher risk of major data loss. The term is largely specific to Unix and Unix-like systems. The equivalent on Microsoft Windows operating systems is a stop error, often called a "blue screen of death".

<span class="mw-page-title-main">GNU GRUB</span> Boot loader package

GNU GRUB is a boot loader package from the GNU Project. GRUB is the reference implementation of the Free Software Foundation's Multiboot Specification, which provides a user the choice to boot one of multiple operating systems installed on a computer or select a specific kernel configuration available on a particular operating system's partitions.

<span class="mw-page-title-main">UEFI</span> Operating system and firmware specification

UEFI is a set of specifications written by the UEFI Forum. They define the architecture of the platform firmware used for booting and its interface for interaction with the operating system. Examples of firmware that implement these specifications are AMI Aptio, Phoenix SecureCore Tiano, TianoCore EDK II and InsydeH2O.

vmlinux Executable file containing the Linux kernel

vmlinux is a statically linked executable file that contains the Linux kernel in one of the object file formats supported by Linux, which includes Executable and Linkable Format (ELF) and Common Object File Format (COFF). The vmlinux file might be required for kernel debugging, symbol table generation or other operations, but must be made bootable before being used as an operating system kernel by adding a multiboot header, bootsector and setup routines.

In Linux systems, initrd is a scheme for loading a temporary root file system into memory, to be used as part of the Linux startup process. initrd and initramfs refer to two different methods of achieving this. Both are commonly used to make preparations before the real root file system can be mounted.

<span class="mw-page-title-main">Linux kernel oops</span>

In computing, an oops is a serious but non-fatal error in the Linux kernel. An oops may precede a kernel panic, but it may also allow continued operation with compromised reliability. The term does not stand for anything, other than that it is a simple mistake.

The following tables compare general and technical information for a number of available bootloaders.

kexec, abbreviated from kernel execute and analogous to the Unix/Linux kernel call exec, is a mechanism of the Linux kernel that allows booting of a new kernel from the currently running one. Essentially, kexec skips the bootloader stage and hardware initialization phase performed by the system firmware, and directly loads the new kernel into main memory and starts executing it immediately. This avoids the long times associated with a full reboot, and can help systems to meet high-availability requirements by minimizing downtime.

ktrace is a utility included with certain versions of BSD Unix and Mac OS X that traces kernel interaction with a program and dumps it to disk for the purposes of debugging and analysis. Traced kernel operations include system calls, namei translations, signal processing, and I/O.

<span class="mw-page-title-main">Blue screen of death</span> Error screen displayed after a fatal system error on a computer running Microsoft Windows or ReactOS

The Blue Screen of Death (BSoD), officially known as a Stop error, Blue screen error, fatal error, or bugcheck is an error screen displayed by the Microsoft Windows or ReactOS operating systems.

In computing, rebooting is the process by which a running computer system is restarted, either intentionally or unintentionally. Reboots can be either a cold reboot in which the power to the system is physically turned off and back on again ; or a warm reboot in which the system restarts while still powered up. The term restart is used to refer to a reboot when the operating system closes all programs and finalizes all pending input and output operations before initiating a soft reboot.

perf is a performance analyzing tool in Linux, available from Linux kernel version 2.6.31 in 2009. Userspace controlling utility, named perf, is accessed from the command line and provides a number of subcommands; it is capable of statistical profiling of the entire system.

Dracut is a set of tools that provide enhanced functionality for automating the Linux boot process. The tool named dracut is used to create a Linux boot image (initramfs) by copying tools and files from an installed system and combining it with the Dracut framework, which is usually found in /usr/lib/dracut/modules.d.

Memory forensics is forensic analysis of a computer's memory dump. Its primary application is investigation of advanced computer attacks which are stealthy enough to avoid leaving data on the computer's hard drive. Consequently, the memory (RAM) must be analyzed for forensic information.

kGraft is a feature of the Linux kernel that implements live patching of a running kernel, which allows kernel patches to be applied while the kernel is still running. By avoiding the need for rebooting the system with a new kernel that contains the desired patches, kGraft aims to maximize the system uptime and availability. At the same time, kGraft allows kernel-related security updates to be applied without deferring them to scheduled downtimes. Internally, kGraft allows entire functions in a running kernel to be replaced with their patched versions, doing that safely by selectively using original versions of functions to ensure per-process consistency while the live patching is performed.

kpatch is a feature of the Linux kernel that implements live patching of a running kernel, which allows kernel patches to be applied while the kernel is still running. By avoiding the need for rebooting the system with a new kernel that contains the desired patches, kpatch aims to maximize the system uptime and availability. At the same time, kpatch allows kernel-related security updates to be applied without deferring them to scheduled downtimes. Internally, kpatch allows entire functions in a running kernel to be replaced with their patched versions, doing that safely by stopping all running processes while the live patching is performed.

kdump may refer to:

ftrace is a tracing framework for the Linux kernel. Although its original name, Function Tracer, came from ftrace's ability to record information related to various function calls performed while the kernel is running, ftrace's tracing capabilities cover a much broader range of kernel's internal operations.

<span class="mw-page-title-main">ProcDump</span>

ProcDump is a command-line application used for monitoring an application for CPU spikes and creating crash dumps during a spike. The crash dumps can then be used by an administrator or software developer to determine the cause of the spike. ProcDump supports monitoring of hung windows and unhandled exceptions. It can also create dumps based on the values of system performance counters.

References

  1. 1 2 3 4 Jonathan Corbet (October 27, 2004). "Crash dumps with kexec". LWN.net . Retrieved August 9, 2014.
  2. "13.2 About Kdump (Chapter 13: Support Diagnostic Tools)". Oracle Corporation. 2012. Retrieved August 9, 2014.
  3. 1 2 3 Vivek Goyal; Eric W. Biederman; Hariprasad Nellitheertha (June 14, 2006). "Kdump: A Kexec-based Kernel Crash Dumping Mechanism" (PDF). lse.sourceforge.net. Retrieved August 9, 2014.
  4. 1 2 3 "Linux kernel documentation: Documentation/kdump/kdump.txt". kernel.org. August 12, 2013. Retrieved August 9, 2014.
  5. Takashi Iwai (July 26, 2006). "Debugging using Kdump" (PDF). SUSE . Retrieved August 9, 2014.
  6. "29.2.2. Using the Kernel Dump Configuration Utility (Red Hat Enterprise Linux 6 Deployment Guide)". Red Hat . Retrieved August 9, 2014.
  7. "kexec and kdump: Basic kdump Configuration (System Analysis and Tuning Guide)". SUSE. April 25, 2014. Retrieved August 9, 2014.
  8. "How to use kdump to debug kernel crashes". Fedora. April 9, 2014. Retrieved August 9, 2014.
  9. David Anderson (August 27, 2010). "White Paper: Red Hat Crash Utility". Red Hat . Retrieved August 9, 2014.
  10. "kexec and kdump: Analyzing the Crash Dump (System Analysis and Tuning Guide)". SUSE. April 25, 2014. Retrieved August 9, 2014.
  11. "Linux kernel 2.6.13". kernelnewbies.org. August 29, 2005. Retrieved August 9, 2014.