Kernel panic

Last updated
Kernel panic in Ubuntu 4.10, this one due to a VFS error. Ubuntu 4.10 kernel paic.png
Kernel panic in Ubuntu 4.10, this one due to a VFS error.

A kernel panic message from a Linux system Kernel-panic.jpg
A kernel panic message from a Linux system
An OpenSolaris kernel panic. Osol-dtrace-kp.png
An OpenSolaris kernel panic.
Kernel panic in Ubuntu 13.04 "Raring Ringtail" (Linux kernel 3.8) in Oracle VM VirtualBox Ubuntu 13.04 VirtualBox Kernel Panic.png
Kernel panic in Ubuntu 13.04 "Raring Ringtail" (Linux kernel 3.8) in Oracle VM VirtualBox

A kernel panic (sometimes abbreviated as KP [1] ) is a safety measure taken by an operating system's kernel upon detecting an internal fatal error in which either it is unable to safely recover or continuing to run the system would have a higher risk of major data loss. The term is largely specific to Unix and Unix-like systems. The equivalent on Microsoft Windows operating systems is a stop error, often called a "blue screen of death".

Contents

The kernel routines that handle panics, known as panic() in AT&T-derived and BSD Unix source code, are generally designed to output an error message to the console, dump an image of kernel memory to disk for post-mortem debugging, and then either wait for the system to be manually rebooted, or initiate an automatic reboot. [2] The information provided is of a highly technical nature and aims to assist a system administrator or software developer in diagnosing the problem. Kernel panics can also be caused by errors originating outside kernel space. For example, many Unix operating systems panic if the init process, which runs in user space, terminates. [3] [4]

History

The Unix kernel maintains internal consistency and runtime correctness with assertions as the fault detection mechanism. The basic assumption is that the hardware and the software should perform correctly and a failure of an assertion results in a panic, i.e. a voluntary halt to all system activity. [5] The kernel panic was introduced in an early version of Unix and demonstrated a major difference between the design philosophies of Unix and its predecessor Multics. Multics developer Tom van Vleck recalls a discussion of this change with Unix developer Dennis Ritchie:

I remarked to Dennis that easily half the code I was writing in Multics was error recovery code. He said, "We left all that stuff out. If there's an error, we have this routine called panic, and when it is called, the machine crashes, and you holler down the hall, 'Hey, reboot it.'" [6]

The original panic() function was essentially unchanged from Fifth Edition UNIX to the VAX-based UNIX 32V and output only an error message with no other information, then dropped the system into an endless idle loop.

Source code of panic() function in V6 UNIX: [7]

/* * In case console is off, * panicstr contains argument to last * call to panic. */char*panicstr;/* * Panic is called on unresolvable * fatal errors. * It syncs, prints "panic: mesg" and * then loops. */panic(s)char*s;{panicstr=s;update();printf("panic: %s\n",s);for(;;)idle();}

As the Unix codebase was enhanced, the panic() function was also enhanced to dump various forms of debugging information to the console.

Causes

A panic may occur as a result of a hardware failure or a software bug in the operating system. In many cases, the operating system is capable of continued operation after an error has occurred. If the system is in an unstable state, rather than risking security breaches and data corruption, the operating system stops in order to prevent further damage, which helps to facilitate diagnosis of the error and may restart automatically. [8]

After recompiling a kernel binary image from source code, a kernel panic while booting the resulting kernel is a common problem if the kernel was not correctly configured, compiled or installed. [9] Add-on hardware or malfunctioning RAM could also be sources of fatal kernel errors during start up, due to incompatibility with the OS or a missing device driver. [10] A kernel may also go into panic() if it is unable to locate a root file system. [11] During the final stages of kernel userspace initialization, a panic is typically triggered if the spawning of init fails. A panic might also be triggered if the init process terminates, as the system would then be unusable. [12]

The following is an implementation of the Linux kernel final initialization in kernel_init(): [13]

staticint__refkernel_init(void*unused){.../*         * We try each of these until one succeeds.         *         * The Bourne shell can be used instead of init if we are         * trying to recover a really broken machine.         */if(execute_command){if(!run_init_process(execute_command))return0;pr_err("Failed to execute %s.  Attempting defaults...\n",execute_command);}if(!run_init_process("/sbin/init")||!run_init_process("/etc/init")||!run_init_process("/bin/init")||!run_init_process("/bin/sh"))return0;panic("No init found.  Try passing init= option to kernel. ""See Linux Documentation/init.txt for guidance.");}

Operating system specifics

Linux

Kernel panic as seen on an iKVM console Kernel panic message.png
Kernel panic as seen on an iKVM console

Kernel panics appear in Linux like in other Unix-like systems; however, serious but non-fatal errors can generate another kind of error condition, known as a kernel oops. [14] In this case, the kernel normally continues to run after killing the offending process. As an oops could cause some subsystems or resources to become unavailable, they can later lead to a full kernel panic.

On Linux, a kernel panic causes keyboard LEDs to blink as a visual indication of a critical condition. [15]

macOS

When a kernel panic occurs in Mac OS X 10.2 through 10.7, the computer displays a multilingual message informing the user that they need to reboot the system. [16] Prior to 10.2, a more traditional Unix-style panic message was displayed; in 10.8 and later, the computer automatically reboots and the message is only displayed as a skippable warning afterward. The format of the message varies from version to version: [17]

If five new kernel panics occur within three minutes of the first one, the Mac will display a prohibitory sign for thirty seconds, and then shut down; this is known as a "recurring kernel panic". [18]

In all versions above 10.2, the text is superimposed on a standby symbol and is not full screen. Debugging information is saved in NVRAM and written to a log file on reboot. In 10.7 there is a feature to automatically restart after a kernel panic. In some cases, on 10.2 and later, white text detailing the error may appear in addition to the standby symbol.

See also

Related Research Articles

Guru Meditation is an error notice originally displayed by the Amiga computer when it crashes. It is now also used by Varnish, a software component used by many content-heavy websites. This has led to many internet users seeing a "Guru Meditation" message when these websites suffer crashes or other issues. It is analogous to the "Blue Screen of Death" in Microsoft Windows operating systems, or a kernel panic in Unix.

<span class="mw-page-title-main">Operating system</span> Software that manages computer hardware resources

An operating system (OS) is system software that manages computer hardware and software resources, and provides common services for computer programs.

In computing, a core dump, memory dump, crash dump, storage dump, system dump, or ABEND dump consists of the recorded state of the working memory of a computer program at a specific time, generally when the program has crashed or otherwise terminated abnormally. In practice, other key pieces of program state are usually dumped at the same time, including the processor registers, which may include the program counter and stack pointer, memory management information, and other processor and operating system flags and information. A snapshot dump is a memory dump requested by the computer operator or by the running program, after which the program is able to continue. Core dumps are often used to assist in diagnosing and debugging errors in computer programs.

<span class="mw-page-title-main">Crash (computing)</span> Unexpected program exit due to an error

In computing, a crash, or system crash, occurs when a computer program such as a software application or an operating system stops functioning properly and exits. On some operating systems or individual applications, a crash reporting service will report the crash and any details relating to it, usually to the developer(s) of the application. If the program is a critical part of the operating system, the entire system may crash or hang, often resulting in a kernel panic or fatal system error.

Hexspeak is a novelty form of variant English spelling using the hexadecimal digits. Created by programmers as memorable magic numbers, hexspeak words can serve as a clear and unique identifier with which to mark memory or data.

In computing, the process identifier is a number used by most operating system kernels—such as those of Unix, macOS and Windows—to uniquely identify an active process. This number may be used as a parameter in various function calls, allowing processes to be manipulated, such as adjusting the process's priority or killing it altogether.

Signals are standardized messages sent to a running program to trigger specific behavior, such as quitting or error handling. They are a limited form of inter-process communication (IPC), typically used in Unix, Unix-like, and other POSIX-compliant operating systems.

init UNIX system component

In Unix-based computer operating systems, init is the first process started during booting of the operating system. Init is a daemon process that continues running until the system is shut down. It is the direct or indirect ancestor of all other processes and automatically adopts all orphaned processes. Init is started by the kernel during the booting process; a kernel panic will occur if the kernel is unable to start it, or it should die for any reason. Init is typically assigned process identifier 1.

The bomb icon (💣) has several different applications in computing, and typically indicates a fatal system error.

<span class="mw-page-title-main">Fatal system error</span> Error that stops the operating system

A fatal system error occurs when an operating system halts because it has reached a condition where it can no longer operate safely.

<span class="mw-page-title-main">Error message</span> Computer message indicating an error

An error message is the information displayed when an unforeseen problem occurs, usually on a computer or other device. Modern operating systems with graphical user interfaces, often display error messages using dialog boxes. Error messages are used when user intervention is required, to indicate that a desired operation has failed, or to relay important warnings. Error messages are seen widely throughout computing, and are part of every operating system or computer hardware device. The proper design of error messages is an important topic in usability and other fields of human–computer interaction.

<span class="mw-page-title-main">Screen of death</span> Fatal error displays in operating systems

In computing, a screen of death, colloquially referred to as a blue screen of death, is an informal term for a type of a computer operating system error message displayed onscreen when the system has experienced a fatal system error. The fatal error typically results in unsaved work being lost and often indicates serious problems with the system's hardware or software. These error screens are usually the result of a kernel panic, although the terms are frequently used interchangeably. Most screens of death are displayed on an even background color with a message advising the user to restart the computer.

<span class="mw-page-title-main">Blue screen of death</span> Fatal system error screen

The blue screen of death is a critical error screen displayed by the Microsoft Windows operating systems. It indicates a system crash, in which the operating system reaches a critical condition where it can no longer operate safely.

<span class="mw-page-title-main">Kernel (operating system)</span> Core of a computer operating system

The kernel is a computer program at the core of a computer's operating system and generally has complete control over everything in the system. The kernel is also responsible for preventing and mitigating conflicts between different processes. It is the portion of the operating system code that is always resident in memory and facilitates interactions between hardware and software components. A full kernel controls all hardware resources via device drivers, arbitrates conflicts between processes concerning such resources, and optimizes the utilization of common resources e.g. CPU & cache usage, file systems, and network sockets. On most systems, the kernel is one of the first programs loaded on startup. It handles the rest of startup as well as memory, peripherals, and input/output (I/O) requests from software, translating them into data-processing instructions for the central processing unit.

<span class="mw-page-title-main">Unix-like</span> Operating system that behaves similarly to Unix, e.g. Linux

A Unix-like operating system is one that behaves in a manner similar to a Unix system, although not necessarily conforming to or being certified to any version of the Single UNIX Specification. A Unix-like application is one that behaves like the corresponding Unix command or shell. Although there are general philosophies for Unix design, there is no technical standard defining the term, and opinions can differ about the degree to which a particular operating system or application is Unix-like.

<span class="mw-page-title-main">Unix</span> Family of computer operating systems

Unix is a family of multitasking, multi-user computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, and others.

ptrace is a system call found in Unix and several Unix-like operating systems. By using ptrace one process can control another, enabling the controller to inspect and manipulate the internal state of its target. ptrace is used by debuggers and other code-analysis tools, mostly as aids to software development.

<span class="mw-page-title-main">Command-line interface</span> Computer interface that uses text

A command-line interface (CLI) is a means of interacting with a computer program by inputting lines of text called command-lines. Command-line interfaces emerged in the mid-1960s, on computer terminals, as an interactive and more user-friendly alternative to the non-interactive interface available with punched cards.

In computing, rebooting is the process by which a running computer system is restarted, either intentionally or unintentionally. Reboots can be either a cold reboot in which the power to the system is physically turned off and back on again ; or a warm reboot in which the system restarts while still powered up. The term restart is used to refer to a reboot when the operating system closes all programs and finalizes all pending input and output operations before initiating a soft reboot.

Comparison of user features of operating systems refers to a comparison of the general user features of major operating systems in a narrative format. It does not encompass a full exhaustive comparison or description of all technical details of all operating systems. It is a comparison of basic roles and the most prominent features. It also includes the most important features of the operating system's origins, historical development, and role.

References

  1. "KP - Kernel Panic (Linux) | AcronymFinder". www.acronymfinder.com. Archived from the original on October 26, 2015. Retrieved January 6, 2016.
  2. "FreeBSD 11.0 - man page for panic (freebsd section 9) - Unix & Linux Commands". www.unix.com. Archived from the original on April 1, 2024. Retrieved October 26, 2010.
  3. "boot failure-init died - Unix Linux Forums - HP-UX". www.unix.com. Archived from the original on April 1, 2024. Retrieved June 12, 2013.
  4. Randolph J. Herber (September 1, 1999). "Re: PANIC: init died". Newsgroup:  comp.sys.sgi.admin. Archived from the original on January 22, 2011. Retrieved December 9, 2017.
  5. Daniel P. Siewiorek; Robert S. Swarz (1998). Reliable computer systems: design and evaluation. A K Peters, Ltd. p. 622. ISBN   978-1-56881-092-8 . Retrieved May 6, 2011.
  6. "Unix and Multics". www.multicians.org. Archived from the original on August 5, 2012. Retrieved May 25, 2005.
  7. "Source code /usr/sys/ken/prf.c". Archived from the original on February 24, 2021. from V6 UNIX
  8. Steven M. Hancock (November 22, 2002). Tru64 UNIX troubleshooting: diagnosing and correcting system problemsHP Technologies SeriesITPro collection. Digital Press. pp. 119–126. ISBN   978-1-55558-274-6 . Retrieved May 3, 2011.
  9. Michael Jang (2006). Linux annoyances for geeks. O'Reilly Media, Inc. pp. 267–274. ISBN   978-0-596-00801-7 . Retrieved April 29, 2011.
  10. David Pogue (December 17, 2009). Switching to the Mac: The Missing Manual, Snow Leopard Edition. O'Reilly Media, Inc. p. 589. ISBN   978-0-596-80425-1 . Retrieved May 4, 2011.
  11. Greg Kroah-Hartman (2007). Linux kernel in a nutshell. O'Reilly Media, Inc. p. 59. ISBN   978-0-596-10079-7 . Retrieved May 3, 2011.
  12. Wolfgang Mauerer (September 26, 2008). Professional Linux Kernel Architecture. John Wiley and Sons. pp. 1238–1239. ISBN   978-0-470-34343-2. Archived from the original on April 1, 2024. Retrieved May 3, 2011.
  13. "linux/init/main.c". LXR Cross Referencer . Archived from the original on October 6, 2022.
  14. "Linux Device Drivers, Chapter 4" (PDF). Archived (PDF) from the original on November 14, 2014. Retrieved July 21, 2016.
  15. James Kirkland; David Carmichael; Christopher L. Tinker; Gregory L. Tinker (May 2006). Linux Troubleshooting for System Administrators and Power Users. Prentice Hall. p. 62. ISBN   9780132797399. Archived from the original on April 1, 2024. Retrieved February 5, 2016.
  16. "OS X: About kernel panics - Apple Support". support.apple.com. Archived from the original on May 21, 2013.
  17. "A New Screen of Death for Mac OS X". OSXBook.com. Archived from the original on May 1, 2012. Retrieved April 30, 2011.
  18. "OS X: About kernel panics". Apple Support. Apple. Archived from the original on May 24, 2018.