Timeout Detection and Recovery or TDR is a feature of the Windows operating system (OS) introduced in Windows Vista. It detects response problems from a graphics card (GPU), and if a timeout occurs, the OS will attempt a card reset to recover a functional and responsive desktop environment. However, if the attempt was unsuccessful, it results in the Blue Screen of Death (BSOD). The recovery tries to mitigate the scenario where an end user superfluously reboots their device should it become unresponsive. [1]
When the GPU takes more than the allotted time to process a request, the system's GPU scheduler will pick up the anomaly. It then tries to preempt the particular task, this operation has the TDR timeout which is 2 seconds by default. [1] [2]
Once the timeout is up and the task is not completed or preempted, the kernel determines that the GPU is frozen and proceeds to inform the respective driver about the detected timeout. It is then the driver's responsibility to properly reset and reinitialize the underlying GPU. [1] [2]
The OS will then do a bunch of other recovery steps needed for the system to regain responsiveness. If the entire operation was successful, the end user might see some visual artefacts and a message will be shown on the screen describing what had happened ("Display driver stopped responding and has recovered."), else a BSOD might ensue. [1] [2]
There are multiple probable causes should a recovery fail, causing an inevitable BSOD: [2] [3]
Possible BSOD stop codes emitted if the attempted recovery failed:
A kernel panic is a safety measure taken by an operating system's kernel upon detecting an internal fatal error in which either it is unable to safely recover or continuing to run the system would have a higher risk of major data loss. The term is largely specific to Unix and Unix-like systems. The equivalent on Microsoft Windows operating systems is a stop error, often called a "blue screen of death".
In computing, a crash, or system crash, occurs when a computer program such as a software application or an operating system stops functioning properly and exits. On some operating systems or individual applications, a crash reporting service will report the crash and any details relating to it, usually to the developer(s) of the application. If the program is a critical part of the operating system, the entire system may crash or hang, often resulting in a kernel panic or fatal system error, on Windows this can result in a Blue Screen.
NTLDR is the boot loader for all releases of Windows NT operating system from 1993 with the release of Windows NT 3.1 up until Windows XP and Windows Server 2003. From Windows Vista onwards it was replaced by the BOOTMGR bootloader. NTLDR is typically run from the primary storage device, but it can also run from portable storage devices such as a CD-ROM, USB flash drive, or floppy disk. NTLDR can also load a non NT-based operating system given the appropriate boot sector in a file.
A graphics processing unit (GPU) is a specialized electronic circuit initially designed for digital image processing and to accelerate computer graphics, being present either as a discrete video card or embedded on motherboards, mobile phones, personal computers, workstations, and game consoles. After their initial design, GPUs were found to be useful for non-graphic calculations involving embarrassingly parallel problems due to their parallel structure. Other non-graphical uses include the training of neural networks and cryptocurrency mining.
CONFIG.SYS is the primary configuration file for the DOS and OS/2 operating systems. It is a special ASCII text file that contains user-accessible setup or configuration directives evaluated by the operating system's DOS BIOS during boot. CONFIG.SYS was introduced with DOS 2.0.
A watchdog timer, sometimes called a computer operating properly timer, is an electronic or software timer that is used to detect and recover from computer malfunctions. Watchdog timers are widely used in computers to facilitate automatic correction of temporary hardware faults, and to prevent errant or malevolent software from disrupting system operation.
The Advanced Host Controller Interface (AHCI) is a technical standard defined by Intel that specifies the register-level interface of Serial ATA (SATA) host controllers in a non-implementation-specific manner in its motherboard chipsets.
In computing, CHKDSK
is a system tool and command in DOS and Microsoft Windows, as well as Digital Research FlexOS, IBM/Toshiba 4690 OS, IBM OS/2. It verifies the integrity of the file system on a volume and attempts to fix logical file system errors. Logical errors are typically defined as software-level problems with a filesystem as a result of prior software malfunction or irregular use. Logical errors are contrasted with and usually less severe than hardware-level errors, which can not be fixed with CHKDSK
and may instead require data recovery software or expert assistance. CHKDSK
is similar to the fsck
command in Unix and similar to Microsoft ScanDisk, which co-existed with CHKDSK
in Windows 9x and MS-DOS 6.x.
A free and open-source graphics device driver is a software stack which controls computer-graphics hardware and supports graphics-rendering application programming interfaces (APIs) and is released under a free and open-source software license. Graphics device drivers are written for specific hardware to work within a specific operating system kernel and to support a range of APIs used by applications to access the graphics hardware. They may also control output to the display if the display driver is part of the graphics hardware. Most free and open-source graphics device drivers are developed by the Mesa project. The driver is made up of a compiler, a rendering API, and software which manages access to the graphics hardware.
A fatal system error occurs when an operating system halts because it has reached a condition where it can no longer operate safely.
An error message is the information displayed when an unforeseen problem occurs, usually on a computer or other device. Modern operating systems with graphical user interfaces, often display error messages using dialog boxes. Error messages are used when user intervention is required, to indicate that a desired operation has failed, or to relay important warnings. Error messages are seen widely throughout computing, and are part of every operating system or computer hardware device. The proper design of error messages is an important topic in usability and other fields of human–computer interaction.
A machine check exception (MCE) is a type of computer error that occurs when a problem involving the computer's hardware is detected. With most mass-market personal computers, an MCE indicates faulty or misconfigured hardware.
Windows Display Driver Model is the graphic driver architecture for video card drivers running Microsoft Windows versions beginning with Windows Vista.
In computing, a screen of death, colloquially referred to as a blue screen of death, is an informal term for a type of a computer operating system error message displayed onscreen when the system has experienced a fatal system error. The fatal error typically results in unsaved work being lost and often indicates serious problems with the system's hardware or software. These error screens are usually the result of a kernel panic, although the terms are frequently used interchangeably. Most screens of death are displayed on an even background color with a message advising the user to restart the computer.
Intel oneAPI DPC++/C++ Compiler and Intel C++ Compiler Classic are Intel’s C, C++, SYCL, and Data Parallel C++ (DPC++) compilers for Intel processor-based systems, available for Windows, Linux, and macOS operating systems.
The blue screen of death (BSoD) – or blue screen error, blue screen, fatal error, bugcheck, and officially known as a stop error – is a critical error screen displayed by the Microsoft Windows operating systems to indicate a system crash, in which the operating system reaches a critical condition where it can no longer operate safely.
In computing, a hang or freeze occurs when either a process or system ceases to respond to inputs. A typical example is when computer's graphical user interface no longer responds to the user typing on the keyboard or moving the mouse. The term covers a wide range of behaviors in both clients and servers, and is not limited to graphical user interface issues.
Driver Verifier is a tool included in Microsoft Windows that replaces the default operating system subroutines with ones that are specifically developed to catch device driver bugs. Once enabled, it monitors and stresses drivers to detect illegal function calls or actions that may be causing system corruption. It acts within the kernel mode and can target specific device drivers for continual checking or make driver verifier functionality multithreaded, so that several device drivers can be stressed at the same time. It can simulate certain conditions such as low memory, I/O verification, pool tracking, IRQL checking, deadlock detection, DMA checks, IRP logging, etc. The verifier works by forcing drivers to work with minimal resources, making potential errors that might happen only rarely in a working system manifest immediately. Typically fatal system errors are generated by the stressed drivers in the test environment, producing core dumps that can be analysed and debugged immediately; without stressing, intermittent faults would occur in the field, without proper troubleshooting facilities or personnel.
GPU switching is a mechanism used on computers with multiple graphic controllers. This mechanism allows the user to either maximize the graphic performance or prolong battery life by switching between the graphic cards. It is mostly used on gaming laptops which usually have an integrated graphic device and a discrete video card.
In computing, rebooting is the process by which a running computer system is restarted, either intentionally or unintentionally. Reboots can be either a cold reboot in which the power to the system is physically turned off and back on again ; or a warm reboot in which the system restarts while still powered up. The term restart is used to refer to a reboot when the operating system closes all programs and finalizes all pending input and output operations before initiating a soft reboot.