Pentium F00F bug

Last updated

The Pentium F00F bug is a design flaw in the majority of Intel Pentium, Pentium MMX, and Pentium OverDrive processors (all in the P5 microarchitecture). Discovered in 1997, it can result in the processor ceasing to function until the computer is physically rebooted. The bug has been circumvented through operating system updates.

Contents

The name is shorthand for F0 0F C7 C8, the hexadecimal encoding of one offending instruction. [1] More formally, the bug is called the invalid operand with locked CMPXCHG8B instruction bug. [2]

Description

In the x86 architecture, the byte sequence F0 0F C7 C8 represents the instruction lock cmpxchg8b eax (locked compare and exchange of 8 bytes in register EAX). The bug also applies to opcodes ending in C9 through CF, which specify register operands other than EAX. The F0 0F C7 C8 instruction does not require any special privileges.

This instruction encoding is invalid. The cmpxchg8b instruction compares the value in the EDX and EAX registers with an 8-byte value in a memory location. In this case, however, a register is specified instead of a memory location, which is not allowed.

Under normal circumstances, this would simply result in an exception; however, when used with the lock prefix (normally used to prevent two processors from interfering with the same memory location), the CPU erroneously uses locked bus cycles to read the illegal instruction exception-handler descriptor. Locked reads must be paired with locked writes, and the CPU's bus interface enforces this by forbidding other memory accesses until the corresponding writes occur. As none are forthcoming, after performing these bus cycles all CPU activity stops, and the CPU must be reset to recover.

Due to the proliferation of Intel microprocessors, the existence of this open-privilege instruction was considered a serious issue at the time. Operating system vendors responded by implementing workarounds that detected the condition and prevented the crash. [3] Information about the bug first appeared on the Internet on or around 8 November 1997. [4] Since the F00F bug has become common knowledge, the term is sometimes used to describe similar hardware design flaws such as the Cyrix coma bug.

No permanent hardware damage results from executing the F00F instruction on a vulnerable system; it simply locks up until rebooted. However, data loss of unsaved data is likely if the disk buffers have not been flushed, if drives were interrupted during a write operation, or if some other non-atomic operation was interrupted.

The myB2 stepping solved this issue for Intel's Pentium processors. [2]

The F00F instruction can be considered an example of a Halt and Catch Fire (HCF) instruction.

Workarounds

Although a definite solution to this problem required some sort of hardware/firmware revision, there were proposed workarounds at the time [1] which prevented the exploitation of this issue in generating a denial-of-service attack on the affected machine. All of them were based on forcefully breaking up the pattern of faulty bus accesses responsible for the processor hang. Intel's proposed (therefore "official") solutions required setting up the table of interrupt descriptors in an unnatural way that forced the processor to issue an intervening page fault before it could access the memory containing the descriptor for the undefined-opcode exception. These extraneous memory accesses turned out to be sufficient for the bus interface to let go of the locking requirement that was the root cause for the bug.

Specifically, the table of interrupt descriptors, which normally resides on a single memory page, is instead split over two pages such that the descriptors for the first seven exception handlers reside on a page, and the remainder of the table on the following page. The handler for the undefined opcode exception is then the last descriptor on the first page, while the handler for the page-fault exception resides on the second page. The first page can now be made not-present (usually signifying a page that has been swapped out to disk to make room for some other data), which will force the processor to fetch the descriptor for the page-fault exception handler. This descriptor, residing on the second page of the table, is present in memory as usual (if it were not, the processor would double- and then triple-fault, leading to a shutdown). These extra memory cycles override the memory locking requirement issued by the original illegal instruction (since faulting instructions are supposed to be able to be restarted after the exception handler returns). The handler for the page-fault exception has to be modified, however, to cope with the necessity of providing the missing page for the first half of the interrupt descriptor table, a task it is not usually required to perform.

The second official workaround from Intel proposed keeping all pages present in memory, but marking the first page read-only. Since the originating illegal instruction was supposed to issue a memory write cycle, this is enough for again forcing the intervention of the page-fault handler. This variant has the advantage that the modifications required to the page-fault handler are very minor compared to the ones required for the first variant; it basically just needs to redirect to the undefined-exception handler when appropriate. However, this variant requires that the operating system itself be prevented from writing to read-only pages (through the setting of a global processor flag), and not all kernels are designed this way; more recent kernels in fact are, since this is the same basic mechanism used for implementing copy-on-write.

Additional workarounds other than the official ones from Intel have been proposed; in many cases these proved as effective and much easier to implement. [1] The simplest one involved merely marking the page containing interrupt descriptors as non-cacheable. Again, the extra memory cycles that the processor was forced to go through to fetch data from RAM every time it needed to invoke an exception handler appeared to be all that was needed to prevent the processor from locking up. In this case, no modification whatsoever to any exception handler was required. And, although not strictly necessary, the same split of the interrupt descriptor table was performed in this case, with only the first page marked non-cacheable. This was for performance reasons, as the page containing most of the descriptors (and the ones more often required, in fact) could stay in cache.

For unknown reasons, these additional, unofficial workarounds were never endorsed by Intel. It might be that it was suspected that they might not work with all affected processor versions.

See also

Related Research Articles

<span class="mw-page-title-main">Pentium (original)</span> Intel microprocessor

The Pentium is a fifth generation, 32-bit x86 microprocessor that was introduced by Intel on March 22, 1993, as the very first CPU in the Pentium brand. It was instruction set compatible with the 80486 but was a new and very different microarchitecture design from previous iterations. The P5 Pentium was the first superscalar x86 microarchitecture and the world's first superscalar microprocessor to be in mass production—meaning it generally executes at least 2 instructions per clock mainly because of a design-first dual integer pipeline design previously thought impossible to implement on a CISC microarchitecture. Additional features include a faster floating-point unit, wider data bus, separate code and data caches, and many other techniques and features to enhance performance and support security, encryption, and multiprocessing, for workstations and servers when compared to the next best previous industry standard processor implementation before it, the Intel 80486.

x86 Family of instruction set architectures

x86 is a family of complex instruction set computer (CISC) instruction set architectures initially developed by Intel based on the Intel 8086 microprocessor and its 8088 variant. The 8086 was introduced in 1978 as a fully 16-bit extension of Intel's 8-bit 8080 microprocessor, with memory segmentation as a solution for addressing more memory than can be covered by a plain 16-bit address. The term "x86" came into being because the names of several successors to Intel's 8086 processor end in "86", including the 80186, 80286, 80386 and 80486 processors. Colloquially, their names were "186", "286", "386" and "486".

<span class="mw-page-title-main">MMX (instruction set)</span> Instruction set designed by Intel

MMX is a single instruction, multiple data (SIMD) instruction set architecture designed by Intel, introduced on January 8, 1997 with its Pentium P5 (microarchitecture) based line of microprocessors, named "Pentium with MMX Technology". It developed out of a similar unit introduced on the Intel i860, and earlier the Intel i750 video pixel processor. MMX is a processor supplementary capability that is supported on IA-32 processors by Intel and other vendors as of 1997.

In computing, Streaming SIMD Extensions (SSE) is a single instruction, multiple data (SIMD) instruction set extension to the x86 architecture, designed by Intel and introduced in 1999 in their Pentium III series of central processing units (CPUs) shortly after the appearance of Advanced Micro Devices (AMD's) 3DNow!. SSE contains 70 new instructions, most of which work on single precision floating-point data. SIMD instructions can greatly increase performance when exactly the same operations are to be performed on multiple data objects. Typical applications are digital signal processing and graphics processing.

<span class="mw-page-title-main">Intel 8051</span> Single chip microcontroller series by Intel

The Intel MCS-51 is a single chip microcontroller (MCU) series developed by Intel in 1980 for use in embedded systems. The architect of the Intel MCS-51 instruction set was John H. Wharton. Intel's original versions were popular in the 1980s and early 1990s, and enhanced binary compatible derivatives remain popular today. It is a complex instruction set computer, but also has some of the features of RISC architectures, such as a large register set and register windows, and has separate memory spaces for program instructions and data.

x86 memory segmentation refers to the implementation of memory segmentation in the Intel x86 computer instruction set architecture. Segmentation was introduced on the Intel 8086 in 1978 as a way to allow programs to address more than 64 KB (65,536 bytes) of memory. The Intel 80286 introduced a second version of segmentation in 1982 that added support for virtual memory and memory protection. At this point the original mode was renamed to real mode, and the new version was named protected mode. The x86-64 architecture, introduced in 2003, has largely dropped support for segmentation in 64-bit mode.

<span class="mw-page-title-main">Pentium Pro</span> Sixth-generation x86 microprocessor by Intel

The Pentium Pro is a sixth-generation x86 microprocessor developed and manufactured by Intel and introduced on November 1, 1995. It introduced the P6 microarchitecture and was originally intended to replace the original Pentium in a full range of applications. While the Pentium and Pentium MMX had 3.1 and 4.5 million transistors, respectively, the Pentium Pro contained 5.5 million transistors. Later, it was reduced to a more narrow role as a server and high-end desktop processor and was used in supercomputers like ASCI Red, the first computer to reach the trillion floating point operations per second (teraFLOPS) performance mark in 1996. The Pentium Pro was capable of both dual- and quad-processor configurations. It only came in one form factor, the relatively large rectangular Socket 8. The Pentium Pro was succeeded by the Pentium II Xeon in 1998.

x86 assembly language is the name for the family of assembly languages which provide some level of backward compatibility with CPUs back to the Intel 8008 microprocessor, which was launched in April 1972. It is used to produce object code for the x86 class of processors.

INT is an assembly language instruction for x86 processors that generates a software interrupt. It takes the interrupt number formatted as a byte value.

In computer engineering, Halt and Catch Fire, known by the assembly mnemonic HCF, is an idiom referring to a computer machine code instruction that causes the computer's central processing unit (CPU) to cease meaningful operation, typically requiring a restart of the computer. It originally referred to a fictitious instruction in IBM System/360 computers, making a joke about its numerous non-obvious instruction mnemonics.

<span class="mw-page-title-main">General protection fault</span>

A general protection fault (GPF) in the x86 instruction set architectures (ISAs) is a fault initiated by ISA-defined protection mechanisms in response to an access violation caused by some running code, either in the kernel or a user program. The mechanism is first described in Intel manuals and datasheets for the Intel 80286 CPU, which was introduced in 1983; it is also described in section 9.8.13 in the Intel 80386 programmer's reference manual from 1986. A general protection fault is implemented as an interrupt. Some operating systems may also classify some exceptions not related to access violations, such as illegal opcode exceptions, as general protection faults, even though they have nothing to do with memory protection. If a CPU detects a protection violation, it stops executing the code and sends a GPF interrupt. In most cases, the operating system removes the failing process from the execution queue, signals the user, and continues executing other processes. If, however, the operating system fails to catch the general protection fault, i.e. another protection violation occurs before the operating system returns from the previous GPF interrupt, the CPU signals a double fault, stopping the operating system. If yet another failure occurs, the CPU is unable to recover; since 80286, the CPU enters a special halt state called "Shutdown", which can only be exited through a hardware reset. The IBM PC AT, the first PC-compatible system to contain an 80286, has hardware that detects the Shutdown state and automatically resets the CPU when it occurs. All descendants of the PC AT do the same, so in a PC, a triple fault causes an immediate system reset.

The x86 instruction set refers to the set of instructions that x86-compatible microprocessors support. The instructions are usually part of an executable program, often stored as a computer file and executed on the processor.

<span class="mw-page-title-main">Illegal opcode</span> Undocumented CPU instruction that has an effect

An illegal opcode, also called an unimplemented operation, unintended opcode or undocumented instruction, is an instruction to a CPU that is not mentioned in any official documentation released by the CPU's designer or manufacturer, which nevertheless has an effect. Illegal opcodes were common on older CPUs designed during the 1970s, such as the MOS Technology 6502, Intel 8086, and the Zilog Z80. On these older processors, many exist as a side effect of the wiring of transistors in the CPU, and usually combine functions of the CPU that were not intended to be combined. On old and modern processors, there are also instructions intentionally included in the processor by the manufacturer, but that are not documented in any official specification.

In the x86 architecture, the CPUID instruction is a processor supplementary instruction allowing software to discover details of the processor. It was introduced by Intel in 1993 with the launch of the Pentium and SL-enhanced 486 processors.

The Cyrix coma bug is a design flaw in Cyrix 6x86, 6x86L, and early 6x86MX processors that allows a non-privileged program to hang the computer.

A test register, in the Intel 80386 and Intel 80486 processor, was a register used by the processor, usually to do a self-test. Most of these registers were undocumented, and used by specialized software. The test registers were named TR3 to TR7. Regular programs don't usually require these registers to work. With the Pentium, the test registers were replaced by a variety of model-specific registers (MSRs).

On the x86 architecture, a debug register is a register used by a processor for program debugging. There are six debug registers, named DR0...DR7, with DR4 and DR5 as obsolete synonyms for DR6 and DR7. The debug registers allow programmers to selectively enable various debug conditions associated with a set of four debug addresses. Two of these registers are used to control debug features. These registers are accessed by variants of the MOV instruction. A debug register may be either the source operand or destination operand. The debug registers are privileged resources; the MOV instructions that access them can only be executed at privilege level zero. An attempt to read or write the debug registers when executing at any other privilege level causes a general protection fault.

In the x86 assembly language, the TEST instruction performs a bitwise AND on two operands. The flags SF, ZF, PF are modified while the result of the AND is discarded. The OF and CF flags are set to 0, while AF flag is undefined. There are 9 different opcodes for the TEST instruction depending on the type and size of the operands. It can compare 8-bit, 16-bit, 32-bit or 64-bit values. It can also compare registers, immediate values and register indirect values.

Intel microcode is microcode that runs inside x86 processors made by Intel. Since the P6 microarchitecture introduced in the mid-1990s, the microcode programs can be patched by the operating system or BIOS firmware to work around bugs found in the CPU after release. Intel had originally designed microcode updates for processor debugging under its design for testing (DFT) initiative.

References

  1. 1 2 3 Collins, Robert R. (1998-05-01). "The Pentium F00F Bug". Dr. Dobb's Journal . Retrieved 2015-07-27.
  2. 1 2 "81. Invalid Operand with Locked CMPXCHG8B Instruction". Pentium Processor Specification Update, Version-041 [Release Date January 1999] (PDF). Santa Clara, California, USA: Intel Corporation. 1998. pp. 51f. Order Number 242480-041. Archived from the original (PDF) on 2016-03-04. Retrieved 2015-07-27. PROBLEM: The CMPXCHG8B instruction compares an 8-byte value in EDX and EAX with an 8-byte value in memory (the destination operand). The only valid destination operands for this instruction are memory operands. If the destination operand is a register, the processor should generate an invalid opcode exception, execution of the CMPXCHG8B instruction should be halted and the processor should execute the invalid opcode exception handler. This erratum occurs if the LOCK prefix is used with the CMPXCHG8B instruction with an (invalid) register destination operand. In this case, the processor may not start execution of the invalid opcode exception handler because the bus is locked. This results in a system hang. IMPLICATION: If an (invalid) register destination operand is used with the CMPXCHG8B instruction and the LOCK prefix, the system may hang. No memory data is corrupted and the user can perform a system reset to return to normal operation. Note that the specific invalid code sequence necessary for this erratum to occur is not normally generated in the course of programming nor is such a sequence known by Intel to be generated by commercially available software. This erratum only applies to Pentium processors, Pentium processors with MMX technology, Pentium OverDrive processors and Pentium OverDrive processors with MMX technology. Pentium Pro processors, Pentium II processors and i486TM and earlier processors are not affected […]
  3. "torvalds/linux". GitHub. Archived from the original on 2022-06-23. Retrieved 2021-07-09.
  4. Hovers, Onno; et al. (1997-11-08). "Nieuwe Intel Pentium Bug" [New Intel…](newsgroup thread, 38 posts by 22 authors) (in Dutch). Newsgroup:  nl.comp.hardware . Retrieved 2015-07-27. Als je er nog niet over gehoord hebt, er is een nieuwe Intel Pentium BUG. Daardoor is het vanuit userspace mogelijk om de Pentium helemaal te laten crashen met 1 instructie. De bug doet zich voor op de Intel Pentium en de Intel Pentium MMX. De bug doet zich niet voor op de Intel Pentium Pro, de Intel Pentium II, de chips van AMD, Cyrix e.d. Deze bug is alleen van belang voor sommige mensen die een multiuser (shell) systeem draaien op een Intel Pentium. Op zo'n systeem kan elke user het systeem crashen…

Further reading