Global Descriptor Table

Last updated

The Global Descriptor Table (GDT) is a data structure used by Intel x86-family processors starting with the 80286 in order to define the characteristics of the various memory areas used during program execution, including the base address, the size, and access privileges like executability and writability. These memory areas are called segments in Intel terminology.

Contents

Description

The GDT is a table of 8-byte entries. Each entry may refer to a segment descriptor, Task State Segment (TSS), Local Descriptor Table (LDT), or call gate. Call gates were designed for transferring control between x86 privilege levels, although this mechanism is not used on most modern operating systems.

There is also a Local Descriptor Table (LDT). Multiple LDTs can be defined in the GDT, but only one is current at any one time: usually associated with the current Task. While the LDT contains memory segments which are private to a specific process, the GDT contains global segments. The x86 processors have facilities for automatically switching the current LDT on specific machine events, but no facilities for automatically switching the GDT.

Every memory access performed by a process always goes through a segment. On the 80386 processor and later, because of 32-bit segment offsets and limits, it is possible to make segments cover the entire addressable memory, which makes segment-relative addressing transparent to the user.

In order to reference a segment, a program must use its index inside the GDT or the LDT. Such an index is called a segment selector (or selector). The selector must be loaded into a segment register to be used. Apart from the machine instructions which allow one to set/get the position of the GDT, and of the Interrupt Descriptor Table (IDT), in memory, every machine instruction referencing memory has an implicit segment register, occasionally two. Most of the time this segment register can be overridden by adding a segment prefix before the instruction.

Loading a selector into a segment register reads the GDT or LDT entry at the time it is loaded, and caches the properties of the segment in a hidden register. Subsequent modifications to the GDT or LDT will not take effect until the segment register is reloaded.

Format of a segment descriptor SegmentDescriptor.svg
Format of a segment descriptor

GDT in 64-bit

The GDT is still present in 64-bit mode; a GDT must be defined, but is generally never changed or used for segmentation. The size of the register has been extended from 48 to 80 bits, and 64-bit descriptors are always "flat" (thus, from 0x0000000000000000 to 0xFFFFFFFFFFFFFFFF). However, the base of FS and GS are not constrained to 0, and they continue to be used as pointers to the offset of items such as the process environment block and the thread information block.

If the System bit (4th bit of the Access field) is cleared, the size of the descriptor is 16 bytes instead of 8. This is because, even though code/data segments are ignored, TSS are not, but the TSS pointer can be 64bit long and thus the descriptor needs more space to insert the higher dword of the TSS pointer.

64-bit versions of Windows forbid hooking of the GDT; attempting to do so will cause the machine to bug check. [1]

Local Descriptor Table

A Local Descriptor Table (LDT) is a memory table used in the x86 architecture in protected mode and containing memory segment descriptors, just like the GDT: address start in linear memory, size, executability, writability, access privilege, actual presence in memory, etc.

LDTs are the siblings of the Global Descriptor Table (GDT), and each define up to 8192 memory segments accessible to programs - note that unlike the GDT, the zeroeth entry is a valid entry, and can be used like any other LDT entry. Also note that unlike the GDT, the LDT cannot be used to store certain system entries: TSSs or LDTs. Call Gates and Task Gates are fine, however.

History

On x86 processors not having paging features, like the Intel 80286, the LDT is essential to implementing separate address spaces for multiple processes. There will be generally one LDT per user process, describing privately held memory, while shared memory and kernel memory will be described by the GDT. The operating system will switch the current LDT when scheduling a new process, using the LLDT machine instruction or when using a TSS. On the contrary, the GDT is generally not switched (although this may happen if virtual machine monitors like VMware are running on the computer).

The lack of symmetry between both tables is underlined by the fact that the current LDT can be automatically switched on certain events, notably if TSS-based multitasking is used, while this is not possible for the GDT. The LDT also cannot store certain privileged types of memory segments (e.g. TSSes). Finally, the LDT is actually defined by a descriptor inside the GDT, while the GDT is directly defined by a linear address.

Creating shared memory through the GDT has some drawbacks. Notably such memory is visible to every process and with equal rights. In order to restrict visibility and to differentiate the protection of shared memory, for example to only allow read-only access for some processes, one can use separate LDT entries, pointed at the same physical memory areas and only created in the LDTs of processes which have requested access to a given shared memory area.

LDT (and GDT) entries which point to identical memory areas are called aliases. Aliases are also typically created in order to get write access to code segments: an executable selector cannot be used for writing. (Protected mode programs constructed in the so-called tiny memory model, where everything is located in the same memory segment, must use separate selectors for code and data/stack, making both selectors technically "aliases" as well.) In the case of the GDT, aliases are also created in order to get access to system segments like the TSSes.

Segments have a "Present" flag in their descriptors, allowing them to be removed from memory if the need arises. For example, code segments or unmodified data segments can be thrown away, and modified data segments can be swapped out to disk. However, because entire segments need to be operated on as a unit, it is necessary to limit their size in order to ensure that swapping can happen in a timely fashion. However, using smaller, more easily swappable segments means that segment registers must be reloaded more frequently which is itself a time-consuming operation.

Modern usage

The Intel 80386 microprocessor introduced paging - allocating separate physical memory pages (themselves very small units of memory) at the same virtual addresses, with the advantage that disk paging is far faster and more efficient than segment swapping. Therefore, modern 32-bit x86 operating systems use the LDT very little, primarily to run legacy 16-bit code.

Should 16-bit code need to run in a 32-bit environment while sharing memory (this happens e.g. when running OS/2 1.x programs on OS/2 2.0 and later), the LDT must be written in such a way that every flat (paged) address has also a selector in the LDT (typically this results in the LDT being filled with 64 KiB entries). This technique is sometimes called LDT tiling. The limited size of the LDT means the virtual flat address space has to be limited to 512 megabytes (8191 times 64 KiB) - this is what happens on OS/2, although this limitation was fixed in version 4.5. It is also necessary to make sure that objects allocated in the 32-bit environment do not cross 64 KiB boundaries; this generates some address space waste.

If 32-bit code does not have to pass arbitrary memory objects to 16-bit code, e.g. presumably in the OS/2 1.x emulation present in Windows NT or in the Windows 3.1 emulation layer, it is not necessary to artificially limit the size of the 32-bit address space.

Related Research Articles

x86 Family of instruction set architectures

x86 is a family of complex instruction set computer (CISC) instruction set architectures initially developed by Intel based on the Intel 8086 microprocessor and its 8088 variant. The 8086 was introduced in 1978 as a fully 16-bit extension of Intel's 8-bit 8080 microprocessor, with memory segmentation as a solution for addressing more memory than can be covered by a plain 16-bit address. The term "x86" came into being because the names of several successors to Intel's 8086 processor end in "86", including the 80186, 80286, 80386 and 80486 processors. Colloquially, their names were "186", "286", "386" and "486".

<span class="mw-page-title-main">Endianness</span> Order of bytes in a computer word

In computing, endianness is the order in which bytes within a word of digital data are transmitted over a data communication medium or stored (upwardly) in computer memory, counting only byte significance compared to earliness. Endianness is primarily expressed as big-endian (BE) or little-endian (LE), terms introduced by Danny Cohen into computer science for data ordering in an Internet Experiment Note published in 1980. The adjective endian has its origin in the writings of 18th century Anglo-Irish writer Jonathan Swift. In the 1726 novel Gulliver's Travels, he portrays the conflict between sects of Lilliputians divided into those breaking the shell of a boiled egg from the big end or from the little end. By analogy, a CPU may read a digital word big end first, or little end first.

x86 memory segmentation refers to the implementation of memory segmentation in the Intel x86 computer instruction set architecture. Segmentation was introduced on the Intel 8086 in 1978 as a way to allow programs to address more than 64 KB (65,536 bytes) of memory. The Intel 80286 introduced a second version of segmentation in 1982 that added support for virtual memory and memory protection. At this point the original mode was renamed to real mode, and the new version was named protected mode. The x86-64 architecture, introduced in 2003, has largely dropped support for segmentation in 64-bit mode.

In computing, protected mode, also called protected virtual address mode, is an operational mode of x86-compatible central processing units (CPUs). It allows system software to use features such as segmentation, virtual memory, paging and safe multi-tasking designed to increase an operating system's control over application software.

<span class="mw-page-title-main">Memory management unit</span> Hardware translating virtual addresses to physical address

A memory management unit (MMU), sometimes called paged memory management unit (PMMU), is a computer hardware unit that examines all memory references on the memory bus, translating these requests, known as virtual memory addresses, into physical addresses in main memory.

x86 assembly language is the name for the family of assembly languages which provide some level of backward compatibility with CPUs back to the Intel 8008 microprocessor, which was launched in April 1972. It is used to produce object code for the x86 class of processors.

x86-64 64-bit version of x86 architecture

x86-64 is a 64-bit version of the x86 instruction set, first announced in 1999. It introduced two new modes of operation, 64-bit mode and compatibility mode, along with a new 4-level paging mode.

In computing, Physical Address Extension (PAE), sometimes referred to as Page Address Extension, is a memory management feature for the x86 architecture. PAE was first introduced by Intel in the Pentium Pro, and later by AMD in the Athlon processor. It defines a page table hierarchy of three levels (instead of two), with table entries of 64 bits each instead of 32, allowing these CPUs to directly access a physical address space larger than 4 gigabytes (232 bytes).

<span class="mw-page-title-main">Extended memory</span>

In DOS memory management, extended memory refers to memory above the first megabyte (220 bytes) of address space in an IBM PC or compatible with an 80286 or later processor. The term is mainly used under the DOS and Windows operating systems. DOS programs, running in real mode or virtual x86 mode, cannot directly access this memory, but are able to do so through an application programming interface called the Extended Memory Specification (XMS). This API is implemented by a driver (such as HIMEM.SYS) or the operating system, which takes care of memory management and copying memory between conventional and extended memory, by temporarily switching the processor into protected mode. In this context, the term "extended memory" may refer to either the whole of the extended memory or only the portion available through this API.

Memory protection is a way to control memory access rights on a computer, and is a part of most modern instruction set architectures and operating systems. The main purpose of memory protection is to prevent a process from accessing memory that has not been allocated to it. This prevents a bug or malware within a process from affecting other processes, or the operating system itself. Protection may encompass all accesses to a specified area of memory, write accesses, or attempts to execute the contents of the area. An attempt to access unauthorized memory results in a hardware fault, e.g., a segmentation fault, storage violation exception, generally causing abnormal termination of the offending process. Memory protection for computer security includes additional techniques such as address space layout randomization and executable space protection.

<span class="mw-page-title-main">General protection fault</span>

A general protection fault (GPF) in the x86 instruction set architectures (ISAs) is a fault initiated by ISA-defined protection mechanisms in response to an access violation caused by some running code, either in the kernel or a user program. The mechanism is first described in Intel manuals and datasheets for the Intel 80286 CPU, which was introduced in 1983; it is also described in section 9.8.13 in the Intel 80386 programmer's reference manual from 1986. A general protection fault is implemented as an interrupt. Some operating systems may also classify some exceptions not related to access violations, such as illegal opcode exceptions, as general protection faults, even though they have nothing to do with memory protection. If a CPU detects a protection violation, it stops executing the code and sends a GPF interrupt. In most cases, the operating system removes the failing process from the execution queue, signals the user, and continues executing other processes. If, however, the operating system fails to catch the general protection fault, i.e. another protection violation occurs before the operating system returns from the previous GPF interrupt, the CPU signals a double fault, stopping the operating system. If yet another failure occurs, the CPU is unable to recover; since 80286, the CPU enters a special halt state called "Shutdown", which can only be exited through a hardware reset. The IBM PC AT, the first PC-compatible system to contain an 80286, has hardware that detects the Shutdown state and automatically resets the CPU when it occurs. All descendants of the PC AT do the same, so in a PC, a triple fault causes an immediate system reset.

The x86 instruction set refers to the set of instructions that x86-compatible microprocessors support. The instructions are usually part of an executable program, often stored as a computer file and executed on the processor.

The NX bit (no-execute) is a technology used in CPUs to segregate areas of a virtual address space to store either data or processor instructions. An operating system with support for the NX bit may mark certain areas of an address space as non-executable. The processor will then refuse to execute any code residing in these areas of the address space. The general technique, known as executable space protection, also called Write XOR Execute, is used to prevent certain types of malicious software from taking over computers by inserting their code into another program's data storage area and running their own code from within this section; one class of such attacks is known as the buffer overflow attack.

In x86 computing, unreal mode, also big real mode, flat real mode, or voodoo mode is a variant of real mode, in which one or more segment descriptors has been loaded with non-standard values, like 32-bit limits allowing access to the entire memory. Contrary to its name, it is not a separate addressing mode that the x86 processors can operate in. It is used in the 80286 and later x86 processors.

Memory segmentation is an operating system memory management technique of dividing a computer's primary memory into segments or sections. In a computer system using segmentation, a reference to a memory location includes a value that identifies a segment and an offset within that segment. Segments or sections are also used in object files of compiled programs when they are linked together into a program image and when the image is loaded into memory.

The interrupt descriptor table (IDT) is a data structure used by the x86 architecture to implement an interrupt vector table. The IDT is used by the processor to determine the correct response to interrupts and exceptions.

A call gate is a mechanism in Intel's x86 architecture for changing the privilege level of a process when it executes a predefined function call using a CALL FAR instruction.

The task state segment (TSS) is a structure on x86-based computers which holds information about a task. It is used by the operating system kernel for task management. Specifically, the following information is stored in the TSS:

In memory addressing for Intel x86 computer architectures, segment descriptors are a part of the segmentation unit, used for translating a logical address to a linear address. Segment descriptors describe the memory segment referred to in the logical address. The segment descriptor contains the following fields:

  1. A segment base address
  2. The segment limit which specifies the segment size
  3. Access rights byte containing the protection mechanism information
  4. Control bits

Intel 5-level paging, referred to simply as 5-level paging in Intel documents, is a processor extension for the x86-64 line of processors. It extends the size of virtual addresses from 48 bits to 57 bits by adding an additional level to x86-64's multilevel page tables, increasing the addressable virtual memory from 256 TB to 128 PB. The extension was first implemented in the Ice Lake processors.

References

  1. "Patching Policy for x64-Based Systems". 8 October 2009. If the operating system detects one of these modifications or any other unauthorized patch, it will generate a bug check and shut down the system.