Long mode

Last updated

In the x86-64 computer architecture, long mode is the mode where a 64-bit operating system can access 64-bit instructions and registers. 64-bit programs are run in a sub-mode called 64-bit mode, while 32-bit programs and 16-bit protected mode programs are executed in a sub-mode called compatibility mode. Real mode or virtual 8086 mode programs cannot be natively run in long mode.

Contents

Overview

An x86-64 processor acts identically to an IA-32 processor when running in real mode or protected mode, which are supported modes when the processor is not in long mode.

A bit in the CPUID extended attributes field informs programs in real or protected modes if the processor can go to long mode, which allows a program to detect an x86-64 processor. This is similar to the CPUID attributes bit that Intel IA-64 processors use to allow programs to detect if they are running under IA-32 emulation.

With a computer running legacy BIOS, the BIOS and the boot loader run in real mode. After execution passes to an operating system kernel which supports x86-64, the kernel verifies CPU support for long mode and then executes the instructions to enter it. With a computer running UEFI, the UEFI firmware (except CSM and legacy Option ROM), any UEFI boot loader, and the operating system kernel all run in Long mode.

Memory limitations

While register sizes have increased to 64 bits from the previous x86 architecture, memory addressing has not yet been increased to the full 64 bits. For the time being, it is impractical to equip computers with sufficient memory to require a full 64 bits. As long as that remains the case, load/store unit(s), cache tags, MMUs and TLBs can be simplified without any loss of usable memory. Despite this limitation, software is programmed using full 64-bit pointers, and will therefore be able to use progressively larger address spaces as they become supported by future processors and operating systems.

Current limits

The first CPUs implementing the x86-64 architecture, namely the AMD Athlon 64 / Opteron (K8) CPUs, had 48-bit virtual [1] :129–130 and 40-bit physical addressing. [1] :4

The virtual address space of these processors is divided into two 47-bit regions, one starting at the lowest possible address, the other extending down from the largest. Attempting to use addresses falling outside this range will cause a general protection fault.

The limit of physical addressing constrains how much installed RAM is able to be accessed by the computer. On a ccNUMA multiprocessor system (Opteron) this includes the memory which is installed in the remote nodes, because the CPUs can directly address (and cache) all memory regardless if it is on the home node or remote. The 1  TB limit (40-bit) for physical memory for the K8 is huge by typical personal computer standards, but might have been a limitation for use in supercomputers. Consequently, the K10 (or "10h") microarchitecture implements 48-bit physical addresses and so can address up to 256  TB of RAM. [2]

When there is need, the microarchitecture can be expanded step by step without side-effects from software and simultaneously save cost with its implementation. For future expansion, the architecture supports expanding virtual address space to 64 bits, and physical memory addressing to 52 bits (limited by the page table entry format). [3] This would allow the processor to address 264 bytes (16 exabytes) of virtual address space and 252 bytes (4 petabytes) of physical address space.

See also

Related Research Articles

x86 Family of instruction set architectures

x86 is a family of complex instruction set computer (CISC) instruction set architectures initially developed by Intel based on the Intel 8086 microprocessor and its 8088 variant. The 8086 was introduced in 1978 as a fully 16-bit extension of Intel's 8-bit 8080 microprocessor, with memory segmentation as a solution for addressing more memory than can be covered by a plain 16-bit address. The term "x86" came into being because the names of several successors to Intel's 8086 processor end in "86", including the 80186, 80286, 80386 and 80486 processors. Colloquially, their names were "186", "286", "386" and "486".

In computer architecture, 64-bit integers, memory addresses, or other data units are those that are 64 bits wide. Also, 64-bit central processing units (CPU) and arithmetic logic units (ALU) are those that are based on processor registers, address buses, or data buses of that size. A computer that uses such a processor is a 64-bit computer.

<span class="mw-page-title-main">Athlon 64</span> Series of CPUs by AMD

The Athlon 64 is a ninth-generation, AMD64-architecture microprocessor produced by Advanced Micro Devices (AMD), released on September 23, 2003. It is the third processor to bear the name Athlon, and the immediate successor to the Athlon XP. The Athlon 64 was the second processor to implement the AMD64 architecture and the first 64-bit processor targeted at the average consumer. Variants of the Athlon 64 have been produced for Socket 754, Socket 939, Socket 940, and Socket AM2. It was AMD's primary consumer CPU, and primarily competed with Intel's Pentium 4, especially the Prescott and Cedar Mill core revisions.

x86-64 64-bit version of x86 architecture

x86-64 is a 64-bit version of the x86 instruction set, first announced in 1999. It introduced two new modes of operation, 64-bit mode and compatibility mode, along with a new 4-level paging mode.

In computing, Physical Address Extension (PAE), sometimes referred to as Page Address Extension, is a memory management feature for the x86 architecture. PAE was first introduced by Intel in the Pentium Pro, and later by AMD in the Athlon processor. It defines a page table hierarchy of three levels (instead of two), with table entries of 64 bits each instead of 32, allowing these CPUs to directly access a physical address space larger than 4 gigabytes (232 bytes).

The x86 instruction set refers to the set of instructions that x86-compatible microprocessors support. The instructions are usually part of an executable program, often stored as a computer file and executed on the processor.

In the 80386 microprocessor and later, virtual 8086 mode allows the execution of real mode applications that are incapable of running directly in protected mode while the processor is running a protected mode operating system. It is a hardware virtualization technique that allowed multiple 8086 processors to be emulated by the 386 chip. It emerged from the painful experiences with the 80286 protected mode, which by itself was not suitable to run concurrent real-mode applications well. John Crawford developed the Virtual Mode bit at the register set, paving the way to this environment.

The NX bit (no-execute) is a technology used in CPUs to segregate areas of a virtual address space to store either data or processor instructions. An operating system with support for the NX bit may mark certain areas of an address space as non-executable. The processor will then refuse to execute any code residing in these areas of the address space. The general technique, known as executable space protection, also called Write XOR Execute, is used to prevent certain types of malicious software from taking over computers by inserting their code into another program's data storage area and running their own code from within this section; one class of such attacks is known as the buffer overflow attack.

x86 virtualization is the use of hardware-assisted virtualization capabilities on an x86/x86-64 CPU.

The AMD K8 Hammer, also code-named SledgeHammer, is a computer processor microarchitecture designed by AMD as the successor to the AMD K7 Athlon microarchitecture. The K8 was the first implementation of the AMD64 64-bit extension to the x86 instruction set architecture.

The AMD Family 10h, or K10, is a microprocessor microarchitecture by AMD based on the K8 microarchitecture. The first third-generation Opteron products for servers were launched on September 10, 2007, with the Phenom processors for desktops following and launching on November 11, 2007 as the immediate successors to the K8 series of processors.

In computer security, executable-space protection marks memory regions as non-executable, such that an attempt to execute machine code in these regions will cause an exception. It makes use of hardware features such as the NX bit, or in some cases software emulation of those features. However, technologies that emulate or supply an NX bit will usually impose a measurable overhead while using a hardware-supplied NX bit imposes no measurable overhead.

In the x86 architecture, the CPUID instruction is a processor supplementary instruction allowing software to discover details of the processor. It was introduced by Intel in 1993 with the launch of the Pentium and SL-enhanced 486 processors.

The Time Stamp Counter (TSC) is a 64-bit register present on all x86 processors since the Pentium. It counts the number of CPU cycles since its reset. The instruction RDTSC returns the TSC in EDX:EAX. In x86-64 mode, RDTSC also clears the upper 32 bits of RAX and RDX. Its opcode is 0F 31. Pentium competitors such as the Cyrix 6x86 did not always have a TSC and may consider RDTSC an illegal instruction. Cyrix included a Time Stamp Counter in their MII.

Memory type range registers (MTRRs) are a set of processor supplementary capability control registers that provide system software with control of how accesses to memory ranges by the CPU are cached. It uses a set of programmable model-specific registers (MSRs) which are special registers provided by most modern CPUs. Possible access modes to memory ranges can be uncached, write-through, write-combining, write-protect, and write-back. In write-back mode, writes are written to the CPU's cache and the cache is marked dirty, so that its contents are written to memory later.

In computing, PSE-36 refers to a feature of x86 processors that extends the physical memory addressing capabilities from 32 bits to 36 bits, allowing addressing to up to 64 GB of memory. Compared to the Physical Address Extension (PAE) method, PSE-36 is a simpler alternative to addressing more than 4 GB of memory. It uses the Page Size Extension (PSE) mode and a modified page directory table to map 4 MB pages into a 64 GB physical address space. PSE-36's downside is that, unlike PAE, it doesn't have 4-KB page granularity above the 4 GB mark.

The maximum random access memory (RAM) installed in any computer system is limited by hardware, software and economic factors. The hardware may have a limited number of address bus bits, limited by the processor package or design of the system. Some of the address space may be shared between RAM, peripherals, and read-only memory. In the case of a microcontroller with no external RAM, the size of the RAM array is limited by the size of the integrated circuit die. In a packaged system, only enough RAM may be provided for the system's required functions, with no provision for addition of memory after manufacture.

Second Level Address Translation (SLAT), also known as nested paging, is a hardware-assisted virtualization technology which makes it possible to avoid the overhead associated with software-managed shadow page tables.

<span class="mw-page-title-main">Intel 5-level paging</span> Processor extension for the x86-64 line of processors

Intel 5-level paging, referred to simply as 5-level paging in Intel documents, is a processor extension for the x86-64 line of processors. It extends the size of virtual addresses from 48 bits to 57 bits, increasing the addressable virtual memory from 256 TB to 128 PB. The extension was first implemented in the Ice Lake processors, and the 4.14 Linux kernel adds support for it. Windows 10 and 11 with server versions also support this extension in their latest updates, where it is provided by a separate kernel of the system called ntkrla57.exe.

References

  1. 1 2 "AMD64 Architecture Programmer's Manual Volume 2: System Programming" (PDF). 2016. Retrieved 2015-04-09.
  2. "BIOS and Kernel Developer's Guide (BKDG) For AMD Family 10h Processors" (PDF). p. 30. Retrieved 2015-04-09. Physical address space increased to 48 bits.
  3. AMD 2016, p. 24: "The AMD64 architecture enhances this support to allow translation of 64-bit virtual addresses into 52-bit physical addresses, although processor implementations can support smaller virtual-address and physical-address spaces."