Intel Memory Model

Last updated April 29, 2021

In computing, Intel Memory Model refers to a set of six different memory models of the x86 CPU operating in real mode which control how the segment registers are used and the default size of pointers.

Memory segmentation

Four registers are used to refer to four segments on the 16-bit x86 segmented memory architecture. DS (data segment), CS (code segment), SS (stack segment), and ES (extra segment). Another 16-bit register can act as an offset into a given segment, and so a logical address on this platform is written segment:offset, typically in hexadecimal notation. In real mode, in order to calculate the physical address of a byte of memory, the hardware shifts the contents of the appropriate segment register 4 bits left (effectively multiplying by 16), and then adds the offset.

For example, the logical address 7522:F139 yields the 20-bit physical address:

	75220
+	F139
	84359

Note that this process leads to aliasing of memory, such that any given physical address has up to 4096 corresponding logical addresses. This complicates the comparison of pointers to different segments.

Pointer sizes

Pointer formats are known as near, far, or huge.

Near pointers are 16-bit offsets within the reference segment, i.e. DS for data and CS for code. They are the fastest pointers, but are limited to point to 64 KB of memory (to the associated segment of the data type). Near pointers can be held in registers (typically SI and DI).

movbx,word[reg]movax,word[bx]movdx,word[bx+2]

Far pointers are 32-bit pointers containing a segment and an offset. To use them the segment register ES is used by using the instruction les [reg]|[mem],dword [mem]|[reg].^[1] They may reference up to 1024 KiB of memory. Note that pointer arithmetic (addition and subtraction) does not modify the segment portion of the pointer, only its offset. Operations which exceed the bounds of zero or 65535 (0xFFFF) will undergo modulo 64K operation just as any normal 16-bit operation. For example, if the segment register is set to 0x5000 and the offset is being incremented, the moment this counter offset becomes (0x10000), the resulting absolute address will roll over to 0x5000:0000.

lesbx,dword[reg]movax,word[es:bx]movdx,word[es:bx+2]

Huge pointers are essentially far pointers, but are (mostly) normalized every time they are modified so that they have the highest possible segment for that address. This is very slow but allows the pointer to point to multiple segments, and allows for accurate pointer comparisons, as if the platform were a flat memory model: It forbids the aliasing of memory as described above, so two huge pointers that reference the same memory location are always equal.

lesbx,dword[reg]movax,word[es:bx]addbx,2testbx,0xfff0jzlblsubbx,0x10movdx,esincdxmoves,dxlbl:movdx,word[es:bx]

Memory models

The memory models are:

Model	Data	Code	Definition
Tiny*	near		CS=DS=SS
Small	near**	near	DS=SS
Medium	near**	far	DS=SS, multiple code segments
Compact	far	near	single code segment, multiple data segments
Large	far	far	multiple code and data segments
Huge	huge	far	multiple code and data segments; single array may be >64 KB

* In the Tiny model, all four segment registers point to the same segment.
** In all models with near data pointers, SS equals DS.
*** Stack is always limited to at most 64 KByte.

Other platforms

In protected mode a segment cannot be both writable and executable.^[2]^[3] Therefore, when implementing the Tiny memory model the code segment register must point to the same physical address and have the same limit as the data segment register. This defeated one of the features of the 80286, which makes sure data segments are never executable and code segments are never writable (which means that self-modifying code is never allowed). However, on the 80386, with its paged memory management unit it is possible to protect individual memory pages against writing.^[4]^[5]

Memory models are not limited to 16-bit programs. It is possible to use segmentation in 32-bit protected mode as well (resulting in 48-bit pointers) and there exist C language compilers which support that.^[6] However segmentation in 32-bit mode does not allow to access a larger address space than what a single segment would cover, unless some segments are not always present in memory and the linear address space is just used as a cache over a larger segmented virtual space.^{[ citation needed ]} It allows better protection for access to various objects (areas up to 1 MB long can benefit from a one-byte access protection granularity, versus the coarse 4 KiB granularity offered by sole paging), and is therefore only used in specialized applications, like telecommunications software.^{[ citation needed ]} Technically, the "flat" 32-bit address space is a "tiny" memory model for the segmented address space. Under both reigns all four segment registers contain one and the same value.

x86-64

On the x86-64 platform, a total of seven memory models exist,^[7] as the majority of symbol references are only 32 bits wide, and if the addresses are known at link time (as opposed to position-independent code). This does not affect the pointers used, which are always flat 64-bit pointers, but only how values that have to be accessed via symbols can be placed.

Bibliography

Turbo C++ Version 3.0 User's Guide. Borland International, Copyright 1992.

Related Research Articles

IA-32 is the 32-bit version of the x86 instruction set architecture, designed by Intel and first implemented in the 80386 microprocessor in 1985. IA-32 is the first incarnation of x86 that supports 32-bit computing; as a result, the "IA-32" term may be used as a metonym to refer to all x86 versions that support 32-bit computing.

The 8086 is a 16-bit microprocessor chip designed by Intel between early 1976 and June 8, 1978, when it was released. The Intel 8088, released July 1, 1979, is a slightly modified chip with an external 8-bit data bus, and is notable as the processor used in the original IBM PC design.

The Intel 80386, also known as i386 or just 386, is a 32-bit microprocessor introduced in 1985. The first versions had 275,000 transistors and were the CPU of many workstations and high-end personal computers of the time. As the original implementation of the 32-bit extension of the 80286 architecture, the 80386 instruction set, programming model, and binary encodings are still the common denominator for all 32-bit x86 processors, which is termed the i386-architecture, x86, or IA-32, depending on context.

x86 is a family of instruction set architectures initially developed by Intel based on the Intel 8086 microprocessor and its 8088 variant. The 8086 was introduced in 1978 as a fully 16-bit extension of Intel's 8-bit 8080 microprocessor, with memory segmentation as a solution for addressing more memory than can be covered by a plain 16-bit address. The term "x86" came into being because the names of several successors to Intel's 8086 processor end in "86", including the 80186, 80286, 80386 and 80486 processors.

The Intel MCS-51 is a single chip microcontroller (MCU) series developed by Intel in 1980 for use in embedded systems. The architect of the Intel MCS-51 instruction set was John H. Wharton. Intel's original versions were popular in the 1980s and early 1990s and enhanced binary compatible derivatives remain popular today. It is an example of a complex instruction set computer, and has separate memory spaces for program instructions and data.

In computer architecture, 64-bit integers, memory addresses, or other data units are those that are 64 bits wide. Also, 64-bit central processing unit (CPU) and arithmetic logic unit (ALU) architectures are those that are based on processor registers, address buses, or data buses of that size. 64-bit microcomputers are computers in which 64-bit microprocessors are the norm. From the software perspective, 64-bit computing means the use of machine code with 64-bit virtual memory addresses. However, not all 64-bit instruction sets support full 64-bit virtual memory addresses; x86-64 and ARMv8, for example, support only 48 bits of virtual address, with the remaining 16 bits of the virtual address required to be all 0's or all 1's, and several 64-bit instruction sets support fewer than 64 bits of physical memory address.

x86 memory segmentation refers to the implementation of memory segmentation in the Intel x86 computer instruction set architecture. Segmentation was introduced on the Intel 8086 in 1978 as a way to allow programs to address more than 64 KB (65,536 bytes) of memory. The Intel 80286 introduced a second version of segmentation in 1982 that added support for virtual memory and memory protection. At this point the original model was renamed real mode, and the new version was named protected mode. The x86-64 architecture, introduced in 2003, has largely dropped support for segmentation in 64-bit mode.

In computing, protected mode, also called protected virtual address mode, is an operational mode of x86-compatible central processing units (CPUs). It allows system software to use features such as virtual memory, paging and safe multi-tasking designed to increase an operating system's control over application software.

x86 assembly language is a family of backward-compatible assembly languages, which provide some level of compatibility all the way back to the Intel 8008 introduced in April 1972. x86 assembly languages are used to produce object code for the x86 class of processors. Like all assembly languages, it uses short mnemonics to represent the fundamental instructions that the CPU in a computer can understand and follow. Compilers sometimes produce assembly code as an intermediate step when translating a high level program into machine code. Regarded as a programming language, assembly coding is machine-specific and low level. Assembly languages are more typically used for detailed and time critical applications such as small real-time embedded systems or operating system kernels and device drivers.

x86-64 is a 64-bit version of the x86 instruction set, first released in 1999. It introduced two new modes of operation, 64-bit mode and compatibility mode, along with a new 4-level paging mode.

In computing, Physical Address Extension (PAE), sometimes referred to as Page Address Extension, is a memory management feature for the x86 architecture. PAE was first introduced by Intel in the Pentium Pro, and later by AMD in the Athlon processor. It defines a page table hierarchy of three levels (instead of two), with table entries of 64 bits each instead of 32, allowing these CPUs to directly access a physical address space larger than 4 gigabytes (2³² bytes).

Fetching the instruction opcodes from program memory well in advance is known as prefetching and it is served by using prefetch input queue (PIQ).The pre-fetched instructions are stored in data structure - namely a queue. The fetching of opcodes well in advance, prior to their need for execution increases the overall efficiency of the processor boosting its speed. The processor no longer has to wait for the memory access operations for the subsequent instruction opcode to complete. This architecture was prominently used in the Intel 8086 microprocessor.

In computing, a memory address is a reference to a specific memory location used at various levels by software and hardware. Memory addresses are fixed-length sequences of digits conventionally displayed and manipulated as unsigned integers. Such numerical semantic bases itself upon features of CPU, as well upon use of the memory like an array endorsed by various programming languages.

Addressing modes are an aspect of the instruction set architecture in most central processing unit (CPU) designs. The various addressing modes that are defined in a given instruction set architecture define how the machine language instructions in that architecture identify the operand(s) of each instruction. An addressing mode specifies how to calculate the effective memory address of an operand by using information held in registers and/or constants contained within a machine instruction or elsewhere.

Memory segmentation is an operating system memory management technique of division of a computer's primary memory into segments or sections. In a computer system using segmentation, a reference to a memory location includes a value that identifies a segment and an offset within that segment. Segments or sections are also used in object files of compiled programs when they are linked together into a program image and when the image is loaded into memory.

The Global Descriptor Table (GDT) is a data structure used by Intel x86-family processors starting with the 80286 in order to define the characteristics of the various memory areas used during program execution, including the base address, the size, and access privileges like executability and writability. These memory areas are called segments in Intel terminology.

A call gate is a mechanism in Intel's x86 architecture for changing the privilege level of a process when it executes a predefined function call using a CALL FAR instruction.

The task state segment (TSS) is a structure on x86-based computers which holds information about a task. It is used by the operating system kernel for task management. Specifically, the following information is stored in the TSS:

This article describes the calling conventions used when programming x86 architecture microprocessors.

A stack register is a computer central processor register whose purpose is to keep track of a call stack. On an accumulator-based architecture machine, this may be a dedicated register such as SP on an Intel x86 machine. On a general register machine, it may be a register which is reserved by convention, such as on the PDP-11 or RISC machines. Some designs such as the Data General Eclipse had no dedicated register, but used a reserved hardware memory address for this function.

References

↑ "Intel Instruction Set - LES". Intel Instruction Set pages. Istanbul Teknik Üniversitesi/Intel . Retrieved October 19, 2015.CS1 maint: discouraged parameter (link)
↑ "Intel 64 and IA-32 Architectures Developer's Manual: Vol. 3A". Intel. pp. 3–17. Retrieved September 13, 2011.CS1 maint: discouraged parameter (link)
↑ "AMD64 Architecture Programmer's Manual Volume 2: System Programming" (PDF). AMD. pp. 82–84. Retrieved September 13, 2011.CS1 maint: discouraged parameter (link)
↑ "Intel 64 and IA-32 Architectures Developer's Manual: Vol. 3A". Intel. pp. 4–41. Retrieved September 13, 2011.CS1 maint: discouraged parameter (link)
↑ "AMD64 Architecture Programmer's Manual Volume 2: System Programming" (PDF). AMD. p. 139. Retrieved September 13, 2011.CS1 maint: discouraged parameter (link)
↑ "Open Watcom C Language Reference version 2" (PDF). github.com/open-watcom. Open Watcom. Retrieved January 10, 2018.CS1 maint: discouraged parameter (link)
↑ "System V Application binary Interface, AMD64 Architecture Processor Supplement, Draft Version 0.99.7" (PDF). pp. 33–35.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[LES_instruction-1] "Intel Instruction Set - LES". Intel Instruction Set pages. Istanbul Teknik Üniversitesi/Intel . Retrieved October 19, 2015.CS1 maint: discouraged parameter (link)

[2] "Intel 64 and IA-32 Architectures Developer's Manual: Vol. 3A". Intel. pp. 3–17. Retrieved September 13, 2011.CS1 maint: discouraged parameter (link)

[3] "AMD64 Architecture Programmer's Manual Volume 2: System Programming" (PDF). AMD. pp. 82–84. Retrieved September 13, 2011.CS1 maint: discouraged parameter (link)

[4] "Intel 64 and IA-32 Architectures Developer's Manual: Vol. 3A". Intel. pp. 4–41. Retrieved September 13, 2011.CS1 maint: discouraged parameter (link)

[5] "AMD64 Architecture Programmer's Manual Volume 2: System Programming" (PDF). AMD. p. 139. Retrieved September 13, 2011.CS1 maint: discouraged parameter (link)

[48bits-6] "Open Watcom C Language Reference version 2" (PDF). github.com/open-watcom. Open Watcom. Retrieved January 10, 2018.CS1 maint: discouraged parameter (link)

[7] "System V Application binary Interface, AMD64 Architecture Processor Supplement, Draft Version 0.99.7" (PDF). pp. 33–35.

[1]

[2]

[3]

[4]

[5]

[6]

[7]