Offset (computer science)

Last updated

In computer science, an offset within an array or other data structure object is an integer indicating the distance (displacement) between the beginning of the object and a given element or point, presumably within the same object. The concept of a distance is valid only if all elements of the object are of the same size (typically given in bytes or words).

For example, if A is an array of characters containing "abcdef", the fourth element containing the character 'd' has an offset of three from the start of A.

In assembly language

In computer engineering and low-level programming (such as assembly language), an offset usually denotes the number of address locations added to a base address in order to get to a specific absolute address. In this (original) meaning of offset, only the basic address unit, usually the 8-bit byte, is used to specify the offset's size. In this context an offset is sometimes called a relative address.

In IBM System/360 instructions, a 12-bit offset embedded within certain instructions provided a range of between 0 and 4096 bytes. For example, within an unconditional branch instruction (X'47F0Fxxx'), the xxx 12bit hexadecimal offset provided the byte offset from the base register (15) to branch to. An odd offset would cause a program check (unless the base register itself also contained an odd address)—since instructions had to be aligned on half-word boundaries to execute without a program or hardware interrupt.

The previous example describes an indirect way to address to a memory location in the format of segment:offset. For example, assume we want to refer to memory location 0xF867. One way this can be accomplished is by first defining a segment with beginning address 0xF000, and then defining an offset of 0x0867. Further, we are also allowed to shift the hexadecimal segment to reach the final absolute memory address. One thing to note here is that we can reach our final absolute address in many ways.

An offset is not always relative to the base address of the module, for example: If you have a class (GUI) and you want to retrieve the "color" attribute of this class, the offset may be 0x0100, but this offset has to be added to the offset of the class itself, not the base address. If the class's offset is 0xFF881 and the base address is 0x0A100, then to retrieve the "color" attribute both offsets are added to the base address. 0x0A100 (base) + 0xFF881 (class) + 0x0100 (attribute). Ultimately the attribute's address will be 0x109A81.

See also

Related Research Articles

x86 memory segmentation refers to the implementation of memory segmentation in the Intel x86 computer instruction set architecture. Segmentation was introduced on the Intel 8086 in 1978 as a way to allow programs to address more than 64 KB (65,536 bytes) of memory. The Intel 80286 introduced a second version of segmentation in 1982 that added support for virtual memory and memory protection. At this point the original mode was renamed to real mode, and the new version was named protected mode. The x86-64 architecture, introduced in 2003, has largely dropped support for segmentation in 64-bit mode.

x86 assembly language is the name for the family of assembly languages which provide some level of backward compatibility with CPUs back to the Intel 8008 microprocessor, which was launched in April 1972. It is used to produce object code for the x86 class of processors.

<span class="mw-page-title-main">Pointer (computer programming)</span> Object which stores memory addresses in a computer program

In computer science, a pointer is an object in many programming languages that stores a memory address. This can be that of another value located in computer memory, or in some cases, that of memory-mapped computer hardware. A pointer references a location in memory, and obtaining the value stored at that location is known as dereferencing the pointer. As an analogy, a page number in a book's index could be considered a pointer to the corresponding page; dereferencing such a pointer would be done by flipping to the page with the given page number and reading the text found on that page. The actual format and content of a pointer variable is dependent on the underlying computer architecture.

In computer science, a union is a value that may have any of several representations or formats within the same position in memory; that consists of a variable that may hold such a data structure. Some programming languages support special data types, called union types, to describe such values and variables. In other words, a union type definition will specify which of a number of permitted primitive types may be stored in its instances, e.g., "float or long integer". In contrast with a record, which could be defined to contain both a float and an integer; in a union, there is only one value at any given time.

<span class="mw-page-title-main">Memory address</span> Reference to a specific memory location

In computing, a memory address is a reference to a specific memory location used at various levels by software and hardware. Memory addresses are fixed-length sequences of digits conventionally displayed and manipulated as unsigned integers. Such numerical semantic bases itself upon features of CPU, as well upon use of the memory like an array endorsed by various programming languages.

In computing, position-independent code (PIC) or position-independent executable (PIE) is a body of machine code that, being placed somewhere in the primary memory, executes properly regardless of its absolute address. PIC is commonly used for shared libraries, so that the same library code can be loaded in a location in each program address space where it does not overlap with other memory in use. PIC was also used on older computer systems that lacked an MMU, so that the operating system could keep applications away from each other even within the single address space of an MMU-less system.

<span class="mw-page-title-main">COM file</span>

A COM file is a type of simple executable file. On the Digital Equipment Corporation (DEC) VAX operating systems of the 1970s, .COM was used as a filename extension for text files containing commands to be issued to the operating system. With the introduction of Digital Research's CP/M, the type of files commonly associated with COM extension changed to that of executable files. This convention was later carried over to DOS. Even when complemented by the more general EXE file format for executables, the compact COM files remained viable and frequently used under DOS.

Addressing modes are an aspect of the instruction set architecture in most central processing unit (CPU) designs. The various addressing modes that are defined in a given instruction set architecture define how the machine language instructions in that architecture identify the operand(s) of each instruction. An addressing mode specifies how to calculate the effective memory address of an operand by using information held in registers and/or constants contained within a machine instruction or elsewhere.

Loop unrolling, also known as loop unwinding, is a loop transformation technique that attempts to optimize a program's execution speed at the expense of its binary size, which is an approach known as space–time tradeoff. The transformation can be undertaken manually by the programmer or by an optimizing compiler. On modern processors, loop unrolling is often counterproductive, as the increased code size can cause more cache misses; cf. Duff's device.

The zero page or base page is the block of memory at the very beginning of a computer's address space; that is, the page whose starting address is zero. The size of a page depends on the context, and the significance of zero page memory versus higher addressed memory is highly dependent on machine architecture. For example, the Motorola 6800 and MOS Technology 6502 processor families treat the first 256 bytes of memory specially, whereas many other processors do not.

Memory segmentation is an operating system memory management technique of division of a computer's primary memory into segments or sections. In a computer system using segmentation, a reference to a memory location includes a value that identifies a segment and an offset within that segment. Segments or sections are also used in object files of compiled programs when they are linked together into a program image and when the image is loaded into memory.

<span class="mw-page-title-main">WDC 65C02</span> CMOS microprocessor in the 6502 family

The Western Design Center (WDC) 65C02 microprocessor is an enhanced CMOS version of the popular nMOS-based 8-bit MOS Technology 6502. The 65C02 fixed several problems in the original 6502 and added some new instructions, but its main feature was greatly lowered power usage, on the order of 10 to 20 times less than the original 6502 running at the same speed. The reduced power consumption made the 65C02 useful in portable computer roles and microcontroller systems in industrial settings. It has been used in some home computers, as well as in embedded applications, including medical-grade implanted devices.

Relocation is the process of assigning load addresses for position-dependent code and data of a program and adjusting the code and data to reflect the assigned addresses. Prior to the advent of multiprocess systems, and still in many embedded systems, the addresses for objects were absolute starting at a known location, often zero. Since multiprocessing systems dynamically link and switch between programs it became necessary to be able to relocate objects using position-independent code. A linker usually performs relocation in conjunction with symbol resolution, the process of searching files and libraries to replace symbolic references or names of libraries with actual usable addresses in memory before running a program.

Data structure alignment is the way data is arranged and accessed in computer memory. It consists of three separate but related issues: data alignment, data structure padding, and packing.

INT 13h is shorthand for BIOS interrupt call 13hex, the 20th interrupt vector in an x86-based computer system. The BIOS typically sets up a real mode interrupt handler at this vector that provides sector-based hard disk and floppy disk read and write services using cylinder-head-sector (CHS) addressing. Modern PC BIOSes also include INT 13h extension functions, originated by IBM and Microsoft in 1992, that provide those same disk access services using 64-bit LBA addressing; with minor additions, these were quasi-standardized by Phoenix Technologies and others as the EDD BIOS extensions.

Intel hexadecimal object file format, Intel hex format or Intellec Hex is a file format that conveys binary information in ASCII text form. It is commonly used for programming microcontrollers, EPROMs, and other types of programmable logic devices and hardware emulators. In a typical application, a compiler or assembler converts a program's source code to machine code and outputs it into a HEX file. Some also use it as a container format holding packets of stream data. Common file extensions used for the resulting files are .HEX or .H86. The HEX file is then read by a programmer to write the machine code into a PROM or is transferred to the target system for loading and execution.

In computing, Intel Memory Model refers to a set of six different memory models of the x86 CPU operating in real mode which control how the segment registers are used and the default size of pointers.

<span class="mw-page-title-main">Word addressing</span> Support by a hardware architecture of accessing memory only in units of words larger than a byte

In computer architecture, word addressing means that addresses of memory on a computer uniquely identify words of memory. It is usually used in contrast with byte addressing, where addresses uniquely identify bytes. Almost all modern computer architectures use byte addressing, and word addressing is largely only of historical interest. A computer that uses word addressing is sometimes called a word machine.

In IBM System/360 through present day z/Architecture, an address constant or "adcon" is an assembly language data type which contains the address of a location in computer memory. An address constant can be one, two, three or four bytes long, although an adcon of less than four bytes is conventionally used to hold an expression for a small integer such as a length, a relative address, or an index value, and does not represent an address at all. Address constants are defined using an assembler language "DC" statement.

The GOFF specification was developed for IBM's MVS operating system to supersede the IBM OS/360 Object File Format to compensate for weaknesses in the older format.