Bit field

Last updated July 30, 2024

A bit field is a data structure that maps to one or more adjacent bits which have been allocated for specific purposes, so that any single bit or group of bits within the structure can be set or inspected.^[1]^[2] A bit field is most commonly used to represent integral types of known, fixed bit-width, such as single-bit Booleans.

Within CPUs and other logic devices, collections of bit fields called flags are commonly used to control or to indicate the outcome of particular operations.^[4] Processors have a status register that is composed of flags. For example, if the result of an addition cannot be represented in the destination an arithmetic overflow is set. The flags can be used to decide subsequent operations, such as conditional jump instructions. For example, a JE... (Jump if Equal) instruction in the x86 assembly language will result in a jump if the Z (zero) flag was set by some previous operation.

A bit field is distinguished from a bit array in that the latter is used to store a large set of bits indexed by integers and is often wider than any integral type supported by the language.^{[ citation needed ]} Bit fields, on the other hand, typically fit within a machine word,^[3] and the denotation of bits is independent of their numerical index.^[2]

Implementation

Bit fields can be used to reduce memory consumption when a program requires a number of integer variables which always will have low values. For example, in many systems, storing an integer value requires two bytes (16-bits) of memory; sometimes the values to be stored actually need only one or two bits. Having a number of these tiny variables share a bit field allows efficient packaging of data in the memory.^[5]

In C, native implementation-defined bit fields can be created using int,^{[lower-alpha 1]}unsigned int, signed int, _Bool (in C99), _BitInt(N), unsigned _BitInt(N) (in C23) or other implementation-defined types. In C++, they can be created using any integral or enumeration type; most C compilers also allow this. In this case, the programmer can declare a structure for a bit field which labels and determines the width of several subfields.^[6] Adjacently declared bit fields of the same type can then be packed by the compiler into a reduced number of words, compared with the memory used if each 'field' were to be declared separately.

For languages lacking native bit fields, or where the programmer wants control over the resulting bit representation, it is possible to manually manipulate bits within a larger word type. In this case, the programmer can set, test, and change the bits in the field using combinations of masking and bitwise operations.^[7]

Examples

C programming language

Declaring a bit field in C and C++:^[6]

// opaque and show#define YES 1#define NO  0// line styles#define SOLID  1#define DOTTED 2#define DASHED 3// primary colors#define BLUE  0b100#define GREEN 0b010#define RED   0b001// mixed colors#define BLACK   0#define YELLOW  (RED | GREEN)        /* 011 */#define MAGENTA (RED | BLUE)         /* 101 */#define CYAN    (GREEN | BLUE)       /* 110 */#define WHITE   (RED | GREEN | BLUE) /* 111 */constchar*colors[8]={"Black","Red","Green","Yellow","Blue","Magenta","Cyan","White"};// bit field box propertiesstructBoxProps{unsignedintopaque:1;unsignedintfill_color:3;unsignedint:4;// fill to 8 bitsunsignedintshow_border:1;unsignedintborder_color:3;unsignedintborder_style:2;unsignedchar:0;// fill to nearest byte (16 bits)unsignedcharwidth:4,// Split a byte into 2 fields of 4 bitsheight:4;};

The layout of bit fields in a C struct is implementation-defined. For behavior that remains predictable across compilers, it may be preferable to emulate bit fields with a primitive and bit operators:

/* Each of these preprocessor directives defines a single bit,   corresponding to one button on the controller.     Button order matches that of the Nintendo Entertainment System. */#define KEY_RIGHT  0b00000001#define KEY_LEFT   0b00000010#define KEY_DOWN   0b00000100#define KEY_UP     0b00001000#define KEY_START  0b00010000#define KEY_SELECT 0b00100000#define KEY_B      0b01000000#define KEY_A      0b10000000unsignedchargameControllerStatus=0;/* Sets the gameControllerStatus using OR */voidKeyPressed(unsignedcharkey){gameControllerStatus|=key;}/* Clears the gameControllerStatus using AND and ~ (binary NOT)*/voidKeyReleased(unsignedcharkey){gameControllerStatus&=~key;}/* Tests whether a bit is set using AND */unsignedcharIsPressed(unsignedcharkey){returngameControllerStatus&key;}

Processor status register

The status register of a processor is a bit field consisting of several flag bits. Each flag bit describes information about the processor's current state.^[8] As an example, the status register of the 6502 processor is shown below:

6502 status register
Bit 7	Bit 6	Bit 5	Bit 4	Bit 3	Bit 2	Bit 1	Bit 0
Negative flag	oVerflow flag	-	Break flag	Decimal flag	Interrupt-disable flag	Zero flag	Carry flag

These bits are set by the processor following the result of an operation. Certain bits (such as the Carry, Interrupt-disable, and Decimal flags) may be explicitly controlled using set and clear instructions. Additionally, branching instructions are also defined to alter execution based on the current state of a flag.

For an instance, after an ADC (Add with Carry) instruction, the BVS (Branch on oVerflow Set) instruction may be used to jump based on whether the overflow flag was set by the processor following the result of the addition instruction.

Extracting bits from flag words

A subset of flags in a flag field may be extracted by ANDing with a mask. A large number of languages support the shift operator (<<) where 1 << n aligns a single bit to the nth position. Most also support the use of the AND operator (&) to isolate the value of one or more bits.

If the status-byte from a device is 0x67 and the 5th flag bit indicates data-ready. The mask-byte is 2^5 = 0x20. ANDing the status-byte 0x67 (0110 0111 in binary) with the mask-byte 0x20(0010 0000 in binary) evaluates to 0x20. This means the flag bit is set i.e., the device has data ready. If the flag-bit had not been set, this would have evaluated to 0 i.e., there is no data available from the device.

To check the nth bit from a variable v, perform either of the following: (both are equivalent)

bool nth_is_set = (v & (1 << n)) != 0; bool nth_is_set = (v >> n) & 1;

Changing bits in flag words

Writing, reading or toggling bits in flags can be done only using the OR, AND and NOT operations – operations which can be performed quickly in the processor. To set a bit, OR the status byte with a mask byte. Any bits set in the mask byte or the status byte will be set in the result.

To toggle a bit, XOR the status byte and the mask byte. This will set a bit if it is cleared or clear a bit if it is set.

Notes

↑ In C, it is implementation-defined whether a bit-field of type int is signed or unsigned. In C++, it is always signed to match the underlying type.

Related Research Articles

In computer science, an integer is a datum of integral data type, a data type that represents some range of mathematical integers. Integral data types may be of different sizes and may or may not be allowed to contain negative values. Integers are commonly represented in a computer as a group of binary digits (bits). The size of the grouping varies so the set of integer sizes available varies between different types of computers. Computer hardware nearly always provides a way to represent a processor register or memory address as an integer.

The Motorola 68000 is a 16/32-bit complex instruction set computer (CISC) microprocessor, introduced in 1979 by Motorola Semiconductor Products Sector.

AltiVec is a single-precision floating point and integer SIMD instruction set designed and owned by Apple, IBM, and Freescale Semiconductor — the AIM alliance. It is implemented on versions of the PowerPC processor architecture, including Motorola's G4, IBM's G5 and POWER6 processors, and P.A. Semi's PWRficient PA6T. AltiVec is a trademark owned solely by Freescale, so the system is also referred to as Velocity Engine by Apple and VMX by IBM and P.A. Semi.

I²C (Inter-Integrated Circuit; pronounced as “eye-squared-see” or “eye-two-see”), alternatively known as I2C or IIC, is a synchronous, multi-controller/multi-target (historically-termed as master/slave), single-ended, serial communication bus invented in 1982 by Philips Semiconductors. It is widely used for attaching lower-speed peripheral integrated circuits (ICs) to processors and microcontrollers in short-distance, intra-board communication.

x86 assembly language is the name for the family of assembly languages which provide some level of backward compatibility with CPUs back to the Intel 8008 microprocessor, which was launched in April 1972. It is used to produce object code for the x86 class of processors.

In computer programming, a bitwise operation operates on a bit string, a bit array or a binary numeral at the level of its individual bits. It is a fast and simple action, basic to the higher-level arithmetic operations and directly supported by the processor. Most bitwise operations are presented as two-operand instructions where the result replaces one of the input operands.

In computer science, primitive data types are a set of basic data types from which all other data types are constructed. Specifically it often refers to the limited set of data representations in use by a particular processor, which all compiled programs must use. Most processors support a similar set of primitive data types, although the specific representations vary. More generally, "primitive data types" may refer to the standard data types built into a programming language. Data types which are not primitive are referred to as derived or composite.

The syntax of the C programming language is the set of rules governing writing of software in C. It is designed to allow for programs that are extremely terse, have a close relationship with the resulting object code, and yet provide relatively high-level data abstraction. C was the first widely successful high-level language for portable operating-system development.

<span class="mw-page-title-main">Pointer (computer programming)</span> Object which stores memory addresses in a computer program

In computer science, a pointer is an object in many programming languages that stores a memory address. This can be that of another value located in computer memory, or in some cases, that of memory-mapped computer hardware. A pointer references a location in memory, and obtaining the value stored at that location is known as dereferencing the pointer. As an analogy, a page number in a book's index could be considered a pointer to the corresponding page; dereferencing such a pointer would be done by flipping to the page with the given page number and reading the text found on that page. The actual format and content of a pointer variable is dependent on the underlying computer architecture.

In computer science, a union is a value that may have any of multiple representations or formats within the same area of memory; that consists of a variable that may hold such a data structure. Some programming languages support a union type for such a data type. In other words, a union type specifies the permitted types that may be stored in its instances, e.g., float and integer. In contrast with a record, which could be defined to contain both a float and an integer; a union would hold only one at a time.

A bit array is an array data structure that compactly stores bits. It can be used to implement a simple set data structure. A bit array is effective at exploiting bit-level parallelism in hardware to perform operations quickly. A typical bit array stores kw bits, where w is the number of bits in the unit of storage, such as a byte or word, and k is some nonnegative integer. If w does not divide the number of bits to be stored, some space is wasted due to internal fragmentation.

IEC 61131-3 is the third part of the international standard IEC 61131 for programmable logic controllers. It was first published in December 1993 by the IEC; the current (third) edition was published in February 2013.

The computer programming languages C and Pascal have similar times of origin, influences, and purposes. Both were used to design their own compilers early in their lifetimes. The original Pascal definition appeared in 1969 and a first compiler in 1970. The first version of C appeared in 1972.

In the C programming language, data types constitute the semantics and characteristics of storage of data elements. They are expressed in the language syntax in form of declarations for memory locations or variables. Data types also determine the types of operations or methods of processing of data elements.

In computer programming, the term hooking covers a range of techniques used to alter or augment the behaviour of an operating system, of applications, or of other software components by intercepting function calls or messages or events passed between software components. Code that handles such intercepted function calls, events or messages is called a hook.

A class in C++ is a user-defined type or data structure declared with any of the keywords class, struct or union that has data and functions as its members whose access is governed by the three access specifiers private, protected or public. By default access to members of a C++ class declared with the keyword class is private. The private members are not accessible outside the class; they can be accessed only through member functions of the class. The public members form an interface to the class and are accessible outside the class.

In the x86 architecture, the CPUID instruction is a processor supplementary instruction allowing software to discover details of the processor. It was introduced by Intel in 1993 with the launch of the Pentium and SL-enhanced 486 processors.

ALGOL 68RS is the second ALGOL 68 compiler written by I. F. Currie and J. D. Morrison, at the Royal Signals and Radar Establishment (RSRE). Unlike the earlier ALGOL 68-R, it was designed to be portable, and implemented the language of the Revised Report.

An instruction set architecture (ISA) is an abstract model of a computer, also referred to as computer architecture. A realization of an ISA is called an implementation. An ISA permits multiple implementations that may vary in performance, physical size, and monetary cost ; because the ISA serves as the interface between software and hardware. Software that has been written for an ISA can run on different implementations of the same ISA. This has enabled binary compatibility between different generations of computers to be easily achieved, and the development of computer families. Both of these developments have helped to lower the cost of computers and to increase their applicability. For these reasons, the ISA is one of the most important abstractions in computing today.

The IBM System/360 architecture is the model independent architecture for the entire S/360 line of mainframe computers, including but not limited to the instruction set architecture. The elements of the architecture are documented in the IBM System/360 Principles of Operation and the IBM System/360 I/O Interface Channel to Control Unit Original Equipment Manufacturers' Information manuals.

References

↑ Penn Brumm; Don Brumm (August 1988). 80386 Assembly Language: A Complete Tutorial and Subroutine Library. McGraw-Hill School Education Group. p. 606. ISBN 978-0-8306-9047-3.
1 2 Steve Oualline (1997). Practical C Programming . "O'Reilly Media, Inc.". pp. 403–. ISBN 978-1-56592-306-5.
1 2 Michael A. Miller (January 1992). The 68000 Microprocessor Family: Architecture, Programming, and Applications. Merrill. p. 323. ISBN 978-0-02-381560-7.
↑ Ian Griffiths; Matthew Adams; Jesse Liberty (30 July 2010). Programming C# 4.0: Building Windows, Web, and RIA Applications for the .NET 4.0 Framework. "O'Reilly Media, Inc.". pp. 81–. ISBN 978-1-4493-9972-6.
↑ Tibet Mimar (1991). Programming and Designing with the 68000 Family: Including 68000, 68010/12, 68020, and the 68030. Prentice Hall. p. 275. ISBN 978-0-13-731498-0.
1 2 Prata, Stephen (2007). C primer plus (5th ed.). Indianapolis, Ind: Sams. ISBN 978-0-672-32696-7.
↑ Mark E. Daggett (13 November 2013). Expert JavaScript. Apress. pp. 68–. ISBN 978-1-4302-6097-4.
↑ InCider. W. Green. January 1986. p. 108.

External links

Explanation from a book
Description from another wiki
Use case in a C++ guide
C++ libbit bit library (alternative URL)
Bit Twiddling Hacks: Several snippets of C code manipulating bit fields

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[6] In C, it is implementation-defined whether a bit-field of type int is signed or unsigned. In C++, it is always signed to match the underlying type.

[BrummBrumm1988-1] Penn Brumm; Don Brumm (August 1988). 80386 Assembly Language: A Complete Tutorial and Subroutine Library. McGraw-Hill School Education Group. p. 606. ISBN 978-0-8306-9047-3.

[Oualline1997-2] 1 2 Steve Oualline (1997). Practical C Programming . "O'Reilly Media, Inc.". pp. 403–. ISBN 978-1-56592-306-5.

[Miller1992-3] 1 2 Michael A. Miller (January 1992). The 68000 Microprocessor Family: Architecture, Programming, and Applications. Merrill. p. 323. ISBN 978-0-02-381560-7.

[GriffithsAdams2010-4] Ian Griffiths; Matthew Adams; Jesse Liberty (30 July 2010). Programming C# 4.0: Building Windows, Web, and RIA Applications for the .NET 4.0 Framework. "O'Reilly Media, Inc.". pp. 81–. ISBN 978-1-4493-9972-6.

[Mimar1991-5] Tibet Mimar (1991). Programming and Designing with the 68000 Family: Including 68000, 68010/12, 68020, and the 68030. Prentice Hall. p. 275. ISBN 978-0-13-731498-0.

[Prata_2007-7] 1 2 Prata, Stephen (2007). C primer plus (5th ed.). Indianapolis, Ind: Sams. ISBN 978-0-672-32696-7.

[Daggett2013-8] Mark E. Daggett (13 November 2013). Expert JavaScript. Apress. pp. 68–. ISBN 978-1-4302-6097-4.

[9] InCider. W. Green. January 1986. p. 108.

[1]

[2]

[3]

[4]

[5]

[lower-alpha 1]

[6]

[7]

[8]