Bit field

Last updated

A bit field is a data structure that consists of one or more adjacent bits which have been allocated for specific purposes, so that any single bit or group of bits within the structure can be set or inspected. [1] [2] A bit field is most commonly used to represent integral types of known, fixed bit-width, such as single-bit Booleans.

Contents

The meaning of the individual bits within the field is determined by the programmer; for example, the first bit in a bit field (located at the field's base address) is sometimes used to determine the state of a particular attribute associated with the bit field. [3]

Within CPUs and other logic devices, collections of bit fields called flags are commonly used to control or to indicate the outcome of particular operations. [4] Processors have a status register that is composed of flags. For example, if the result of an addition cannot be represented in the destination an arithmetic overflow is set. The flags can be used to decide subsequent operations, such as conditional jump instructions. For example, a JE... (Jump if Equal) instruction in the x86 assembly language will result in a jump if the Z (zero) flag was set by some previous operation.

A bit field is distinguished from a bit array in that the latter is used to store a large set of bits indexed by integers and is often wider than any integral type supported by the language.[ citation needed ] Bit fields, on the other hand, typically fit within a machine word, [3] and the denotation of bits is independent of their numerical index. [2]

Implementation

Bit fields can be used to reduce memory consumption when a program requires a number of integer variables which always will have low values. For example, in many systems, storing an integer value requires two bytes (16-bits) of memory; sometimes the values to be stored actually need only one or two bits. Having a number of these tiny variables share a bit field allows efficient packaging of data in the memory. [5]

In C, native implementation-defined bit fields can be created using int, [lower-alpha 1] unsigned int, signed int, _Bool (in C99), _BitInt(N), unsigned _BitInt(N) (in C23) or other implementation-defined types. In C++, they can be created using any integral or enumeration type; most C compilers also allow this. In this case, the programmer can declare a structure for a bit field which labels and determines the width of several subfields. [6] Adjacently declared bit fields of the same type can then be packed by the compiler into a reduced number of words, compared with the memory used if each 'field' were to be declared separately.

For languages lacking native bit fields, or where the programmer wants control over the resulting bit representation, it is possible to manually manipulate bits within a larger word type. In this case, the programmer can set, test, and change the bits in the field using combinations of masking and bitwise operations. [7]

Examples

C programming language

Declaring a bit field in C and C++: [6]

// opaque and show#define YES 1#define NO  0// line styles#define SOLID  1#define DOTTED 2#define DASHED 3// primary colors#define BLUE  0b100#define GREEN 0b010#define RED   0b001// mixed colors#define BLACK   0                    /* 000 */#define YELLOW  (RED | GREEN)        /* 011 */#define MAGENTA (RED | BLUE)         /* 101 */#define CYAN    (GREEN | BLUE)       /* 110 */#define WHITE   (RED | GREEN | BLUE) /* 111 */constchar*colors[8]={"Black","Red","Green","Yellow","Blue","Magenta","Cyan","White"};// bit field box propertiesstructBoxProps{unsignedintopaque:1;unsignedintfill_color:3;unsignedint:4;// fill to 8 bitsunsignedintshow_border:1;unsignedintborder_color:3;unsignedintborder_style:2;unsignedchar:0;// fill to nearest byte (16 bits)unsignedcharwidth:4,// Split a byte into 2 fields of 4 bitsheight:4;};


The layout of bit fields in a C struct is implementation-defined. For behavior that remains predictable across compilers, it may be preferable to emulate bit fields with a primitive and bit operators:

/* Each of these preprocessor directives defines a single bit,   corresponding to one button on the controller.     Button order matches that of the Nintendo Entertainment System. */#define KEY_RIGHT  0b00000001#define KEY_LEFT   0b00000010#define KEY_DOWN   0b00000100#define KEY_UP     0b00001000#define KEY_START  0b00010000#define KEY_SELECT 0b00100000#define KEY_B      0b01000000#define KEY_A      0b10000000unsignedchargameControllerStatus=0;/* Sets the gameControllerStatus using OR */voidKeyPressed(unsignedcharkey){gameControllerStatus|=key;}/* Clears the gameControllerStatus using AND and ~ (binary NOT)*/voidKeyReleased(unsignedcharkey){gameControllerStatus&=~key;}/* Tests whether a bit is set using AND */unsignedcharIsPressed(unsignedcharkey){returngameControllerStatus&key;}

Processor status register

The status register of a processor is a bit field consisting of several flag bits. Each flag bit describes information about the processor's current state. [8] As an example, the status register of the 6502 processor is shown below:

6502 status register
Bit 7Bit 6Bit 5Bit 4Bit 3Bit 2Bit 1Bit 0
Negative flagoVerflow flag-Break flagDecimal flagInterrupt-disable flagZero flagCarry flag

These bits are set by the processor following the result of an operation. Certain bits (such as the Carry, Interrupt-disable, and Decimal flags) may be explicitly controlled using set and clear instructions. Additionally, branching instructions are also defined to alter execution based on the current state of a flag.

For an instance, after an ADC (Add with Carry) instruction, the BVS (Branch on oVerflow Set) instruction may be used to jump based on whether the overflow flag was set by the processor following the result of the addition instruction.

Extracting bits from flag words

A subset of flags in a flag field may be extracted by ANDing with a mask. A large number of languages support the shift operator (<<) where 1 << n aligns a single bit to the nth position. Most also support the use of the AND operator (&) to isolate the value of one or more bits.

If the status-byte from a device is 0x67 and the 5th flag bit indicates data-ready. The mask-byte is 2^5 = 0x20. ANDing the status-byte 0x67 (0110 0111 in binary) with the mask-byte 0x20(0010 0000 in binary) evaluates to 0x20. This means the flag bit is set i.e., the device has data ready. If the flag-bit had not been set, this would have evaluated to 0 i.e., there is no data available from the device.

To check the nth bit from a variable v, perform either of the following: (both are equivalent)

bool nth_is_set = (v & (1 << n)) != 0; bool nth_is_set = (v >> n) & 1;

Changing bits in flag words

Writing, reading or toggling bits in flags can be done only using the OR, AND and NOT operations – operations which can be performed quickly in the processor. To set a bit, OR the status byte with a mask byte. Any bits set in the mask byte or the status byte will be set in the result.

To toggle a bit, XOR the status byte and the mask byte. This will set a bit if it is cleared or clear a bit if it is set.

See also

Notes

  1. In C, it is implementation-defined whether a bit-field of type int is signed or unsigned. In C++, it is always signed to match the underlying type.

Related Research Articles

In computer science, an integer is a datum of integral data type, a data type that represents some range of mathematical integers. Integral data types may be of different sizes and may or may not be allowed to contain negative values. Integers are commonly represented in a computer as a group of binary digits (bits). The size of the grouping varies so the set of integer sizes available varies between different types of computers. Computer hardware nearly always provides a way to represent a processor register or memory address as an integer.

<span class="mw-page-title-main">Motorola 68000</span> Microprocessor

The Motorola 68000 is a 16/32-bit complex instruction set computer (CISC) microprocessor, introduced in 1979 by Motorola Semiconductor Products Sector.

AltiVec is a single-precision floating point and integer SIMD instruction set designed and owned by Apple, IBM, and Freescale Semiconductor — the AIM alliance. It is implemented on versions of the PowerPC processor architecture, including Motorola's G4, IBM's G5 and POWER6 processors, and P.A. Semi's PWRficient PA6T. AltiVec is a trademark owned solely by Freescale, so the system is also referred to as Velocity Engine by Apple and VMX by IBM and P.A. Semi.

<span class="mw-page-title-main">MCS-51</span> Single chip microcontroller series by Intel

The Intel MCS-51 is a single chip microcontroller (MCU) series developed by Intel in 1980 for use in embedded systems. The architect of the Intel MCS-51 instruction set was John H. Wharton. Intel's original versions were popular in the 1980s and early 1990s, and enhanced binary compatible derivatives remain popular today. It is a complex instruction set computer, but also has some of the features of RISC architectures, such as a large register set and register windows, and has separate memory spaces for program instructions and data.

x86 assembly language is the name for the family of assembly languages which provide some level of backward compatibility with CPUs back to the Intel 8008 microprocessor, which was launched in April 1972. It is used to produce object code for the x86 class of processors.

In computer programming, a bitwise operation operates on a bit string, a bit array or a binary numeral at the level of its individual bits. It is a fast and simple action, basic to the higher-level arithmetic operations and directly supported by the processor. Most bitwise operations are presented as two-operand instructions where the result replaces one of the input operands.

In computer science, primitive data types are a set of basic data types from which all other data types are constructed. Specifically it often refers to the limited set of data representations in use by a particular processor, which all compiled programs must use. Most processors support a similar set of primitive data types, although the specific representations vary. More generally, "primitive data types" may refer to the standard data types built into a programming language. Data types which are not primitive are referred to as derived or composite.

<span class="mw-page-title-main">C syntax</span> Set of rules defining correctly structured programs

The syntax of the C programming language is the set of rules governing writing of software in C. It is designed to allow for programs that are extremely terse, have a close relationship with the resulting object code, and yet provide relatively high-level data abstraction. C was the first widely successful high-level language for portable operating-system development.

In computer science, a mask or bitmask is data that is used for bitwise operations, particularly in a bit field. Using a mask, multiple bits in a byte, nibble, word, etc. can be set either on or off, or inverted from on to off in a single bitwise operation. An additional use of masking involves predication in vector processing, where the bitmask is used to select which element operations in the vector are to be executed and which are not.

In computer science, a union is a value that may have any of several representations or formats within the same position in memory; that consists of a variable that may hold such a data structure. Some programming languages support special data types, called union types, to describe such values and variables. In other words, a union type definition will specify which of a number of permitted primitive types may be stored in its instances, e.g., "float or long integer". In contrast with a record, which could be defined to contain both a float and an integer; in a union, there is only one value at any given time.

A bit array is an array data structure that compactly stores bits. It can be used to implement a simple set data structure. A bit array is effective at exploiting bit-level parallelism in hardware to perform operations quickly. A typical bit array stores kw bits, where w is the number of bits in the unit of storage, such as a byte or word, and k is some nonnegative integer. If w does not divide the number of bits to be stored, some space is wasted due to internal fragmentation.

IEC 61131-3 is the third part of the international standard IEC 61131 for programmable logic controllers. It was first published in December 1993 by the IEC; the current (third) edition was published in February 2013.

The computer programming languages C and Pascal have similar times of origin, influences, and purposes. Both were used to design their own compilers early in their lifetimes. The original Pascal definition appeared in 1969 and a first compiler in 1970. The first version of C appeared in 1972.

<span class="mw-page-title-main">C data types</span> Data types supported by the C programming language

In the C programming language, data types constitute the semantics and characteristics of storage of data elements. They are expressed in the language syntax in form of declarations for memory locations or variables. Data types also determine the types of operations or methods of processing of data elements.

In computer programming, the term hooking covers a range of techniques used to alter or augment the behaviour of an operating system, of applications, or of other software components by intercepting function calls or messages or events passed between software components. Code that handles such intercepted function calls, events or messages is called a hook.

A class in C++ is a user-defined type or data structure declared with any of the keywords class, struct or union that has data and functions as its members whose access is governed by the three access specifiers private, protected or public. By default access to members of a C++ class declared with the keyword class is private. The private members are not accessible outside the class; they can be accessed only through member functions of the class. The public members form an interface to the class and are accessible outside the class.

ALGOL 68RS is the second ALGOL 68 compiler written by I. F. Currie and J. D. Morrison, at the Royal Signals and Radar Establishment (RSRE). Unlike the earlier ALGOL 68-R, it was designed to be portable, and implemented the language of the Revised Report.

The IBM System/360 architecture is the model independent architecture for the entire S/360 line of mainframe computers, including but not limited to the instruction set architecture. The elements of the architecture are documented in the IBM System/360 Principles of Operation and the IBM System/360 I/O Interface Channel to Control Unit Original Equipment Manufacturers' Information manuals.

MessagePack is a computer data interchange format. It is a binary form for representing simple data structures like arrays and associative arrays. MessagePack aims to be as compact and simple as possible. The official implementation is available in a variety of languages such as C, C++, C#, D, Erlang, Go, Haskell, Java, JavaScript (NodeJS), Lua, OCaml, Perl, PHP, Python, Ruby, Rust, Scala, Smalltalk, and Swift.

In the C programming language, operations can be performed on a bit level using bitwise operators.

References

  1. Penn Brumm; Don Brumm (August 1988). 80386 Assembly Language: A Complete Tutorial and Subroutine Library. McGraw-Hill School Education Group. p. 606. ISBN   978-0-8306-9047-3.
  2. 1 2 Steve Oualline (1997). Practical C Programming . "O'Reilly Media, Inc.". pp.  403–. ISBN   978-1-56592-306-5.
  3. 1 2 Michael A. Miller (January 1992). The 68000 Microprocessor Family: Architecture, Programming, and Applications. Merrill. p. 323. ISBN   978-0-02-381560-7.
  4. Ian Griffiths; Matthew Adams; Jesse Liberty (30 July 2010). Programming C# 4.0: Building Windows, Web, and RIA Applications for the .NET 4.0 Framework. "O'Reilly Media, Inc.". pp. 81–. ISBN   978-1-4493-9972-6.
  5. Tibet Mimar (1991). Programming and Designing with the 68000 Family: Including 68000, 68010/12, 68020, and the 68030. Prentice Hall. p. 275. ISBN   978-0-13-731498-0.
  6. 1 2 Prata, Stephen (2007). C primer plus (5th ed.). Indianapolis, Ind: Sams. ISBN   978-0-672-32696-7.
  7. Mark E. Daggett (13 November 2013). Expert JavaScript. Apress. pp. 68–. ISBN   978-1-4302-6097-4.
  8. InCider. W. Green. January 1986. p. 108.