Bit manipulation

Last updated October 14, 2023

Bit manipulation is the act of algorithmically manipulating bits or other pieces of data shorter than a word. Computer programming tasks that require bit manipulation include low-level device control, error detection and correction algorithms, data compression, encryption algorithms, and optimization. For most other tasks, modern programming languages allow the programmer to work directly with abstractions instead of bits that represent those abstractions.

Source code that does bit manipulation makes use of the bitwise operations: AND, OR, XOR, NOT, and possibly other operations analogous to the boolean operators; there are also bit shifts and operations to count ones and zeros, find high and low one or zero, set, reset and test bits, extract and insert fields, mask and zero fields, gather and scatter bits to and from specified bit positions or fields. Integer arithmetic operators can also effect bit-operations in conjunction with the other operators.

Bit manipulation, in some cases, can obviate or reduce the need to loop over a data structure and can give manyfold speed-ups, as bit manipulations are processed in parallel.

Terminology

Bit twiddling, bit fiddling, bit bashing, and bit gymnastics are often used interchangeably with bit manipulation, but sometimes exclusively refer to clever or non-obvious ways or uses of bit manipulation, or tedious or challenging low-level device control data manipulation tasks.

The term bit twiddling dates from early computing hardware, where computer operators would make adjustments by tweaking or twiddling computer controls. As computer programming languages evolved, programmers adopted the term to mean any handling of data that involved bit-level computation.

Bitwise operation

A bitwise operation operates on one or more bit patterns or binary numerals at the level of their individual bits. It is a fast, primitive action directly supported by the central processing unit (CPU), and is used to manipulate values for comparisons and calculations.

On most processors, the majority of bitwise operations are single cycle - substantially faster than division and multiplication and branches. While modern processors usually perform some arithmetic and logical operations just as fast as bitwise operations due to their longer instruction pipelines and other architectural design choices, bitwise operations do commonly use less power because of the reduced use of resources.

Example of bit manipulation

To determine if a number is a power of two, conceptually we may repeatedly do integer divide by two until the number won't divide by 2 evenly; if the only factor left is 1, the original number was a power of 2. Using bit and logical operators, there is a simple expression which will return true (1) or false (0):

boolisPowerOfTwo=(x!=0)&&((x&(x-1))==0);

The second half uses the fact that powers of two have one and only one bit set in their binary representation:

x         == 0...010...0 x-1       == 0...001...1 x & (x-1) == 0...000...0

If the number is neither zero nor a power of two, it will have '1' in more than one place:

x         == 0...1...010...0 x-1       == 0...1...001...1 x & (x-1) == 0...1...000...0

If inline assembly language code is used, then an instruction (popcnt) that counts the number of 1's or 0's in the operand might be available; an operand with exactly one '1' bit is a power of 2. However, such an instruction may have greater latency than the bitwise method above.

Bit manipulation operations

Processors typically provide only a subset of the useful bit operators. Programming languages don't directly support most bit operations, so idioms must be used to code them. The 'C' programming language, for example provides only bit-wise AND(&), OR(|), XOR(^) and NOT(~). Fortran provides AND(.and.), OR (.or.), XOR (.neqv.) and EQV(.eqv.). Algol provides syntactic bitfield extract and insert. When languages provide bit operations that don't directly map to hardware instructions, compilers must synthesize the operation from available operators.

An especially useful bit operation is count leading zeros used to find the high set bit of a machine word, though it may have different names on various architectures.^[1] There's no simple programming language idiom, so it must be provided by a compiler intrinsic or system library routine. Without that operator, it is very expensive (see Find first set#CLZ) to do any operations with regard to the high bit of a word, due to the asymmetric carry-propagate of arithmetic operations. Fortunately, most cpu architectures have provided that since the middle 1980s. An accompanying operation count ones, also called POPCOUNT, which counts the number of set bits in a machine word, is also usually provided as a hardware operator. Simpler bit operations like bit set, reset, test and toggle are often provided as hardware operators, but are easily simulated if they aren't - for example (SET R0, 1; LSHFT R0, i; OR x, R0) sets bit i in operand x.

Some of the more useful and complex bit operations that must be coded as idioms in the programming language and synthesized by compilers include:

clear from specified bit position up (leave lower part of word)
clear from specified bit position down (leave upper part of word)
mask from low bit down (clear lower word)
mask from high bit up (clear lower word)
bitfield extract
bitfield insert
bitfield scatter/gather operations which distribute contiguous portions of a bitfield over a machine word, or gather disparate bitfields in the word into a contiguous portion of a bitfield (see recent Intel PEXT/PDEP operators). Used by cryptography and video encoding.
matrix inversion

Some arithmetic operations can be reduced to simpler operations and bit operations:

reduce multiply by constant to sequence of shift-add

Multiply by 9 for example, is copy operand, shift up by 3 (multiply by 8), and add to original operand.

reduce division by constant to sequence of shift-subtract

Masking

A mask is data that is used for bitwise operations, particularly in a bit field.

Using a mask, multiple bits in a Byte, nibble, word (etc.) can be set either on, off or inverted from on to off (or vice versa) in a single bitwise operation. More comprehensive applications of masking, when applied conditionally to operations, are termed predication.

Related Research Articles

<span class="mw-page-title-main">Logical conjunction</span> Logical connective AND

In logic, mathematics and linguistics, and is the truth-functional operator of conjunction or logical conjunction. The logical connective of this operator is typically represented as $or or (prefix) or or in which is the most modern and widely used.$

In computer programming, an arithmetic shift is a shift operator, sometimes termed a signed shift. The two basic types are the arithmetic left shift and the arithmetic right shift. For binary numbers it is a bitwise operation that shifts all of the bits of its operand; every bit in the operand is simply moved a given number of bit positions, and the vacant bit-positions are filled in. Instead of being filled with all 0s, as in logical shift, when shifting to the right, the leftmost bit is replicated to fill in all the vacant positions.

In computer science, an instruction set architecture (ISA), also called computer architecture, is an abstract model of a computer. A device that executes instructions described by that ISA, such as a central processing unit (CPU), is called an implementation.

In computer programming, the exclusive or swap is an algorithm that uses the exclusive or bitwise operation to swap the values of two variables without using the temporary variable which is normally required.

In computer programming, a bitwise operation operates on a bit string, a bit array or a binary numeral at the level of its individual bits. It is a fast and simple action, basic to the higher-level arithmetic operations and directly supported by the processor. Most bitwise operations are presented as two-operand instructions where the result replaces one of the input operands.

A bitboard is a specialized bit array data structure commonly used in computer systems that play board games, where each bit corresponds to a game board space or piece. This allows parallel bitwise operations to set or query the game state, or determine moves or plays in the game.

In computer science, a mask or bitmask is data that is used for bitwise operations, particularly in a bit field. Using a mask, multiple bits in a byte, nibble, word, etc. can be set either on or off, or inverted from on to off in a single bitwise operation. An additional use of masking involves predication in vector processing, where the bitmask is used to select which element operations in the vector are to be executed and which are not.

The Hamming weight of a string is the number of symbols that are different from the zero-symbol of the alphabet used. It is thus equivalent to the Hamming distance from the all-zero string of the same length. For the most typical case, a string of bits, this is the number of 1's in the string, or the digit sum of the binary representation of a given number and the ℓ₁ norm of a bit vector. In this binary case, it is also called the population count, popcount, sideways sum, or bit summation.

A bit array is an array data structure that compactly stores bits. It can be used to implement a simple set data structure. A bit array is effective at exploiting bit-level parallelism in hardware to perform operations quickly. A typical bit array stores kw bits, where w is the number of bits in the unit of storage, such as a byte or word, and k is some nonnegative integer. If w does not divide the number of bits to be stored, some space is wasted due to internal fragmentation.

In computer science, a logical shift is a bitwise operation that shifts all the bits of its operand. The two base variants are the logical left shift and the logical right shift. This is further modulated by the number of bit positions a given value shall be shifted, such as shift left by 1 or shift right by n. Unlike an arithmetic shift, a logical shift does not preserve a number's sign bit or distinguish a number's exponent from its significand (mantissa); every bit in the operand is simply moved a given number of bit positions, and the vacant bit-positions are filled, usually with zeros, and possibly ones.

In combinatorial mathematics, a circular shift is the operation of rearranging the entries in a tuple, either by moving the final entry to the first position, while shifting all other entries to the next position, or by performing the inverse operation. A circular shift is a special kind of cyclic permutation, which in turn is a special kind of permutation. Formally, a circular shift is a permutation σ of the n entries in the tuple such that either

A bit field is a data structure that consists of one or more adjacent bits which have been allocated for specific purposes, so that any single bit or group of bits within the structure can be set or inspected. A bit field is most commonly used to represent integral types of known, fixed bit-width, such as single-bit Booleans.

In computer processors, the overflow flag is usually a single bit in a system status register used to indicate when an arithmetic overflow has occurred in an operation, indicating that the signed two's-complement result would not fit in the number of bits used for the result. Some architectures may be configured to automatically generate an exception on an operation resulting in overflow.

<span class="mw-page-title-main">TI-990</span> Series of 16-bit computers by Texas Instruments.

The TI-990 was a series of 16-bit minicomputers sold by Texas Instruments (TI) in the 1970s and 1980s. The TI-990 was a replacement for TI's earlier minicomputer systems, the TI-960 and the TI-980. It had several unique features, and was easier to program than its predecessors.

Little Computer 3, or LC-3, is a type of computer educational programming language, an assembly language, which is a type of low-level programming language.

In computing, an arithmetic logic unit (ALU) is a combinational digital circuit that performs arithmetic and bitwise operations on integer binary numbers. This is in contrast to a floating-point unit (FPU), which operates on floating point numbers. It is a fundamental building block of many types of computing circuits, including the central processing unit (CPU) of computers, FPUs, and graphics processing units (GPUs).

In computer software and hardware, find first set (ffs) or find first one is a bit operation that, given an unsigned machine word, designates the index or position of the least significant bit set to one in the word counting from the least significant bit position. A nearly equivalent operation is count trailing zeros (ctz) or number of trailing zeros (ntz), which counts the number of zero bits following the least significant one bit. The complementary operation that finds the index or position of the most significant set bit is log base 2, so called because it computes the binary logarithm $⌊log 2 (x)⌋$ . This is closely related to count leading zeros (clz) or number of leading zeros (nlz), which counts the number of zero bits preceding the most significant one bit. There are two common variants of find first set, the POSIX definition which starts indexing of bits at 1, herein labelled ffs, and the variant which starts indexing of bits at zero, which is equivalent to ctz and so will be called by that name.

In the C programming language, operations can be performed on a bit level using bitwise operators.

<span class="mw-page-title-main">Signetics 8X300</span> Signetics microprocessor

The 8X300 is a microprocessor produced and marketed by Signetics starting 1976 as a second source for the SMS 300 by Scientific Micro Systems, Inc. Although SMS developed the SMS 300, Signetics was the sole manufacturer of this product line. In 1978 Signetics purchased the rights to the SMS 300 series and renamed it 8X300.

The WD16 is a 16-bit microprocessor introduced by Western Digital in October 1976. It is based on the MCP-1600 chipset, a general-purpose design that was also used to implement the DEC LSI-11 low-end minicomputer and the Pascal MicroEngine processor. The three systems differed primarily in their microcode, giving each system a unique instruction set architecture (ISA).

References

↑ On most Intel chips, it's BSR (bitscan reverse), though newer SoCs also have LZCNT (count leading zeros)

External links

Anderson, Sean Eron, ed. (2009-11-26) [1997]. "Bit Twiddling Hacks". Stanford University. Archived from the original on 2020-06-01. Retrieved 2020-06-01.
Bit Manipulation Tricks with full explanations and source code
Intel Intrinsics Guide
xchg rax, rax: x86_64 riddles and hacks
The Aggregate Magic Algorithms from University of Kentucky

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] On most Intel chips, it's BSR (bitscan reverse), though newer SoCs also have LZCNT (count leading zeros)

[1]