Mask (computing)

Last updated

In computer science, a mask or bitmask is data that is used for bitwise operations, particularly in a bit field. Using a mask, multiple bits in a byte, nibble, word, etc. can be set either on or off, or inverted from on to off (or vice versa) in a single bitwise operation. An additional use of masking involves predication in vector processing, where the bitmask is used to select which element operations in the vector are to be executed (mask bit is enabled) and which are not (mask bit is clear).

Contents

Common bitmask functions

Masking bits to 1

To turn certain bits on, the bitwise OR operation can be used, following the principle that Y OR 1 = 1 and Y OR 0 = Y. Therefore, to make sure a bit is on, OR can be used with a 1. To leave a bit unchanged, OR is used with a 0.

Example: Masking on the higher nibble (bits 4, 5, 6, 7) while leaving the lower nibble (bits 0, 1, 2, 3) unchanged.

10010101   10100101  OR 11110000   11110000   = 11110101   11110101

Masking bits to 0

More often in practice, bits are "masked off" (or masked to 0) than "masked on" (or masked to 1). When a bit is ANDed with a 0, the result is always 0, i.e. Y AND 0 = 0. To leave the other bits as they were originally, they can be ANDed with 1 as Y AND 1 = Y

Example: Masking off the higher nibble (bits 4, 5, 6, 7) while leaving the lower nibble (bits 0, 1, 2, 3) unchanged.

10010101   10100101 AND 00001111   00001111   = 00000101   00000101

Querying the status of a bit

It is possible to use bitmasks to easily check the state of individual bits regardless of the other bits. To do this, turning off all the other bits using the bitwise AND is done as discussed above and the value is compared with 0. If it is equal to 0, then the bit was off, but if the value is any other value, then the bit was on. What makes this convenient is that it is not necessary to figure out what the value actually is, just that it is not 0.

Example: Querying the status of the 4th bit

    10011101   10010101 AND 00001000   00001000   = 00001000   00000000

Toggling bit values

So far the article has covered how to turn bits on and turn bits off, but not both at once. Sometimes it does not really matter what the value is, but it must be made the opposite of what it currently is. This can be achieved using the XOR (exclusive or) operation. XOR returns 1 if and only if an odd number of bits are 1. Therefore, if two corresponding bits are 1, the result will be a 0, but if only one of them is 1, the result will be 1. Therefore inversion of the values of bits is done by XORing them with a 1. If the original bit was 1, it returns 1 XOR 1 = 0. If the original bit was 0 it returns 0 XOR 1 = 1. Also note that XOR masking is bit-safe, meaning that it will not affect unmasked bits because Y XOR 0 = Y, just like an OR.

Example: Toggling bit values

    10011101   10010101 XOR 00001111   11111111   = 10010010   01101010

To write arbitrary 1s and 0s to a subset of bits, first write 0s to that subset, then set the high bits:

  register = (register & ~bitmask) | value;

Uses of bitmasks

A party trick to guess a number from which cards it is printed on uses the bits of the binary representation of the number. In the SVG file, click a card to toggle it. Binary guess number trick SMIL.svg
A party trick to guess a number from which cards it is printed on uses the bits of the binary representation of the number. In the SVG file, click a card to toggle it.

Arguments to functions

In programming languages such as C, bit fields are a useful way to pass a set of named Boolean arguments to a function. For example, in the graphics API OpenGL, there is a command, glClear() which clears the screen or other buffers. It can clear up to four buffers (the color, depth, accumulation, and stencil buffers), so the API authors could have had it take four arguments. But then a call to it would look like

glClear(1,1,0,0);// This is not how glClear actually works and would make for unstable code.

which is not very descriptive. Instead there are four defined field bits, GL_COLOR_BUFFER_BIT, GL_DEPTH_BUFFER_BIT, GL_ACCUM_BUFFER_BIT, and GL_STENCIL_BUFFER_BIT and glClear() is declared as

voidglClear(GLbitfieldbits);

Then a call to the function looks like this

glClear(GL_COLOR_BUFFER_BIT|GL_DEPTH_BUFFER_BIT);

Internally, a function taking a bitfield like this can use binary and to extract the individual bits. For example, an implementation of glClear() might look like:

voidglClear(GLbitfieldbits){if((bits&GL_COLOR_BUFFER_BIT)!=0){// Clear color buffer.}if((bits&GL_DEPTH_BUFFER_BIT)!=0){// Clear depth buffer.}if((bits&GL_ACCUM_BUFFER_BIT)!=0){// Clear accumulation buffer.}if((bits&GL_STENCIL_BUFFER_BIT)!=0){// Clear stencil buffer.}}

The advantage to this approach is that function argument overhead is decreased. Since the minimum datum size is one byte, separating the options into separate arguments would be wasting seven bits per argument and would occupy more stack space. Instead, functions typically accept one or more 32-bit integers, with up to 32 option bits in each. While elegant, in the simplest implementation this solution is not type-safe. A GLbitfield is simply defined to be an unsigned int, so the compiler would allow a meaningless call to glClear(42) or even glClear(GL_POINTS). In C++ an alternative would be to create a class to encapsulate the set of arguments that glClear could accept and could be cleanly encapsulated in a library.

Inverse masks

Masks are used with IP addresses in IP ACLs (Access Control Lists) to specify what should be permitted and denied. To configure IP addresses on interfaces, masks start with 255 and have the large values on the left side: for example, IP address 203.0.113.129 with a 255.255.255.224 mask. Masks for IP ACLs are the reverse: for example, mask 0.0.0.255. This is sometimes called an inverse mask or a wildcard mask. When the value of the mask is broken down into binary (0s and 1s), the results determine which address bits are to be considered in processing the traffic. A 0-bit indicates that the address bit must be considered (exact match); a 1-bit in the mask is a "don't care". This table further explains the concept.

Mask example:

network address (traffic that is to be processed): 192.0.2.0

mask: 0.0.0.255

network address (binary): 11000000.00000000.00000010.00000000

mask (binary): 00000000.00000000.00000000.11111111

Based on the binary mask, it can be seen that the first three sets (octets) must match the given binary network address exactly (11000000.00000000.00000010). The last set of numbers is made of "don't cares" (.11111111). Therefore, all traffic that begins with "192.0.2." matches, since the last octet is "don't care". Therefore, with this mask, network addresses 192.0.2.1 through 192.0.2.255 (192.0.2.x) are processed.

Subtract the normal mask from 255.255.255.255 in order to determine the ACL inverse mask. In this example, the inverse mask is determined for network address 198.51.100.0 with a normal mask of 255.255.255.0.

255.255.255.255 - 255.255.255.0 (normal mask) = 0.0.0.255 (inverse mask)

ACL equivalents

The source/source-wildcard of 0.0.0.0/255.255.255.255 means "any".

The source/wildcard of 198.51.100.2/0.0.0.0 is the same as "host 198.51.100.2"

Image masks

Raster graphic sprites (left) and masks (right) Blit dot.gif
Raster graphic sprites (left) and masks (right)

In computer graphics, when a given image is intended to be placed over a background, the transparent areas can be specified through a binary mask. [1] This way, for each intended image there are actually two bitmaps: the actual image, in which the unused areas are given a pixel value with all bits set to 0s, and an additional mask, in which the correspondent image areas are given a pixel value of all bits set to 0s and the surrounding areas a value of all bits set to 1s. In the sample at right, black pixels have the all-zero bits and white pixels have the all-one bits.

At run time, to put the image on the screen over the background, the program first masks the screen pixel's bits with the image mask at the desired coordinates using the bitwise AND operation. This preserves the background pixels of the transparent areas while resets with zeros the bits of the pixels which will be obscured by the overlapped image.

Then, the program renders the image pixel's bits by combining them with the background pixel's bits using the bitwise OR operation. This way, the image pixels are appropriately placed while keeping the background surrounding pixels preserved. The result is a perfect compound of the image over the background.

Sprite rendering by binary image mask.png

This technique is used for painting pointing device cursors, in typical 2-D videogames for characters, bullets and so on (the sprites), for GUI icons, and for video titling and other image mixing applications. A faster method is to simply overwrite the background pixels with the foreground pixels if their alpha=1

Although related (due to being used for the same purposes), transparent colors and alpha channels are techniques which do not involve the image pixel mixage by binary masking.

Hash tables

To create a hashing function for a hash table, often a function is used that has a large domain. To create an index from the output of the function, a modulo can be taken to reduce the size of the domain to match the size of the array; however, it is often faster on many processors to restrict the size of the hash table to powers of two sizes and use a bitmask instead.

An example of both modulo and masking in C:

#include<stdint.h>#include<string.h>intmain(void){constuint32_tNUM_BUCKETS=0xFFFFFFFF;// 2^32 - 1constuint32_tMAX_RECORDS=1<<10;// 2^10constuint32_tHASH_BITMASK=0x3FF;// (2^10)-1char**token_array=NULL;// Handle memory allocation for token_array…chartoken[]="some hashable value";uint32_thashed_token=hash_function(token,strlen(token),NUM_BUCKETS);// Using modulosize_tindex=hashed_token%MAX_RECORDS;// OR// Using bitmasksize_tindex=hashed_token&HASH_BITMASK;*(token_array+index)=token;// Free the memory from token_array …return0;}

See also

Related Research Articles

<span class="mw-page-title-main">Hash function</span> Mapping arbitrary data to fixed-size values

A hash function is any function that can be used to map data of arbitrary size to fixed-size values, though there are some hash functions that support variable-length output. The values returned by a hash function are called hash values, hash codes, hash digests, digests, or simply hashes. The values are usually used to index a fixed-size table called a hash table. Use of a hash function to index a hash table is called hashing or scatter-storage addressing.

Bit blit is a data operation commonly used in computer graphics in which several bitmaps are combined into one using a boolean function.

<span class="mw-page-title-main">Subnet</span> Logical subdivision of an IP network

A subnetwork, or subnet, is a logical subdivision of an IP network. The practice of dividing a network into two or more networks is called subnetting.

The BMP file format, or bitmap, is a raster graphics image file format used to store bitmap digital images, independently of the display device, especially on Microsoft Windows and OS/2 operating systems.

In computer programming, a bitwise operation operates on a bit string, a bit array or a binary numeral at the level of its individual bits. It is a fast and simple action, basic to the higher-level arithmetic operations and directly supported by the processor. Most bitwise operations are presented as two-operand instructions where the result replaces one of the input operands.

In computing, signed number representations are required to encode negative numbers in binary number systems.

In computer science, a fusion tree is a type of tree data structure that implements an associative array on w-bit integers on a finite universe, where each of the input integers has size less than 2w and is non-negative. When operating on a collection of n key–value pairs, it uses O(n) space and performs searches in O(logwn) time, which is asymptotically faster than a traditional self-balancing binary search tree, and also better than the van Emde Boas tree for large values of w. It achieves this speed by using certain constant-time operations that can be done on a machine word. Fusion trees were invented in 1990 by Michael Fredman and Dan Willard.

A bit array is an array data structure that compactly stores bits. It can be used to implement a simple set data structure. A bit array is effective at exploiting bit-level parallelism in hardware to perform operations quickly. A typical bit array stores kw bits, where w is the number of bits in the unit of storage, such as a byte or word, and k is some nonnegative integer. If w does not divide the number of bits to be stored, some space is wasted due to internal fragmentation.

<span class="mw-page-title-main">Circular shift</span> Circular shifts: Mathematical concept and applications in software development

In combinatorial mathematics, a circular shift is the operation of rearranging the entries in a tuple, either by moving the final entry to the first position, while shifting all other entries to the next position, or by performing the inverse operation. A circular shift is a special kind of cyclic permutation, which in turn is a special kind of permutation. Formally, a circular shift is a permutation σ of the n entries in the tuple such that either

The ICO file format is an image file format for computer icons in Microsoft Windows. ICO files contain one or more small images at multiple sizes and color depths, such that they may be scaled appropriately. In Windows, all executables that display an icon to the user, on the desktop, in the Start Menu, or in file Explorer, must carry the icon in ICO format.

<span class="mw-page-title-main">Stencil buffer</span> Data buffer in graphics hardware

A stencil buffer is an extra data buffer, in addition to the color buffer and Z-buffer, found on modern graphics hardware. The buffer is per pixel and works on integer values, usually with a depth of one byte per pixel. The Z-buffer and stencil buffer often share the same area in the RAM of the graphics hardware.

A bit field is a data structure that maps to one or more adjacent bits which have been allocated for specific purposes, so that any single bit or group of bits within the structure can be set or inspected. A bit field is most commonly used to represent integral types of known, fixed bit-width, such as single-bit Booleans.

Bit manipulation is the act of algorithmically manipulating bits or other pieces of data shorter than a word. Computer programming tasks that require bit manipulation include low-level device control, error detection and correction algorithms, data compression, encryption algorithms, and optimization. For most other tasks, modern programming languages allow the programmer to work directly with abstractions instead of bits that represent those abstractions.

In computing, bit numbering is the convention used to identify the bit positions in a binary number.

<span class="mw-page-title-main">Computation of cyclic redundancy checks</span> Overview of the computation of cyclic redundancy checks

Computation of a cyclic redundancy check is derived from the mathematics of polynomial division, modulo two. In practice, it resembles long division of the binary message string, with a fixed number of zeroes appended, by the "generator polynomial" string except that exclusive or operations replace subtractions. Division of this type is efficiently realised in hardware by a modified shift register, and in software by a series of equivalent algorithms, starting with simple code close to the mathematics and becoming faster through byte-wise parallelism and space–time tradeoffs.

A wildcard mask is a mask of bits that indicates which parts of an IP address are available for examination. In the Cisco IOS, they are used in several places, for example:

bcrypt is a password-hashing function designed by Niels Provos and David Mazières, based on the Blowfish cipher and presented at USENIX in 1999. Besides incorporating a salt to protect against rainbow table attacks, bcrypt is an adaptive function: over time, the iteration count can be increased to make it slower, so it remains resistant to brute-force search attacks even with increasing computation power.

MurmurHash is a non-cryptographic hash function suitable for general hash-based lookup. It was created by Austin Appleby in 2008 and, as of 8 January 2016, is hosted on GitHub along with its test suite named SMHasher. It also exists in a number of variants, all of which have been released into the public domain. The name comes from two basic operations, multiply (MU) and rotate (R), used in its inner loop.

In the C programming language, operations can be performed on a bit level using bitwise operators.

The Quite OK Image Format (QOI) is a specification for lossless image compression of 24-bit or 32-bit color raster (bitmapped) images, invented by Dominic Szablewski and first announced on 24 November 2021.

References

  1. "Mask R-CNN with OpenCV". PyImageSearch. 2018-11-19. Retrieved 2020-04-05.