Circular shift

Last updated
Example permutation matrix; circular shift, left.svg
Example permutation matrix; circular shift, right.svg
Matrices of 8-element circular shifts to the left and right

In combinatorial mathematics, a circular shift is the operation of rearranging the entries in a tuple, either by moving the final entry to the first position, while shifting all other entries to the next position, or by performing the inverse operation. A circular shift is a special kind of cyclic permutation, which in turn is a special kind of permutation. Formally, a circular shift is a permutation σ of the n entries in the tuple such that either

Contents

modulo n, for all entries i = 1, ..., n

or

modulo n, for all entries i = 1, ..., n.

The result of repeatedly applying circular shifts to a given tuple are also called the circular shifts of the tuple.

For example, repeatedly applying circular shifts to the four-tuple (a, b, c, d) successively gives

and then the sequence repeats; this four-tuple therefore has four distinct circular shifts. However, not all n-tuples have n distinct circular shifts. For instance, the 4-tuple (a, b, a, b) only has 2 distinct circular shifts. The number of distinct circular shifts of an n-tuple is , where k is a divisor of n, indicating the maximal number of repeats over all subpatterns.

In computer programming, a bitwise rotation, also known as a circular shift, is a bitwise operation that shifts all bits of its operand. Unlike an arithmetic shift, a circular shift does not preserve a number's sign bit or distinguish a floating-point number's exponent from its significand. Unlike a logical shift, the vacant bit positions are not filled in with zeros but are filled in with the bits that are shifted out of the sequence.

Implementing circular shifts

Circular shifts are used often in cryptography in order to permute bit sequences. Unfortunately, many programming languages, including C, do not have operators or standard functions for circular shifting, even though virtually all processors have bitwise operation instructions for it (e.g. Intel x86 has ROL and ROR). However, some compilers may provide access to the processor instructions by means of intrinsic functions. In addition, some constructs in standard ANSI C code may be optimized by a compiler to the "rotate" assembly language instruction on CPUs that have such an instruction. Most C compilers recognize the following idiom, and compile it to a single 32-bit rotate instruction. [1] [2]

/* * Shift operations in C are only defined for shift values which are * not negative and smaller than sizeof(value) * CHAR_BIT. * The mask, used with bitwise-and (&), prevents undefined behaviour * when the shift count is 0 or >= the width of unsigned int. */#include<stdint.h>  // for uint32_t, to get 32-bit-wide rotates, regardless of the size of int.#include<limits.h>  // for CHAR_BITuint32_trotl32(uint32_tvalue,unsignedintcount){constunsignedintmask=CHAR_BIT*sizeof(value)-1;count&=mask;return(value<<count)|(value>>(-count&mask));}uint32_trotr32(uint32_tvalue,unsignedintcount){constunsignedintmask=CHAR_BIT*sizeof(value)-1;count&=mask;return(value>>count)|(value<<(-count&mask));}

This safe and compiler-friendly implementation was developed by John Regehr, [3] and further polished by Peter Cordes. [4] [5]

A simpler version is often seen when the count is limited to the range of 1 to 31 bits:

uint32_trotl32(uint32_tvalue,unsignedintcount){return(value<<count)|(value>>(32-count));}

This version is dangerous because if the count is 0 or 32, it asks for a 32-bit shift, which is undefined behaviour in the C language standard. However, it tends to work anyway, because most microprocessors implement value >> 32 as either a 32-bit shift (producing 0) or a 0-bit shift (producing the original value), and either one produces the correct result in this application.

Example

If the bit sequence 0001 0111 were subjected to a circular shift of one bit position... (see images below)

  • to the left would yield: 0010 1110
Left circular shift. Rotate left.svg
Left circular shift.
  • to the right would yield: 1000 1011.
Right circular shift. Rotate right.svg
Right circular shift.

If the bit sequence 1001 0110 were subjected to the following operations:

left circular shift by 1 position:0010 1101       
left circular shift by 2 positions:0101 1010
left circular shift by 3 positions:1011 0100
left circular shift by 4 positions:0110 1001
left circular shift by 5 positions:1101 0010
left circular shift by 6 positions:1010 0101
left circular shift by 7 positions:0100 1011
left circular shift by 8 positions:1001 0110
right circular shift by 1 position:0100 1011
right circular shift by 2 positions:1010 0101
right circular shift by 3 positions:1101 0010
right circular shift by 4 positions:0110 1001
right circular shift by 5 positions:1011 0100
right circular shift by 6 positions:0101 1010
right circular shift by 7 positions:0010 1101
right circular shift by 8 positions:1001 0110

Applications

Cyclic codes are a kind of block code with the property that the circular shift of a codeword will always yield another codeword. This motivates the following general definition: For a string s over an alphabet Σ, let shift(s) denote the set of circular shifts of s, and for a set L of strings, let shift(L) denote the set of all circular shifts of strings in L. If L is a cyclic code, then shift(L) ⊆ L; this is a necessary condition for L being a cyclic language. The operation shift(L) has been studied in formal language theory. For instance, if L is a context-free language, then shift(L) is again context-free. [6] [7] Also, if L is described by a regular expression of length n, there is a regular expression of length O(n3) describing shift(L). [8]

See also

Related Research Articles

In computer programming, a bitwise operation operates on a bit string, a bit array or a binary numeral at the level of its individual bits. It is a fast and simple action, basic to the higher-level arithmetic operations and directly supported by the processor. Most bitwise operations are presented as two-operand instructions where the result replaces one of the input operands.

<span class="mw-page-title-main">C syntax</span> Set of rules defining correctly structured programs

The syntax of the C programming language is the set of rules governing writing of software in the C language. It is designed to allow for programs that are extremely terse, have a close relationship with the resulting object code, and yet provide relatively high-level data abstraction. C was the first widely successful high-level language for portable operating-system development.

<span class="mw-page-title-main">Pointer (computer programming)</span> Object which stores memory addresses in a computer program

In computer science, a pointer is an object in many programming languages that stores a memory address. This can be that of another value located in computer memory, or in some cases, that of memory-mapped computer hardware. A pointer references a location in memory, and obtaining the value stored at that location is known as dereferencing the pointer. As an analogy, a page number in a book's index could be considered a pointer to the corresponding page; dereferencing such a pointer would be done by flipping to the page with the given page number and reading the text found on that page. The actual format and content of a pointer variable is dependent on the underlying computer architecture.

In computer science, a mask or bitmask is data that is used for bitwise operations, particularly in a bit field. Using a mask, multiple bits in a byte, nibble, word, etc. can be set either on or off, or inverted from on to off in a single bitwise operation. An additional use of masking involves predication in vector processing, where the bitmask is used to select which element operations in the vector are to be executed and which are not.

In computer programming, undefined behavior (UB) is the result of executing a program whose behavior is prescribed to be unpredictable, in the language specification to which the computer code adheres. This is different from unspecified behavior, for which the language specification does not prescribe a result, and implementation-defined behavior that defers to the documentation of another component of the platform.

<span class="mw-page-title-main">XTEA</span> Block cipher

In cryptography, XTEA is a block cipher designed to correct weaknesses in TEA. The cipher's designers were David Wheeler and Roger Needham of the Cambridge Computer Laboratory, and the algorithm was presented in an unpublished technical report in 1997. It is not subject to any patents.

The Hamming weight of a string is the number of symbols that are different from the zero-symbol of the alphabet used. It is thus equivalent to the Hamming distance from the all-zero string of the same length. For the most typical case, a string of bits, this is the number of 1's in the string, or the digit sum of the binary representation of a given number and the ₁ norm of a bit vector. In this binary case, it is also called the population count, popcount, sideways sum, or bit summation.

de Bruijn sequence Cycle through all length-k sequences

In combinatorial mathematics, a de Bruijn sequence of order n on a size-k alphabet A is a cyclic sequence in which every possible length-n string on A occurs exactly once as a substring (i.e., as a contiguous subsequence). Such a sequence is denoted by B(k, n) and has length kn, which is also the number of distinct strings of length n on A. Each of these distinct strings, when taken as a substring of B(k, n), must start at a different position, because substrings starting at the same position are not distinct. Therefore, B(k, n) must have at leastkn symbols. And since B(k, n) has exactlykn symbols, de Bruijn sequences are optimally short with respect to the property of containing every string of length n at least once.

In computer science, a logical shift is a bitwise operation that shifts all the bits of its operand. The two base variants are the logical left shift and the logical right shift. This is further modulated by the number of bit positions a given value shall be shifted, such as shift left by 1 or shift right by n. Unlike an arithmetic shift, a logical shift does not preserve a number's sign bit or distinguish a number's exponent from its significand (mantissa); every bit in the operand is simply moved a given number of bit positions, and the vacant bit-positions are filled, usually with zeros, and possibly ones.

The computer programming languages C and Pascal have similar times of origin, influences, and purposes. Both were used to design their own compilers early in their lifetimes. The original Pascal definition appeared in 1969 and a first compiler in 1970. The first version of C appeared in 1972.

<span class="mw-page-title-main">C data types</span> Data types supported by the C programming language

In the C programming language, data types constitute the semantics and characteristics of storage of data elements. They are expressed in the language syntax in form of declarations for memory locations or variables. Data types also determine the types of operations or methods of processing of data elements.

The bitap algorithm is an approximate string matching algorithm. The algorithm tells whether a given text contains a substring which is "approximately equal" to a given pattern, where approximate equality is defined in terms of Levenshtein distance – if the substring and pattern are within a given distance k of each other, then the algorithm considers them equal. The algorithm begins by precomputing a set of bitmasks containing one bit for each element of the pattern. Then it is able to do most of the work with bitwise operations, which are extremely fast.

Data structure alignment is the way data is arranged and accessed in computer memory. It consists of three separate but related issues: data alignment, data structure padding, and packing.

A bit field is a data structure that consists of one or more adjacent bits which have been allocated for specific purposes, so that any single bit or group of bits within the structure can be set or inspected. A bit field is most commonly used to represent integral types of known, fixed bit-width, such as single-bit Booleans.

sizeof is a unary operator in the programming languages C and C++. It generates the storage size of an expression or a data type, measured in the number of char-sized units. Consequently, the construct sizeof (char) is guaranteed to be 1. The actual number of bits of type char is specified by the preprocessor macro CHAR_BIT, defined in the standard include file limits.h. On most modern computing platforms this is eight bits. The result of sizeof has an unsigned integer type that is usually denoted by size_t.

In computer science, a type punning is any programming technique that subverts or circumvents the type system of a programming language in order to achieve an effect that would be difficult or impossible to achieve within the bounds of the formal language.

In computing, the producer-consumer problem is a family of problems described by Edsger W. Dijkstra since 1965.

In computer software and hardware, find first set (ffs) or find first one is a bit operation that, given an unsigned machine word, designates the index or position of the least significant bit set to one in the word counting from the least significant bit position. A nearly equivalent operation is count trailing zeros (ctz) or number of trailing zeros (ntz), which counts the number of zero bits following the least significant one bit. The complementary operation that finds the index or position of the most significant set bit is log base 2, so called because it computes the binary logarithm ⌊log2(x)⌋. This is closely related to count leading zeros (clz) or number of leading zeros (nlz), which counts the number of zero bits preceding the most significant one bit. There are two common variants of find first set, the POSIX definition which starts indexing of bits at 1, herein labelled ffs, and the variant which starts indexing of bits at zero, which is equivalent to ctz and so will be called by that name.

In the C programming language, operations can be performed on a bit level using bitwise operators.

A permuted congruential generator (PCG) is a pseudorandom number generation algorithm developed in 2014 by Dr. M.E. O'Neill which applies an output permutation function to improve the statistical properties of a modulo-2n linear congruential generator. It achieves excellent statistical performance with small and fast code, and small state size.

References

  1. GCC: "Optimize common rotate constructs"
  2. "Cleanups in ROTL/ROTR DAG combiner code" mentions that this code supports the "rotate" instruction in the CellSPU
  3. Safe, Efficient, and Portable Rotate in C/C++
  4. Stackoverflow: Best practices for rotates in C/C++
  5. Near constant time rotate that does not violate the standards
  6. T. Oshiba, "Closure property of the family of context-free languages under the cyclic shift operation", Transactions of IECE, 55D:119–122, 1972.
  7. A. N. Maslov, "Cyclic shift operation for languages", Problems of Information Transmission 9:333–338, 1973.
  8. Gruber, Hermann; Holzer, Markus (2009). "Language operations with regular expressions of polynomial size" (PDF). Theoretical Computer Science. 410 (35): 3281–3289. doi: 10.1016/j.tcs.2009.04.009 . Zbl   1176.68105. Archived from the original (PDF) on 2017-08-29. Retrieved 2012-09-06..