Arithmetic shift

Last updated
A right arithmetic shift of a binary number by 1. The empty position in the most significant bit is filled with a copy of the original MSB. Rotate right arithmetically.svg
A right arithmetic shift of a binary number by 1. The empty position in the most significant bit is filled with a copy of the original MSB.
A left arithmetic shift of a binary number by 1. The empty position in the least significant bit is filled with a zero. Rotate left logically.svg
A left arithmetic shift of a binary number by 1. The empty position in the least significant bit is filled with a zero.
Arithmetic shift operators in various programming languages and processors
Language or processorLeftRight
ActionScript 3, Java, JavaScript, Python,
PHP, Ruby, C, C++, [1] D, C#, Go, Julia,
Rust [2] (signed types only),
Swift (signed types only) [note 1]
<< >>
Ada Shift_Left [3] Shift_Right_Arithmetic
Kotlin shl shr
Fortran SHIFTL SHIFTA [note 2]
Standard ML << ~>>
Verilog <<< >>> [note 3]
OpenVMS macro language@ [note 4]
Scheme arithmetic-shift [note 5]
Common Lisp ash
OCaml lslasr
Haskell Data.Bits.shift [note 6]
Assembly, 68k ASLASR
Assembly, x86 SALSAR
VHDL sla [note 7] sra
RISC-V sll, slli [5] sra, srai
Z80 SLA [6] SRA

In computer programming, an arithmetic shift is a shift operator, sometimes termed a signed shift (though it is not restricted to signed operands). The two basic types are the arithmetic left shift and the arithmetic right shift. For binary numbers it is a bitwise operation that shifts all of the bits of its operand; every bit in the operand is simply moved a given number of bit positions, and the vacant bit-positions are filled in. Instead of being filled with all 0s, as in logical shift, when shifting to the right, the leftmost bit (usually the sign bit in signed integer representations) is replicated to fill in all the vacant positions (this is a kind of sign extension).

Contents

Some authors prefer the terms sticky right-shift and zero-fill right-shift for arithmetic and logical shifts respectively. [7]

Arithmetic shifts can be useful as efficient ways to perform multiplication or division of signed integers by powers of two. Shifting left by n bits on a signed or unsigned binary number has the effect of multiplying it by 2n. Shifting right by n bits on a two's complement signed binary number has the effect of dividing it by 2n, but it always rounds down (towards negative infinity). This is different from the way rounding is usually done in signed integer division (which rounds towards 0). This discrepancy has led to bugs in a number of compilers. [8]

For example, in the x86 instruction set, the SAR instruction (arithmetic right shift) divides a signed number by a power of two, rounding towards negative infinity. [9] However, the IDIV instruction (signed divide) divides a signed number, rounding towards zero. So a SAR instruction cannot be substituted for an IDIV by power of two instruction nor vice versa.

Formal definition

The formal definition of an arithmetic shift, from Federal Standard 1037C is that it is:

A shift, applied to the representation of a number in a fixed radix numeration system and in a fixed-point representation system, and in which only the characters representing the fixed-point part of the number are moved. An arithmetic shift is usually equivalent to multiplying the number by a positive or a negative integral power of the radix, except for the effect of any rounding; compare the logical shift with the arithmetic shift, especially in the case of floating-point representation.

An important word in the FS 1073C definition is "usually".

Equivalence of arithmetic and logical left shifts and multiplication

Arithmetic left shifts are equivalent to multiplication by a (positive, integral) power of the radix (e.g., a multiplication by a power of 2 for binary numbers). Logical left shifts are also equivalent, except multiplication and arithmetic shifts may trigger arithmetic overflow whereas logical shifts do not [ citation needed ].

Non-equivalence of arithmetic right shift and division

However, arithmetic right shifts are major traps for the unwary, specifically in treating rounding of negative integers. For example, in the usual two's complement representation of negative integers, −1 is represented as all 1's. For an 8-bit signed integer this is 1111 1111. An arithmetic right-shift by 1 (or 2, 3, ..., 7) yields 1111 1111 again, which is still −1. This corresponds to rounding down (towards negative infinity), but is not the usual convention for division.

It is frequently stated that arithmetic right shifts are equivalent to division by a (positive, integral) power of the radix (e.g., a division by a power of 2 for binary numbers), and hence that division by a power of the radix can be optimized by implementing it as an arithmetic right shift. (A shifter is much simpler than a divider. On most processors, shift instructions will execute faster than division instructions.) Large number of 1960s and 1970s programming handbooks, manuals, and other specifications from companies and institutions such as DEC, IBM, Data General, and ANSI make such incorrect statements [10] [ page needed ].

Logical right shifts are equivalent to division by a power of the radix (usually 2) only for positive or unsigned numbers. Arithmetic right shifts are equivalent to logical right shifts for positive signed numbers. Arithmetic right shifts for negative numbers in N1's complement (usually two's complement) is roughly equivalent to division by a power of the radix (usually 2), where for odd numbers rounding downwards is applied (not towards 0 as usually expected).

Arithmetic right shifts for negative numbers are equivalent to division using rounding towards 0 in one's complement representation of signed numbers as was used by some historic computers, but this is no longer in general use.

Handling the issue in programming languages

The (1999) ISO standard for the programming language C defines the right shift operator in terms of divisions by powers of 2. [11] Because of the above-stated non-equivalence, the standard explicitly excludes from that definition the right shifts of signed numbers that have negative values. It does not specify the behaviour of the right shift operator in such circumstances, but instead requires each individual C compiler to define the behaviour of shifting negative values right. [note 8]

Like C, C++ had an implementation-defined right shift for signed integers until C++20. Starting in the C++20 standard, right shift of a signed integer is defined to be an arithmetic shift. [13]

Applications

In applications where consistent rounding down is desired, arithmetic right shifts for signed values are useful. An example is in downscaling raster coordinates by a power of two, which maintains even spacing. For example, right shift by 1 sends 0, 1, 2, 3, 4, 5, ... to 0, 0, 1, 1, 2, 2, ..., and −1, −2, −3, −4, ... to −1, −1, −2, −2, ..., maintaining even spacing as −2, −2, −1, −1, 0, 0, 1, 1, 2, 2, ... In contrast, integer division with rounding towards zero sends −1, 0, and 1 all to 0 (3 points instead of 2), yielding −2, −1, −1, 0, 0, 0, 1, 1, 2, 2, ... instead, which is irregular at 0.

Notes

  1. The >> operator in C and C++ is not necessarily an arithmetic shift. Usually it is only an arithmetic shift if used with a signed integer type on its left-hand side. If it is used on an unsigned integer type instead, it will be a logical shift.
  2. Fortran 2008.
  3. The Verilog arithmetic right shift operator only actually performs an arithmetic shift if the first operand is signed. If the first operand is unsigned, the operator actually performs a logical right shift.
  4. In the OpenVMS macro language, whether an arithmetic shift is left or right is determined by whether the second operand is positive or negative. This is unusual. In most programming languages the two directions have distinct operators, with the operator specifying the direction, and the second operand is implicitly positive. (Some languages, such as Verilog, require that negative values be converted to unsigned positive values. Some languages, such as C and C++, have no defined behaviour if negative values are used.) [4] [ page needed ]
  5. In Scheme arithmetic-shift can be both left and right shift, depending on the second operand, very similar to the OpenVMS macro language, although R6RS Scheme adds both -right and -left variants.
  6. The Bits class from Haskell's Data.Bits module defines both shift taking a signed argument and shiftL/shiftR taking unsigned arguments. These are isomorphic; for new definitions the programmer need provide only one of the two forms and the other form will be automatically defined in terms of the provided one.
  7. The VHDL arithmetic left shift operator is unusual. Instead of filling the LSB of the result with zero, it copies the original LSB into the new LSB. While this is an exact mirror image of the arithmetic right shift, it is not the conventional definition of the operator, and is not equivalent to multiplication by a power of 2. In the VHDL 2008 standard this strange behavior was left unchanged (for backward compatibility) for argument types that do not have forced numeric interpretation (e.g., BIT_VECTOR) but 'SLA' for unsigned and signed argument types behaves in the expected way (i.e., rightmost positions are filled with zeros). VHDL's shift left logical (SLL) function does implement the aforementioned 'standard' arithmetic shift.
  8. The C standard was intended to not restrict the C language to either ones' complement or two's complement architectures. In cases where the behaviours of ones' complement and two's complement representations differ, such as this, the standard requires individual C compilers to document the behaviour of their target architectures. The documentation for GNU Compiler Collection (GCC), for example, documents its behaviour as employing sign-extension. [12]

Related Research Articles

In computer science, an integer is a datum of integral data type, a data type that represents some range of mathematical integers. Integral data types may be of different sizes and may or may not be allowed to contain negative values. Integers are commonly represented in a computer as a group of binary digits (bits). The size of the grouping varies so the set of integer sizes available varies between different types of computers. Computer hardware nearly always provides a way to represent a processor register or memory address as an integer.

A binary number is a number expressed in the base-2 numeral system or binary numeral system, a method of mathematical expression which uses only two symbols: typically "0" (zero) and "1" (one).

In computer programming, a bitwise operation operates on a bit string, a bit array or a binary numeral at the level of its individual bits. It is a fast and simple action, basic to the higher-level arithmetic operations and directly supported by the processor. Most bitwise operations are presented as two-operand instructions where the result replaces one of the input operands.

Two's complement is the most common method of representing signed integers on computers, and more generally, fixed point binary values. Two's complement uses the binary digit with the greatest place value as the sign to indicate whether the binary number is positive or negative. When the most significant bit is 1, the number is signed as negative; and when the most significant bit is 0 the number is signed as positive.

<span class="mw-page-title-main">Method of complements</span> Method of subtraction

In mathematics and computing, the method of complements is a technique to encode a symmetric range of positive and negative integers in a way that they can use the same algorithm for addition throughout the whole range. For a given number of places half of the possible representations of numbers encode the positive numbers, the other half represents their respective additive inverses. The pairs of mutually additive inverse numbers are called complements. Thus subtraction of any number is implemented by adding its complement. Changing the sign of any number is encoded by generating its complement, which can be done by a very simple and efficient algorithm. This method was commonly used in mechanical calculators and is still used in modern computers. The generalized concept of the radix complement is also valuable in number theory, such as in Midy's theorem.

In computing, fixed-point is a method of representing fractional (non-integer) numbers by storing a fixed number of digits of their fractional part. Dollar amounts, for example, are often stored with exactly two fractional digits, representing the cents. More generally, the term may refer to representing fractional values as integer multiples of some fixed small unit, e.g. a fractional amount of hours as an integer multiple of ten-minute intervals. Fixed-point number representation is often contrasted to the more complicated and computationally demanding floating-point representation.

In computing, signed number representations are required to encode negative numbers in binary number systems.

In computing, signedness is a property of data types representing numbers in computer programs. A numeric variable is signed if it can represent both positive and negative numbers, and unsigned if it can only represent non-negative numbers.

Saturation arithmetic is a version of arithmetic in which all operations, such as addition and multiplication, are limited to a fixed range between a minimum and maximum value.

In computer science, a logical shift is a bitwise operation that shifts all the bits of its operand. The two base variants are the logical left shift and the logical right shift. This is further modulated by the number of bit positions a given value shall be shifted, such as shift left by 1 or shift right by n. Unlike an arithmetic shift, a logical shift does not preserve a number's sign bit or distinguish a number's exponent from its significand (mantissa); every bit in the operand is simply moved a given number of bit positions, and the vacant bit-positions are filled, usually with zeros, and possibly ones.

<span class="mw-page-title-main">Integer overflow</span> Computer arithmetic error

In computer programming, an integer overflow occurs when an arithmetic operation attempts to create a numeric value that is outside of the range that can be represented with a given number of digits – either higher than the maximum or lower than the minimum representable value.

Bit manipulation is the act of algorithmically manipulating bits or other pieces of data shorter than a word. Computer programming tasks that require bit manipulation include low-level device control, error detection and correction algorithms, data compression, encryption algorithms, and optimization. For most other tasks, modern programming languages allow the programmer to work directly with abstractions instead of bits that represent those abstractions.

Signed zero is zero with an associated sign. In ordinary arithmetic, the number 0 does not have a sign, so that −0, +0 and 0 are identical. However, in computing, some number representations allow for the existence of two zeros, often denoted by −0 and +0, regarded as equal by the numerical comparison operations but with possible different behaviors in particular operations. This occurs in the sign-magnitude and ones' complement signed number representations for integers, and in most floating-point number representations. The number 0 is usually encoded as +0, but can be represented by either +0 or −0.

In computer processors, the overflow flag is usually a single bit in a system status register used to indicate when an arithmetic overflow has occurred in an operation, indicating that the signed two's-complement result would not fit in the number of bits used for the result. Some architectures may be configured to automatically generate an exception on an operation resulting in overflow.

The Q notation is a way to specify the parameters of a binary fixed point number format. For example, in Q notation, the number format denoted by Q8.8 means that the fixed point numbers in this format have 8 bits for the integer part and 8 bits for the fraction part.

A binary multiplier is an electronic circuit used in digital electronics, such as a computer, to multiply two binary numbers.

A redundant binary representation (RBR) is a numeral system that uses more bits than needed to represent a single binary digit so that most numbers have several representations. An RBR is unlike usual binary numeral systems, including two's complement, which use a single bit for each digit. Many of an RBR's properties differ from those of regular binary representation systems. Most importantly, an RBR allows addition without using a typical carry. When compared to non-redundant representation, an RBR makes bitwise logical operation slower, but arithmetic operations are faster when a greater bit width is used. Usually, each digit has its own sign that is not necessarily the same as the sign of the number represented. When digits have signs, that RBR is also a signed-digit representation.

<span class="mw-page-title-main">Arithmetic logic unit</span> Combinational digital circuit

In computing, an arithmetic logic unit (ALU) is a combinational digital circuit that performs arithmetic and bitwise operations on integer binary numbers. This is in contrast to a floating-point unit (FPU), which operates on floating point numbers. It is a fundamental building block of many types of computing circuits, including the central processing unit (CPU) of computers, FPUs, and graphics processing units (GPUs).

In computer software and hardware, find first set (ffs) or find first one is a bit operation that, given an unsigned machine word, designates the index or position of the least significant bit set to one in the word counting from the least significant bit position. A nearly equivalent operation is count trailing zeros (ctz) or number of trailing zeros (ntz), which counts the number of zero bits following the least significant one bit. The complementary operation that finds the index or position of the most significant set bit is log base 2, so called because it computes the binary logarithm ⌊log2(x)⌋. This is closely related to count leading zeros (clz) or number of leading zeros (nlz), which counts the number of zero bits preceding the most significant one bit. There are two common variants of find first set, the POSIX definition which starts indexing of bits at 1, herein labelled ffs, and the variant which starts indexing of bits at zero, which is equivalent to ctz and so will be called by that name.

In the C programming language, operations can be performed on a bit level using bitwise operators.

References

Cross-reference

  1. "Bit manipulation - Dlang Tour". tour.dlang.org. Retrieved 2019-06-23.
  2. "Operator Expressions: Arithmetic and Logical Binary Operators". doc.rust-lang.org. Retrieved 2022-11-13.
  3. "Annotated Ada 2012 Reference Manual".
  4. HP 2001.
  5. "The RISC-V Instruction Set Manual, Volume I: Unprivileged ISA" (PDF). GitHub . 2019-12-13. pp. 18–20. Archived (PDF) from the original on 2022-10-09. Retrieved 2021-08-07.
  6. "Z80 Assembler Syntax".
  7. Thomas R. Cain and Alan T. Sherman. "How to break Gifford's cipher". Section 8.1: "Sticky versus Non-Sticky Bit-shifting". Cryptologia. 1997.
  8. Steele Jr, Guy. "Arithmetic Shifting Considered Harmful" (PDF). MIT AI Lab. Archived (PDF) from the original on 2022-10-09. Retrieved 20 May 2013.
  9. Hyde 1996, § 6.6.2.2 SAR.
  10. Steele 1977.
  11. ISOIEC9899 1999, § 6.5.7 Bitwise shift operators.
  12. FSF 2008, § 4.5 Integers implementation.
  13. ISOCPP20 2020, § 7.6.7 Shift operators.

Sources used

PD-icon.svg This article incorporates public domain material from Federal Standard 1037C. General Services Administration. Archived from the original on 2022-01-22.