A division algorithm is an algorithm which, given two integers N and D (respectively the numerator and the denominator), computes their quotient and/or remainder, the result of Euclidean division. Some are applied by hand, while others are employed by digital circuit designs and software.
Division algorithms fall into two main categories: slow division and fast division. Slow division algorithms produce one digit of the final quotient per iteration. Examples of slow division include restoring, non-performing restoring, non-restoring, and SRT division. Fast division methods start with a close approximation to the final quotient and produce twice as many digits of the final quotient on each iteration. [1] Newton–Raphson and Goldschmidt algorithms fall into this category.
Variants of these algorithms allow using fast multiplication algorithms. It results that, for large integers, the computer time needed for a division is the same, up to a constant factor, as the time needed for a multiplication, whichever multiplication algorithm is used.
Discussion will refer to the form , where
is the input, and
is the output.
The simplest division algorithm, historically incorporated into a greatest common divisor algorithm presented in Euclid's Elements, Book VII, Proposition 1, finds the remainder given two positive integers using only subtractions and comparisons:
R:=NQ:=0whileR≥DdoR:=R−DQ:=Q+1endreturn(Q,R)
The proof that the quotient and remainder exist and are unique (described at Euclidean division) gives rise to a complete division algorithm, applicable to both negative and positive numbers, using additions, subtractions, and comparisons:
functiondivide(N,D)ifD=0thenerror(DivisionByZero)endifD<0then(Q,R):=divide(N,−D);return(−Q,R)endifN<0then(Q,R):=divide(−N,D)ifR=0thenreturn(−Q,0)elsereturn(−Q−1,D−R)endend-- At this point, N ≥ 0 and D > 0returndivide_unsigned(N,D)endfunctiondivide_unsigned(N,D)Q:=0;R:=NwhileR≥DdoQ:=Q+1R:=R−Dendreturn(Q,R)end
This procedure always produces R ≥ 0. Although very simple, it takes Ω(Q) steps, and so is exponentially slower than even slow division algorithms like long division. It is useful if Q is known to be small (being an output-sensitive algorithm), and can serve as an executable specification.
Long division is the standard algorithm used for pen-and-paper division of multi-digit numbers expressed in decimal notation. It shifts gradually from the left to the right end of the dividend, subtracting the largest possible multiple of the divisor (at the digit level) at each stage; the multiples then become the digits of the quotient, and the final difference is then the remainder.
When used with a binary radix, this method forms the basis for the (unsigned) integer division with remainder algorithm below. Short division is an abbreviated form of long division suitable for one-digit divisors. Chunking – also known as the partial quotients method or the hangman method – is a less-efficient form of long division which may be easier to understand. By allowing one to subtract more multiples than what one currently has at each stage, a more freeform variant of long division can be developed as well.
The following algorithm, the binary version of the famous long division, will divide N by D, placing the quotient in Q and the remainder in R. In the following pseudo-code, all values are treated as unsigned integers.
ifD=0thenerror(DivisionByZeroException)endQ:=0-- Initialize quotient and remainder to zeroR:=0fori:=n−1..0do-- Where n is number of bits in NR:=R<<1-- Left-shift R by 1 bitR(0):=N(i)-- Set the least-significant bit of R equal to bit i of the numeratorifR≥DthenR:=R−DQ(i):=1endend
If we take N=11002 (1210) and D=1002 (410)
Step 1: Set R=0 and Q=0
Step 2: Take i=3 (one less than the number of bits in N)
Step 3: R=00 (left shifted by 1)
Step 4: R=01 (setting R(0) to N(i))
Step 5: R < D, so skip statement
Step 2: Set i=2
Step 3: R=010
Step 4: R=011
Step 5: R < D, statement skipped
Step 2: Set i=1
Step 3: R=0110
Step 4: R=0110
Step 5: R>=D, statement entered
Step 5b: R=10 (R−D)
Step 5c: Q=10 (setting Q(i) to 1)
Step 2: Set i=0
Step 3: R=100
Step 4: R=100
Step 5: R>=D, statement entered
Step 5b: R=0 (R−D)
Step 5c: Q=11 (setting Q(i) to 1)
end
Q=112 (310) and R=0.
Slow division methods are all based on a standard recurrence equation [2]
where:
Restoring division operates on fixed-point fractional numbers and depends on the assumption 0 < D < N. [ citation needed ]
The quotient digits q are formed from the digit set {0,1}.
The basic algorithm for binary (radix 2) restoring division is:
R:=ND:=D<<n-- R and D need twice the word width of N and Qfori:=n−1..0do-- For example 31..0 for 32 bitsR:=2*R−D-- Trial subtraction from shifted value (multiplication by 2 is a shift in binary representation)ifR>=0thenq(i):=1-- Result-bit 1elseq(i):=0-- Result-bit 0R:=R+D-- New partial remainder is (restored) shifted valueendend-- Where: N = numerator, D = denominator, n = #bits, R = partial remainder, q(i) = bit #i of quotient
Non-performing restoring division is similar to restoring division except that the value of 2R is saved, so D does not need to be added back in for the case of R < 0.
Non-restoring division uses the digit set {−1, 1} for the quotient digits instead of {0, 1}. The algorithm is more complex, but has the advantage when implemented in hardware that there is only one decision and addition/subtraction per quotient bit; there is no restoring step after the subtraction, [3] which potentially cuts down the numbers of operations by up to half and lets it be executed faster. [4] The basic algorithm for binary (radix 2) non-restoring division of non-negative numbers is:
R:=ND:=D<<n-- R and D need twice the word width of N and Qfori=n−1..0do-- for example 31..0 for 32 bitsifR>=0thenq(i):=+1R:=2*R−Delseq(i):=−1R:=2*R+Dendifend-- Note: N=numerator, D=denominator, n=#bits, R=partial remainder, q(i)=bit #i of quotient.
Following this algorithm, the quotient is in a non-standard form consisting of digits of −1 and +1. This form needs to be converted to binary to form the final quotient. Example:
Convert the following quotient to the digit set {0,1}: | |
Start: | |
1. Form the positive term: | |
2. Mask the negative term*: | |
3. Subtract: : | |
*.( Signed binary notation with ones' complement without two's complement) |
If the −1 digits of are stored as zeros (0) as is common, then is and computing is trivial: perform a ones' complement (bit by bit complement) on the original .
Q:=Q−bit.bnot(Q)-- Appropriate if −1 digits in Q are represented as zeros as is common.
Finally, quotients computed by this algorithm are always odd, and the remainder in R is in the range −D ≤ R < D. For example, 5 / 2 = 3 R −1. To convert to a positive remainder, do a single restoring step after Q is converted from non-standard form to standard form:
ifR<0thenQ:=Q−1R:=R+D-- Needed only if the remainder is of interest.endif
The actual remainder is R >> n. (As with restoring division, the low-order bits of R are used up at the same rate as bits of the quotient Q are produced, and it is common to use a single shift register for both.)
SRT division is a popular method for division in many microprocessor implementations. [5] [6] The algorithm is named after D. W. Sweeney of IBM, James E. Robertson of University of Illinois, and K. D. Tocher of Imperial College London. They all developed the algorithm independently at approximately the same time (published in February 1957, September 1958, and January 1958 respectively). [7] [8] [9]
SRT division is similar to non-restoring division, but it uses a lookup table based on the dividend and the divisor to determine each quotient digit.
The most significant difference is that a redundant representation is used for the quotient. For example, when implementing radix-4 SRT division, each quotient digit is chosen from five possibilities: { −2, −1, 0, +1, +2 }. Because of this, the choice of a quotient digit need not be perfect; later quotient digits can correct for slight errors. (For example, the quotient digit pairs (0, +2) and (1, −2) are equivalent, since 0×4+2 = 1×4−2.) This tolerance allows quotient digits to be selected using only a few most-significant bits of the dividend and divisor, rather than requiring a full-width subtraction. This simplification in turn allows a radix higher than 2 to be used.
Like non-restoring division, the final steps are a final full-width subtraction to resolve the last quotient bit, and conversion of the quotient to standard binary form.
The Intel Pentium processor's infamous floating-point division bug was caused by an incorrectly coded lookup table. Five of the 1066 entries had been mistakenly omitted. [10] [11]
Newton–Raphson uses Newton's method to find the reciprocal of and multiply that reciprocal by to find the final quotient .
The steps of Newton–Raphson division are:
In order to apply Newton's method to find the reciprocal of , it is necessary to find a function that has a zero at . The obvious such function is , but the Newton–Raphson iteration for this is unhelpful, since it cannot be computed without already knowing the reciprocal of (moreover it attempts to compute the exact reciprocal in one step, rather than allow for iterative improvements). A function that does work is , for which the Newton–Raphson iteration gives
which can be calculated from using only multiplication and subtraction, or using two fused multiply–adds.
From a computation point of view, the expressions and are not equivalent. To obtain a result with a precision of 2n bits while making use of the second expression, one must compute the product between and with double the given precision of (n bits).[ citation needed ] In contrast, the product between and need only be computed with a precision of n bits, because the leading n bits (after the binary point) of are zeros.
If the error is defined as , then:
This squaring of the error at each iteration step – the so-called quadratic convergence of Newton–Raphson's method – has the effect that the number of correct digits in the result roughly doubles for every iteration, a property that becomes extremely valuable when the numbers involved have many digits (e.g. in the large integer domain). But it also means that the initial convergence of the method can be comparatively slow, especially if the initial estimate is poorly chosen.
For the subproblem of choosing an initial estimate , it is convenient to apply a bit-shift to the divisor D to scale it so that 0.5 ≤ D ≤ 1; by applying the same bit-shift to the numerator N, one ensures the quotient does not change. Then one could use a linear approximation in the form
to initialize Newton–Raphson. To minimize the maximum of the absolute value of the error of this approximation on interval , one should use
The coefficients of the linear approximation are determined as follows. The absolute value of the error is . The minimum of the maximum absolute value of the error is determined by the Chebyshev equioscillation theorem applied to . The local minimum of occurs when , which has solution . The function at that minimum must be of opposite sign as the function at the endpoints, namely, . The two equations in the two unknowns have a unique solution and , and the maximum error is . Using this approximation, the absolute value of the error of the initial value is less than
It is possible to generate a polynomial fit of degree larger than 1, computing the coefficients using the Remez algorithm. The trade-off is that the initial guess requires more computational cycles but hopefully in exchange for fewer iterations of Newton–Raphson.
Since for this method the convergence is exactly quadratic, it follows that
steps are enough to calculate the value up to binary places. This evaluates to 3 for IEEE single precision and 4 for both double precision and double extended formats.
The following computes the quotient of N and D with a precision of P binary places:
Express D as M × 2e where 1 ≤ M < 2 (standard floating point representation) D' := D / 2e+1// scale between 0.5 and 1, can be performed with bit shift / exponent subtraction N' := N / 2e+1 X := 48/17 − 32/17 × D' // precompute constants with same precision as Drepeattimes// can be precomputed based on fixed P X := X + X × (1 - D' × X) endreturn N' × X
For example, for a double-precision floating-point division, this method uses 10 multiplies, 9 adds, and 2 shifts.
The Newton-Raphson division method can be modified to be slightly faster as follows. After shifting N and D so that D is in [0.5, 1.0], initialize with
This is the best quadratic fit to 1/D and gives an absolute value of the error less than or equal to 1/99. It is chosen to make the error equal to a re-scaled third order Chebyshev polynomial of the first kind. The coefficients should be pre-calculated and hard-coded.
Then in the loop, use an iteration which cubes the error.
The Y⋅E term is new.
If the loop is performed until X agrees with 1/D on its leading P bits, then the number of iterations will be no more than
which is the number of times 99 must be cubed to get to 2P+1. Then
is the quotient to P bits.
Using higher degree polynomials in either the initialization or the iteration results in a degradation of performance because the extra multiplications required would be better spent on doing more iterations.
Goldschmidt division [12] (after Robert Elliott Goldschmidt) [13] uses an iterative process of repeatedly multiplying both the dividend and divisor by a common factor Fi, chosen such that the divisor converges to 1. This causes the dividend to converge to the sought quotient Q:
The steps for Goldschmidt division are:
Assuming N/D has been scaled so that 0 < D < 1, each Fi is based on D:
Multiplying the dividend and divisor by the factor yields:
After a sufficient number k of iterations .
The Goldschmidt method is used in AMD Athlon CPUs and later models. [14] [15] It is also known as Anderson Earle Goldschmidt Powers (AEGP) algorithm and is implemented by various IBM processors. [16] [17] Although it converges at the same rate as a Newton–Raphson implementation, one advantage of the Goldschmidt method is that the multiplications in the numerator and in the denominator can be done in parallel. [17]
The Goldschmidt method can be used with factors that allow simplifications by the binomial theorem. Assume has been scaled by a power of two such that . We choose and . This yields
After n steps , the denominator can be rounded to 1 with a relative error
which is maximum at when , thus providing a minimum precision of binary digits.
Methods designed for hardware implementation generally do not scale to integers with thousands or millions of decimal digits; these frequently occur, for example, in modular reductions in cryptography. For these large integers, more efficient division algorithms transform the problem to use a small number of multiplications, which can then be done using an asymptotically efficient multiplication algorithm such as the Karatsuba algorithm, Toom–Cook multiplication or the Schönhage–Strassen algorithm. The result is that the computational complexity of the division is of the same order (up to a multiplicative constant) as that of the multiplication. Examples include reduction to multiplication by Newton's method as described above, [18] as well as the slightly faster Burnikel-Ziegler division, [19] Barrett reduction and Montgomery reduction algorithms. [20] [ verification needed ] Newton's method is particularly efficient in scenarios where one must divide by the same divisor many times, since after the initial Newton inversion only one (truncated) multiplication is needed for each division.
The division by a constant D is equivalent to the multiplication by its reciprocal. Since the denominator is constant, so is its reciprocal (1/D). Thus it is possible to compute the value of (1/D) once at compile time, and at run time perform the multiplication N·(1/D) rather than the division N/D. In floating-point arithmetic the use of (1/D) presents little problem, [lower-alpha 1] but in integer arithmetic the reciprocal will always evaluate to zero (assuming |D| > 1).
It is not necessary to use specifically (1/D); any value (X/Y) that reduces to (1/D) may be used. For example, for division by 3, the factors 1/3, 2/6, 3/9, or 194/582 could be used. Consequently, if Y were a power of two the division step would reduce to a fast right bit shift. The effect of calculating N/D as (N·X)/Y replaces a division with a multiply and a shift. Note that the parentheses are important, as N·(X/Y) will evaluate to zero.
However, unless D itself is a power of two, there is no X and Y that satisfies the conditions above. Fortunately, (N·X)/Y gives exactly the same result as N/D in integer arithmetic even when (X/Y) is not exactly equal to 1/D, but "close enough" that the error introduced by the approximation is in the bits that are discarded by the shift operation. [21] [22] [23] Barrett reduction uses powers of 2 for the value of Y to make division by Y a simple right shift. [lower-alpha 2]
As a concrete fixed-point arithmetic example, for 32-bit unsigned integers, division by 3 can be replaced with a multiply by 2863311531/233, a multiplication by 2863311531 (hexadecimal 0xAAAAAAAB) followed by a 33 right bit shift. The value of 2863311531 is calculated as 233/3, then rounded up. Likewise, division by 10 can be expressed as a multiplication by 3435973837 (0xCCCCCCCD) followed by division by 235 (or 35 right bit shift). [25] : p230-234 OEIS provides sequences of the constants for multiplication as A346495 and for the right shift as A346496.
For general x-bit unsigned integer division where the divisor D is not a power of 2, the following identity converts the division into two x-bit addition/subtraction, one x-bit by x-bit multiplication (where only the upper half of the result is used) and several shifts, after precomputing and :
In some cases, division by a constant can be accomplished in even less time by converting the "multiply by a constant" into a series of shifts and adds or subtracts. [26] Of particular interest is division by 10, for which the exact quotient is obtained, with remainder if required. [27]
This section needs expansion. You can help by adding to it. (September 2012) |
When a division operation is performed, the exact quotient and remainder are approximated to fit within the computer’s precision limits. The Division Algorithm states:
where .
In floating-point arithmetic, the quotient is represented as and the remainder as , introducing rounding errors and :
This rounding causes a small error, which can propagate and accumulate through subsequent calculations. Such errors are particularly pronounced in iterative processes and when subtracting nearly equal values - is told loss of significance. To mitigate these errors, techniques such as the use of guard digits or higher precision arithmetic are employed. [28] [29]
In mathematics, the Euclidean algorithm, or Euclid's algorithm, is an efficient method for computing the greatest common divisor (GCD) of two integers (numbers), the largest number that divides them both without a remainder. It is named after the ancient Greek mathematician Euclid, who first described it in his Elements . It is an example of an algorithm, a step-by-step procedure for performing a calculation according to well-defined rules, and is one of the oldest algorithms in common use. It can be used to reduce fractions to their simplest form, and is a part of many other number-theoretic and cryptographic calculations.
In mathematics, the greatest common divisor (GCD), also known as greatest common factor (GCF), of two or more integers, which are not all zero, is the largest positive integer that divides each of the integers. For two integers x, y, the greatest common divisor of x and y is denoted . For example, the GCD of 8 and 12 is 4, that is, gcd(8, 12) = 4.
Multiplication is one of the four elementary mathematical operations of arithmetic, with the other ones being addition, subtraction, and division. The result of a multiplication operation is called a product.
In abstract algebra, the field of fractions of an integral domain is the smallest field in which it can be embedded. The construction of the field of fractions is modeled on the relationship between the integral domain of integers and the field of rational numbers. Intuitively, it consists of ratios between integral domain elements.
In mathematics, a square-free integer (or squarefree integer) is an integer which is divisible by no square number other than 1. That is, its prime factorization has exactly one factor for each prime that appears in it. For example, 10 = 2 ⋅ 5 is square-free, but 18 = 2 ⋅ 3 ⋅ 3 is not, because 18 is divisible by 9 = 32. The smallest positive square-free numbers are
In number theory, a Gaussian integer is a complex number whose real and imaginary parts are both integers. The Gaussian integers, with ordinary addition and multiplication of complex numbers, form an integral domain, usually written as or
Division is one of the four basic operations of arithmetic. The other operations are addition, subtraction, and multiplication. What is being divided is called the dividend, which is divided by the divisor, and the result is called the quotient.
In mathematics, factorization (or factorisation, see English spelling differences) or factoring consists of writing a number or another mathematical object as a product of several factors, usually smaller or simpler objects of the same kind. For example, 3 × 5 is an integer factorization of 15, and (x – 2)(x + 2) is a polynomial factorization of x2 – 4.
In arithmetic and computer programming, the extended Euclidean algorithm is an extension to the Euclidean algorithm, and computes, in addition to the greatest common divisor (gcd) of integers a and b, also the coefficients of Bézout's identity, which are integers x and y such that
In mathematics, division by zero, division where the divisor (denominator) is zero, is a unique and problematic special case. Using fraction notation, the general example can be written as , where is the dividend (numerator).
In mathematics, a multiplicative inverse or reciprocal for a number x, denoted by 1/x or x−1, is a number which when multiplied by x yields the multiplicative identity, 1. The multiplicative inverse of a fraction a/b is b/a. For the multiplicative inverse of a real number, divide 1 by the number. For example, the reciprocal of 5 is one fifth (1/5 or 0.2), and the reciprocal of 0.25 is 1 divided by 0.25, or 4. The reciprocal function, the function f(x) that maps x to 1/x, is one of the simplest examples of a function which is its own inverse (an involution).
In arithmetic, long division is a standard division algorithm suitable for dividing multi-digit Hindu-Arabic numerals that is simple enough to perform by hand. It breaks down a division problem into a series of easier steps.
In arithmetic, Euclidean division – or division with remainder – is the process of dividing one integer by another, in a way that produces an integer quotient and a natural number remainder strictly smaller than the absolute value of the divisor. A fundamental property is that the quotient and the remainder exist and are unique, under some conditions. Because of this uniqueness, Euclidean division is often considered without referring to any method of computation, and without explicitly computing the quotient and the remainder. The methods of computation are called integer division algorithms, the best known of which being long division.
In modular arithmetic computation, Montgomery modular multiplication, more commonly referred to as Montgomery multiplication, is a method for performing fast modular multiplication. It was introduced in 1985 by the American mathematician Peter L. Montgomery.
In number theory, the integer square root (isqrt) of a non-negative integer n is the non-negative integer m which is the greatest integer less than or equal to the square root of n,
In computing, the modulo operation returns the remainder or signed remainder of a division, after one number is divided by another, called the modulus of the operation.
The Schönhage–Strassen algorithm is an asymptotically fast multiplication algorithm for large integers, published by Arnold Schönhage and Volker Strassen in 1971. It works by recursively applying fast Fourier transform (FFT) over the integers modulo 2n+1. The run-time bit complexity to multiply two n-digit numbers using the algorithm is in big O notation.
Methods of computing square roots are algorithms for approximating the non-negative square root of a positive real number . Since all square roots of natural numbers, other than of perfect squares, are irrational, square roots can usually only be computed to some finite precision: these methods typically construct a series of increasingly accurate approximations.
In algebra, the greatest common divisor of two polynomials is a polynomial, of the highest possible degree, that is a factor of both the two original polynomials. This concept is analogous to the greatest common divisor of two integers.
A repeating decimal or recurring decimal is a decimal representation of a number whose digits are eventually periodic ; if this sequence consists only of zeros, the decimal is said to be terminating, and is not considered as repeating.
{{citation}}
: CS1 maint: location missing publisher (link){{citation}}
: CS1 maint: location missing publisher (link)