Kochanski multiplication

Last updated

Kochanski multiplication [1] is an algorithm that allows modular arithmetic (multiplication or operations based on it, such as exponentiation) to be performed efficiently when the modulus is large (typically several hundred bits). This has particular application in number theory and in cryptography: for example, in the RSA cryptosystem and Diffie–Hellman key exchange.

Contents

The most common way of implementing large-integer multiplication in hardware is to express the multiplier in binary and enumerate its bits, one bit at a time, starting with the most significant bit, perform the following operations on an accumulator:

  1. Double the contents of the accumulator (if the accumulator stores numbers in binary, as is usually the case, this is a simple "shift left" that requires no actual computation).
  2. If the current bit of the multiplier is 1, add the multiplicand into the accumulator; if it is 0, do nothing.

For an n-bit multiplier, this will take n clock cycles (where each cycle does either a shift or a shift-and-add).

To convert this into an algorithm for modular multiplication, with a modulus r, it is necessary to subtract r conditionally at each stage:

  1. Double the contents of the accumulator.
  2. If the result is greater than or equal to r, subtract r. (Equivalently, subtract r from the accumulator and store the result back into the accumulator if and only if it is non-negative).
  3. If the current bit of the multiplier is 1, add the multiplicand into the accumulator; if it is 0, do nothing.
  4. If the result of the addition is greater than or equal to r, subtract r. If no addition took place, do nothing.

This algorithm works. However, it is critically dependent on the speed of addition.

Addition of long integers suffers from the problem that carries have to be propagated from right to left and the final result is not known until this process has been completed. Carry propagation can be speeded up with carry look-ahead logic, but this still makes addition very much slower than it needs to be (for 512-bit addition, addition with carry look-ahead is 32 times slower than addition without carries at all).

Non-modular multiplication can make use of carry-save adders, which save time by storing the carries from each digit position and using them later: for example, by computing 111111111111+000000000010 as 111111111121 instead of waiting for the carry to propagate through the whole number to yield the true binary value 1000000000001. That final propagation still has to be done to yield a binary result but this only needs to be done once at the very end of the multiplication.

Unfortunately, the modular multiplication method outlined above needs to know the magnitude of the accumulated value at every step, in order to decide whether to subtract r: for example, if it needs to know whether the value in the accumulator is greater than 1000000000000, the carry-save representation 111111111121 is useless and needs to be converted to its true binary value for the comparison to be made.

It therefore seems that one can have either the speed of carry-save or modular multiplication, but not both.

Outline of the algorithm

The principle of the Kochanski algorithm is one of making guesses as to whether or not r should be subtracted, based on the most significant few bits of the carry-save value in the accumulator. Such a guess will be wrong some of the time, since there is no way of knowing whether latent carries in the less significant digits (which have not been examined) might not invalidate the result of the comparison. Thus:

What is happening is essentially a race between the errors that result from wrong guesses, which double with every shift left, and the corrections made by adding or subtracting multiples of r based on a guess of what the errors may be.

It turns out [2] that examining the most significant 4 bits of the accumulator is sufficient to keep the errors within bounds and that the only values that need to be added to the accumulator are −2r, −r, 0, +r, and +2r, all of which can be generated instantaneously by simple shifts and negations.

At the end of a complete modular multiplication, the true binary result of the operation has to be evaluated and it is possible that an additional addition or subtraction of r will be needed as a result of the carries that are then discovered; but the cost of that extra step is small when amortized over the hundreds of shift-and-add steps that dominate the overall cost of the multiplication.

Alternatives

Brickell [3] has published a similar algorithm that requires greater complexity in the electronics for each digit of the accumulator.

Montgomery multiplication is an alternative algorithm which processes the multiplier "backwards" (least significant digit first) and uses the least significant digit of the accumulator to control whether or not the modulus should be added. This avoids the need for carries to propagate. However, the algorithm is impractical for single modular multiplications, since two or three additional Montgomery steps have to be performed to convert the operands into a special form before processing and to convert the result back into conventional binary at the end.

Related Research Articles

<span class="mw-page-title-main">Accumulator (computing)</span> Register in which intermediate arithmetic and logic results of a CPU are stored

In a computer's central processing unit (CPU), the accumulator is a register in which intermediate arithmetic logic unit results are stored.

<span class="mw-page-title-main">Arithmetic</span> Elementary branch of mathematics

Arithmetic is an elementary part of mathematics that consists of the study of the properties of the traditional operations on numbers—addition, subtraction, multiplication, division, exponentiation, and extraction of roots. In the 19th century, Italian mathematician Giuseppe Peano formalized arithmetic with his Peano axioms, which are highly important to the field of mathematical logic today.

<span class="mw-page-title-main">Linear congruential generator</span> Algorithm for generating pseudo-randomized numbers

A linear congruential generator (LCG) is an algorithm that yields a sequence of pseudo-randomized numbers calculated with a discontinuous piecewise linear equation. The method represents one of the oldest and best-known pseudorandom number generator algorithms. The theory behind them is relatively easy to understand, and they are easily implemented and fast, especially on computer hardware which can provide modular arithmetic by storage-bit truncation.

A multiplication algorithm is an algorithm to multiply two numbers. Depending on the size of the numbers, different algorithms are more efficient than others. Efficient multiplication algorithms have existed since the advent of the decimal system.

<span class="mw-page-title-main">Subtraction</span> One of the four basic arithmetic operations

Subtraction is one of the four arithmetic operations along with addition, multiplication and division. Subtraction is an operation that represents removal of objects from a collection. For example, in the adjacent picture, there are 5 − 2 peaches—meaning 5 peaches with 2 taken away, resulting in a total of 3 peaches. Therefore, the difference of 5 and 2 is 3; that is, 5 − 2 = 3. While primarily associated with natural numbers in arithmetic, subtraction can also represent removing or decreasing physical and abstract quantities using different kinds of objects including negative numbers, fractions, irrational numbers, vectors, decimals, functions, and matrices.

A binary number is a number expressed in the base-2 numeral system or binary numeral system, a method of mathematical expression which uses only two symbols: typically "0" (zero) and "1" (one).

Two's complement is the most common method of representing signed integers on computers, and more generally, fixed point binary values. Two's complement uses the binary digit with the greatest place value as the sign to indicate whether the binary number is positive or negative. When the most significant bit is 1, the number is signed as negative; and when the most significant bit is 0 the number is signed as positive.

In mathematics, finite field arithmetic is arithmetic in a finite field contrary to arithmetic in a field with an infinite number of elements, like the field of rational numbers.

In computing, signed number representations are required to encode negative numbers in binary number systems.

Modular exponentiation is exponentiation performed over a modulus. It is useful in computer science, especially in the field of public-key cryptography, where it is used in both Diffie-Hellman Key Exchange and RSA public/private keys.

In modular arithmetic computation, Montgomery modular multiplication, more commonly referred to as Montgomery multiplication, is a method for performing fast modular multiplication. It was introduced in 1985 by the American mathematician Peter L. Montgomery.

<span class="mw-page-title-main">Elementary arithmetic</span> Numbers and the basic operations on them

Elementary arithmetic is a branch of mathematics involving basic numerical operations, namely addition, subtraction, multiplication, and division. Due to its low level of abstraction, broad range of application, and position as the foundation of all mathematics, elementary arithmetic is generally the first critical branch of mathematics to be taught in schools.

Booth's multiplication algorithm is a multiplication algorithm that multiplies two signed binary numbers in two's complement notation. The algorithm was invented by Andrew Donald Booth in 1950 while doing research on crystallography at Birkbeck College in Bloomsbury, London. Booth's algorithm is of interest in the study of computer architecture.

The Intel BCD opcodes are a set of six x86 instructions that operate with binary-coded decimal numbers. The radix used for the representation of numbers in the x86 processors is 2. This is called a binary numeral system. However, the x86 processors do have limited support for the decimal numeral system.

Saturation arithmetic is a version of arithmetic in which all operations, such as addition and multiplication, are limited to a fixed range between a minimum and maximum value.

<span class="mw-page-title-main">Stepped reckoner</span> Early mechanical calculator

The stepped reckoner or Leibniz calculator was a mechanical calculator invented by the German mathematician Gottfried Wilhelm Leibniz around 1672 and completed in 1694. The name comes from the translation of the German term for its operating mechanism, Staffelwalze, meaning "stepped drum". It was the first calculator that could perform all four basic arithmetic operations.

A division algorithm is an algorithm which, given two integers N and D, computes their quotient and/or remainder, the result of Euclidean division. Some are applied by hand, while others are employed by digital circuit designs and software.

A carry-save adder is a type of digital adder, used to efficiently compute the sum of three or more binary numbers. It differs from other digital adders in that it outputs two numbers, and the answer of the original summation can be achieved by adding these outputs together. A carry save adder is typically used in a binary multiplier, since a binary multiplier involves addition of more than two binary numbers after multiplication. A big adder implemented using this technique will usually be much faster than conventional addition of those numbers.

A binary multiplier is an electronic circuit used in digital electronics, such as a computer, to multiply two binary numbers.

The Lehmer random number generator, sometimes also referred to as the Park–Miller random number generator, is a type of linear congruential generator (LCG) that operates in multiplicative group of integers modulo n. The general formula is

References

  1. Kochanski, Martin J. (1985). "Developing an RSA Chip". Advances in Cryptology – Proceedings of CRYPTO 85. Berlin: Springer-Verlag. pp. 350–357. doi:10.1007/3-540-39799-X_25. ISBN   3-540-16463-4. S2CID   35095400.
  2. Kochanski, Martin J. (19 August 2003). "A New Method of Serial Modular Multiplication" (PDF). Archived from the original (PDF) on 2018-07-16. Describes the algorithm in full detail.
  3. Brickell, Ernest F. (1983). "A Fast Modular Multiplication Algorithm with Applications to Two Key Cryptography". Advances in Cryptology – Proceedings of CRYPTO '82. New York: Plenum. pp. 51–60. doi:10.1007/978-1-4757-0602-4_5. ISBN   0-306-41366-3.