Baby-step giant-step

Last updated January 25, 2025

In group theory, a branch of mathematics, the baby-step giant-step is a meet-in-the-middle algorithm for computing the discrete logarithm or order of an element in a finite abelian group by Daniel Shanks.^[1] The discrete log problem is of fundamental importance to the area of public key cryptography.

Many of the most commonly used cryptography systems are based on the assumption that the discrete log is extremely difficult to compute; the more difficult it is, the more security it provides a data transfer. One way to increase the difficulty of the discrete log problem is to base the cryptosystem on a larger group.

Theory

The algorithm is based on a space–time tradeoff. It is a fairly simple modification of trial multiplication, the naive method of finding discrete logarithms.

Given a cyclic group $G$ of order $n$ , a generator $\alpha$ of the group and a group element $\beta$ , the problem is to find an integer $x$ such that

\alpha ^{x}=\beta \,.

The baby-step giant-step algorithm is based on rewriting $x$ :

x=im+j

m=\left\lceil {\sqrt {n}}\right\rceil

0\leq i<m

0\leq j<m

Therefore, we have:

\alpha ^{x}=\beta \,

\alpha ^{im+j}=\beta \,

\alpha ^{j}=\beta \left(\alpha ^{-m}\right)^{i}\,

The algorithm precomputes $\alpha ^{j}$ for several values of $j$ . Then it fixes an $m$ and tries values of $i$ in the right-hand side of the congruence above, in the manner of trial multiplication. It tests to see if the congruence is satisfied for any value of $j$ , using the precomputed values of $\alpha ^{j}$ .

The algorithm

Input: A cyclic group G of order n, having a generator α and an element β.

Output: A value x satisfying $\alpha ^{x}=\beta$ .

m ← Ceiling(√n)
For all j where 0 ≤ j<m:
1. Compute α^j and store the pair (j, α^j) in a table. (See § In practice)
Compute α^−m.
γ ← β. (set γ = β)
For all i where 0 ≤ i<m:
1. Check to see if γ is the second component (α^j) of any pair in the table.
2. If so, return im + j.
3. If not, γ ← γ • α^−m.

In practice

The best way to speed up the baby-step giant-step algorithm is to use an efficient table lookup scheme. The best in this case is a hash table. The hashing is done on the second component, and to perform the check in step 1 of the main loop, γ is hashed and the resulting memory address checked. Since hash tables can retrieve and add elements in $O(1)$ time (constant time), this does not slow down the overall baby-step giant-step algorithm.

The space complexity of the algorithm is $O({\sqrt {n}})$ , while the time complexity of the algorithm is $O({\sqrt {n}})$ . This running time is better than the $O(n)$ running time of the naive brute force calculation.

The baby-step giant-step algorithm could be used by an eavesdropper to derive the private key generated in the Diffie Hellman key exchange, when the modulus is a prime number that is not too large. If the modulus is not prime, the Pohlig–Hellman algorithm has a smaller algorithmic complexity, and potentially solves the same problem.^[2]

Notes

The baby-step giant-step algorithm is a generic algorithm. It works for every finite cyclic group.
It is not necessary to know the exact order of the group G in advance. The algorithm still works if n is merely an upper bound on the group order.
Usually the baby-step giant-step algorithm is used for groups whose order is prime. If the order of the group is composite then the Pohlig–Hellman algorithm is more efficient.
The algorithm requires O(m) memory. It is possible to use less memory by choosing a smaller m in the first step of the algorithm. Doing so increases the running time, which then is O(n/m). Alternatively one can use Pollard's rho algorithm for logarithms, which has about the same running time as the baby-step giant-step algorithm, but only a small memory requirement.
While this algorithm is credited to Daniel Shanks, who published the 1971 paper in which it first appears, a 1994 paper by Nechaev^[3] states that it was known to Gelfond in 1962.
There exist optimized versions of the original algorithm, such as using the collision-free truncated lookup tables of ^[4] or negation maps and Montgomery's simultaneous modular inversion as proposed in.^[5]

Related Research Articles

In computational complexity theory, bounded-error quantum polynomial time (BQP) is the class of decision problems solvable by a quantum computer in polynomial time, with an error probability of at most 1/3 for all instances. It is the quantum analogue to the complexity class BPP.

In mathematics, for given real numbers a and b, the logarithm log_b a is a number x such that b^x = a. Analogously, in any group G, powers b^k can be defined for all integers k, and the discrete logarithm log_b a is an integer k such that b^k = a. In number theory, the more commonly used term is index: we can write x = ind_ra (mod m) (read "the index of a to the base r modulo m") for r^x ≡ a (mod m) if r is a primitive root of m and gcd(a,m) = 1.

<span class="mw-page-title-main">Gumbel distribution</span> Particular case of the generalized extreme value distribution

In probability theory and statistics, the Gumbel distribution is used to model the distribution of the maximum of a number of samples of various distributions.

In computer science, a one-way function is a function that is easy to compute on every input, but hard to invert given the image of a random input. Here, "easy" and "hard" are to be understood in the sense of computational complexity theory, specifically the theory of polynomial time problems. This has nothing to do with whether the function is one-to-one; finding any one input with the desired image is considered a successful inversion.

The Cayley–Purser algorithm was a public-key cryptography algorithm published in early 1999 by 16-year-old Irishwoman Sarah Flannery, based on an unpublished work by Michael Purser, founder of Baltimore Technologies, a Dublin data security company. Flannery named it for mathematician Arthur Cayley. It has since been found to be flawed as a public-key algorithm, but was the subject of considerable media attention.

In mathematics, the Smith normal form is a normal form that can be defined for any matrix with entries in a principal ideal domain (PID). The Smith normal form of a matrix is diagonal, and can be obtained from the original matrix by multiplying on the left and right by invertible square matrices. In particular, the integers are a PID, so one can always calculate the Smith normal form of an integer matrix. The Smith normal form is very useful for working with finitely generated modules over a PID, and in particular for deducing the structure of a quotient of a free module. It is named after the Irish mathematician Henry John Stephen Smith.

Pollard's rho algorithm for logarithms is an algorithm introduced by John Pollard in 1978 to solve the discrete logarithm problem, analogous to Pollard's rho algorithm to solve the integer factorization problem.

In numerical analysis, the Crank–Nicolson method is a finite difference method used for numerically solving the heat equation and similar partial differential equations. It is a second-order method in time. It is implicit in time, can be written as an implicit Runge–Kutta method, and it is numerically stable. The method was developed by John Crank and Phyllis Nicolson in the 1940s.

<span class="mw-page-title-main">Pohlig–Hellman algorithm</span> Algorithm for computing logarithms

In group theory, the Pohlig–Hellman algorithm, sometimes credited as the Silver–Pohlig–Hellman algorithm, is a special-purpose algorithm for computing discrete logarithms in a finite abelian group whose order is a smooth integer.

In natural language processing, latent Dirichlet allocation (LDA) is a Bayesian network for modeling automatically extracted topics in textual corpora. The LDA is an example of a Bayesian topic model. In this, observations are collected into documents, and each word's presence is attributable to one of the document's topics. Each document will contain a small number of topics.

The normal-inverse Gaussian distribution is a continuous probability distribution that is defined as the normal variance-mean mixture where the mixing density is the inverse Gaussian distribution. The NIG distribution was noted by Blaesild in 1977 as a subclass of the generalised hyperbolic distribution discovered by Ole Barndorff-Nielsen. In the next year Barndorff-Nielsen published the NIG in another paper. It was introduced in the mathematical finance literature in 1997.

Continuous wavelets of compact support alpha can be built, which are related to the beta distribution. The process is derived from probability distributions using blur derivative. These new wavelets have just one cycle, so they are termed unicycle wavelets. They can be viewed as a soft variety of Haar wavelets whose shape is fine-tuned by two parameters $and . Closed-form expressions for beta wavelets and scale functions as well as their spectra are derived. Their importance is due to the Central Limit Theorem by Gnedenko and Kolmogorov applied for compactly supported signals.$

In computational number theory and computational algebra, Pollard's kangaroo algorithm is an algorithm for solving the discrete logarithm problem. The algorithm was introduced in 1978 by the number theorist John M. Pollard, in the same paper as his better-known Pollard's rho algorithm for solving the same problem. Although Pollard described the application of his algorithm to the discrete logarithm problem in the multiplicative group of units modulo a prime p, it is in fact a generic discrete logarithm algorithm—it will work in any finite cyclic group.

<span class="mw-page-title-main">Structure constants</span> Coefficients of an algebra over a field

In mathematics, the structure constants or structure coefficients of an algebra over a field are the coefficients of the basis expansion of the products of basis vectors. Because the product operation in the algebra is bilinear, by linearity knowing the product of basis vectors allows to compute the product of any elements . Therefore, the structure constants can be used to specify the product operation of the algebra. Given the structure constants, the resulting product is obtained by bilinearity and can be uniquely extended to all vectors in the vector space, thus uniquely determining the product for the algebra.

In mathematics the Function Field Sieve is one of the most efficient algorithms to solve the Discrete Logarithm Problem (DLP) in a finite field. It has heuristic subexponential complexity. Leonard Adleman developed it in 1994 and then elaborated it together with M. D. Huang in 1999. Previous work includes the work of D. Coppersmith about the DLP in fields of characteristic two.

In mathematics, Jacobi polynomials $are a class of classical orthogonal polynomials. They are orthogonal with respect to the weight on the interval . The Gegenbauer polynomials, and thus also the Legendre, Zernike and Chebyshev polynomials, are special cases of the Jacobi polynomials.$

In mathematics, the Perrin numbers are a doubly infinite constant-recursive integer sequence with characteristic equation $x 3 = x + 1$ . The Perrin numbers bear the same relationship to the Padovan sequence as the Lucas numbers do to the Fibonacci sequence.

In mathematics, the Kodaira–Spencer map, introduced by Kunihiko Kodaira and Donald C. Spencer, is a map associated to a deformation of a scheme or complex manifold X, taking a tangent space of a point of the deformation space to the first cohomology group of the sheaf of vector fields on X.

The cyclotomic fast Fourier transform is a type of fast Fourier transform algorithm over finite fields. This algorithm first decomposes a DFT into several circular convolutions, and then derives the DFT results from the circular convolution results. When applied to a DFT over $, this algorithm has a very low multiplicative complexity. In practice, since there usually exist efficient algorithms for circular convolutions with specific lengths, this algorithm is very efficient.$

In computer science, a suffix automaton is an efficient data structure for representing the substring index of a given string which allows the storage, processing, and retrieval of compressed information about all its substrings. The suffix automaton of a string $is the smallest directed acyclic graph with a dedicated initial vertex and a set of "final" vertices, such that paths from the initial vertex to final vertices represent the suffixes of the string.$

References

↑ Daniel Shanks (1971), "Class number, a theory of factorization and genera", In Proc. Symp. Pure Math., vol. 20, Providence, R.I.: American Mathematical Society, pp. 415–440
↑ Maurer, Ueli M.; Wolf, Stefan (2000), "The Diffie-Hellman protocol", Designs, Codes and Cryptography, 19 (2–3): 147–171, doi:10.1023/A:1008302122286, MR 1759615
↑ V. I. Nechaev, Complexity of a determinate algorithm for the discrete logarithm, Mathematical Notes, vol. 55, no. 2 1994 (165-172)
↑ Panagiotis Chatzigiannis, Konstantinos Chalkias and Valeria Nikolaenko (2021-06-30). Homomorphic decryption in blockchains via compressed discrete-log lookup tables. CBT workshop 2021 (ESORICS). Retrieved 2021-09-07.
↑ Steven D. Galbraith, Ping Wang and Fangguo Zhang (2016-02-10). Computing Elliptic Curve Discrete Logarithms with Improved Baby-step Giant-step Algorithm. Advances in Mathematics of Communications. Retrieved 2021-09-07.

External links

Baby step-Giant step – example C source code

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Daniel Shanks (1971), "Class number, a theory of factorization and genera", In Proc. Symp. Pure Math., vol. 20, Providence, R.I.: American Mathematical Society, pp. 415–440

[2] Maurer, Ueli M.; Wolf, Stefan (2000), "The Diffie-Hellman protocol", Designs, Codes and Cryptography, 19 (2–3): 147–171, doi:10.1023/A:1008302122286, MR 1759615

[3] V. I. Nechaev, Complexity of a determinate algorithm for the discrete logarithm, Mathematical Notes, vol. 55, no. 2 1994 (165-172)

[4] Panagiotis Chatzigiannis, Konstantinos Chalkias and Valeria Nikolaenko (2021-06-30). Homomorphic decryption in blockchains via compressed discrete-log lookup tables. CBT workshop 2021 (ESORICS). Retrieved 2021-09-07.

[5] Steven D. Galbraith, Ping Wang and Fangguo Zhang (2016-02-10). Computing Elliptic Curve Discrete Logarithms with Improved Baby-step Giant-step Algorithm. Advances in Mathematics of Communications. Retrieved 2021-09-07.

[1]

[2]

[3]

[4]

[5]

v t e Number-theoretic algorithms
Primality tests	AKS APR Baillie–PSW Elliptic curve Pocklington Fermat Lucas Lucas–Lehmer Lucas–Lehmer–Riesel Proth's theorem Pépin's Quadratic Frobenius Solovay–Strassen Miller–Rabin
Prime-generating	Sieve of Atkin Sieve of Eratosthenes Sieve of Pritchard Sieve of Sundaram Wheel factorization
Integer factorization	Continued fraction (CFRAC) Dixon's Lenstra elliptic curve (ECM) Euler's Pollard's rho p − 1 p + 1 Quadratic sieve (QS) General number field sieve (GNFS) Special number field sieve (SNFS) Rational sieve Fermat's Shanks's square forms Trial division Shor's
Multiplication	Ancient Egyptian Long Karatsuba Toom–Cook Schönhage–Strassen Fürer's
Euclidean division	Binary Chunking Fourier Goldschmidt Newton-Raphson Long Short SRT
Discrete logarithm	Baby-step giant-step Pollard rho Pollard kangaroo Pohlig–Hellman Index calculus Function field sieve
Greatest common divisor	Binary Euclidean Extended Euclidean Lehmer's
Modular square root	Cipolla Pocklington's Tonelli–Shanks Berlekamp Kunerth
Other algorithms	Chakravala Cornacchia Exponentiation by squaring Integer square root Integer relation (LLL; KZ) Modular exponentiation Montgomery reduction Schoof Trachtenberg system
Italics indicate that algorithm is for numbers of special forms