Halton sequence

Last updated
Halton sequence 2D.svg
Pseudorandom sequence 2D.svg
256 points from the first 256 points of the 2,3 Halton sequence (top) compared with a pseudorandom number source (bottom). The Halton sequence covers the space more evenly. (red=1,..,10, blue=11,..,100, green=101,..,256)

In statistics, Halton sequences are sequences used to generate points in space for numerical methods such as Monte Carlo simulations. Although these sequences are deterministic, they are of low discrepancy, that is, appear to be random for many purposes. They were first introduced in 1960 and are an example of a quasi-random number sequence. They generalize the one-dimensional van der Corput sequences.

Contents

Example of Halton sequence used to generate points in (0, 1) × (0, 1) in R2

Illustration of the first 8 points of the 2,3 Halton sequence Halton sequence 2 3.svg
Illustration of the first 8 points of the 2,3 Halton sequence

The Halton sequence is constructed according to a deterministic method that uses coprime numbers as its bases. As a simple example, let's take one dimension of the two-dimensional Halton sequence to be based on 2 and the other dimension on 3. To generate the sequence for 2, we start by dividing the interval (0,1) in half, then in fourths, eighths, etc., which generates

12,
14, 34,
18, 58, 38, 78,
116, 916,...

Equivalently, the nth number of this sequence is the number n written in binary representation, inverted, and written after the decimal point. This is true for any base. As an example, to find the sixth element of the above sequence, we'd write 6 = 1*22 + 1*21 + 0*20 = 1102, which can be inverted and placed after the decimal point to give 0.0112 = 0*2-1 + 1*2-2 + 1*2-3 = 38. So the sequence above is the same as

0.12, 0.012, 0.112, 0.0012, 0.1012, 0.0112, 0.1112, 0.00012, 0.10012,...

To generate the sequence for 3 for the other dimension, we divide the interval (0,1) in thirds, then ninths, twenty-sevenths, etc., which generates

13, 23, 19, 49, 79, 29, 59, 89, 127,...

Notice how we are dividing the interval into thirds: 1/3, 2/3. Then we divide each third into thirds, but only adding the first division of each third: 1/9 [from (0,1/3)], 4/9 [from (1/3,2/3)] and 7/9 [from (2/3,1)]. Then the second division: 2/9, 5/9 and 8/9 etc.

When we pair them up, we get a sequence of points in a unit square:

(12, 13), (14, 23), (34, 19), (18, 49), (58, 79), (38, 29), (78, 59), (116, 89), (916, 127).

Even though standard Halton sequences perform very well in low dimensions, correlation problems have been noted between sequences generated from higher primes. For example, if we started with the primes 17 and 19, the first 16 pairs of points: (117, 119), (217, 219), (317, 319) ... (1617, 1619) would have perfect linear correlation. To avoid this, it is common to drop the first 20 entries, or some other predetermined quantity depending on the primes chosen. Several other methods have also been proposed. One of the most prominent solutions is the scrambled Halton sequence, which uses permutations of the coefficients used in the construction of the standard sequence. Another solution is the leaped Halton, which skips points in the standard sequence. Using, e.g., only each 409th point (also other prime numbers not used in the Halton core sequence are possible), can achieve significant improvements. [1]

Implementation

In pseudocode:

algorithm Halton-Sequence isinputs: index              base output: result whiledoreturn

An alternative implementation that produces subsequent numbers of a Halton sequence for base b is given in the following generator function (in Python). [2] This algorithm uses only integer numbers internally, which makes it robust against round-off errors.

defhalton_sequence(b):"""Generator function for Halton sequence."""n,d=0,1whileTrue:x=d-nifx==1:n=1d*=belse:y=d//bwhilex<=y:y//=bn=(b+1)*y-xyieldn/d

See also

Related Research Articles

In measure theory, a branch of mathematics, the Lebesgue measure, named after French mathematician Henri Lebesgue, is the standard way of assigning a measure to subsets of higher dimensional Euclidean n-spaces. For lower dimensions n = 1, 2, or 3, it coincides with the standard measure of length, area, or volume. In general, it is also called n-dimensional volume, n-volume, hypervolume, or simply volume. It is used throughout real analysis, in particular to define Lebesgue integration. Sets that can be assigned a Lebesgue measure are called Lebesgue-measurable; the measure of the Lebesgue-measurable set A is here denoted by λ(A).

In mathematics, a topological space is called separable if it contains a countable, dense subset; that is, there exists a sequence of elements of the space such that every nonempty open subset of the space contains at least one element of the sequence.

<span class="mw-page-title-main">Probability space</span> Mathematical concept

In probability theory, a probability space or a probability triple is a mathematical construct that provides a formal model of a random process or "experiment". For example, one can define a probability space which models the throwing of a die.

A pseudorandom number generator (PRNG), also known as a deterministic random bit generator (DRBG), is an algorithm for generating a sequence of numbers whose properties approximate the properties of sequences of random numbers. The PRNG-generated sequence is not truly random, because it is completely determined by an initial value, called the PRNG's seed. Although sequences that are closer to truly random can be generated using hardware random number generators, pseudorandom number generators are important in practice for their speed in number generation and their reproducibility.

In probability theory, the probability generating function of a discrete random variable is a power series representation (the generating function) of the probability mass function of the random variable. Probability generating functions are often employed for their succinct description of the sequence of probabilities Pr(X = i) in the probability mass function for a random variable X, and to make available the well-developed theory of power series with non-negative coefficients.

In mathematics, a low-discrepancy sequence is a sequence with the property that for all values of , its subsequence has a low discrepancy.

<span class="mw-page-title-main">Quasi-Monte Carlo method</span> Numerical integration process

In numerical analysis, the quasi-Monte Carlo method is a method for numerical integration and solving some other problems using low-discrepancy sequences to achieve variance reduction. This is in contrast to the regular Monte Carlo method or Monte Carlo integration, which are based on sequences of pseudorandom numbers.

In mathematics, a sequence (s1, s2, s3, ...) of real numbers is said to be equidistributed, or uniformly distributed, if the proportion of terms falling in a subinterval is proportional to the length of that subinterval. Such sequences are studied in Diophantine approximation theory and have applications to Monte Carlo integration.

<span class="mw-page-title-main">Sobol sequence</span> Type of sequence in numerical analysis

Sobol’ sequences (also called LPτ sequences or (ts) sequences in base 2) are a type of quasi-random low-discrepancy sequence. They were first introduced by the Russian mathematician Ilya M. Sobol’ (Илья Меерович Соболь) in 1967.

<span class="mw-page-title-main">Dyadic transformation</span> Doubling map on the unit interval

The dyadic transformation is the mapping

Inversive congruential generators are a type of nonlinear congruential pseudorandom number generator, which use the modular multiplicative inverse to generate the next number in a sequence. The standard formula for an inversive congruential generator, modulo some prime q is:

<span class="mw-page-title-main">Van der Corput sequence</span> One-dimensional low-discrepancy sequence

A van der Corput sequence is an example of the simplest one-dimensional low-discrepancy sequence over the unit interval; it was first described in 1935 by the Dutch mathematician J. G. van der Corput. It is constructed by reversing the base-n representation of the sequence of natural numbers.

In probability theory, a standard probability space, also called Lebesgue–Rokhlin probability space or just Lebesgue space is a probability space satisfying certain assumptions introduced by Vladimir Rokhlin in 1940. Informally, it is a probability space consisting of an interval and/or a finite or countable number of atoms.

<span class="mw-page-title-main">Quasi-isometry</span> Function between two metric spaces that only respects their large-scale geometry

In mathematics, a quasi-isometry is a function between two metric spaces that respects large-scale geometry of these spaces and ignores their small-scale details. Two metric spaces are quasi-isometric if there exists a quasi-isometry between them. The property of being quasi-isometric behaves like an equivalence relation on the class of metric spaces.

High-dimensional integrals in hundreds or thousands of variables occur commonly in finance. These integrals have to be computed numerically to within a threshold . If the integral is of dimension then in the worst case, where one has a guarantee of error at most , the computational complexity is typically of order . That is, the problem suffers the curse of dimensionality. In 1977 P. Boyle, University of Waterloo, proposed using Monte Carlo (MC) to evaluate options. Starting in early 1992, J. F. Traub, Columbia University, and a graduate student at the time, S. Paskov, used quasi-Monte Carlo (QMC) to price a Collateralized mortgage obligation with parameters specified by Goldman Sachs. Even though it was believed by the world's leading experts that QMC should not be used for high-dimensional integration, Paskov and Traub found that QMC beat MC by one to three orders of magnitude and also enjoyed other desirable attributes. Their results were first published in 1995. Today QMC is widely used in the financial sector to value financial derivatives; see list of books below.

A randomness extractor, often simply called an "extractor", is a function, which being applied to output from a weak entropy source, together with a short, uniformly random seed, generates a highly random output that appears independent from the source and uniformly distributed. Examples of weakly random sources include radioactive decay or thermal noise; the only restriction on possible sources is that there is no way they can be fully controlled, calculated or predicted, and that a lower bound on their entropy rate can be established. For a given source, a randomness extractor can even be considered to be a true random number generator (TRNG); but there is no single extractor that has been proven to produce truly random output from any type of weakly random source.

<span class="mw-page-title-main">Graphon</span>

In graph theory and statistics, a graphon is a symmetric measurable function , that is important in the study of dense graphs. Graphons arise both as a natural notion for the limit of a sequence of dense graphs, and as the fundamental defining objects of exchangeable random graph models. Graphons are tied to dense graphs by the following pair of observations: the random graph models defined by graphons give rise to dense graphs almost surely, and, by the regularity lemma, graphons capture the structure of arbitrary large dense graphs.

<span class="mw-page-title-main">Shapley–Folkman lemma</span> Sums of sets of vectors are nearly convex

The Shapley–Folkman lemma is a result in convex geometry that describes the Minkowski addition of sets in a vector space. It is named after mathematicians Lloyd Shapley and Jon Folkman, but was first published by the economist Ross M. Starr.

An approach to nonlinear congruential methods of generating uniform pseudorandom numbers in the interval [0,1) is the Inversive congruential generator with prime modulus. A generalization for arbitrary composite moduli with arbitrary distinct primes will be present here.

ACE is the collection of units, implementing both a public key encryption scheme and a digital signature scheme. Corresponding names for these schemes — «ACE Encrypt» and «ACE Sign». Schemes are based on Cramer-Shoup public key encryption scheme and Cramer-Shoup signature scheme. Introduced variants of these schemes are intended to achieve a good balance between performance and security of the whole encryption system.

References

  1. Kocis and Whiten, 1997
  2. Berblinger, Michael; Schlier, Christoph (1991). "Monte Carlo integration with quasi-random numbers: some experience". Computer Physics Communications. 66 (2–3): 157–166. Bibcode:1991CoPhC..66..157B. doi:10.1016/0010-4655(91)90064-R. ISSN   0010-4655.