The Marsaglia polar method [1] is a pseudo-random number sampling method for generating a pair of independent standard normal random variables. [2]
Standard normal random variables are frequently used in computer science, computational statistics, and in particular, in applications of the Monte Carlo method.
The polar method works by choosing random points (x, y) in the square −1 < x < 1, −1 < y < 1 until
and then returning the required pair of normal random variables as
or, equivalently,
where and represent the cosine and sine of the angle that the vector (x, y) makes with x axis.
The underlying theory may be summarized as follows:
If u is uniformly distributed in the interval 0 ≤ u < 1, then the point (cos(2πu), sin(2πu)) is uniformly distributed on the unit circumference x2 + y2 = 1, and multiplying that point by an independent random variable ρ whose distribution is
will produce a point
whose coordinates are jointly distributed as two independent standard normal random variables.
This idea dates back to Laplace, whom Gauss credits with finding the above
by taking the square root of
The transformation to polar coordinates makes evident that θ is uniformly distributed (constant density) from 0 to 2π, and that the radial distance r has density
(r2 has the appropriate chi square distribution.)
This method of producing a pair of independent standard normal variates by radially projecting a random point on the unit circumference to a distance given by the square root of a chi-square-2 variate is called the polar method for generating a pair of normal random variables,
A direct application of this idea,
is called the Box–Muller transform, in which the chi variate is usually generated as
but that transform requires logarithm, square root, sine and cosine functions. On some processors, the cosine and sine of the same argument can be calculated in parallel using a single instruction. [3] Notably for Intel-based machines, one can use fsincos assembler instruction or the expi instruction (available e.g. in D), to calculate complex
and just separate the real and imaginary parts.
Note: To explicitly calculate the complex-polar form use the following substitutions in the general form,
Let and Then
In contrast, the polar method here removes the need to calculate a cosine and sine. Instead, by solving for a point on the unit circle, these two functions can be replaced with the x and y coordinates normalized to the radius. In particular, a random point (x, y) inside the unit circle is projected onto the unit circumference by setting and forming the point
which is a faster procedure than calculating the cosine and sine. Some researchers argue that the conditional if instruction (for rejecting a point outside of the unit circle), can make programs slower on modern processors equipped with pipelining and branch prediction. [4] Also this procedure requires about 27% more evaluations of the underlying random number generator (only of generated points lie inside of unit circle).
That random point on the circumference is then radially projected the required random distance by means of
using the same s because that s is independent of the random point on the circumference and is itself uniformly distributed from 0 to 1.
A simple implementation in Python:
importmathimportrandomdefmarsaglia_sample():whileTrue:U1=random.uniform(-1,1)U2=random.uniform(-1,1)if(w:=U1**2+U2**2)<1:breakZ1=U1*math.sqrt(-2*math.log(w)/w)Z2=U2*math.sqrt(-2*math.log(w)/w)returnZ1,Z2
Simple implementation in Java using the mean and standard deviation:
privatestaticdoublespare;privatestaticbooleanhasSpare=false;publicstaticsynchronizeddoublegenerateGaussian(doublemean,doublestdDev){if(hasSpare){hasSpare=false;returnspare*stdDev+mean;}else{doubleu,v,s;do{u=Math.random()*2-1;v=Math.random()*2-1;s=u*u+v*v;}while(s>=1||s==0);s=Math.sqrt(-2.0*Math.log(s)/s);spare=v*s;hasSpare=true;returnmean+stdDev*u*s;}}
A non-thread safe implementation in C++ using the mean and standard deviation:
doublegenerateGaussian(doublemean,doublestdDev){staticdoublespare;staticboolhasSpare=false;if(hasSpare){hasSpare=false;returnspare*stdDev+mean;}else{doubleu,v,s;do{u=(rand()/((double)RAND_MAX))*2.0-1.0;v=(rand()/((double)RAND_MAX))*2.0-1.0;s=u*u+v*v;}while(s>=1.0||s==0.0);s=sqrt(-2.0*log(s)/s);spare=v*s;hasSpare=true;returnmean+stdDev*u*s;}}
C++11 GNU GCC libstdc++'s implementation of std::normal_distribution uses the Marsaglia polar method, as quoted from herein.
A simple Julia implementation:
""" marsagliasample(N)Generate `2N` samples from the standard normal distribution using the Marsaglia method."""functionmarsagliasample(N)z=Array{Float64}(undef,N,2);foriinaxes(z,1)s=Inf;whiles>1z[i,:].=2rand(2).-1;s=sum(abs2.(z[i,:]))endz[i,:].*=sqrt(-2log(s)/s);endvec(z)end""" marsagliasample(n,μ,σ)Generate `n` samples from the normal distribution with mean `μ` and standard deviation `σ` using the Marsaglia method."""functionmarsagliasample(n,μ,σ)μ.+σ*marsagliasample(cld(n,2))[1:n];end
The for loop can be parallelized by using the Threads.@threads
macro.
Bessel functions, first defined by the mathematician Daniel Bernoulli and then generalized by Friedrich Bessel, are canonical solutions y(x) of Bessel's differential equation for an arbitrary complex number , which represents the order of the Bessel function. Although and produce the same differential equation, it is conventional to define different Bessel functions for these two values in such a way that the Bessel functions are mostly smooth functions of .
In mathematics, the trigonometric functions are real functions which relate an angle of a right-angled triangle to ratios of two side lengths. They are widely used in all sciences that are related to geometry, such as navigation, solid mechanics, celestial mechanics, geodesy, and many others. They are among the simplest periodic functions, and as such are also widely used for studying periodic phenomena through Fourier analysis.
In mathematics, an n-sphere or hypersphere is an -dimensional generalization of the -dimensional circle and -dimensional sphere to any non-negative integer . The circle is considered 1-dimensional, and the sphere 2-dimensional, because the surfaces themselves are 1- and 2-dimensional respectively, not because they exist as shapes in 1- and 2-dimensional space. As such, the -sphere is the setting for -dimensional spherical geometry.
The Box–Muller transform, by George Edward Pelham Box and Mervin Edgar Muller, is a random number sampling method for generating pairs of independent, standard, normally distributed random numbers, given a source of uniformly distributed random numbers. The method was first mentioned explicitly by Raymond E. A. C. Paley and Norbert Wiener in their 1934 treatise on Fourier transforms in the complex domain. Given the status of these latter authors and the widespread availability and use of their treatise, it is almost certain that Box and Muller were well aware of its contents.
In mathematics, the Lambert W function, also called the omega function or product logarithm, is a multivalued function, namely the branches of the converse relation of the function f(w) = wew, where w is any complex number and ew is the exponential function. The function is named after Johann Lambert, who considered a related problem in 1758. Building on Lambert's work, Leonhard Euler described the W function per se in 1783.
In mathematics and physical science, spherical harmonics are special functions defined on the surface of a sphere. They are often employed in solving partial differential equations in many scientific fields. The table of spherical harmonics contains a list of common spherical harmonics.
Integration is the basic operation in integral calculus. While differentiation has straightforward rules by which the derivative of a complicated function can be found by differentiating its simpler component functions, integration does not, so tables of known integrals are often useful. This page lists some of the most common antiderivatives.
In mathematics, a Gaussian function, often simply referred to as a Gaussian, is a function of the base form and with parametric extension for arbitrary real constants a, b and non-zero c. It is named after the mathematician Carl Friedrich Gauss. The graph of a Gaussian is a characteristic symmetric "bell curve" shape. The parameter a is the height of the curve's peak, b is the position of the center of the peak, and c controls the width of the "bell".
In mathematics, the inverse trigonometric functions are the inverse functions of the trigonometric functions, under suitably restricted domains. Specifically, they are the inverses of the sine, cosine, tangent, cotangent, secant, and cosecant functions, and are used to obtain an angle from any of the angle's trigonometric ratios. Inverse trigonometric functions are widely used in engineering, navigation, physics, and geometry.
In mathematics, the digamma function is defined as the logarithmic derivative of the gamma function:
In mathematics, theta functions are special functions of several complex variables. They show up in many topics, including Abelian varieties, moduli spaces, quadratic forms, and solitons. As Grassmann algebras, they appear in quantum field theory.
In the mathematical field of complex analysis, contour integration is a method of evaluating certain integrals along paths in the complex plane.
A cone is a three-dimensional geometric shape that tapers smoothly from a flat base to a point called the apex or vertex.
In probability theory, a distribution is said to be stable if a linear combination of two independent random variables with this distribution has the same distribution, up to location and scale parameters. A random variable is said to be stable if its distribution is stable. The stable distribution family is also sometimes referred to as the Lévy alpha-stable distribution, after Paul Lévy, the first mathematician to have studied it.
Directional statistics is the subdiscipline of statistics that deals with directions, axes or rotations in Rn. More generally, directional statistics deals with observations on compact Riemannian manifolds including the Stiefel manifold.
In mathematics, the Stieltjes constants are the numbers that occur in the Laurent series expansion of the Riemann zeta function:
In probability theory and directional statistics, the von Mises distribution is a continuous probability distribution on the circle. It is a close approximation to the wrapped normal distribution, which is the circular analogue of the normal distribution. A freely diffusing angle on a circle is a wrapped normally distributed random variable with an unwrapped variance that grows linearly in time. On the other hand, the von Mises distribution is the stationary distribution of a drift and diffusion process on the circle in a harmonic potential, i.e. with a preferred orientation. The von Mises distribution is the maximum entropy distribution for circular data when the real and imaginary parts of the first circular moment are specified. The von Mises distribution is a special case of the von Mises–Fisher distribution on the N-dimensional sphere.
In mathematics, sine and cosine are trigonometric functions of an angle. The sine and cosine of an acute angle are defined in the context of a right triangle: for the specified angle, its sine is the ratio of the length of the side that is opposite that angle to the length of the longest side of the triangle, and the cosine is the ratio of the length of the adjacent leg to that of the hypotenuse. For an angle , the sine and cosine functions are denoted as and .
A ratio distribution is a probability distribution constructed as the distribution of the ratio of random variables having two other known distributions. Given two random variables X and Y, the distribution of the random variable Z that is formed as the ratio Z = X/Y is a ratio distribution.
The q-Gaussian is a probability distribution arising from the maximization of the Tsallis entropy under appropriate constraints. It is one example of a Tsallis distribution. The q-Gaussian is a generalization of the Gaussian in the same way that Tsallis entropy is a generalization of standard Boltzmann–Gibbs entropy or Shannon entropy. The normal distribution is recovered as q → 1.