Expander walk sampling

Last updated

In the mathematical discipline of graph theory, the expander walk sampling theorem intuitively states that sampling vertices in an expander graph by doing relatively short random walk can simulate sampling the vertices independently from a uniform distribution. The earliest version of this theorem is due to Ajtai, Komlós & Szemerédi (1987), and the more general version is typically attributed to Gillman (1998).

Contents

Statement

Let be an n-vertex expander graph with positively weighted edges, and let . Let denote the stochastic matrix of the graph, and let be the second largest eigenvalue of . Let denote the vertices encountered in a -step random walk on starting at vertex , and let . Where

(It is well known [1] that almost all trajectories converges to some limiting point, , as .)

The theorem states that for a weighted graph and a random walk where is chosen by an initial distribution , for all , we have the following bound:

Where is dependant on and .

The theorem gives a bound for the rate of convergence to with respect to the length of the random walk, hence giving a more efficient method to estimate compared to independent sampling the vertices of .

Proof

In order to prove the theorem, we provide a few definitions followed by three lemmas.

Let be the weight of the edge and let Denote by . Let be the matrix with entries , and let .

Let and . Let where is the stochastic matrix, and . Then:

Where . As and are symmetric, they have real eigenvalues. Therefore, as the eigenvalues of and are equal, the eigenvalues of are real. Let and be the first and second largest eigenvalue of respectively.

For convenience of notation, let , , , and let be the all-1 vector.

Lemma 1

Proof:

By Markov’s inequality,

Where is the expectation of chosen according to the probability distribution . As this can be interpreted by summing over all possible trajectories , hence:

Combining the two results proves the lemma.

Lemma 2

For ,

Proof:

As eigenvalues of and are equal,

Lemma 3

If is a real number such that ,

Proof summary:

We Taylor expand about point to get:

Where are first and second derivatives of at . We show that We then prove that (i) by matrix manipulation, and then prove (ii) using (i) and Cauchy’s estimate from complex analysis.

The results combine to show that

A line to line proof can be found in Gilman (1998)

Proof of theorem

Combining lemma 2 and lemma 3, we get that

Interpreting the exponent on the right hand side of the inequality as a quadratic in and minimising the expression, we see that

A similar bound

holds, hence setting gives the desired result.

Uses

This theorem is useful in randomness reduction in the study of derandomization. Sampling from an expander walk is an example of a randomness-efficient sampler. Note that the number of bits used in sampling independent samples from is , whereas if we sample from an infinite family of constant-degree expanders this costs only . Such families exist and are efficiently constructible, e.g. the Ramanujan graphs of Lubotzky-Phillips-Sarnak.

Related Research Articles

<span class="mw-page-title-main">Electric field</span> Physical field surrounding an electric charge

An electric field is the physical field that surrounds electrically charged particles and exerts force on all other charged particles in the field, either attracting or repelling them. It also refers to the physical field for a system of charged particles. Electric fields originate from electric charges and time-varying electric currents. Electric fields and magnetic fields are both manifestations of the electromagnetic field, one of the four fundamental interactions of nature.

<span class="mw-page-title-main">Green's theorem</span> Theorem in calculus relating line and double integrals

In vector calculus, Green's theorem relates a line integral around a simple closed curve C to a double integral over the plane region D bounded by C. It is the two-dimensional special case of Stokes' theorem.

In probability theory, the Borel–Kolmogorov paradox is a paradox relating to conditional probability with respect to an event of probability zero. It is named after Émile Borel and Andrey Kolmogorov.

In mathematics, the von Mangoldt function is an arithmetic function named after German mathematician Hans von Mangoldt. It is an example of an important arithmetic function that is neither multiplicative nor additive.

In quantum field theory and statistical mechanics, loop integrals are the integrals which appear when evaluating the Feynman diagrams with one or more loops by integrating over the internal momenta. These integrals are used to determine counterterms, which in turn allow evaluation of the beta function, which encodes the dependence of coupling for an interaction on an energy scale .

Differential entropy is a concept in information theory that began as an attempt by Claude Shannon to extend the idea of (Shannon) entropy, a measure of average (surprisal) of a random variable, to continuous probability distributions. Unfortunately, Shannon did not derive this formula, and rather just assumed it was the correct continuous analogue of discrete entropy, but it is not. The actual continuous version of discrete entropy is the limiting density of discrete points (LDDP). Differential entropy is commonly encountered in the literature, but it is a limiting case of the LDDP, and one that loses its fundamental association with discrete entropy.

The plasma parameter is a dimensionless number, denoted by capital Lambda, Λ. The plasma parameter is usually interpreted to be the argument of the Coulomb logarithm, which is the ratio of the maximum impact parameter to the classical distance of closest approach in Coulomb scattering. In this case, the plasma parameter is given by:

In mathematics, the Schur orthogonality relations, which were proven by Issai Schur through Schur's lemma, express a central fact about representations of finite groups. They admit a generalization to the case of compact groups in general, and in particular compact Lie groups, such as the rotation group SO(3).

A ratio distribution is a probability distribution constructed as the distribution of the ratio of random variables having two other known distributions. Given two random variables X and Y, the distribution of the random variable Z that is formed as the ratio Z = X/Y is a ratio distribution.

<span class="mw-page-title-main">Contact mechanics</span> Study of the deformation of solids that touch each other

Contact mechanics is the study of the deformation of solids that touch each other at one or more points. A central distinction in contact mechanics is between stresses acting perpendicular to the contacting bodies' surfaces and frictional stresses acting tangentially between the surfaces. Normal contact mechanics or frictionless contact mechanics focuses on normal stresses caused by applied normal forces and by the adhesion present on surfaces in close contact, even if they are clean and dry. Frictional contact mechanics emphasizes the effect of friction forces.

In physics and mathematics, the solid harmonics are solutions of the Laplace equation in spherical polar coordinates, assumed to be (smooth) functions . There are two kinds: the regular solid harmonics, which are well-defined at the origin and the irregular solid harmonics, which are singular at the origin. Both sets of functions play an important role in potential theory, and are obtained by rescaling spherical harmonics appropriately:

In mathematics, a space, where is a real number, is a specific type of metric space. Intuitively, triangles in a space are "slimmer" than corresponding "model triangles" in a standard space of constant curvature . In a space, the curvature is bounded from above by . A notable special case is ; complete spaces are known as "Hadamard spaces" after the French mathematician Jacques Hadamard.

<span class="mw-page-title-main">Normal-inverse-gamma distribution</span>

In probability theory and statistics, the normal-inverse-gamma distribution is a four-parameter family of multivariate continuous probability distributions. It is the conjugate prior of a normal distribution with unknown mean and variance.

<span class="mw-page-title-main">Poisson distribution</span> Discrete probability distribution

In probability theory and statistics, the Poisson distribution is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time or space if these events occur with a known constant mean rate and independently of the time since the last event. It is named after French mathematician Siméon Denis Poisson. The Poisson distribution can also be used for the number of events in other specified interval types such as distance, area, or volume. It plays an important role for discrete-stable distributions.

In cryptography, Learning with errors (LWE) is a mathematical problem that is widely used in cryptography to create secure encryption algorithms. It is based on the idea of representing secret information as a set of equations with errors. In other words, LWE is a way to hide the value of a secret by introducing noise to it. In more technical terms, it refers to the computational problem of inferring a linear -ary function over a finite ring from given samples some of which may be erroneous. The LWE problem is conjectured to be hard to solve, and thus to be useful in cryptography.

In quantum computing, the quantum phase estimation algorithm is a quantum algorithm to estimate the phase corresponding to an eigenvalue of a given unitary operator. Because the eigenvalues of a unitary operator always have unit modulus, they are characterized by their phase, and therefore the algorithm can be equivalently described as retrieving either the phase or the eigenvalue itself. The algorithm was initially introduced by Alexei Kitaev in 1995.

In mathematical physics, the Wu–Sprung potential, named after Hua Wu and Donald Sprung, is a potential function in one dimension inside a Hamiltonian with the potential defined by solving a non-linear integral equation defined by the Bohr–Sommerfeld quantization conditions involving the spectral staircase, the energies and the potential .

<span class="mw-page-title-main">Two-ray ground-reflection model</span>

The two-rays ground-reflection model is a multipath radio propagation model which predicts the path losses between a transmitting antenna and a receiving antenna when they are in line of sight (LOS). Generally, the two antenna each have different height. The received signal having two components, the LOS component and the reflection component formed predominantly by a single ground reflected wave.

In statistics, the complex Wishart distribution is a complex version of the Wishart distribution. It is the distribution of times the sample Hermitian covariance matrix of zero-mean independent Gaussian random variables. It has support for Hermitian positive definite matrices.

In plasma physics and magnetic confinement fusion, neoclassical transport or neoclassical diffusion is a theoretical description of collisional transport in toroidal plasmas, usually found in tokamaks or stellerators. It is a modification of classical diffusion adding in effects of non-uniform magnetic fields due to the toroidal geometry, which give rise to new diffusion effects.

References

  1. Doob, J.L. (1953). Stochastic Processes. Theorem 6.1: Wiley.{{cite book}}: CS1 maint: location (link)