Randomness merger

Last updated

In extractor theory, a randomness merger is a function which extracts randomness out of a set of random variables, provided that at least one of them is uniformly random. Its name stems from the fact that it can be seen as a procedure which "merges" all the variables into one, preserving at least some of the entropy contained in the uniformly random variable. Mergers are currently used in order to explicitly construct randomness extractors.

Contents

Intuition and definition

Consider a set of random variables, , each distributed over at least one of which is uniformly random; but it is not known which one. Furthermore, the variables may be arbitrarily correlated: they may be functions of one another, they may be constant, and so on. However, since at least one of them is uniform, the set as a whole contains at least bits of entropy.

The job of the merger is to output a new random variable, also distributed over , that retains as much of that entropy as possible. Ideally, if it were known which of the variables is uniform, it could be used as the output, but that information is not known. The idea behind mergers is that by using a small additional random seed, it is possible to get a good result even without knowing which one is the uniform variable.

A naive idea would be to take the xor of all the variables. If one of them is uniformly distributed and independent of the other variables, then the output would be uniform. However, if suppose , and both of them are uniformly distributed, then the method would not work.

Definition (merger):

A function is called an -merger if for every set of random variables distributed over , at least one of which is uniform, the distribution of has smooth min-entropy . The variable denotes the uniform distribution over bits, and represents a truly random seed.

In other words, by using a small uniform seed of length , the merger returns a string which is -close to having at least min-entropy; this means that its statistical distance from a string with min-entropy is no larger than .

Reminder: There are several notions of measuring the randomness of a distribution; the min-entropy of a random variable is defined as the largest such that the most probable value of occurs with probability no more than . The min-entropy of a string is an upper bound to the amount of randomness that can be extracted from it. [1]

Parameters

There are three parameters to optimize when building mergers:

  1. The output's min-entropy should be as high as possible, for then more bits can be extracted from it.
  2. should be as small as possible, for then after applying an extractor to the merger's output, the result will be closer to uniform.
  3. The seed length should be as small as possible, for then the merger requires fewer initial truly random bits to work.

Explicit constructions for mergers are known with relatively good parameters. For example, Dvir and Wigderson's construction gives: [2] For every and integer , if , there exists an explicit -merger such that:

The proof is constructive and allows building such a merger in polynomial time in the given parameters.

Usage

It is possible to use mergers in order to produce randomness extractors with good parameters. Recall that an extractor is a function which takes a random variable that has high min-entropy, and returns a smaller random variable, but one that is close to uniform. An arbitrary min-entropy extractor can be obtained using the following merger-based scheme: [2] [3]

The essence of the scheme above is to use the merger in order to transform a string with arbitrary min-entropy into a smaller string, while not losing a lot of min-entropy in the process. This new string has very high min-entropy compared to its length, and it's then possible to use older, known, extractors which only work for those type of strings.

See also

Related Research Articles

An -extractor is a bipartite graph with nodes on the left and nodes on the right such that each node on the left has neighbors, which has the added property that for any subset of the left vertices of size at least , the distribution on right vertices obtained by choosing a random node in and then following a random edge to get a node x on the right side is -close to the uniform distribution in terms of total variation distance.

A binary symmetric channel is a common communications channel model used in coding theory and information theory. In this model, a transmitter wishes to send a bit, and the receiver will receive a bit. The bit will be "flipped" with a "crossover probability" of p, and otherwise is received correctly. This model can be applied to varied communication channels such as telephone lines or disk drive storage.

Additive white Gaussian noise (AWGN) is a basic noise model used in information theory to mimic the effect of many random processes that occur in nature. The modifiers denote specific characteristics:

<span class="mw-page-title-main">Logistic regression</span> Statistical model for a binary dependent variable

In statistics, the logistic model is a statistical model that models the log-odds of an event as a linear combination of one or more independent variables. In regression analysis, logistic regression is estimating the parameters of a logistic model. Formally, in binary logistic regression there is a single binary dependent variable, coded by an indicator variable, where the two values are labeled "0" and "1", while the independent variables can each be a binary variable or a continuous variable. The corresponding probability of the value labeled "1" can vary between 0 and 1, hence the labeling; the function that converts log-odds to probability is the logistic function, hence the name. The unit of measurement for the log-odds scale is called a logit, from logistic unit, hence the alternative names. See § Background and § Definition for formal mathematics, and § Example for a worked example.

In mathematics, a low-discrepancy sequence is a sequence with the property that for all values of N, its subsequence x1, ..., xN has a low discrepancy.

In probability theory, a Chernoff bound is an exponentially decreasing upper bound on the tail of a random variable based on its moment generating function. The minimum of all such exponential bounds forms the Chernoff or Chernoff-Cramér bound, which may decay faster than exponential. It is especially useful for sums of independent random variables, such as sums of Bernoulli random variables.

In probability theory, the multinomial distribution is a generalization of the binomial distribution. For example, it models the probability of counts for each side of a k-sided dice rolled n times. For n independent trials each of which leads to a success for exactly one of k categories, with each category having a given fixed success probability, the multinomial distribution gives the probability of any particular combination of numbers of successes for the various categories.

In information theory, Shannon's source coding theorem establishes the statistical limits to possible data compression for data whose source is an independent identically-distributed random variable, and the operational meaning of the Shannon entropy.

Random forests or random decision forests is an ensemble learning method for classification, regression and other tasks that operates by constructing a multitude of decision trees at training time. For classification tasks, the output of the random forest is the class selected by most trees. For regression tasks, the mean or average prediction of the individual trees is returned. Random decision forests correct for decision trees' habit of overfitting to their training set.

In statistics, econometrics, and signal processing, an autoregressive (AR) model is a representation of a type of random process; as such, it is used to describe certain time-varying processes in nature, economics, behavior, etc. The autoregressive model specifies that the output variable depends linearly on its own previous values and on a stochastic term ; thus the model is in the form of a stochastic difference equation which should not be confused with a differential equation. Together with the moving-average (MA) model, it is a special case and key component of the more general autoregressive–moving-average (ARMA) and autoregressive integrated moving average (ARIMA) models of time series, which have a more complicated stochastic structure; it is also a special case of the vector autoregressive model (VAR), which consists of a system of more than one interlocking stochastic difference equation in more than one evolving random variable.

<span class="mw-page-title-main">Continuous uniform distribution</span> Uniform distribution on an interval

In probability theory and statistics, the continuous uniform distributions or rectangular distributions are a family of symmetric probability distributions. Such a distribution describes an experiment where there is an arbitrary outcome that lies between certain bounds. The bounds are defined by the parameters, and which are the minimum and maximum values. The interval can either be closed or open. Therefore, the distribution is often abbreviated where stands for uniform distribution. The difference between the bounds defines the interval length; all intervals of the same length on the distribution's support are equally probable. It is the maximum entropy probability distribution for a random variable under no constraint other than that it is contained in the distribution's support.

In theoretical computer science and cryptography, a pseudorandom generator (PRG) for a class of statistical tests is a deterministic procedure that maps a random seed to a longer pseudorandom string such that no statistical test in the class can distinguish between the output of the generator and the uniform distribution. The random seed itself is typically a short binary string drawn from the uniform distribution.

The leftover hash lemma is a lemma in cryptography first stated by Russell Impagliazzo, Leonid Levin, and Michael Luby.

<span class="mw-page-title-main">Dvoretzky–Kiefer–Wolfowitz inequality</span> Statistical inequality

In the theory of probability and statistics, the Dvoretzky–Kiefer–Wolfowitz–Massart inequality provides a bound on the worst case distance of an empirically determined distribution function from its associated population distribution function. It is named after Aryeh Dvoretzky, Jack Kiefer, and Jacob Wolfowitz, who in 1956 proved the inequality

In mathematics, uniform integrability is an important concept in real analysis, functional analysis and measure theory, and plays a vital role in the theory of martingales.

A randomness extractor, often simply called an "extractor", is a function, which being applied to output from a weak entropy source, together with a short, uniformly random seed, generates a highly random output that appears independent from the source and uniformly distributed. Examples of weakly random sources include radioactive decay or thermal noise; the only restriction on possible sources is that there is no way they can be fully controlled, calculated or predicted, and that a lower bound on their entropy rate can be established. For a given source, a randomness extractor can even be considered to be a true random number generator (TRNG); but there is no single extractor that has been proven to produce truly random output from any type of weakly random source.

In computer science, streaming algorithms are algorithms for processing data streams in which the input is presented as a sequence of items and can be examined in only a few passes, typically just one. These algorithms are designed to operate with limited memory, generally logarithmic in the size of the stream and/or in the maximum value in the stream, and may also have limited processing time per item.

The exponential mechanism is a technique for designing differentially private algorithms. It was developed by Frank McSherry and Kunal Talwar in 2007. Their work was recognized as a co-winner of the 2009 PET Award for Outstanding Research in Privacy Enhancing Technologies.

Fuzzy extractors are a method that allows biometric data to be used as inputs to standard cryptographic techniques, to enhance computer security. "Fuzzy", in this context, refers to the fact that the fixed values required for cryptography will be extracted from values close to but not identical to the original key, without compromising the security required. One application is to encrypt and authenticate users records, using the biometric inputs of the user as a key.

The counting lemmas this article discusses are statements in combinatorics and graph theory. The first one extracts information from -regular pairs of subsets of vertices in a graph , in order to guarantee patterns in the entire graph; more explicitly, these patterns correspond to the count of copies of a certain graph in . The second counting lemma provides a similar yet more general notion on the space of graphons, in which a scalar of the cut distance between two graphs is correlated to the homomorphism density between them and .

References

  1. De, Portmann, Vidick and Renner (2009). "Trevisan's extractor in the presence of quantum side information". SIAM Journal on Computing. 41 (4): 915–940. arXiv: 0912.5514 . doi:10.1137/100813683. S2CID   5387876.{{cite journal}}: CS1 maint: multiple names: authors list (link) Section 2.2.
  2. 1 2 3 Zeev Dvir & Avi Wigderson. "Kakeya sets, new mergers and old extractors" (PDF).
  3. Noam Nissan & Amnon Ta-Shma. "Extracting Randomness: A Survey and New Constructions" (PDF). Section 4.3.
  4. Amnon Ta-Shma. "Refining Randomness". Phd. Thesis.