McDiarmid's inequality

Last updated March 08, 2024

In probability theory and theoretical computer science, McDiarmid's inequality (named after Colin McDiarmid ^[1]) is a concentration inequality which bounds the deviation between the sampled value and the expected value of certain functions when they are evaluated on independent random variables. McDiarmid's inequality applies to functions that satisfy a bounded differences property, meaning that replacing a single argument to the function while leaving all other arguments unchanged cannot cause too large of a change in the value of the function.

Statement

A function $f:{\mathcal {X}}_{1}\times {\mathcal {X}}_{2}\times \cdots \times {\mathcal {X}}_{n}\rightarrow \mathbb {R}$ satisfies the bounded differences property if substituting the value of the $i$ th coordinate $x_{i}$ changes the value of $f$ by at most $c_{i}$ . More formally, if there are constants $c_{1},c_{2},\dots ,c_{n}$ such that for all $i\in [n]$ , and all $x_{1}\in {\mathcal {X}}_{1},\,x_{2}\in {\mathcal {X}}_{2},\,\ldots ,\,x_{n}\in {\mathcal {X}}_{n}$ ,

\sup _{x_{i}'\in {\mathcal {X}}_{i}}\left|f(x_{1},\dots ,x_{i-1},x_{i},x_{i+1},\ldots ,x_{n})-f(x_{1},\dots ,x_{i-1},x_{i}',x_{i+1},\ldots ,x_{n})\right|\leq c_{i}.

McDiarmid's Inequality^[2] — Let $f:{\mathcal {X}}_{1}\times {\mathcal {X}}_{2}\times \cdots \times {\mathcal {X}}_{n}\rightarrow \mathbb {R}$ satisfy the bounded differences property with bounds $c_{1},c_{2},\dots ,c_{n}$ .

Consider independent random variables $X_{1},X_{2},\dots ,X_{n}$ where $X_{i}\in {\mathcal {X}}_{i}$ for all $i$ . Then, for any $\varepsilon >0$ ,

{\text{P}}\left(f(X_{1},X_{2},\ldots ,X_{n})-\mathbb {E} [f(X_{1},X_{2},\ldots ,X_{n})]\geq \varepsilon \right)\leq \exp \left(-{\frac {2\varepsilon ^{2}}{\sum _{i=1}^{n}c_{i}^{2}}}\right),

{\text{P}}(f(X_{1},X_{2},\ldots ,X_{n})-\mathbb {E} [f(X_{1},X_{2},\ldots ,X_{n})]\leq -\varepsilon )\leq \exp \left(-{\frac {2\varepsilon ^{2}}{\sum _{i=1}^{n}c_{i}^{2}}}\right),

and as an immediate consequence,

{\text{P}}(|f(X_{1},X_{2},\ldots ,X_{n})-\mathbb {E} [f(X_{1},X_{2},\ldots ,X_{n})]|\geq \varepsilon )\leq 2\exp \left(-{\frac {2\varepsilon ^{2}}{\sum _{i=1}^{n}c_{i}^{2}}}\right).

Extensions

Unbalanced distributions

A stronger bound may be given when the arguments to the function are sampled from unbalanced distributions, such that resampling a single argument rarely causes a large change to the function value.

McDiarmid's Inequality (unbalanced)^[3]^[4] — Let $f:{\mathcal {X}}^{n}\rightarrow \mathbb {R}$ satisfy the bounded differences property with bounds $c_{1},c_{2},\dots ,c_{n}$ .

Consider independent random variables $X_{1},X_{2},\ldots ,X_{n}\in {\mathcal {X}}$ drawn from a distribution where there is a particular value $\chi _{0}\in {\mathcal {X}}$ which occurs with probability $1-p$ . Then, for any $\varepsilon >0$ ,

{\text{P}}(|f(X_{1},\ldots ,X_{n})-\mathbb {E} [f(X_{1},\ldots ,X_{n})]|\geq \varepsilon )\leq 2\exp \left({\frac {-\varepsilon ^{2}}{2p(2-p)\sum _{i=1}^{n}c_{i}^{2}+{\frac {2}{3}}\varepsilon \max _{i}c_{i}}}\right).

This may be used to characterize, for example, the value of a function on graphs when evaluated on sparse random graphs and hypergraphs, since in a sparse random graph, it is much more likely for any particular edge to be missing than to be present.

Differences bounded with high probability

McDiarmid's inequality may be extended to the case where the function being analyzed does not strictly satisfy the bounded differences property, but large differences remain very rare.

McDiarmid's Inequality (Differences bounded with high probability)^[5] — Let $f:{\mathcal {X}}_{1}\times {\mathcal {X}}_{2}\times \cdots \times {\mathcal {X}}_{n}\rightarrow \mathbb {R}$ be a function and ${\mathcal {Y}}\subseteq {\mathcal {X}}_{1}\times {\mathcal {X}}_{2}\times \cdots \times {\mathcal {X}}_{n}$ be a subset of its domain and let $c_{1},c_{2},\dots ,c_{n}\geq 0$ be constants such that for all pairs $(x_{1},\ldots ,x_{n})\in {\mathcal {Y}}$ and $(x'_{1},\ldots ,x'_{n})\in {\mathcal {Y}}$ ,

\left|f(x_{1},\ldots ,x_{n})-f(x'_{1},\ldots ,x'_{n})\right|\leq \sum _{i:x_{i}\neq x'_{i}}c_{i}.

Consider independent random variables $X_{1},X_{2},\dots ,X_{n}$ where $X_{i}\in {\mathcal {X}}_{i}$ for all $i$ . Let $p=1-\mathrm {P} ((X_{1},\ldots ,X_{n})\in {\mathcal {Y}})$ and let $m=\mathbb {E} [f(X_{1},\ldots ,X_{n})\mid (X_{1},\ldots ,X_{n})\in {\mathcal {Y}}]$ . Then, for any $\varepsilon >0$ ,

{\text{P}}\left(f(X_{1},\ldots ,X_{n})-m\geq \varepsilon \right)\leq p+\exp \left(-{\frac {2\max \left(0,\varepsilon -p\sum _{i=1}^{n}c_{i}\right)^{2}}{\sum _{i=1}^{n}c_{i}^{2}}}\right),

and as an immediate consequence,

{\text{P}}(|f(X_{1},\ldots ,X_{n})-m|\geq \varepsilon )\leq 2p+2\exp \left(-{\frac {2\max \left(0,\varepsilon -p\sum _{i=1}^{n}c_{i}\right)^{2}}{\sum _{i=1}^{n}c_{i}^{2}}}\right).

There exist stronger refinements to this analysis in some distribution-dependent scenarios,^[6] such as those that arise in learning theory.

Sub-Gaussian and sub-exponential norms

Let the $k$ th centered conditional version of a function $f$ be

f_{k}(X)(x):=f(x_{1},\ldots ,x_{k-1},X_{k},x_{k+1},\ldots ,x_{n})-\mathbb {E} _{X'_{k}}f(x_{1},\ldots ,x_{k-1},X'_{k},x_{k+1},\ldots ,x_{n}),

so that $f_{k}(X)$ is a random variable depending on random values of $x_{1},\ldots ,x_{k-1},x_{k+1},\ldots ,x_{n}$ .

McDiarmid's Inequality (Sub-Gaussian norm)^[7]^[8] — Let $f:{\mathcal {X}}_{1}\times {\mathcal {X}}_{2}\times \cdots \times {\mathcal {X}}_{n}\rightarrow \mathbb {R}$ be a function. Consider independent random variables $X=(X_{1},X_{2},\dots ,X_{n})$ where $X_{i}\in {\mathcal {X}}_{i}$ for all $i$ .

Let $f_{k}(X)$ refer to the $k$ th centered conditional version of $f$ . Let $\|\cdot \|_{\psi _{2}}$ denote the sub-Gaussian norm of a random variable.

Then, for any $\varepsilon >0$ ,

{\text{P}}\left(f(X_{1},\ldots ,X_{n})-m\geq \varepsilon \right)\leq \exp \left({\frac {-\varepsilon ^{2}}{32e\left\|\sum _{k\in [n]}\|f_{k}(X)\|_{\psi _{2}}^{2}\right\|_{\infty }}}\right).

McDiarmid's Inequality (Sub-exponential norm)^[8] — Let $f:{\mathcal {X}}_{1}\times {\mathcal {X}}_{2}\times \cdots \times {\mathcal {X}}_{n}\rightarrow \mathbb {R}$ be a function. Consider independent random variables $X=(X_{1},X_{2},\dots ,X_{n})$ where $X_{i}\in {\mathcal {X}}_{i}$ for all $i$ .

Let $f_{k}(X)$ refer to the $k$ th centered conditional version of $f$ . Let $\|\cdot \|_{\psi _{1}}$ denote the sub-exponential norm of a random variable.

Then, for any $\varepsilon >0$ ,

{\text{P}}\left(f(X_{1},\ldots ,X_{n})-m\geq \varepsilon \right)\leq \exp \left({\frac {-\varepsilon ^{2}}{4e^{2}\left\|\sum _{k\in [n]}\|f_{k}(X)\|_{\psi _{1}}^{2}\right\|_{\infty }+2\varepsilon e\max _{k\in [n]}\left\|\|f_{k}(X)\|_{\psi _{1}}\right\|_{\infty }}}\right).

Bennett and Bernstein forms

Refinements to McDiarmid's inequality in the style of Bennett's inequality and Bernstein inequalities are made possible by defining a variance term for each function argument. Let

{\begin{aligned}B&:=\max _{k\in [n]}\sup _{x_{1},\dots ,x_{k-1},x_{k+1},\dots ,x_{n}}\left|f(x_{1},\dots ,x_{k-1},X_{k},x_{k+1},\dots ,x_{n})-\mathbb {E} _{X_{k}}f(x_{1},\dots ,x_{k-1},X_{k},x_{k+1},\dots ,x_{n})\right|,\\V_{k}&:=\sup _{x_{1},\dots ,x_{k-1},x_{k+1},\dots ,x_{n}}\mathbb {E} _{X_{k}}\left(f(x_{1},\dots ,x_{k-1},X_{k},x_{k+1},\dots ,x_{n})-\mathbb {E} _{X_{k}}f(x_{1},\dots ,x_{k-1},X_{k},x_{k+1},\dots ,x_{n})\right)^{2},\\{\tilde {\sigma }}^{2}&:=\sum _{k=1}^{n}V_{k}.\end{aligned}}

McDiarmid's Inequality (Bennett form)^[4] — Let $f:{\mathcal {X}}^{n}\rightarrow \mathbb {R}$ satisfy the bounded differences property with bounds $c_{1},c_{2},\dots ,c_{n}$ . Consider independent random variables $X_{1},X_{2},\dots ,X_{n}$ where $X_{i}\in {\mathcal {X}}_{i}$ for all $i$ . Let $B$ and ${\tilde {\sigma }}^{2}$ be defined as at the beginning of this section.

Then, for any $\varepsilon >0$ ,

{\text{P}}(f(X_{1},\ldots ,X_{n})-\mathbb {E} [f(X_{1},\ldots ,X_{n})]\geq \varepsilon )\leq \exp \left(-{\frac {\varepsilon }{2B}}\log \left(1+{\frac {B\varepsilon }{{\tilde {\sigma }}^{2}}}\right)\right).

McDiarmid's Inequality (Bernstein form)^[4] — Let $f:{\mathcal {X}}^{n}\rightarrow \mathbb {R}$ satisfy the bounded differences property with bounds $c_{1},c_{2},\dots ,c_{n}$ . Let $B$ and ${\tilde {\sigma }}^{2}$ be defined as at the beginning of this section.

Then, for any $\varepsilon >0$ ,

{\text{P}}(f(X_{1},\ldots ,X_{n})-\mathbb {E} [f(X_{1},\ldots ,X_{n})]\geq \varepsilon )\leq \exp \left(-{\frac {\varepsilon ^{2}}{2\left({\tilde {\sigma }}^{2}+{\frac {B\varepsilon }{3}}\right)}}\right).

Proof

The following proof of McDiarmid's inequality^[2] constructs the Doob martingale tracking the conditional expected value of the function as more and more of its arguments are sampled and conditioned on, and then applies a martingale concentration inequality (Azuma's inequality). An alternate argument avoiding the use of martingales also exists, taking advantage of the independence of the function arguments to provide a Chernoff-bound-like argument.^[4]

For better readability, we will introduce a notational shorthand: $z_{i\rightharpoondown j}$ will denote $z_{i},\dots ,z_{j}$ for any $z\in {\mathcal {X}}^{n}$ and integers $1\leq i\leq j\leq n$ , so that, for example,

f(X_{1\rightharpoondown (i-1)},y,x_{(i+1)\rightharpoondown n}):=f(X_{1},\ldots ,X_{i-1},y,x_{i+1},\ldots ,x_{n}).

Pick any $x_{1}',x_{2}',\ldots ,x_{n}'$ . Then, for any $x_{1},x_{2},\ldots ,x_{n}$ , by triangle inequality,

{\begin{aligned}&|f(x_{1\rightharpoondown n})-f(x'_{1\rightharpoondown n})|\\[6pt]\leq {}&|f(x_{1\rightharpoondown \,n})-f(x'_{1\rightharpoondown (n-1)},x_{n})|+c_{n}\\\leq {}&|f(x_{1\rightharpoondown n})-f(x'_{1\rightharpoondown (n-2)},x_{(n-1)\rightharpoondown n})|+c_{n-1}+c_{n}\\\leq {}&\ldots \\\leq {}&\sum _{i=1}^{n}c_{i},\end{aligned}}

and thus $f$ is bounded.

Since $f$ is bounded, define the Doob martingale $\{Z_{i}\}$ (each $Z_{i}$ being a random variable depending on the random values of $X_{1},\ldots ,X_{i}$ ) as

Z_{i}:=\mathbb {E} [f(X_{1\rightharpoondown n})\mid X_{1\rightharpoondown i}]

for all $i\geq 1$ and $Z_{0}:=\mathbb {E} [f(X_{1\rightharpoondown n})]$ , so that $Z_{n}=f(X_{1\rightharpoondown n})$ .

Now define the random variables for each $i$

{\begin{aligned}U_{i}&:=\sup _{x\in {\mathcal {X}}_{i}}\mathbb {E} [f(X_{1\rightharpoondown (i-1)},x,X_{(i+1)\rightharpoondown n})\mid X_{1\rightharpoondown (i-1)},X_{i}=x]-\mathbb {[} f(X_{1\rightharpoondown (i-1)},X_{i\rightharpoondown n})\mid X_{1\rightharpoondown (i-1)}],\\L_{i}&:=\inf _{x\in {\mathcal {X}}_{i}}\mathbb {E} [f(X_{1\rightharpoondown (i-1)},x,X_{(i+1)\rightharpoondown n})\mid X_{1\rightharpoondown (i-1)},X_{i}=x]-\mathbb {[} f(X_{1\rightharpoondown (i-1)},X_{i\rightharpoondown n})\mid X_{1\rightharpoondown (i-1)}].\\\end{aligned}}

Since $X_{i},\ldots ,X_{n}$ are independent of each other, conditioning on $X_{i}=x$ does not affect the probabilities of the other variables, so these are equal to the expressions

{\begin{aligned}U_{i}&=\sup _{x\in {\mathcal {X}}_{i}}\mathbb {E} [f(X_{1\rightharpoondown (i-1)},x,X_{(i+1)\rightharpoondown n})-f(X_{1\rightharpoondown (i-1)},X_{i\rightharpoondown n})\mid X_{1\rightharpoondown (i-1)}],\\L_{i}&=\inf _{x\in {\mathcal {X}}_{i}}\mathbb {E} [f(X_{1\rightharpoondown (i-1)},x,X_{(i+1)\rightharpoondown n})-f(X_{1\rightharpoondown (i-1)},X_{i\rightharpoondown n})\mid X_{1\rightharpoondown (i-1)}].\\\end{aligned}}

Note that $L_{i}\leq Z_{i}-Z_{i-1}\leq U_{i}$ . In addition,

{\begin{aligned}U_{i}-L_{i}&=\sup _{u\in {\mathcal {X}}_{i},\ell \in {\mathcal {X}}_{i}}\mathbb {E} [f(X_{1\rightharpoondown (i-1)},u,X_{(i+1)\rightharpoondown n})\mid X_{1\rightharpoondown (i-1)}]-\mathbb {E} [f(X_{1\rightharpoondown (i-1)},\ell ,X_{(i+1)\rightharpoondown n})\mid X_{1\rightharpoondown (i-1)}]\\[6pt]&=\sup _{u\in {\mathcal {X}}_{i},\ell \in {\mathcal {X}}_{i}}\mathbb {E} [f(X_{1\rightharpoondown (i-1)},u,X_{(i+1)\rightharpoondown n})-f(X_{1\rightharpoondown (i-1)},l,X_{(i+1)\rightharpoondown n})\mid X_{1\rightharpoondown (i-1)}]\\&\leq \sup _{x_{u}\in {\mathcal {X}}_{i},x_{l}\in {\mathcal {X}}_{i}}\mathbb {E} [c_{i}\mid X_{1\rightharpoondown (i-1)}]\\[6pt]&\leq c_{i}\end{aligned}}

Then, applying the general form of Azuma's inequality to $\left\{Z_{i}\right\}$ , we have

{\text{P}}(f(X_{1},\ldots ,X_{n})-\mathbb {E} [f(X_{1},\ldots ,X_{n})]\geq \varepsilon )=\operatorname {P} (Z_{n}-Z_{0}\geq \varepsilon )\leq \exp \left(-{\frac {2\varepsilon ^{2}}{\sum _{i=1}^{n}c_{i}^{2}}}\right).

The one-sided bound in the other direction is obtained by applying Azuma's inequality to $\left\{-Z_{i}\right\}$ and the two-sided bound follows from a union bound. $\square$

Related Research Articles

In mathematics, the $L p$ spaces are function spaces defined using a natural generalization of the $p$ -norm for finite-dimensional vector spaces. They are sometimes called Lebesgue spaces, named after Henri Lebesgue, although according to the Bourbaki group they were first introduced by Frigyes Riesz.

In mathematics, an infinite series of numbers is said to converge absolutely if the sum of the absolute values of the summands is finite. More precisely, a real or complex series $is said to converge absolutely if for some real number Similarly, an improper integral of a function, is said to converge absolutely if the integral of the absolute value of the integrand is finite—that is, if$

In mathematical analysis, Hölder's inequality, named after Otto Hölder, is a fundamental inequality between integrals and an indispensable tool for the study of $L p$ spaces.

In mathematics, the uniform boundedness principle or Banach–Steinhaus theorem is one of the fundamental results in functional analysis. Together with the Hahn–Banach theorem and the open mapping theorem, it is considered one of the cornerstones of the field. In its basic form, it asserts that for a family of continuous linear operators whose domain is a Banach space, pointwise boundedness is equivalent to uniform boundedness in operator norm.

In mathematics, Fatou's lemma establishes an inequality relating the Lebesgue integral of the limit inferior of a sequence of functions to the limit inferior of integrals of these functions. The lemma is named after Pierre Fatou.

Vapnik–Chervonenkis theory was developed during 1960–1990 by Vladimir Vapnik and Alexey Chervonenkis. The theory is a form of computational learning theory, which attempts to explain the learning process from a statistical point of view.

In probability theory, the Azuma–Hoeffding inequality gives a concentration result for the values of martingales that have bounded differences.

In probability theory, a Chernoff bound is an exponentially decreasing upper bound on the tail of a random variable based on its moment generating function. The minimum of all such exponential bounds forms the Chernoff or Chernoff-Cramér bound, which may decay faster than exponential. It is especially useful for sums of independent random variables, such as sums of Bernoulli random variables.

In probability theory, Hoeffding's inequality provides an upper bound on the probability that the sum of bounded independent random variables deviates from its expected value by more than a certain amount. Hoeffding's inequality was proven by Wassily Hoeffding in 1963.

In extremal graph theory, Szemerédi’s regularity lemma states that a graph can be partitioned into a bounded number of parts so that the edges between parts are regular. The lemma shows that properties of random graphs can be applied to general graphs like counting the copies of a given subgraph within graphs. Endre Szemerédi proved the lemma over bipartite graphs for his theorem on arithmetic progressions in 1975 and for general graphs in 1978. Variants of the lemma use different notions of regularity and apply to other mathematical objects like hypergraphs.

In the theory of probability, the Glivenko–Cantelli theorem, named after Valery Ivanovich Glivenko and Francesco Paolo Cantelli, describes the asymptotic behaviour of the empirical distribution function as the number of independent and identically distributed observations grows. Specifically, the empirical distribution function converges uniformly to the true distribution function almost surely.

In the mathematical theory of probability, a Doob martingale is a stochastic process that approximates a given random variable and has the martingale property with respect to the given filtration. It may be thought of as the evolving sequence of best approximations to the random variable based on information accumulated up to a certain time.

In mathematics, the theory of optimal stopping or early stopping is concerned with the problem of choosing a time to take a particular action, in order to maximise an expected reward or minimise an expected cost. Optimal stopping problems can be found in areas of statistics, economics, and mathematical finance. A key example of an optimal stopping problem is the secretary problem. Optimal stopping problems can often be written in the form of a Bellman equation, and are therefore often solved using dynamic programming.

In the theory of probability and statistics, the Dvoretzky–Kiefer–Wolfowitz–Massart inequality provides a bound on the worst case distance of an empirically determined distribution function from its associated population distribution function. It is named after Aryeh Dvoretzky, Jack Kiefer, and Jacob Wolfowitz, who in 1956 proved the inequality

In mathematics, Doob's martingale inequality, also known as Kolmogorov’s submartingale inequality is a result in the study of stochastic processes. It gives a bound on the probability that a submartingale exceeds any given value over a given interval of time. As the name suggests, the result is usually given in the case that the process is a martingale, but the result is also valid for submartingales.

In probability theory, Bernstein inequalities give bounds on the probability that the sum of random variables deviates from its mean. In the simplest case, let X₁, ..., X_n be independent Bernoulli random variables taking values +1 and −1 with probability 1/2, then for every positive $,$

In mathematics, the Grothendieck inequality states that there is a universal constant $with the following property. If M ij is an n \times n matrix with$

Anatoly Alexeyevich Karatsuba was a Russian mathematician working in the field of analytic number theory, p-adic numbers and Dirichlet series.

In probability theory, concentration inequalities provide mathematical bounds on the probability of a random variable deviating from some value.

In additive number theory, an area of mathematics, the Erdős–Tetali theorem is an existence theorem concerning economical additive bases of every order. More specifically, it states that for every fixed integer $, there exists a subset of the natural numbers satisfying$

References

↑ McDiarmid, Colin (1989). "On the method of bounded differences". Surveys in Combinatorics, 1989: Invited Papers at the Twelfth British Combinatorial Conference: 148–188.
1 2 Doob, J. L. (1940). "Regularity properties of certain families of chance variables" (PDF). Transactions of the American Mathematical Society. 47 (3): 455–486. doi: 10.2307/1989964 . JSTOR 1989964.
↑ Chou, Chi-Ning; Love, Peter J.; Sandhu, Juspreet Singh; Shi, Jonathan (2022). "Limitations of Local Quantum Algorithms on Random Max-k-XOR and Beyond". 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022). 229: 41:13. arXiv: 2108.06049 . doi:10.4230/LIPIcs.ICALP.2022.41 . Retrieved 8 July 2022.
1 2 3 4 Ying, Yiming (2004). "McDiarmid's inequalities of Bernstein and Bennett forms" (PDF). City University of Hong Kong. Retrieved 10 July 2022.
↑ Combes, Richard (2015). "An extension of McDiarmid's inequality". arXiv: 1511.05240 [cs.LG].
↑ Wu, Xinxing; Zhang, Junping (April 2018). "Distribution-dependent concentration inequalities for tighter generalization bounds". Science China Information Sciences. 61 (4): 048105:1–048105:3. arXiv: 1607.05506 . doi:10.1007/s11432-017-9225-2. S2CID 255199895 . Retrieved 10 July 2022.
↑ Kontorovich, Aryeh (22 June 2014). "Concentration in unbounded metric spaces and algorithmic stability". Proceedings of the 31st International Conference on Machine Learning. 32 (2): 28–36. arXiv: 1309.1007 . Retrieved 10 July 2022.
1 2 Maurer, Andreas; Pontil, Pontil (2021). "Concentration inequalities under sub-Gaussian and sub-exponential conditions" (PDF). Advances in Neural Information Processing Systems. 34: 7588–7597. Retrieved 10 July 2022.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] McDiarmid, Colin (1989). "On the method of bounded differences". Surveys in Combinatorics, 1989: Invited Papers at the Twelfth British Combinatorial Conference: 148–188.

[Doob-2] 1 2 Doob, J. L. (1940). "Regularity properties of certain families of chance variables" (PDF). Transactions of the American Mathematical Society. 47 (3): 455–486. doi: 10.2307/1989964 . JSTOR 1989964.

[3] Chou, Chi-Ning; Love, Peter J.; Sandhu, Juspreet Singh; Shi, Jonathan (2022). "Limitations of Local Quantum Algorithms on Random Max-k-XOR and Beyond". 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022). 229: 41:13. arXiv: 2108.06049 . doi:10.4230/LIPIcs.ICALP.2022.41 . Retrieved 8 July 2022.

[ying-4] 1 2 3 4 Ying, Yiming (2004). "McDiarmid's inequalities of Bernstein and Bennett forms" (PDF). City University of Hong Kong. Retrieved 10 July 2022.

[5] Combes, Richard (2015). "An extension of McDiarmid's inequality". arXiv: 1511.05240 [cs.LG].

[6] Wu, Xinxing; Zhang, Junping (April 2018). "Distribution-dependent concentration inequalities for tighter generalization bounds". Science China Information Sciences. 61 (4): 048105:1–048105:3. arXiv: 1607.05506 . doi:10.1007/s11432-017-9225-2. S2CID 255199895 . Retrieved 10 July 2022.

[7] Kontorovich, Aryeh (22 June 2014). "Concentration in unbounded metric spaces and algorithmic stability". Proceedings of the 31st International Conference on Machine Learning. 32 (2): 28–36. arXiv: 1309.1007 . Retrieved 10 July 2022.

[subexponential-8] 1 2 Maurer, Andreas; Pontil, Pontil (2021). "Concentration inequalities under sub-Gaussian and sub-exponential conditions" (PDF). Advances in Neural Information Processing Systems. 34: 7588–7597. Retrieved 10 July 2022.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

McDiarmid's inequality

Contents

Statement

Extensions

Unbalanced distributions

Differences bounded with high probability

Sub-Gaussian and sub-exponential norms

Bennett and Bernstein forms

Proof

See also

Related Research Articles

References