It has been suggested that this article be merged into Design effect . (Discuss) Proposed since May 2021. |

In statistics, **effective sample size** is a notion defined for a sample from a distribution when the observations in the sample are correlated or weighted. In 1965, Leslie Kish defined it as the original sample size divided by the design effect to reflect the variance from the current sampling design as compared to what would be if the sample was a simple random sample ^{ [1] }^{ [2] }^{:162,259}

Suppose a sample of several independent identically distributed observations is drawn from a distribution with mean and standard deviation . Then the mean of this distribution is estimated by the mean of the sample:

In that case, the variance of is given by

However, if the observations in the sample are correlated (in the intraclass correlation sense), then is somewhat higher. For instance, if all observations in the sample are completely correlated (), then regardless of .

The effective sample size is the unique value (not necessarily an integer) such that

is a function of the correlation between observations in the sample.

Suppose that all the (non-trivial) correlations are the same and greater than , i.e. if , then . Then

Therefore

In the case where , then . Similarly, if then . And if then .

The case where the correlations are not uniform is somewhat more complicated. Note that if the correlation is negative, the effective sample size may be larger than the actual sample size. If we allow the more general form (where ) then it is possible to construct correlation matrices that have an even when all correlations are positive. Intuitively, the maximal value of over all choices of the coefficients may be thought of as the information content of the observed data.

If the data has been weighted (the weights don't have to be normalized, i.e. have their sum equal to 1 or n, or some other constant), then several observations composing a sample have been pulled from the distribution with effectively 100% correlation with some previous sample. In this case, the effect is known as Kish's Effective Sample Size^{ [3] }^{ [2] }^{:162,259}

In statistics, the **standard deviation** is a measure of the amount of variation or dispersion of a set of values. A low standard deviation indicates that the values tend to be close to the mean of the set, while a high standard deviation indicates that the values are spread out over a wider range.

In probability theory and statistics, **variance** is the expectation of the squared deviation of a random variable from its mean. In other words, it measures how far a set of numbers is spread out from their average value. Variance has a central role in statistics, where some ideas that use it include descriptive statistics, statistical inference, hypothesis testing, goodness of fit, and Monte Carlo sampling. Variance is an important tool in the sciences, where statistical analysis of data is common. The variance is the square of the standard deviation, the second central moment of a distribution, and the covariance of the random variable with itself, and it is often represented by , , or .

The **weighted arithmetic mean** is similar to an ordinary arithmetic mean, except that instead of each of the data points contributing equally to the final average, some data points contribute more than others. The notion of weighted mean plays a role in descriptive statistics and also occurs in a more general form in several other areas of mathematics.

In probability theory and statistics, the **multivariate normal distribution**, **multivariate Gaussian distribution**, or **joint normal distribution** is a generalization of the one-dimensional (univariate) normal distribution to higher dimensions. One definition is that a random vector is said to be *k*-variate normally distributed if every linear combination of its *k* components has a univariate normal distribution. Its importance derives mainly from the multivariate central limit theorem. The multivariate normal distribution is often used to describe, at least approximately, any set of (possibly) correlated real-valued random variables each of which clusters around a mean value.

In statistics, **maximum likelihood estimation** (**MLE**) is a method of estimating the parameters of a probability distribution by maximizing a likelihood function, so that under the assumed statistical model the observed data is most probable. The point in the parameter space that maximizes the likelihood function is called the maximum likelihood estimate. The logic of maximum likelihood is both intuitive and flexible, and as such the method has become a dominant means of statistical inference.

In probability theory, **Chebyshev's inequality** guarantees that, for a wide class of probability distributions, no more than a certain fraction of values can be more than a certain distance from the mean. Specifically, no more than 1/*k*^{2} of the distribution's values can be *k* or more standard deviations away from the mean. The rule is often called Chebyshev's theorem, about the range of standard deviations around the mean, in statistics. The inequality has great utility because it can be applied to any probability distribution in which the mean and variance are defined. For example, it can be used to prove the weak law of large numbers.

In statistics, **correlation ** or **dependence ** is any statistical relationship, whether causal or not, between two random variables or bivariate data. In the broadest sense **correlation** is any statistical association, though it commonly refers to the degree to which a pair of variables are linearly related. Familiar examples of dependent phenomena include the correlation between the height of parents and their offspring, and the correlation between the price of a good and the quantity the consumers are willing to purchase, as it is depicted in the so-called demand curve.

In probability theory and statistics, a **covariance matrix** is a square matrix giving the covariance between each pair of elements of a given random vector. Any covariance matrix is symmetric and positive semi-definite and its main diagonal contains variances.

In statistics, the **mean squared error** (**MSE**) or **mean squared deviation** (**MSD**) of an estimator measures the average of the squares of the errors—that is, the average squared difference between the estimated values and the actual value. MSE is a risk function, corresponding to the expected value of the squared error loss. The fact that MSE is almost always strictly positive is because of randomness or because the estimator does not account for information that could produce a more accurate estimate.

In statistics, the **Pearson correlation coefficient**, also referred to as **Pearson's r**, the

In statistics, **Spearman's rank correlation coefficient** or **Spearman's ρ**, named after Charles Spearman and often denoted by the Greek letter (rho) or as , is a nonparametric measure of rank correlation. It assesses how well the relationship between two variables can be described using a monotonic function.

In signal processing, **cross-correlation** is a measure of similarity of two series as a function of the displacement of one relative to the other. This is also known as a *sliding dot product* or *sliding inner-product*. It is commonly used for searching a long signal for a shorter, known feature. It has applications in pattern recognition, single particle analysis, electron tomography, averaging, cryptanalysis, and neurophysiology. The cross-correlation is similar in nature to the convolution of two functions. In an autocorrelation, which is the cross-correlation of a signal with itself, there will always be a peak at a lag of zero, and its size will be the signal energy.

In statistics, particularly in hypothesis testing, the **Hotelling's T-squared distribution** (

In statistics, a **pivotal quantity** or **pivot** is a function of observations and unobservable parameters such that the function's probability distribution does not depend on the unknown parameters. A pivot quantity need not be a statistic—the function and its *value* can depend on the parameters of the model, but its *distribution* must not. If it is a statistic, then it is known as an *ancillary statistic.*

In statistics, the **bias** of an estimator is the difference between this estimator's expected value and the true value of the parameter being estimated. An estimator or decision rule with zero bias is called **unbiased**. In statistics, "bias" is an **objective** property of an estimator. Bias can also be measured with respect to the median, rather than the mean, in which case one distinguishes *median*-unbiased from the usual *mean*-unbiasedness property. Bias is a distinct concept from consistency. Consistent estimators converge in probability to the true value of the parameter, but may be biased or unbiased; see bias versus consistency for more.

A **ratio distribution** is a probability distribution constructed as the distribution of the ratio of random variables having two other known distributions. Given two random variables *X* and *Y*, the distribution of the random variable *Z* that is formed as the ratio *Z* = *X*/*Y* is a *ratio distribution*.

In probability theory and statistics, the **negative multinomial distribution** is a generalization of the negative binomial distribution to more than two outcomes.

A **product distribution** is a probability distribution constructed as the distribution of the product of random variables having two other known distributions. Given two statistically independent random variables *X* and *Y*, the distribution of the random variable *Z* that is formed as the product

The **generalized functional linear model** (**GFLM**) is an extension of the generalized linear model (GLM) that allows one to regress univariate responses of various types on functional predictors, which are mostly random trajectories generated by a square-integrable stochastic processes. Similarly to GLM, a link function relates the expected value of the response variable to a linear predictor, which in case of GFLM is obtained by forming the scalar product of the random predictor function with a smooth parameter function . Functional Linear Regression, Functional Poisson Regression and Functional Binomial Regression, with the important Functional Logistic Regression included, are special cases of GFLM. Applications of GFLM include classification and discrimination of stochastic processes and functional data.

**Batch normalization** is a method used to make artificial neural networks faster and more stable through normalization of the layers' inputs by re-centering and re-scaling. It was proposed by Sergey Ioffe and Christian Szegedy in 2015.

- ↑ Tom Leinster (December 18, 2014). "Effective Sample Size" (html).
- 1 2 Kish, Leslie (1965). "Survey Sampling". New York: John Wiley & Sons, Inc. ISBN 0-471-10949-5.Cite journal requires
`|journal=`

(help) - ↑ "Design Effects and Effective Sample Size" (html).

- M. B., Priestley (1981),
*Spectral Analysis and Time Series 1*, Academic Press , §5.3.

This page is based on this Wikipedia article

Text is available under the CC BY-SA 4.0 license; additional terms may apply.

Images, videos and audio are available under their respective licenses.

Text is available under the CC BY-SA 4.0 license; additional terms may apply.

Images, videos and audio are available under their respective licenses.