In survey methodology, the design effect (generally denoted as or ) is a measure of the expected impact of a sampling design on the variance of an estimator for some parameter. It is calculated as the ratio of the variance of an estimator based on a sample from an (often) complex sampling design, to the variance of an alternative estimator based on a simple random sample (SRS) of the same number of elements. [1] : 258 The (be it estimated, or known a priori) can be used to adjust the variance of an estimator in cases where the sample is not drawn using simple random sampling. It may also be useful in sample size calculations and for quantifying the representativeness of a sample. The term "design effect" was coined by Leslie Kish in 1965.
The design effect is a positive real number that indicates an inflation (), or deflation () in the variance of an estimator for some parameter, that is due to the study not using SRS (with , when the variances are identical). [2] : 53, 54
Some potential complex sampling that could introduce that is different than 1 include: cluster sampling (such as when there is correlation between observations), stratified sampling, cluster randomized controlled trial, disproportional (unequal probability) sample, non-coverage, non-response, statistical adjustments of the data, etc..
can be used in sample size calculations, quantifying the representative of a sample (to a target population), as well as for adjusting (often inflating) the variance of some estimator (in cases when we can calculate that estimator's variance assuming SRS). [3]
The term "Design effect" was coined by Leslie Kish in 1965. [1] : 88, 258 Ever since, many calculations (and estimators) have been proposed, in the literature, for describing the effect of known sampling design on the increase/decrease in the variance of estimators of interest. In general, the design effect varies between statistics of interests, such as the total or ratio mean; it also matters if the design (e.g.: selection probabilities) are correlated with the outcome of interest. And lastly, it is influenced by the distribution of the outcome itself. All of these should be considered when estimating and using design effect in practice. [4] : 13
The design effect, commonly denoted by (at times with different subscripts), is the ratio of two theoretical variances for estimators of some parameter (): [1] [5]
So that:
Put differently, is by how much more the variance had increased (or decreased, in some cases) because our sample was drawn and adjusted to a specific sampling design (e.g.: using weights, or other measures) as it would be if instead the sample was from a simple random sampling (without replacement). There are many ways of calculation , depending on the parameter of interest (E.g.: population total, population mean, quantiles, ratio of quantities etc.), the estimator used, and the sampling design (e.g.: clustered sampling, stratified sampling, post-stratification, multi-stage sampling, etc.).
For estimating the population mean, the (for some sampling design p) is: [4] : 4 [2] : 54
Where is the sample size, f is the fraction of the sample from the population , is the (squared) finite population correction (FPC), and is the unbiassed sample variance.
The estimates of unit variance (or element variance) is when multiplying by the element's variance, so to incorporate all the complexities of the sample design. [1] : 259
Notice how the definition of is based on parameters of the population that we often do not know (i.e.: the variances of estimators under two different sampling designs). The process of estimating for specific designs will be described in the following section. [6] : 98
A general formula for the (theoretical) design effect of estimating a total (not the mean), for some design, is given in Cochran 1977. [2] : 54
A related quantity to , proposed by Kish in 1995, is called (Design Effect Factor). [7] : 56 [4] It is defined on the square root of the variance ratios, and also the denominator uses a simple random sample with replacement (srswr), instead of without replacement (srswor):
In this later definition (proposed in 1995, vs 1965) it was argued that srs "without replacement" (with its positive effect on the variance) should be captured in the definition of the design effect, since it is part of the sampling design. It is also more directly related to the use in inference (since we often use +Z*DE*SE, not +Z*DE*VAR when creating confidence intervals). Also since the finite population correction (FPC) is also harder to compute in some situations. But for many cases when the population is very large, Deft is (almost) the square root of ().
The original intention for was to have it "express the effects of sample design beyond the elemental variability , removing both the unit of measurement and sample size as nuisance parameters", this is done in order to make the design effect generalizable (relevant for) many statistics and variables within the same survey (and even between surveys). [7] : 55 However, followup works have shown that the calculation of design effect, for parameters such as a population total or mean, has dependence on the variability of the outcome measure, which limits the original aspiration of Kish for this measure. However, this statement may loosely (i.e.: under some conditions) be true for the weighted mean. [4] : 5
The effective sample size, also defined by Kish in 1965, is the original sample size divided by the design effect. [1] : 162, 259 [8] : 190, 192 This quantity reflects what would be the sample size that is needed to achieve the current variance of the estimator (for some parameter) with the existing design, if the sample design (and its relevant parameter estimator) were based on a simple random sample. [9]
Namely:
Put differently, it says how many responses we are left with when using an estimator that correctly adjusts for the design effect of the sampling design. For example, using the weighted mean with inverse probability weighting, instead of the simple mean.
It is also possible to get the effective sample size ratio by taking the inverse of (i.e.: ).
When using Kish's design effect for unequal weights, you may use the following simplified formula for "Kish's Effective Sample Size" [10] [1] : 162, 259
Different sampling designs differ substantially in their impact on estimators (such as the mean) in terms of their bias and variance.
For example, in the cluster sampling case the units may have equal or unequal selection probabilities, irrespective to their intra-class correlation (and their negative effect of increasing the variance of our estimators). In the case of stratified sampling, the probabilities may be equal (EPSEM) or unequal. But regardless, the usage of the prior information on the stratum size in the population, during the sampling stage, could yield statistical efficiency of our estimators. For example: if we know that gender is correlated with our outcome of interest, and also know that the male-female ratio for some population is 50%-50%. Then if we made sure to sample exactly half of each gender, we've thus reduced the variance of the estimators because we've removed the variability caused by unequal proportion of males-females in our sample. Lastly, in case of adjusting to non-coverage, non-response or some stratum split of the population (unavailable during the sampling stage), we may use statistical procedures (E.g.: post-stratification and others). The result of such procedures may lead to estimations of the sampling probabilities that are similar, or very different, than the true sampling probabilities of the units. The quality of these estimators depends on the quality of the auxiliary information and the missing at random assumptions used in creating them. Even when these sampling probability estimators (propensity scores) manage to capture most of the phenomena that has produced them - the impact of the variable selection probabilities on the estimators may be small or large, depending on the data (details in the next section).
Due to the large variety in sampling designs (with or without an effect on unequal selection probabilities), different formulas have been developed to capture the potential design effect, as well as to estimate the correct variance of estimators. Sometimes, these different design effects can be compounded together (as in the case of unequal selection probability and cluster sampling, more details in the following sections). Whether or not to use these formulas, or just assume SRS, depends on to expected amount of bias reduced vs the increase in estimator variance (and in the overhead of methodological and technical complexity). [1] : 426
There are various ways to sample units so that each unit would have the exact same probability of selection. Such methods are called equal probability sampling (EPSEM) methods. Some of the more basic methods include simple random sample (SRS, either with or without replacement) and systematic sampling for getting a fixed sample size. There is also Bernoulli sampling with a random sample size. More advanced techniques such as stratified sampling and cluster sampling can also be designed to be EPSEM. For example, in cluster sampling we can make sure to sample each cluster with probability that is proportional to its size, and then measure all the units inside the cluster. A more complex method for cluster sampling is to use a two-stage sampling by which we sample clusters at the first stage (as before, proportional to cluster size), and sample from each cluster at the second stage using SRS with a fixed proportion (E.g.: sample half of the cluster). [11] : 3–8
In their works, Kish and others highlights several known reasons that lead to unequal selection probabilities: [1] : 425 [8] : 185 [7] : 69 [12] : 50, 395 [13] : 306
When the sampling design is fully known (leading to some probability of selection for some element from strata h), and the non-response is measurable (i.e.: we know that only observations answered in strata h), then an exactly known inverse probability weight can be calculated for each element i from strata h using:. [8] : 186 Sometimes a statistical adjustment, such as post-stratification or raking, is used for estimating the selection probability. E.g.: when comparing the sample we have with same target population, also known as matching to controls. The estimation process may be focused only on adjusting the existing population to an alternative population (for example, if trying to extrapolate from a panel drawn from several regions to an entire country). In such a case, the adjust might be focused on some calibration factor and the weights be calculated as . [8] : 187 However, in other cases, both the under-coverage and non-response are all modeled in one go as part of the statistical adjustment, which leads to an estimation of the overall sampling probability (let's say ). In such a case, the weights are simply: . Notice that when statistical adjustments are used, is often estimated based on some model. The formulation in the following sections assume this is known, which is not true for statistical adjustments (since we only have ). However, if it is assumed that the estimation error of is very small then the following sections can be used as if it was known. Having this assumption be true depends on the size of the sample used for modeling, and is worth keeping in mind during analysis.
When the selection probabilities may be different, the sample size is random, and the pairwise selection probabilities are independent, we call this Poisson sampling. [17]
When adjusting for unequal probability selection through "individual case weights" (E.g.: inverse probability weighting), we get various types of estimators for quantities of interest. Estimators such as Horvitz–Thompson estimator yield unbiased estimators (if the selection probabilities are indeed known, or approximately known), for total and the mean of the population. Deville and Särndal (1992) coined the term "calibration estimator" for estimators using weights such that they satisfy some condition, such as having the sum of weights equal the population size. And more generally, that the weighted sum of weights is equal some quantity of an auxiliary variable: (e.g.: that the sum of weighted ages of the respondents is equal to the population size in each age bucket). [18] [15] : 132 [19] : 1
The two primary ways to argue about the properties of calibration estimators are: [15] : 133–134 [20]
As we will see later, some proofs in the literature rely on the randomization-based framework, while others focus on the model-based perspective. When moving from the mean to the weighted mean, more complexity is added. For example, in the context of survey methodology oftentimes the population size itself is considered an unknown quantity that is estimated. So in the calculation of the weighted mean is in fact based on a ratio estimator, with an estimator of the total at the numerator and an estimator of the population size in the denominator (making the variance calculation to be more complex). [21]
There are many types (and subtypes) of weights, with different ways to use and interpret them. With some weights their absolute value has some important meaning, while with other weights the important part is the relative values of the weights to each other. This section presents some of the more common types of weights so that they can be referenced in followup sections.
There are also indirect ways of applying "weighted" adjustments. For example, the existing cases may be duplicated to impute missing observations (e.g.: from non-response), with variance estimated using methods such as multiple imputation. A complementary dealing of data is to remove (give weight of 0) to some cases. For example, when wanting to reduce the influence of over-sampled groups that are less essential for some analysis. Both cases are similar in nature to inverse probability weighting but the application in practice gives more/less rows of data (making the input potentially simpler to use in some software implementation), instead of applying an extra column of weights. Nevertheless, the consequences of such implementations are similar to just using weights. So while in the case of removing observations the data can easily be handled by common software implementations, the case of adding rows requires special adjustments for the uncertainty estimations. Not doing so may lead to erroneous conclusions(i.e.: there is no free lunch when using alternative representation of the underlying issues). [8] : 189, 190
The term "Haphazard weights", coined by Kish, is used to refer to weights that correspond to unequal selection probabilities, but ones that are not related to the expectancy or variance of the selected elements. [8] : 190, 191
When taking an unrestricted sample of elements, we can then randomly split these elements into disjoint stratum, each of them containing some size of elements so that . All elements in each strata has some (known) non-negative weight assigned to them (). The weight can be produced by the inverse of some unequal selection probability for elements in each strata (i.e.: inverse probability weighting following something like post-stratification). In this setting, Kish's design effect, for the increase in variance of the sample weighted mean due to this design (reflected in the weights), versus SRS of some outcome variable y (when there is no correlation between the weights and the outcome, i.e.: haphazard weights) is: [1] : 427 [8] : 191(4.2)
By treating each item from coming from its own stratum , Kish (in 1992) simplified the above formula to the (well known) following version: [8] : 191(4.3) [23] : 318 [4] : 8
This version of the formula is valid when one stratum had several observations taken from it (i.e.: each having the same weight), or when there are just many stratum were each one had one observation taken from it, but several of them had the same probability of selection. While the interpretation is slightly different, the calculation of the two scenarios comes out to be the same.
Notice that Kish's definition of the design effect is closely tied to the coefficient of variation (also termed relative variance, relvariance or relvar for short) of the weights (when using uncorrected (population level) sample standard deviation for estimation). This has several notations in the literature: [8] : 191 [12] : 396
Where is the population variance of , and is the mean. When the weights are normalized to sample size (so that their sum is equal to n and their mean is equal to 1), then and the formula reduces to . While it is true we assume the weights are fixed, we can think of their variance as the variance of an empirical distribution defined by sampling (with equal probability) one weight from our set of weights (similar to how we would think about the correlation of x and y in a simple linear regression).
The above formula gives the increase in the variance of the weighted mean based on "haphazard" weights, which reflects when y are observations selected using unequal selection probabilities (with no within-cluster correlation, and no relationship to the expectancy or variance of the outcome measurement); [8] : 190, 191 and y' are the observations we would have had if we got them from simple random sample, then:
From a model based perspective, [24] this formula holds when all n observations () are (at least approximately) uncorrelated (), with the same variance () in the response variable of interest (y). It also assumes the weights themselves are not a random variable but rather some known constants (E.g.: the inverse of probability of selection, for some pre-determined and known sampling design).
The following is a simplified proof for when there are no clusters (i.e.: no Intraclass correlation between element of the sample) and each strata includes only one observation: [24]
Transitions:
The conditions on y are trivially held if the y observations are i.i.d with the same expectation and variance. In such case we have , and we can estimate by using . [8] [25] If the y's are not all with the same expectations then we cannot use the estimated variance for calculation, since that estimation assumes that all s have the same expectation. Specifically, if there is a correlation between the weights and the outcome variable y, then it means that the expectation of y is not the same for all observations (but rather, dependent on the specific weight value for each observation). In such a case, while the design effect formula might still be correct (if the other conditions are met), it would require a different estimator for the variance of the weighted mean. For example, it might be better to use a weighted variance estimator.
If different s have different variances, then while the weighted variance could capture the correct population-level variance, the Kish's formula for the design effect may no longer be true.
A similar issue happens if there is some correlation structure in the samples (such as when using cluster sampling).
It is worth noting that some sources in the literature give the following alternative definition to Kish's design effect, stating it is: "the ratio of the variance of the weighted survey mean under disproportionate stratified sampling to the variance under proportionate stratified sampling when all stratum unit variances are equal". [23] : 318 [12] : 396
This definition can be slightly misleading, since it might be interpreted to mean that "proportionate stratified sampling" was achieved via stratified sampling, in which a pre-determined number of units is selected from each stratum. Such selection will yield reduced variance (as compared with simple random sample), since it removes some of the uncertainty in the specific number of elements per stratum. This is different than Kish's original definition which compared the variance of the design to a simple random sample (which would yield approximately probability proportionate to sample, but not exactly - due to the variance in sample sizes in each stratum). Park and Lee (2006) reflects on this by stating that "The rationale behind the above derivation is that the loss in precision of [the weighted mean] due to haphazard unequal weighting can be approximated by the ratio of the variance under disproportionate stratified sampling to that under the proportionate stratified sampling". [4] : 8 How far are these two definitions differ from each other is not mentioned in the literature.[ citation needed ] In his book from 1977, Cochran provides a formula for the proportional increase in variance due to deviation from optimum allocation (what, it Kish's formulas, would be called L). [2] : 116 However, the connection from that formula to Kish's L is not apparent.[ citation needed ]
Earlier papers would use the term . [8] : 192 As more definitions of design effect appeared, Kish's design effect for unequal selection probabilities was denoted (or ) or simply for short. [4] : 8 [12] : 396 [23] : 318 Kish's design effect is also known as the "Unequal Weighting Effect" (or just UWE), termed by Liu et al. in 2002. [26] : 2124
The estimator for the total is the "p-expanded with replacement" estimator (a.k.a.: pwr-estimator or Hansen and Hurwitz). It is based on a simple random sample (with replacement, denoted SIR) of m items () from a population of size M. Each item has a probability of (k from 1 to N) to be drawn in a single draw (, i.e.: it's a multinomial distribution). The probability that a specific will appear in our sample is . The "p-expanded with replacement" value is with the following expectancy: . Hence , the pwr-estimator, is an unbiased estimator for the sum total of y. [2] : 51
In 2000, Bruce D. Spencer proposed a formula for estimating the design effect for the variance of estimating the total (not the mean) of some quantity (), when there is correlation between the selection probabilities of the elements and the outcome variable of interest. [27]
In this setup, a sample of size n is drawn (with replacement) from a population of size N. Each item is drawn with probability (where , i.e.: multinomial distribution). The selection probabilities are used to define the Normalized (convex) weights: . Notice that for some random set of n items, the sum of weights will be equal to 1 only by expectancy () with some variability of the sum around it (i.e.: the sum of elements from a poisson binomial distribution). The relationship between and is defined by the following (population) simple linear regression:
Where is the outcome of element i, which linearly depends on with the intercept and slope . The residual from the fitted line is . We can also define the population variances of the outcome and the residuals as and . The correlation between and is .
Spencer's (approximate) design effect, for estimating the total of y, is: [27] : 138 [28] : 4 [12] : 401
Where:
This assumes that the regression model fits well so that the probability of selection and the residuals are independent, since it leads to the residuals, and the square residuals, to be uncorrelated with the weights. I.e.: that and also . [27] : 138
When the population size (N) is very large, the formula can be written as: [23] : 319
(since , where )
This approximation assumes that the linear relationship between P and y holds. And also that the correlation of the weights with the errors, and the errors squared, are both zero. I.e.: and . [28] : 4
We notice that if , then (i.e.: the average of y). In such a case, the formula reduces to
Only if the variance of y is much larger than its mean then the right-most term is close to 0 (i.e.: ), which reduces Spencer's design effect (for the estimatoed total) to be equal to Kish's design effect (for the ratio means): [28] : 5 . Otherwise, the two formula's will yield different results, which demonstrates the difference between the design effect of the total vs the one of the mean.
In 2001, Park and Lee extended Spencer's formula to the case of the ratio-mean (i.e.: estimating the mean by dividing the estimator of the total with the estimator of the population size). It is: [28] : 4
Where:
Park and Lee's formula is exactly equal to Kish's formula when . Both formula's relate to the design effect of the mean of y (while Spencer's Deff relates to the estimation of the total).
In general, the Deff for the total () tends to be less efficient than the Deff for the ratio mean () when is small. And in general, impacts the efficiency of both design effects. [4] : 8
For data collected using cluster sampling we assume the following structure:
When clusters are all of the same size , the design effect Deff, proposed by Kish in 1965 (and later re-visited by others), is given by: [1] : 162 [12] : 399 [4] : 9 [30] [31] [13] : 241
It is sometimes also denoted as . [26] : 2124
In various papers, when cluster sizes are not equal, the above formula is also used with as the average cluster size (it is also sometimes denoted as ). [32] [24] : 105 In such cases, Kish's formula (using the average cluster weight) serves as a conservative (upper bound) of the exact design effect. [24] : 106
Alternative formulas exists for unequal cluster sizes. [1] : 193 Followup work had discussed the sensitivity of using the average cluster size with various assumptions. [33]
In his paper from 1987, Kish proposed a combined design effect that incorporates both the effects due to weighting that accounts for unequal selection probabilities as well as cluster sampling: [32] [24] : 105 [34] : 4 [28] : 2
With notations similar to above.
This formula received a model based justification, proposed in 1999 by Gabler et al. [24]
In 2000, Liu and Aragon proposed a decomposition of unequal selection probabilities design effect for different strata in stratified sampling. [35] In 2002, Liu et al. extended that work to account for stratified sample were within each strata is a set of unequal selection probability weights. The cluster sampling is either global or per strata. [26] Similar work was done also by Park et al. in 2003. [36]
Deff is primarily used for several purposes: [13] : 85
In his 1995 paper, Kish proposed the following categorization of when Deff is, and is not, useful: [7] : 57–62
When planning the sample size, work has been done to correct the design effect so to separates the interviewer effect (measurement error) from the effects of the sampling design on the sampling variance. [37]
While Kish originally hoped to have the design effect be able to be agnostic as possible to the underlying distribution of the data, sampling probabilities, their correlations, and the statistics of interest - followup research has shown that these do influence the design effect. Hence, careful attention to these properties should be taken into account when deciding which Deff calculation to use, and how to use it. [4] : 13 [28] : 6
Kish's design effect is implemented in various statistical software:
The term "Design effect" was introduced by Leslie Kish in 1965 in his book "Survey Sampling". [1] : 88, 258 In his paper from 1995, [7] : 73 Kish mentions that a similar concept, termed "Lexis ratio", was described at the end of the 19th century. The closely related Intraclass correlation was described by Fisher in 1950, while computations of ratios of variances were already published by Kish and others from the late 40s to the 50s. One of the precursors for Kish's definition was the work done by Cornfield in 1951. [39] [4]
In his original book from 1965, Kish proposed the general definition for the design effect (ratio of variances of two estimators, one from a sample with some design and the other from a simple random sample). In his book, Kish proposed the formula for the design effect of cluster sampling (with intraclass correlation); [1] : 162 as well as the famous design effect formula for unequal probability sampling. [1] : 427 These are often known as "Kish's design effect", and have been merged later into a single formula.
The Cauchy distribution, named after Augustin Cauchy, is a continuous probability distribution. It is also known, especially among physicists, as the Lorentz distribution, Cauchy–Lorentz distribution, Lorentz(ian) function, or Breit–Wigner distribution. The Cauchy distribution is the distribution of the x-intercept of a ray issuing from with a uniformly distributed angle. It is also the distribution of the ratio of two independent normally distributed random variables with mean zero.
In statistics and probability theory, the median is the value separating the higher half from the lower half of a data sample, a population, or a probability distribution. For a data set, it may be thought of as "the middle" value. The basic feature of the median in describing data compared to the mean is that it is not skewed by a small proportion of extremely large or small values, and therefore provides a better representation of the center. Median income, for example, may be a better way to describe center of the income distribution because increases in the largest incomes alone have no effect on median. For this reason, the median is of central importance in robust statistics.
In statistics, the standard deviation is a measure of the amount of variation of a random variable expected about its mean. A low standard deviation indicates that the values tend to be close to the mean of the set, while a high standard deviation indicates that the values are spread out over a wider range.
In probability theory and statistics, variance is the expected value of the squared deviation from the mean of a random variable. The standard deviation (SD) is obtained as the square root of the variance. Variance is a measure of dispersion, meaning it is a measure of how far a set of numbers is spread out from their average value. It is the second central moment of a distribution, and the covariance of the random variable with itself, and it is often represented by , , , , or .
The weighted arithmetic mean is similar to an ordinary arithmetic mean, except that instead of each of the data points contributing equally to the final average, some data points contribute more than others. The notion of weighted mean plays a role in descriptive statistics and also occurs in a more general form in several other areas of mathematics.
In probability theory and statistics, the beta distribution is a family of continuous probability distributions defined on the interval [0, 1] or in terms of two positive parameters, denoted by alpha (α) and beta (β), that appear as exponents of the variable and its complement to 1, respectively, and control the shape of the distribution.
In statistics, the Pearson correlation coefficient (PCC) is a correlation coefficient that measures linear correlation between two sets of data. It is the ratio between the covariance of two variables and the product of their standard deviations; thus, it is essentially a normalized measurement of the covariance, such that the result always has a value between −1 and 1. As with covariance itself, the measure can only reflect a linear correlation of variables, and ignores many other types of relationships or correlations. As a simple example, one would expect the age and height of a sample of teenagers from a high school to have a Pearson correlation coefficient significantly greater than 0, but less than 1.
In mathematics, the moments of a function are certain quantitative measures related to the shape of the function's graph. If the function represents mass density, then the zeroth moment is the total mass, the first moment is the center of mass, and the second moment is the moment of inertia. If the function is a probability distribution, then the first moment is the expected value, the second central moment is the variance, the third standardized moment is the skewness, and the fourth standardized moment is the kurtosis. The mathematical concept is closely related to the concept of moment in physics.
In statistics, an effect size is a value measuring the strength of the relationship between two variables in a population, or a sample-based estimate of that quantity. It can refer to the value of a statistic calculated from a sample of data, the value of a parameter for a hypothetical population, or to the equation that operationalizes how statistics or parameters lead to the effect size value. Examples of effect sizes include the correlation between two variables, the regression coefficient in a regression, the mean difference, or the risk of a particular event happening. Effect sizes complement statistical hypothesis testing, and play an important role in power analyses, sample size planning, and in meta-analyses. The cluster of data-analysis methods concerning effect sizes is referred to as estimation statistics.
Importance sampling is a Monte Carlo method for evaluating properties of a particular distribution, while only having samples generated from a different distribution than the distribution of interest. Its introduction in statistics is generally attributed to a paper by Teun Kloek and Herman K. van Dijk in 1978, but its precursors can be found in statistical physics as early as 1949. Importance sampling is also related to umbrella sampling in computational physics. Depending on the application, the term may refer to the process of sampling from this alternative distribution, the process of inference, or both.
Estimation theory is a branch of statistics that deals with estimating the values of parameters based on measured empirical data that has a random component. The parameters describe an underlying physical setting in such a way that their value affects the distribution of the measured data. An estimator attempts to approximate the unknown parameters using the measurements. In estimation theory, two approaches are generally considered:
In statistics, ordinary least squares (OLS) is a type of linear least squares method for choosing the unknown parameters in a linear regression model by the principle of least squares: minimizing the sum of the squares of the differences between the observed dependent variable in the input dataset and the output of the (linear) function of the independent variable.
Sample size determination or estimation is the act of choosing the number of observations or replicates to include in a statistical sample. The sample size is an important feature of any empirical study in which the goal is to make inferences about a population from a sample. In practice, the sample size used in a study is usually determined based on the cost, time, or convenience of collecting the data, and the need for it to offer sufficient statistical power. In complex studies, different sample sizes may be allocated, such as in stratified surveys or experimental designs with multiple treatment groups. In a census, data is sought for an entire population, hence the intended sample size is equal to the population. In experimental design, where a study may be divided into different treatment groups, there may be different sample sizes for each group.
In statistics, generalized least squares (GLS) is a method used to estimate the unknown parameters in a linear regression model. It is used when there is a non-zero amount of correlation between the residuals in the regression model. GLS is employed to improve statistical efficiency and reduce the risk of drawing erroneous inferences, as compared to conventional least squares and weighted least squares methods. It was first described by Alexander Aitken in 1935.
The mean absolute difference (univariate) is a measure of statistical dispersion equal to the average absolute difference of two independent values drawn from a probability distribution. A related statistic is the relative mean absolute difference, which is the mean absolute difference divided by the arithmetic mean, and equal to twice the Gini coefficient. The mean absolute difference is also known as the absolute mean difference and the Gini mean difference (GMD). The mean absolute difference is sometimes denoted by Δ or as MD.
In statistics, the jackknife is a cross-validation technique and, therefore, a form of resampling. It is especially useful for bias and variance estimation. The jackknife pre-dates other common resampling methods such as the bootstrap. Given a sample of size , a jackknife estimator can be built by aggregating the parameter estimates from each subsample of size obtained by omitting one observation.
The sample mean or empirical mean, and the sample covariance or empirical covariance are statistics computed from a sample of data on one or more random variables.
In statistics, efficiency is a measure of quality of an estimator, of an experimental design, or of a hypothesis testing procedure. Essentially, a more efficient estimator needs fewer input data or observations than a less efficient one to achieve the Cramér–Rao bound. An efficient estimator is characterized by having the smallest possible variance, indicating that there is a small deviance between the estimated value and the "true" value in the L2 norm sense.
Inverse probability weighting is a statistical technique for estimating quantities related to a population other than the one from which the data was collected. Study designs with a disparate sampling population and population of target inference are common in application. There may be prohibitive factors barring researchers from directly sampling from the target population such as cost, time, or ethical concerns. A solution to this problem is to use an alternate design strategy, e.g. stratified sampling. Weighting, when correctly applied, can potentially improve the efficiency and reduce the bias of unweighted estimators.
The ratio estimator is a statistical estimator for the ratio of means of two random variables. Ratio estimates are biased and corrections must be made when they are used in experimental or survey work. The ratio estimates are asymmetrical and symmetrical tests such as the t test should not be used to generate confidence intervals.