Cochran's C test

Last updated

Cochran's test, [1] named after William G. Cochran, is a one-sided upper limit variance outlier statistical test . The C test is used to decide if a single estimate of a variance (or a standard deviation) is significantly larger than a group of variances (or standard deviations) with which the single estimate is supposed to be comparable. The C test is discussed in many text books [2] [3] [4] and has been recommended by IUPAC [5] and ISO. [6] Cochran's C test should not be confused with Cochran's Q test, which applies to the analysis of two-way randomized block designs.

Contents

The C test assumes a balanced design, i.e. the considered full data set should consist of individual data series that all have equal size. The C test further assumes that each individual data series is normally distributed. Although primarily an outlier test, the C test is also in use as a simple alternative for regular homoscedasticity tests such as Bartlett's test, Levene's test and the Brown–Forsythe test to check a statistical data set for homogeneity of variances. An even simpler way to check homoscedasticity is provided by Hartley's Fmax test, [3] but Hartley's Fmax test has the disadvantage that it only accounts for the minimum and the maximum of the variance range, while the C test accounts for all variances within the range.

Description

The C test detects one exceptionally large variance value at a time. The corresponding data series is then omitted from the full data set. According to ISO standard 5725 [6] the C test may be iterated until no further exceptionally large variance values are detected, but such practice may lead to excessive rejections if the underlying data series are not normally distributed. The C test evaluates the ratio:

where:

Cj = Cochran's C statistic for data series j
Sj = standard deviation of data series j
N = number of data series that remain in the data set; N is decreased in steps of 1 upon each iteration of the C test
Si = standard deviation of data series i (1 ≤ iN)

The C test tests the null hypothesis (H0) against the alternative hypothesis (Ha):

H0: All variances are equal.
Ha: At least one variance value is significantly larger than the other variance values.

Critical values

The sample variance of data series j is considered an outlier at significance level α if Cj exceeds the upper limit critical value CUL. CUL depends on the desired significance level α, the number of considered data series N, and the number of data points (n) per data series. Selections of values for CUL have been tabulated at significance levels α = 0.01, [6] [7] [8] α = 0.025, [8] and α = 0.05. [6] [7] [8] CUL can also be calculated from: [8] [9]

Here:

CUL = upper limit critical value for one-sided test on a balanced design
α = significance level, e.g., 0.05
n = number of data points per data series
Fc = critical value of Fisher's F ratio; Fc can be obtained from tables of the F distribution [10] or using computer software for this function.

Generalization

The C test can be generalized to include unbalanced designs, one-sided lower limit tests and two-sided tests at any significance level α, for any number of data series N, and for any number of individual data points nj in data series j. [8] [9]

See also

Related Research Articles

Analysis of variance (ANOVA) is a collection of statistical models and their associated estimation procedures used to analyze the differences among means. ANOVA was developed by the statistician Ronald Fisher. ANOVA is based on the law of total variance, where the observed variance in a particular variable is partitioned into components attributable to different sources of variation. In its simplest form, ANOVA provides a statistical test of whether two or more population means are equal, and therefore generalizes the t-test beyond two means. In other words, the ANOVA is used to test the difference between two or more means.

<span class="mw-page-title-main">Standard deviation</span> In statistics, a measure of variation

In statistics, the standard deviation is a measure of the amount of variation of a random variable expected about its mean. A low standard deviation indicates that the values tend to be close to the mean of the set, while a high standard deviation indicates that the values are spread out over a wider range.

<span class="mw-page-title-main">Student's t-distribution</span> Probability distribution

In probability and statistics, Student's t distribution is a continuous probability distribution that generalizes the standard normal distribution. Like the latter, it is symmetric around zero and bell-shaped.

<span class="mw-page-title-main">Outlier</span> Observation far apart from others in statistics and data science

In statistics, an outlier is a data point that differs significantly from other observations. An outlier may be due to a variability in the measurement, an indication of novel data, or it may be the result of experimental error; the latter are sometimes excluded from the data set. An outlier can be an indication of exciting possibility, but can also cause serious problems in statistical analyses.

<i>F</i>-test Statistical hypothesis test, mostly using multiple restrictions

An F-test is any statistical test used to compare the variances of two samples or the ratio of variances between multiple samples. The test statistic, random variable F, is used to determine if the tested data has an F-distribution under the true null hypothesis, and true customary assumptions about the error term (ε). It is most often used when comparing statistical models that have been fitted to a data set, in order to identify the model that best fits the population from which the data were sampled. Exact "F-tests" mainly arise when the models have been fitted to the data using least squares. The name was coined by George W. Snedecor, in honour of Ronald Fisher. Fisher initially developed the statistic as the variance ratio in the 1920s.

<i>Z</i>-test Statistical test

A Z-test is any statistical test for which the distribution of the test statistic under the null hypothesis can be approximated by a normal distribution. Z-test tests the mean of a distribution. For each significance level in the confidence interval, the Z-test has a single critical value which makes it more convenient than the Student's t-test whose critical values are defined by the sample size. Both the Z-test and Student's t-test have similarities in that they both help determine the significance of a set of data. However, the z-test is rarely used in practice because the population deviation is difficult to determine.

In statistics, a studentized residual is the dimensionless ratio resulting from the division of a residual by an estimate of its standard deviation, both expressed in the same units. It is a form of a Student's t-statistic, with the estimate of error varying between points.

<span class="mw-page-title-main">One- and two-tailed tests</span> Alternative ways of computing the statistical significance of a parameter inferred from a data set

In statistical significance testing, a one-tailed test and a two-tailed test are alternative ways of computing the statistical significance of a parameter inferred from a data set, in terms of a test statistic. A two-tailed test is appropriate if the estimated value is greater or less than a certain range of values, for example, whether a test taker may score above or below a specific range of scores. This method is used for null hypothesis testing and if the estimated value exists in the critical areas, the alternative hypothesis is accepted over the null hypothesis. A one-tailed test is appropriate if the estimated value may depart from the reference value in only one direction, left or right, but not both. An example can be whether a machine produces more than one-percent defective products. In this situation, if the estimated value exists in one of the one-sided critical areas, depending on the direction of interest, the alternative hypothesis is accepted over the null hypothesis. Alternative names are one-sided and two-sided tests; the terminology "tail" is used because the extreme portions of distributions, where observations lead to rejection of the null hypothesis, are small and often "tail off" toward zero as in the normal distribution, colored in yellow, or "bell curve", pictured on the right and colored in green.

Sample size determination or estimation is the act of choosing the number of observations or replicates to include in a statistical sample. The sample size is an important feature of any empirical study in which the goal is to make inferences about a population from a sample. In practice, the sample size used in a study is usually determined based on the cost, time, or convenience of collecting the data, and the need for it to offer sufficient statistical power. In complex studies, different sample sizes may be allocated, such as in stratified surveys or experimental designs with multiple treatment groups. In a census, data is sought for an entire population, hence the intended sample size is equal to the population. In experimental design, where a study may be divided into different treatment groups, there may be different sample sizes for each group.

In statistics, Bartlett's test, named after Maurice Stevenson Bartlett, is used to test homoscedasticity, that is, if multiple samples are from populations with equal variances. Some statistical tests, such as the analysis of variance, assume that variances are equal across groups or samples, which can be verified with Bartlett's test.

In statistics, Levene's test is an inferential statistic used to assess the equality of variances for a variable calculated for two or more groups. This test is used because some common statistical procedures assume that variances of the populations from which different samples are drawn are equal. Levene's test assesses this assumption. It tests the null hypothesis that the population variances are equal. If the resulting p-value of Levene's test is less than some significance level (typically 0.05), the obtained differences in sample variances are unlikely to have occurred based on random sampling from a population with equal variances. Thus, the null hypothesis of equal variances is rejected and it is concluded that there is a difference between the variances in the population.

Omnibus tests are a kind of statistical test. They test whether the explained variance in a set of data is significantly greater than the unexplained variance, overall. One example is the F-test in the analysis of variance. There can be legitimate significant effects within a model even if the omnibus test is not significant. For instance, in a model with two independent variables, if only one variable exerts a significant effect on the dependent variable and the other does not, then the omnibus test may be non-significant. This fact does not affect the conclusions that may be drawn from the one significant variable. In order to test effects within an omnibus test, researchers often use contrasts.

In statistics and in particular statistical theory, unbiased estimation of a standard deviation is the calculation from a statistical sample of an estimated value of the standard deviation of a population of values, in such a way that the expected value of the calculation equals the true value. Except in some important situations, outlined later, the task has little relevance to applications of statistics since its need is avoided by standard procedures, such as the use of significance tests and confidence intervals, or by using Bayesian analysis.

In statistics, one-way analysis of variance is a technique to compare whether two samples' means are significantly different. This analysis of variance technique requires a numeric response variable "Y" and a single explanatory variable "X", hence "one-way".

Tukey's range test, also known as Tukey's test, Tukey method, Tukey's honest significance test, or Tukey's HSDtest, is a single-step multiple comparison procedure and statistical test. It can be used to find means that are significantly different from each other.

In statistics, Grubbs's test or the Grubbs test, also known as the maximum normalized residual test or extreme studentized deviate test, is a test used to detect outliers in a univariate data set assumed to come from a normally distributed population.

Durbin test is a non-parametric statistical test for balanced incomplete designs that reduces to the Friedman test in the case of a complete block design. In the analysis of designed experiments, the Friedman test is the most common non-parametric test for complete block designs.

Named after the Dutch mathematician Bartel Leendert van der Waerden, the Van der Waerden test is a statistical test that k population distribution functions are equal. The Van der Waerden test converts the ranks from a standard Kruskal-Wallis test to quantiles of the standard normal distribution. These are called normal scores and the test is computed from these normal scores.

Cochran's test is a non-parametric statistical test to verify whether k treatments have identical effects in the analysis of two-way randomized block designs where the response variable is binary. It is named after William Gemmell Cochran. Cochran's Q test should not be confused with Cochran's C test, which is a variance outlier test. Put in simple technical terms, Cochran's Q test requires that there only be a binary response and that there be more than 2 groups of the same size. The test assesses whether the proportion of successes is the same between groups. Often it is used to assess if different observers of the same phenomenon have consistent results.

<span class="mw-page-title-main">Homoscedasticity and heteroscedasticity</span> Statistical property

In statistics, a sequence of random variables is homoscedastic if all its random variables have the same finite variance; this is also known as homogeneity of variance. The complementary notion is called heteroscedasticity, also known as heterogeneity of variance. The spellings homoskedasticity and heteroskedasticity are also frequently used. Assuming a variable is homoscedastic when in reality it is heteroscedastic results in unbiased but inefficient point estimates and in biased estimates of standard errors, and may result in overestimating the goodness of fit as measured by the Pearson coefficient.

References

  1. W.G. Cochran, The distribution of the largest of a set of estimated variances as a fraction of their total, Annals of Human Genetics (London) 11(1), 47–52 (January 1941).
  2. D.L. Massart, B.G.M. Vandeginste, L.M.C. Buydens, S. de Jong, P.J. Lewi, J. Smeyers-Verbeke, Handbook of Chemometrics and Qualimetrics: Part A, Elsevier, Amsterdam, the Netherlands, 1997 ISBN   0-444-89724-0.
  3. 1 2 P. Konieczka, J. Namieśnik, Quality Assurance and Quality Control in the Analytical Chemical Laboratory – A Practical Approach, CRC Press, Boca Raton, Florida, 2009; ISBN   978-1-4200-8270-8.
  4. J.K. Taylor, Quality Assurance of Chemical Measurements, 4th printing, Lewis Publishers, Chelsea, Michigan, 1988; ISBN   0-87371-097-5.
  5. W. Horwitz, Harmonized protocol for the design and interpretation of collaborative studies, Trends in Analytical Chemistry 7(4), 118–120 (April 1988).
  6. 1 2 3 4 ISO Standard 5725–2:1994, “Accuracy (trueness and precision) of measurement methods and results – Part 2: Basic method for the determination of repeatability and reproducibility of a standard measurement method”, International Organization for Standardization, Geneva, Switzerland, 1994; http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=11834
  7. 1 2 R. Moore, Mathematics Department, Macquarie University, Sydney, Australia, 1999: http://faculty.washington.edu/heagerty/Books/Biostatistics/TABLES/Cochran.
  8. 1 2 3 4 5 R.U.E. 't Lam, Scrutiny of variance results for outliers: Cochran's test optimized, Analytica Chimica Acta 659, 68–84 (2010); doi : 10.1016/j.aca.2009.11.032
  9. 1 2 R.U.E. 't Lam, Variance Outlier Test, blog: http://rtlam.blogspot.com/
  10. Table of critical values of the F-distribution: NIST