Cochran's Q test

Last updated

Cochran's test is a non-parametric statistical test to verify whether k treatments have identical effects in the analysis of two-way randomized block designs where the response variable is binary. [1] [2] [3] It is named after William Gemmell Cochran. Cochran's Q test should not be confused with Cochran's C test, which is a variance outlier test. Put in simple technical terms, Cochran's Q test requires that there only be a binary response (e.g. success/failure or 1/0) and that there be more than 2 groups of the same size. The test assesses whether the proportion of successes is the same between groups. Often it is used to assess if different observers of the same phenomenon have consistent results (interobserver variability). [4]

Contents

Background

Cochran's Q test assumes that there are k > 2 experimental treatments and that the observations are arranged in b blocks; that is,

Treatment 1Treatment 2Treatment k
Block 1X11X12X1k
Block 2X21X22X2k
Block 3X31X32X3k
Block bXb1Xb2Xbk

The "blocks" here might be individual people or other organisms. [5] For example, if b respondents in a survey had each been asked k Yes/No questions, the Q test could be used to test the null hypothesis that all questions were equally likely to elicit the answer "Yes".

Description

Cochran's Q test is

Null hypothesis (H0): the treatments are equally effective.
Alternative hypothesis (Ha): there is a difference in effectiveness between treatments.

The Cochran's Q test statistic is

where

k is the number of treatments
X j is the column total for the jth treatment
b is the number of blocks
Xi  is the row total for the ith block
N is the grand total

Critical region

For significance level α, the asymptotic critical region is

where Χ21 α,k 1 is the (1 α)-quantile of the chi-squared distribution with k 1 degrees of freedom. The null hypothesis is rejected if the test statistic is in the critical region. If the Cochran test rejects the null hypothesis of equally effective treatments, pairwise multiple comparisons can be made by applying Cochran's Q test on the two treatments of interest.

The exact distribution of the T statistic may be computed for small samples. This allows obtaining an exact critical region. A first algorithm had been suggested in 1975 by Patil [6] and a second one has been made available by Fahmy and Bellétoile [7] in 2017.

Assumptions

Cochran's Q test is based on the following assumptions:

  1. If the large sample approximation is used (and not the exact distribution), b is required to be "large".
  2. The blocks were randomly selected from the population of all possible blocks.
  3. The outcomes of the treatments can be coded as binary responses (i.e., a "0" or "1") in a way that is common to all treatments within each block.

Related Research Articles

Analysis of variance (ANOVA) is a collection of statistical models and their associated estimation procedures used to analyze the differences among means. ANOVA was developed by the statistician Ronald Fisher. ANOVA is based on the law of total variance, where the observed variance in a particular variable is partitioned into components attributable to different sources of variation. In its simplest form, ANOVA provides a statistical test of whether two or more population means are equal, and therefore generalizes the t-test beyond two means. In other words, the ANOVA is used to test the difference between two or more means.

<span class="mw-page-title-main">Kolmogorov–Smirnov test</span> Non-parametric statistical test between two distributions

In statistics, the Kolmogorov–Smirnov test is a nonparametric test of the equality of continuous, one-dimensional probability distributions that can be used to test whether a sample came from a given reference probability distribution, or to test whether two samples came from the same distribution. Intuitively, the test provides a method to qualitatively answer the question "How likely is it that we would see a collection of samples like this if they were drawn from that probability distribution?" or, in the second case, "How likely is it that we would see two sets of samples like this if they were drawn from the same probability distribution?". It is named after Andrey Kolmogorov and Nikolai Smirnov.

In statistics, the likelihood-ratio test assesses the goodness of fit of two competing statistical models, specifically one found by maximization over the entire parameter space and another found after imposing some constraint, based on the ratio of their likelihoods. If the constraint is supported by the observed data, the two likelihoods should not differ by more than sampling error. Thus the likelihood-ratio test tests whether this ratio is significantly different from one, or equivalently whether its natural logarithm is significantly different from zero.

<span class="mw-page-title-main">Chi-squared distribution</span> Probability distribution and special case of gamma distribution

In probability theory and statistics, the chi-squared distribution with degrees of freedom is the distribution of a sum of the squares of independent standard normal random variables. The chi-squared distribution is a special case of the gamma distribution and is one of the most widely used probability distributions in inferential statistics, notably in hypothesis testing and in construction of confidence intervals. This distribution is sometimes called the central chi-squared distribution, a special case of the more general noncentral chi-squared distribution.

<span class="mw-page-title-main">Chi-squared test</span> Statistical hypothesis test

A chi-squared test is a statistical hypothesis test used in the analysis of contingency tables when the sample sizes are large. In simpler terms, this test is primarily used to examine whether two categorical variables are independent in influencing the test statistic. The test is valid when the test statistic is chi-squared distributed under the null hypothesis, specifically Pearson's chi-squared test and variants thereof. Pearson's chi-squared test is used to determine whether there is a statistically significant difference between the expected frequencies and the observed frequencies in one or more categories of a contingency table. For contingency tables with smaller sample sizes, a Fisher's exact test is used instead.

Fisher's exact test is a statistical significance test used in the analysis of contingency tables. Although in practice it is employed when sample sizes are small, it is valid for all sample sizes. It is named after its inventor, Ronald Fisher, and is one of a class of exact tests, so called because the significance of the deviation from a null hypothesis can be calculated exactly, rather than relying on an approximation that becomes exact in the limit as the sample size grows to infinity, as with many statistical tests.

<span class="mw-page-title-main">One- and two-tailed tests</span> Alternative ways of computing the statistical significance of a parameter inferred from a data set

In statistical significance testing, a one-tailed test and a two-tailed test are alternative ways of computing the statistical significance of a parameter inferred from a data set, in terms of a test statistic. A two-tailed test is appropriate if the estimated value is greater or less than a certain range of values, for example, whether a test taker may score above or below a specific range of scores. This method is used for null hypothesis testing and if the estimated value exists in the critical areas, the alternative hypothesis is accepted over the null hypothesis. A one-tailed test is appropriate if the estimated value may depart from the reference value in only one direction, left or right, but not both. An example can be whether a machine produces more than one-percent defective products. In this situation, if the estimated value exists in one of the one-sided critical areas, depending on the direction of interest, the alternative hypothesis is accepted over the null hypothesis. Alternative names are one-sided and two-sided tests; the terminology "tail" is used because the extreme portions of distributions, where observations lead to rejection of the null hypothesis, are small and often "tail off" toward zero as in the normal distribution, colored in yellow, or "bell curve", pictured on the right and colored in green.

In statistics, ordinary least squares (OLS) is a type of linear least squares method for choosing the unknown parameters in a linear regression model by the principle of least squares: minimizing the sum of the squares of the differences between the observed dependent variable in the input dataset and the output of the (linear) function of the independent variable.

This glossary of statistics and probability is a list of definitions of terms and concepts used in the mathematical sciences of statistics and probability, their sub-disciplines, and related fields. For additional related terms, see Glossary of mathematics and Glossary of experimental design.

A permutation test is an exact statistical hypothesis test making use of the proof by contradiction. A permutation test involves two or more samples. The null hypothesis is that all samples come from the same distribution . Under the null hypothesis, the distribution of the test statistic is obtained by calculating all possible values of the test statistic under possible rearrangements of the observed data. Permutation tests are, therefore, a form of resampling.

An exact (significance) test is a test such that if the null hypothesis is true, then all assumptions made during the derivation of the distribution of the test statistic are met. Using an exact test provides a significance test that maintains the type I error rate of the test at the desired significance level of the test. For example, an exact test at a significance level of , when repeated over many samples where the null hypothesis is true, will reject at most of the time. This is in contrast to an approximate test in which the desired type I error rate is only approximately maintained, while this approximation may be made as close to as desired by making the sample size sufficiently large.

McNemar's test is a statistical test used on paired nominal data. It is applied to 2 × 2 contingency tables with a dichotomous trait, with matched pairs of subjects, to determine whether the row and column marginal frequencies are equal. It is named after Quinn McNemar, who introduced it in 1947. An application of the test in genetics is the transmission disequilibrium test for detecting linkage disequilibrium.

Omnibus tests are a kind of statistical test. They test whether the explained variance in a set of data is significantly greater than the unexplained variance, overall. One example is the F-test in the analysis of variance. There can be legitimate significant effects within a model even if the omnibus test is not significant. For instance, in a model with two independent variables, if only one variable exerts a significant effect on the dependent variable and the other does not, then the omnibus test may be non-significant. This fact does not affect the conclusions that may be drawn from the one significant variable. In order to test effects within an omnibus test, researchers often use contrasts.

The logrank test, or log-rank test, is a hypothesis test to compare the survival distributions of two samples. It is a nonparametric test and appropriate to use when the data are right skewed and censored. It is widely used in clinical trials to establish the efficacy of a new treatment in comparison with a control treatment when the measurement is the time to event. The test is sometimes called the Mantel–Cox test. The logrank test can also be viewed as a time-stratified Cochran–Mantel–Haenszel test.

Tukey's range test, also known as Tukey's test, Tukey method, Tukey's honest significance test, or Tukey's HSDtest, is a single-step multiple comparison procedure and statistical test. It can be used to correctly interpret the statistical significance of the difference between means that have been selected for comparison because of their extreme values.

Durbin test is a non-parametric statistical test for balanced incomplete designs that reduces to the Friedman test in the case of a complete block design. In the analysis of designed experiments, the Friedman test is the most common non-parametric test for complete block designs.

In statistics, the multinomial test is the test of the null hypothesis that the parameters of a multinomial distribution equal specified values; it is used for categorical data.

In statistics, the Cochran–Mantel–Haenszel test (CMH) is a test used in the analysis of stratified or matched categorical data. It allows an investigator to test the association between a binary predictor or treatment and a binary outcome such as case or control status while taking into account the stratification. Unlike the McNemar test, which can only handle pairs, the CMH test handles arbitrary strata sizes. It is named after William G. Cochran, Nathan Mantel and William Haenszel. Extensions of this test to a categorical response and/or to several groups are commonly called Cochran–Mantel–Haenszel statistics. It is often used in observational studies in which random assignment of subjects to different treatments cannot be controlled but confounding covariates can be measured.

In statistics and probability theory, the nonparametric skew is a statistic occasionally used with random variables that take real values. It is a measure of the skewness of a random variable's distribution—that is, the distribution's tendency to "lean" to one side or the other of the mean. Its calculation does not require any knowledge of the form of the underlying distribution—hence the name nonparametric. It has some desirable properties: it is zero for any symmetric distribution; it is unaffected by a scale shift; and it reveals either left- or right-skewness equally well. In some statistical samples it has been shown to be less powerful than the usual measures of skewness in detecting departures of the population from normality.

In statistics, the Jonckheere trend test is a test for an ordered alternative hypothesis within an independent samples (between-participants) design. It is similar to the Kruskal-Wallis test in that the null hypothesis is that several independent samples are from the same population. However, with the Kruskal–Wallis test there is no a priori ordering of the populations from which the samples are drawn. When there is an a priori ordering, the Jonckheere test has more statistical power than the Kruskal–Wallis test. The test was developed by Aimable Robert Jonckheere, who was a psychologist and statistician at University College London.

References

  1. William G. Cochran (December 1950). "The Comparison of Percentages in Matched Samples". Biometrika. 37 (3/4): 256–266. doi:10.1093/biomet/37.3-4.256. JSTOR   2332378.
  2. Conover, William Jay (1999). Practical Nonparametric Statistics (Third ed.). Wiley, New York, NY USA. pp. 388–395. ISBN   9780471160687.
  3. National Institute of Standards and Technology. Cochran Test
  4. Mohamed M. Shoukri (2004). Measures of interobserver agreement . Boca Raton: Chapman & Hall/CRC. ISBN   9780203502594. OCLC   61365784.
  5. Robert R. Sokal & F. James Rohlf (1969). Biometry (3rd ed.). New York: W. H. Freeman. pp. 786–787. ISBN   9780716724117.
  6. Kashinath D. Patil (March 1975). "Cochran's Q test: Exact distribution". Journal of the American Statistical Association. 70 (349): 186–189. doi:10.1080/01621459.1975.10480285. JSTOR   2285400.
  7. Fahmy T.; Bellétoile A. (October 2017). "Algorithm 983: Fast Computation of the Non-Asymptotic Cochran's Q Statistic for Heterogeneity Detection". ACM Transactions on Mathematical Software. 44 (2): 1–20. doi:10.1145/3095076.

PD-icon.svg This article incorporates public domain material from the National Institute of Standards and Technology