List of statistical tests

Last updated July 06, 2024

Statistical tests are used to test the fit between a hypothesis and the data.^[1]^[2] Choosing the right statistical test is not a trivial task.^[1] The choice of the test depends on many properties of the research question. The vast majority of studies can be addressed by 30 of the 100 or so statistical tests in use.^[3]^[4]^[5]

Explanation of properties

Scaling of data: One of the properties of the tests is the scale of the data, which can be interval-based, ordinal or nominal.^[3] Nominal scale is also known as categorical.^[6] Interval scale is also known as numerical.^[6] When categorical data has only two possibilities, it is called binary or dichotomous.^[1]

Assumptions, parametric and non-parametric: There are two groups of statistical tests, parametric and non-parametric. The choice between these two groups needs to be justified. Parametric tests assume that the data follow a particular distribution, typically a normal distribution, while non-parametric tests make no assumptions about the distribution.^[7] Non-parametric tests have the advantage of being more resistant to misbehaviour of the data, such as outliers.^[7] They also have the disadvantage of being less certain in the statistical estimate.^[7]

Type of data: Statistical tests use different types of data.^[1] Some tests perform univariate analysis on a single sample with a single variable. Others compare two or more paired or unpaired samples. Unpaired samples are also called independent samples. Paired samples are also called dependent. Finally, there are some statistical tests that perform analysis of relationship between multiple variables like regression.^[1]

Number of samples: The number of samples of data.

Exactness: A test can be exact or be asymptotic delivering approximate results.

List of statistical tests


Test name	Scaling	Assumptions	Data	Samples	Exact	Special case of	Application conditions
One sample t-test	interval	normal	univariate	1	No ^[8]	Location test
Unpaired t-test	interval	normal	unpaired	2	No ^[8]	Location test	Homoscedasticity ^[9]
Welch's t-test	interval	normal	unpaired	2	No ^[8]	Location test
Paired t-test	interval	normal	paired	2	No	Location test
F-test	interval	normal		2
Z-test	interval	normal		2	No		variance is known
Permutation test	interval	non-parametric	unpaired	≥2	Yes
Kruskal-Wallis test	ordinal	non-parametric	unpaired	≥2	Yes		small sample size^[10]
Mann–Whitney $U$ test	ordinal	non-parametric	unpaired	2		Kruskal-Wallis test ^[11]
Wilcoxon signed-rank test	interval	non-parametric	paired	≥1		Location test
Sign test	ordinal	non-parametric	paired	2
Friedman test	ordinal	non-parametric	paired	>2		Location test
$\chi ^{2}$ test	nominal ^[1]	non-parametric ^[12]			No		Contingency table, sample size > ca. 60,^[1] any cell content ≥ 5,^[13] marginal totals fixed^[13]
Pearson's $\chi ^{2}$ test	nominal/ordinal	non-parametric			No	$\chi ^{2}$ test
Median test	ordinal	non-parametric			No	Pearson's $\chi ^{2}$ test
Multinomial test	nominal	non-parametric	univariate	1	Yes	Location test
McNemar's test	binary	non-parametric ^[14]	paired	2	Yes	Cochran's $Q$ test ^[15]
Cochran's $Q$ test	binary	non-parametric	paired	≥2
Binomial test	binary	non-parametric	univariate	1	Yes	Multinomial test
Siegel–Tukey test	ordinal	non-parametric	unpaired	2
Chow test	interval	parametric	linear regression	2	No		Time series
Fisher's exact test	nominal	non-parametric	unpaired	≥2^[13]	Yes		Contingency table, marginal totals fixed^[13]
Barnard's exact test	nominal	non-parametric	unpaired	2	Yes		Contingency table
Boschloo's test	nominal	non-parametric	unpaired	2	Yes		Contingency table
Shapiro–Wilk test	interval		univariate	1		Normality test	sample size between 3 and 5000^[16]
Kolmogorov–Smirnov test	interval			1		Normality test	distribution parameters known^[16]
Shapiro-Francia test	interval		univariate	1		Normality test	Simpliplification of Shapiro–Wilk test
Lilliefors test	interval			1		Normality test

Related Research Articles

Statistics is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a scientific, industrial, or social problem, it is conventional to begin with a statistical population or a statistical model to be studied. Populations can be diverse groups of people or objects such as "all people living in a country" or "every atom composing a crystal". Statistics deals with every aspect of data, including the planning of data collection in terms of the design of surveys and experiments.

Statistical inference is the process of using data analysis to infer properties of an underlying distribution of probability. Inferential statistical analysis infers properties of a population, for example by testing hypotheses and deriving estimates. It is assumed that the observed data set is sampled from a larger population.

Nonparametric statistics is a type of statistical analysis that makes minimal assumptions about the underlying distribution of the data being studied. Often these models are infinite-dimensional, rather than finite dimensional, as is parametric statistics. Nonparametric statistics can be used for descriptive statistics or statistical inference. Nonparametric tests are often used when the assumptions of parametric tests are evidently violated.

Mann–Whitney $test$ is a nonparametric test of the null hypothesis that, for randomly selected values X and Y from two populations, the probability of X being greater than Y is equal to the probability of Y being greater than X.

A Likert scale is a psychometric scale named after its inventor, American social psychologist Rensis Likert, which is commonly used in research questionnaires. It is the most widely used approach to scaling responses in survey research, such that the term is often used interchangeably with rating scale, although there are other types of rating scales.

Student's t-test is a statistical test used to test whether the difference between the response of two groups is statistically significant or not. It is any statistical hypothesis test in which the test statistic follows a Student's t-distribution under the null hypothesis. It is most commonly applied when the test statistic would follow a normal distribution if the value of a scaling term in the test statistic were known. When the scaling term is estimated based on the data, the test statistic—under certain conditions—follows a Student's t distribution. The t-test's most common application is to test whether the means of two populations are significantly different. In many cases, a Z-test will yield very similar results to a t-test since the latter converges to the former as the size of the dataset increases.

Level of measurement or scale of measure is a classification that describes the nature of information within the values assigned to variables. Psychologist Stanley Smith Stevens developed the best-known classification with four levels, or scales, of measurement: nominal, ordinal, interval, and ratio. This framework of distinguishing levels of measurement originated in psychology and has since had a complex history, being adopted and extended in some disciplines and by some scholars, and criticized or rejected by others. Other classifications include those by Mosteller and Tukey, and by Chrisman.

In statistics, a categorical variable is a variable that can take on one of a limited, and usually fixed, number of possible values, assigning each individual or other unit of observation to a particular group or nominal category on the basis of some qualitative property. In computer science and some branches of mathematics, categorical variables are referred to as enumerations or enumerated types. Commonly, each of the possible values of a categorical variable is referred to as a level. The probability distribution associated with a random categorical variable is called a categorical distribution.

Mathematical statistics is the application of probability theory, a branch of mathematics, to statistics, as opposed to techniques for collecting statistical data. Specific mathematical techniques which are used for this include mathematical analysis, linear algebra, stochastic analysis, differential equations, and measure theory.

The Kruskal–Wallis test by ranks, Kruskal–Wallis $test$ , or one-way ANOVA on ranks is a non-parametric method for testing whether samples originate from the same distribution. It is used for comparing two or more independent samples of equal or different sample sizes. It extends the Mann–Whitney U test, which is used for comparing only two groups. The parametric equivalent of the Kruskal–Wallis test is the one-way analysis of variance (ANOVA).

A permutation test is an exact statistical hypothesis test making use of the proof by contradiction. A permutation test involves two or more samples. The null hypothesis is that all samples come from the same distribution $. Under the null hypothesis, the distribution of the test statistic is obtained by calculating all possible values of the test statistic under possible rearrangements of the observed data. Permutation tests are, therefore, a form of resampling.$

An exact (significance) test is a test such that if the null hypothesis is true, then all assumptions made during the derivation of the distribution of the test statistic are met. Using an exact test provides a significance test that maintains the type I error rate of the test at the desired significance level of the test. For example, an exact test at a significance level of $, when repeated over many samples where the null hypothesis is true, will reject at most of the time. This is in contrast to an approximate test in which the desired type I error rate is only approximately maintained, while this approximation may be made as close to as desired by making the sample size sufficiently large.$

In statistics, the Kendall rank correlation coefficient, commonly referred to as Kendall's τ coefficient, is a statistic used to measure the ordinal association between two measured quantities. A τ test is a non-parametric hypothesis test for statistical dependence based on the τ coefficient. It is a measure of rank correlation: the similarity of the orderings of the data when ranked by each of the quantities. It is named after Maurice Kendall, who developed it in 1938, though Gustav Fechner had proposed a similar measure in the context of time series in 1897.

In statistics, inter-rater reliability is the degree of agreement among independent observers who rate, code, or assess the same phenomenon.

In statistics, Goodman and Kruskal's gamma is a measure of rank correlation, i.e., the similarity of the orderings of the data when ranked by each of the quantities. It measures the strength of association of the cross tabulated data when both variables are measured at the ordinal level. It makes no adjustment for either table size or ties. Values range from −1 to +1. A value of zero indicates the absence of association.

In statistics, groups of individual data points may be classified as belonging to any of various statistical data types, e.g. categorical ("red", "blue", "green"), real number (1.68, −5, 1.7×10⁺⁶), odd number (1,3,5) etc. The data type is a fundamental component of the semantic content of the variable, and controls which sorts of probability distributions can logically be used to describe the variable, the permissible operations on the variable, the type of regression analysis used to predict the variable, etc. The concept of data type is similar to the concept of level of measurement, but more specific: For example, count data require a different distribution (e.g. a Poisson distribution or binomial distribution) than non-negative real-valued data require, but both fall under the same level of measurement (a ratio scale).

Univariate is a term commonly used in statistics to describe a type of data which consists of observations on only a single characteristic or attribute. A simple example of univariate data would be the salaries of workers in industry. Like all the other data, univariate data can be visualized using graphs, images or other analysis tools after the data is measured, collected, reported, and analyzed.

Ordinal data is a categorical, statistical data type where the variables have natural, ordered categories and the distances between the categories are not known. These data exist on an ordinal scale, one of four levels of measurement described by S. S. Stevens in 1946. The ordinal scale is distinguished from the nominal scale by having a ranking. It also differs from the interval scale and ratio scale by not having category widths that represent equal increments of the underlying attribute.

The Scheirer–Ray–Hare (SRH) test is a statistical test that can be used to examine whether a measure is affected by two or more factors. Since it does not require a normal distribution of the data, it is one of the non-parametric methods. It is an extension of the Kruskal–Wallis test, the non-parametric equivalent for one-way analysis of variance (ANOVA), to the application for more than one factor. It is thus a non-parameter alternative to multi-factorial ANOVA analyses. The test is named after James Scheirer, William Ray and Nathan Hare, who published it in 1976.

References

1 2 3 4 5 6 7 Parab, Shraddha; Bhalerao, Supriya (2010). "Choosing statistical test". International Journal of Ayurveda Research. 1 (3): 187–191. doi: 10.4103/0974-7788.72494 . ISSN 0974-7788. PMC 2996580 . PMID 21170214.
↑ "Entscheidbaum" (in German). Retrieved 8 February 2024.
1 2 Nayak, Barun K; Hazra, Avijit (2011). "How to choose the right statistical test?". Indian Journal of Ophthalmology. 59 (2): 85–86. doi: 10.4103/0301-4738.77005 . ISSN 0301-4738. PMC 3116565 . PMID 21350275.
↑ Lewis, Nancy D.; Lewis, Nigel Da Costa; Lewis, N. D. (2013). 100 Statistical Tests in R: What to Choose, how to Easily Calculate, with Over 300 Illustrations and Examples. Heather Hills Press. ISBN 978-1-4840-5299-0.
↑ Kanji, Gopal K. (18 July 2006). 100 Statistical Tests. SAGE. ISBN 978-1-4462-2250-8.
1 2 "What is the difference between categorical, ordinal and interval variables?". stats.oarc.ucla.edu. Retrieved 10 February 2024.
1 2 3 Huth, R.; Pokorná, L. (1 March 2004). "Parametric versus non-parametric estimates of climatic trends". Theoretical and Applied Climatology. 77 (1): 107–112. Bibcode:2004ThApC..77..107H. doi:10.1007/s00704-003-0026-3. ISSN 1434-4483. S2CID 121539673.
1 2 3 de Winter, J.C.F. (2019). "Using the Student's t-test with extremely small sample sizes". Practical Assessment, Research, and Evaluation. 18. doi:10.7275/e4r6-dj05.
↑ "t-Test für unabhängige Stichproben". Hochschule Luzern (in German). Retrieved 10 February 2024.
↑ Choi, Won; Lee, Jae Won; Huh, Myung-Hoe; Kang, Seung-Ho (11 January 2003). "An Algorithm for Computing the Exact Distribution of the Kruskal–Wallis Test". Communications in Statistics - Simulation and Computation. 32 (4): 1029–1040. doi:10.1081/SAC-120023876. ISSN 0361-0918. S2CID 123037097.
↑ McKight, Patrick E.; Najab, Julius (30 January 2010). "Kruskal-Wallis Test". The Corsini Encyclopedia of Psychology. Wiley. p. 1. doi:10.1002/9780470479216.corpsy0491. ISBN 978-0-470-17024-3.
↑ McHugh, Mary L. (15 June 2013). "The Chi-square test of independence". Biochemia Medica. 23 (2): 143–149. doi:10.11613/BM.2013.018. PMC 3900058 . PMID 23894860.
1 2 3 4 Warner, Pamela (1 October 2013). "Testing association with Fisher's Exact test". Journal of Family Planning and Reproductive Health Care. 39 (4): 281–284. doi:10.1136/jfprhc-2013-100747. ISSN 1471-1893. PMID 24062499.
↑ Károly, Héberger; Róbert, Rajkó (1999). Pair-Correlation Method with parametric and non-parametric test-statistics for variable selection. Description of computer program and application for environmental data case studies. szef. pp. 82–91.
↑ Carpi, Angelo; Rossi, Giuseppe; Coscio, Giancarlo Di; Iervasi, Giorgio; Nicolini, Andrea; Carpi, Federico; Mechanick, Jeffrey I.; Bartolazzi, Armando (2010). "Galectin-3 detection on large-needle aspiration biopsy improves preoperative selection of thyroid nodules: a prospective cohort study". Annals of Medicine. 42 (1): 70–78. doi:10.3109/07853890903439778. ISSN 1365-2060.
1 2 Ahmad, Fiaz; Khan, Rehan Ahmad (8 September 2015). "A power comparison of various normality tests". Pakistan Journal of Statistics and Operation Research. 11 (3): 331–345. doi: 10.18187/pjsor.v11i3.845 . ISSN 1816-2711.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[choosing-1] 1 2 3 4 5 6 7 Parab, Shraddha; Bhalerao, Supriya (2010). "Choosing statistical test". International Journal of Ayurveda Research. 1 (3): 187–191. doi: 10.4103/0974-7788.72494 . ISSN 0974-7788. PMC 2996580 . PMID 21170214.

[2] "Entscheidbaum" (in German). Retrieved 8 February 2024.

[avijit-3] 1 2 Nayak, Barun K; Hazra, Avijit (2011). "How to choose the right statistical test?". Indian Journal of Ophthalmology. 59 (2): 85–86. doi: 10.4103/0301-4738.77005 . ISSN 0301-4738. PMC 3116565 . PMID 21350275.

[4] Lewis, Nancy D.; Lewis, Nigel Da Costa; Lewis, N. D. (2013). 100 Statistical Tests in R: What to Choose, how to Easily Calculate, with Over 300 Illustrations and Examples. Heather Hills Press. ISBN 978-1-4840-5299-0.

[5] Kanji, Gopal K. (18 July 2006). 100 Statistical Tests. SAGE. ISBN 978-1-4462-2250-8.

[scaling-6] 1 2 "What is the difference between categorical, ordinal and interval variables?". stats.oarc.ucla.edu. Retrieved 10 February 2024.

[addis-7] 1 2 3 Huth, R.; Pokorná, L. (1 March 2004). "Parametric versus non-parametric estimates of climatic trends". Theoretical and Applied Climatology. 77 (1): 107–112. Bibcode:2004ThApC..77..107H. doi:10.1007/s00704-003-0026-3. ISSN 1434-4483. S2CID 121539673.

[locttest-8] 1 2 3 de Winter, J.C.F. (2019). "Using the Student's t-test with extremely small sample sizes". Practical Assessment, Research, and Evaluation. 18. doi:10.7275/e4r6-dj05.

[9] "t-Test für unabhängige Stichproben". Hochschule Luzern (in German). Retrieved 10 February 2024.

[10] Choi, Won; Lee, Jae Won; Huh, Myung-Hoe; Kang, Seung-Ho (11 January 2003). "An Algorithm for Computing the Exact Distribution of the Kruskal–Wallis Test". Communications in Statistics - Simulation and Computation. 32 (4): 1029–1040. doi:10.1081/SAC-120023876. ISSN 0361-0918. S2CID 123037097.

[11] McKight, Patrick E.; Najab, Julius (30 January 2010). "Kruskal-Wallis Test". The Corsini Encyclopedia of Psychology. Wiley. p. 1. doi:10.1002/9780470479216.corpsy0491. ISBN 978-0-470-17024-3.

[12] McHugh, Mary L. (15 June 2013). "The Chi-square test of independence". Biochemia Medica. 23 (2): 143–149. doi:10.11613/BM.2013.018. PMC 3900058 . PMID 23894860.

[fishertest-13] 1 2 3 4 Warner, Pamela (1 October 2013). "Testing association with Fisher's Exact test". Journal of Family Planning and Reproductive Health Care. 39 (4): 281–284. doi:10.1136/jfprhc-2013-100747. ISSN 1471-1893. PMID 24062499.

[14] Károly, Héberger; Róbert, Rajkó (1999). Pair-Correlation Method with parametric and non-parametric test-statistics for variable selection. Description of computer program and application for environmental data case studies. szef. pp. 82–91.

[15] Carpi, Angelo; Rossi, Giuseppe; Coscio, Giancarlo Di; Iervasi, Giorgio; Nicolini, Andrea; Carpi, Federico; Mechanick, Jeffrey I.; Bartolazzi, Armando (2010). "Galectin-3 detection on large-needle aspiration biopsy improves preoperative selection of thyroid nodules: a prospective cohort study". Annals of Medicine. 42 (1): 70–78. doi:10.3109/07853890903439778. ISSN 1365-2060.

[normalitytests-16] 1 2 Ahmad, Fiaz; Khan, Rehan Ahmad (8 September 2015). "A power comparison of various normality tests". Pakistan Journal of Statistics and Operation Research. 11 (3): 331–345. doi: 10.18187/pjsor.v11i3.845 . ISSN 1816-2711.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

List of statistical tests

Contents

Explanation of properties

List of statistical tests

See also

Related Research Articles

References