Null distribution

Last updated

In statistical hypothesis testing, the null distribution is the probability distribution of the test statistic when the null hypothesis is true. [1] For example, in an F-test, the null distribution is an F-distribution. [2] Null distribution is a tool scientists often use when conducting experiments. The null distribution is the distribution of two sets of data under a null hypothesis. If the results of the two sets of data are not outside the parameters of the expected results, then the null hypothesis is said to be true.

Contents

Null and alternative distribution Null and alternative distribution.jpg
Null and alternative distribution

Examples of application

The null hypothesis is often a part of an experiment. The null hypothesis tries to show that among two sets of data, there is no statistical difference between the results of doing one thing as opposed to doing a different thing. For an example of this, a scientist might be trying to prove that people who walk two miles a day have healthier hearts than people who walk less than two miles a day. The scientist would use the null hypothesis to test the health of the hearts of people who walked two miles a day against the health of the hearts of the people who walked less than two miles a day. If there was no difference between their heart rate, then the scientist would be able to say that the test statistics would follow the null distribution. Then the scientists could determine that if there was significant difference that means the test follows the alternative distribution.

Obtaining the null distribution

In the procedure of hypothesis testing, one needs to form the joint distribution of test statistics to conduct the test and control type I errors. However, the true distribution is often unknown and a proper null distribution ought to be used to represent the data. For example, one sample and two samples tests of means can use t statistics which have Gaussian null distribution, while F statistics, testing k groups of population means, which have Gaussian quadratic form the null distribution. [3] The null distribution is defined as the asymptotic distributions of null quantile-transformed test statistics, based on marginal null distribution. [4] During practice, the test statistics of the null distribution is often unknown, since it relies on the unknown data generating distribution. Resampling procedures, such as non-parametric or model-based bootstrap, can provide consistent estimators for the null distributions. Improper choice of the null distribution poses significant influence on type I error and power properties in the testing process. Another approach to obtain the test statistics null distribution is to use the data of generating null distribution estimation.

Null distribution with large sample size

The null distribution plays a crucial role in large scale testing. Large sample size allows us to implement a more realistic empirical null distribution. One can generate the empirical null using an MLE fitting algorithm. [5] Under a Bayesian framework, the large-scale studies allow the null distribution to be put into a probabilistic context with its non-null counterparts. When sample size n is large, like over 10,000, the empirical nulls utilize a study's own data to estimate an appropriate null distribution. The important assumption is that due to the large proportion of null cases ( > 0.9), the data can show the null distribution itself. The theoretical null may fail in some cases, which is not completely wrong but needs adjustment accordingly. In the large-scale data sets, it is easy to find the deviations of data from the ideal mathematical framework, e.g., independent and identically distributed (i.i.d.) samples. In addition, the correlation across sampling units and unobserved covariates may lead to wrong theoretical null distribution. [6] Permutation methods are frequently used in multiple testing to obtain an empirical null distribution generated from data. Empirical null methods were introduced with the central matching algorithm in Efron's paper. [7]

Several points should be considered using permutation method. Permutation methods are not suitable for correlated sampling units, since the sampling process of permutation implies independence and requires i.i.d. assumptions. Furthermore, literature showed that the permutation distribution converges to N(0,1) quickly as n becomes large. In some cases, permutation techniques and empirical methods can be combined by using permutation null replace N(0,1) in the empirical algorithm. [8]

Related Research Articles

Biostatistics are the development and application of statistical methods to a wide range of topics in biology. It encompasses the design of biological experiments, the collection and analysis of data from those experiments and the interpretation of the results.

Kolmogorov–Smirnov test Non-parametric statistical test between two distributions

In statistics, the Kolmogorov–Smirnov test is a nonparametric test of the equality of continuous, one-dimensional probability distributions that can be used to compare a sample with a reference probability distribution, or to compare two samples. It is named after Andrey Kolmogorov and Nikolai Smirnov.

Quantile Statistical method of dividing data into equal-sized intervals for analysis

In statistics and probability, quantiles are cut points dividing the range of a probability distribution into continuous intervals with equal probabilities, or dividing the observations in a sample in the same way. There is one fewer quantile than the number of groups created. Common quantiles have special names, such as quartiles, deciles, and percentiles. The groups created are termed halves, thirds, quarters, etc., though sometimes the terms for the quantile are used for the groups created, rather than for the cut points.

Statistics Study of the collection, analysis, interpretation, and presentation of data

Statistics is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a scientific, industrial, or social problem, it is conventional to begin with a statistical population or a statistical model to be studied. Populations can be diverse groups of people or objects such as "all people living in a country" or "every atom composing a crystal". Statistics deals with every aspect of data, including the planning of data collection in terms of the design of surveys and experiments.

Statistical inference

Statistical inference is the process of using data analysis to infer properties of an underlying distribution of probability. Inferential statistical analysis infers properties of a population, for example by testing hypotheses and deriving estimates. It is assumed that the observed data set is sampled from a larger population.

Statistics is a field of inquiry that studies the collection, analysis, interpretation, and presentation of data. It is applicable to a wide variety of academic disciplines, from the physical and social sciences to the humanities; it is also used and misused for making informed decisions in all areas of business and government.

A statistical hypothesis is a hypothesis that is testable on the basis of observed data modelled as the realised values taken by a collection of random variables. A set of data is modelled as being realised values of a collection of random variables having a joint probability distribution in some set of possible joint distributions. The hypothesis being tested is exactly that set of possible probability distributions. A statistical hypothesis test is a method of statistical inference. An alternative hypothesis is proposed for the probability distribution of the data, either explicitly or only informally. The comparison of the two models is deemed statistically significant if, according to a threshold probability—the significance level—the data would be unlikely to occur if the null hypothesis were true. A hypothesis test specifies which outcomes of a study may lead to a rejection of the null hypothesis at a pre-specified level of significance, while using a pre-chosen measure of deviation from that hypothesis. The pre-chosen level of significance is the maximal allowed "false positive rate". One wants to control the risk of incorrectly rejecting a true null hypothesis.

Randomization is the process of making something random; in various contexts this involves, for example:

Nonparametric statistics is the branch of statistics that is not based solely on parametrized families of probability distributions. Nonparametric statistics is based on either being distribution-free or having a specified distribution but with the distribution's parameters unspecified. Nonparametric statistics includes both descriptive statistics and statistical inference. Nonparametric tests are often used when the assumptions of parametric tests are violated.

In inferential statistics, the null hypothesis is a default hypothesis that a quantity to be measured is zero (null). Typically, the quantity to be measured is the difference between two situations, for instance to try to determine if there is a positive proof that an effect has occurred or that samples derive from different batches.

Mathematical statistics

Mathematical statistics is the application of probability theory, a branch of mathematics, to statistics, as opposed to techniques for collecting statistical data. Specific mathematical techniques which are used for this include mathematical analysis, linear algebra, stochastic analysis, differential equations, and measure theory.

The following is a glossary of terms used in the mathematical sciences statistics and probability.

In statistics, an exact (significance) test is a test where if the null hypothesis is true then all assumptions, upon which the derivation of the distribution of the test statistic is based, are met. Using an exact test provides a significance test that keeps the Type I error rate of the test at the desired significance level of the test. For example an exact test at significance level of , when repeating the test over many samples where the null hypotheses is true, will reject at most of the time. This is opposed to an approximate test in which the desired type I error rate is only approximately kept, while this approximation may be made as close to as desired by making the sample size big enough.

In statistics, resampling is any of a variety of methods for doing one of the following:

  1. Estimating the precision of sample statistics by using subsets of available data (jackknifing) or drawing randomly with replacement from a set of data points (bootstrapping)
  2. Exchanging labels on data points when performing significance tests
  3. Validating models by using random subsets

Bootstrapping is any test or metric that uses random sampling with replacement, and falls under the broader class of resampling methods. Bootstrapping assigns measures of accuracy to sample estimates. This technique allows estimation of the sampling distribution of almost any statistic using random sampling methods.

Bradley Efron American statistician

Bradley Efron is an American statistician. Efron has been president of the American Statistical Association (2004) and of the Institute of Mathematical Statistics (1987–1988). He is a past editor of the Journal of the American Statistical Association, and he is the founding editor of the Annals of Applied Statistics. Efron is also the recipient of many awards.

Multiple comparisons problem

In statistics, the multiple comparisons, multiplicity or multiple testing problem occurs when one considers a set of statistical inferences simultaneously or infers a subset of parameters selected based on the observed values. In certain fields it is known as the look-elsewhere effect.

The foundations of statistics concern the epistemological debate in statistics over how one should conduct inductive inference from data. Among the issues considered in statistical inference are the question of Bayesian inference versus frequentist inference, the distinction between Fisher's "significance testing" and Neyman–Pearson "hypothesis testing", and whether the likelihood principle should be followed. Some of these issues have been debated for up to 200 years without resolution.

Surrogate data testing is a statistical proof by contradiction technique and similar to parametric bootstrapping used to detect non-linearity in a time series. The technique basically involves specifying a null hypothesis describing a linear process and then generating several surrogate data sets according to using Monte Carlo methods. A discriminating statistic is then calculated for the original time series and all the surrogate set. If the value of the statistic is significantly different for the original series than for the surrogate set, the null hypothesis is rejected and non-linearity assumed.

References

  1. Staley, Kent W. An Introduction to the Philosophy of Science. 2014. p. 142. ISBN   9780521112499.
  2. Jackson, Sally Ann. Random Factors in ANOVA. 1994. p. 38. ISBN   9780803950900.
  3. Dudoit, S., and M. J. Van Der Laan. "Multiple testing procedures with applications to genomics. 2008."
  4. Van Der Laan, Mark J., and Alan E. Hubbard. "Quantile-function based null distribution in resampling based multiple testing." Statistical Applications in Genetics and Molecular Biology 5.1 (2006): 1199.
  5. Efron, Bradley, and Trevor Hastie. Computer Age Statistical Inference. Cambridge University Press, 2016.
  6. Efron, Bradley. Large-scale inference: empirical Bayes methods for estimation, testing, and prediction. Cambridge University Press, 2012.
  7. Efron, Bradley. "Large-scale simultaneous hypothesis testing: the choice of a null hypothesis." Journal of the American Statistical Association 99.465 (2004): 96-104.
  8. Efron, Bradley. Large-scale inference: empirical Bayes methods for estimation, testing, and prediction. Cambridge University Press, 2012.