In statistics, the **Lilliefors test** is a normality test based on the Kolmogorov–Smirnov test. It is used to test the null hypothesis that data come from a normally distributed population, when the null hypothesis does not specify *which* normal distribution; i.e., it does not specify the expected value and variance of the distribution.^{ [1] } It is named after Hubert Lilliefors, professor of statistics at George Washington University.

A variant of the test can be used to test the null hypothesis that data come from an exponentially distributed population, when the null hypothesis does not specify which exponential distribution.^{ [2] }

The test proceeds as follows:^{ [1] }

- First estimate the population mean and population variance based on the data.
- Then find the maximum discrepancy between the empirical distribution function and the cumulative distribution function (CDF) of the normal distribution with the estimated mean and estimated variance. Just as in the Kolmogorov–Smirnov test, this will be the test statistic.
- Finally, assess whether the maximum discrepancy is large enough to be statistically significant, thus requiring rejection of the null hypothesis. This is where this test becomes more complicated than the Kolmogorov–Smirnov test. Since the hypothesised CDF has been moved closer to the data by estimation based on those data, the maximum discrepancy has been made smaller than it would have been if the null hypothesis had singled out just one normal distribution. Thus the "null distribution" of the test statistic, i.e. its probability distribution assuming the null hypothesis is true, is stochastically smaller than the Kolmogorov–Smirnov distribution. This is the
**Lilliefors distribution**. To date, tables for this distribution have been computed only by Monte Carlo methods.

In 1986 a corrected table of critical values for the test was published.^{ [3] }

In statistics, the **Kolmogorov–Smirnov test** is a nonparametric test of the equality of continuous, one-dimensional probability distributions that can be used to compare a sample with a reference probability distribution, or to compare two samples. It is named after Andrey Kolmogorov and Nikolai Smirnov.

**Statistical inference** is the process of using data analysis to infer properties of an underlying distribution of probability. Inferential statistical analysis infers properties of a population, for example by **testing hypotheses** and deriving estimates. It is assumed that the observed data set is sampled from a larger population.

**Nonparametric statistics** is the branch of statistics that is not based solely on parametrized families of probability distributions. Nonparametric statistics is based on either being distribution-free or having a specified distribution but with the distribution's parameters unspecified. Nonparametric statistics includes both descriptive statistics and statistical inference. Nonparametric tests are often used when the assumptions of parametric tests are violated.

An ** F-test** is any statistical test in which the test statistic has an

A ** Z-test** is any statistical test for which the distribution of the test statistic under the null hypothesis can be approximated by a normal distribution. Z-tests test the mean of a distribution. For each significance level in the confidence interval, the

The ** t-test** is any statistical hypothesis test in which the test statistic follows a Student's

In null hypothesis significance testing, the ** p-value** is the probability of obtaining test results at least as extreme as the results actually observed, under the assumption that the null hypothesis is correct. A very small

In statistics, a vector of random variables is **heteroscedastic** if the variability of the random disturbance is different across elements of the vector. Here, variability could be quantified by the variance or any other measure of statistical dispersion. Thus heteroscedasticity is the absence of homoscedasticity. A typical example is the set of observations of income in different cities.

In statistical significance testing, a **one-tailed test** and a **two-tailed test** are alternative ways of computing the statistical significance of a parameter inferred from a data set, in terms of a test statistic. A two-tailed test is appropriate if the estimated value is greater or less than a certain range of values, for example, whether a test taker may score above or below a specific range of scores. This method is used for null hypothesis testing and if the estimated value exists in the critical areas, the alternative hypothesis is accepted over the null hypothesis. A one-tailed test is appropriate if the estimated value may depart from the reference value in only one direction, left or right, but not both. An example can be whether a machine produces more than one-percent defective products. In this situation, if the estimated value exists in one of the one-sided critical areas, depending on the direction of interest, the alternative hypothesis is accepted over the null hypothesis. Alternative names are **one-sided** and **two-sided** tests; the terminology "tail" is used because the extreme portions of distributions, where observations lead to rejection of the null hypothesis, are small and often "tail off" toward zero as in the normal distribution, colored in yellow, or "bell curve", pictured on the right and colored in green.

A **test statistic** is a statistic used in statistical hypothesis testing. A hypothesis test is typically specified in terms of a test statistic, considered as a numerical summary of a data-set that reduces the data to one value that can be used to perform the hypothesis test. In general, a test statistic is selected or defined in such a way as to quantify, within observed data, behaviours that would distinguish the null from the alternative hypothesis, where such an alternative is prescribed, or that would characterize the null hypothesis if there is no explicitly stated alternative hypothesis.

The **goodness of fit** of a statistical model describes how well it fits a set of observations. Measures of goodness of fit typically summarize the discrepancy between observed values and the values expected under the model in question. Such measures can be used in statistical hypothesis testing, e.g. to test for normality of residuals, to test whether two samples are drawn from identical distributions, or whether outcome frequencies follow a specified distribution. In the analysis of variance, one of the components into which the variance is partitioned may be a lack-of-fit sum of squares.

The **Shapiro–Wilk test** is a test of normality in frequentist statistics. It was published in 1965 by Samuel Sanford Shapiro and Martin Wilk.

The **Anderson–Darling test** is a statistical test of whether a given sample of data is drawn from a given probability distribution. In its basic form, the test assumes that there are no parameters to be estimated in the distribution being tested, in which case the test and its set of critical values is distribution-free. However, the test is most often used in contexts where a family of distributions is being tested, in which case the parameters of that family need to be estimated and account must be taken of this in adjusting either the test-statistic or its critical values. When applied to testing whether a normal distribution adequately describes a set of data, it is one of the most powerful statistical tools for detecting most departures from normality. ** K-sample Anderson–Darling tests** are available for testing whether several collections of observations can be modelled as coming from a single population, where the distribution function does not have to be specified.

The **Wald–Wolfowitz runs test**, named after statisticians Abraham Wald and Jacob Wolfowitz is a non-parametric statistical test that checks a randomness hypothesis for a two-valued data sequence. More precisely, it can be used to test the hypothesis that the elements of the sequence are mutually independent.

In statistics, **resampling** is any of a variety of methods for doing one of the following:

- Estimating the precision of sample statistics by using subsets of available data (
**jackknifing**) or drawing randomly with replacement from a set of data points (**bootstrapping**) - Exchanging labels on data points when performing significance tests
- Validating models by using random subsets

In statistics, **normality tests** are used to determine if a data set is well-modeled by a normal distribution and to compute how likely it is for a random variable underlying the data set to be normally distributed.

**Minimum-distance estimation** (**MDE**) is a conceptual method for fitting a statistical model to data, usually the empirical distribution. Often-used estimators such as ordinary least squares can be thought of as special cases of minimum-distance estimation.

In statistics, one purpose for the analysis of variance (ANOVA) is to analyze differences in means between groups. The test statistic, *F*, assumes independence of observations, homogeneous variances, and population normality. **ANOVA on ranks** is a statistic designed for situations when the normality assumption has been violated.

The **Shapiro–Francia test** is a statistical test for the normality of a population, based on sample data. It was introduced by S. S. Shapiro and R. S. Francia in 1972 as a simplification of the Shapiro–Wilk test.

- 1 2 Lilliefors, Hubert W. (1967-06-01). "On the Kolmogorov-Smirnov Test for Normality with Mean and Variance Unknown".
*Journal of the American Statistical Association*.**62**(318): 399–402. doi:10.1080/01621459.1967.10482916. ISSN 0162-1459. - ↑ Lilliefors, Hubert W. (1969-03-01). "On the Kolmogorov-Smirnov Test for the Exponential Distribution with Mean Unknown".
*Journal of the American Statistical Association*.**64**(325): 387–389. doi:10.1080/01621459.1969.10500983. ISSN 0162-1459. - ↑ Dallal, Gerard E.; Wilkinson, Leland (1986-11-01). "An Analytic Approximation to the Distribution of Lilliefors's Test Statistic for Normality".
*The American Statistician*.**40**(4): 294–296. doi:10.1080/00031305.1986.10475419. ISSN 0003-1305.

- Conover, W.J. (1999), "Practical nonparametric statistics", 3rd ed. Wiley : New York.

This page is based on this Wikipedia article

Text is available under the CC BY-SA 4.0 license; additional terms may apply.

Images, videos and audio are available under their respective licenses.

Text is available under the CC BY-SA 4.0 license; additional terms may apply.

Images, videos and audio are available under their respective licenses.