This is a list of important publications in statistics, organized by field.
Some reasons why a particular publication might be regarded as important:
Mathematical Methods of Statistics
Statistical Decision Functions
Testing Statistical Hypotheses
An Essay Towards Solving a Problem in the Doctrine of Chances
On Small Differences in Sensation
Truth and Probability
Bayesian Inference in Statistical Analysis
Theory of Probability
Introduction to statistical decision theory
An Introduction to Multivariate Analysis
Time Series Analysis Forecasting and Control
Statistical Methods for Research Workers
Statistical Methods
Principles and Procedures of Statistics with Special Reference to the Biological Sciences.
Biometry: The Principles and Practices of Statistics in Biological Research
On the uniform convergence of relative frequencies of events to their probabilities
On the mathematical foundations of theoretical statistics
Estimation of variance and covariance components
Maximum-likelihood estimation for the mixed analysis of variance model
Recovery of inter-block information when block sizes are unequal
Estimation of Variance and Covariance Components in Linear Models
Nonparametric estimation from incomplete observations
A generalized Wilcoxon test for comparing arbitrarily singly-censored samples
Evaluation of survival data and two new rank order statistics arising in its consideration
Regression Models and Life Tables
The Statistical Analysis of Failure Time Data
Report on Certain Enteric Fever Inoculation Statistics
The Probability Integral Transformation for Testing Goodness of Fit and Combining Independent Tests of Significance
Combining Independent Tests of Significance
The combination of estimates from different experiments
On Small Differences in Sensation
The Design and Analysis of Experiments
On the Experimental Attainment of Optimum Conditions (with discussion)
Analysis of variance (ANOVA) is a collection of statistical models and their associated estimation procedures used to analyze the differences among means. ANOVA was developed by the statistician Ronald Fisher. ANOVA is based on the law of total variance, where the observed variance in a particular variable is partitioned into components attributable to different sources of variation. In its simplest form, ANOVA provides a statistical test of whether two or more population means are equal, and therefore generalizes the t-test beyond two means. In other words, the ANOVA is used to test the difference between two or more means.
Bayesian probability is an interpretation of the concept of probability, in which, instead of frequency or propensity of some phenomenon, probability is interpreted as reasonable expectation representing a state of knowledge or as quantification of a personal belief.
The design of experiments, also known as experiment design or experimental design, is the design of any task that aims to describe and explain the variation of information under conditions that are hypothesized to reflect the variation. The term is generally associated with experiments in which the design introduces conditions that directly affect the variation, but may also refer to the design of quasi-experiments, in which natural conditions that influence the variation are selected for observation.
Frequentist probability or frequentism is an interpretation of probability; it defines an event's probability as the limit of its relative frequency in infinitely many trials . Probabilities can be found by a repeatable objective process. The continued use of frequentist methods in scientific inference, however, has been called into question.
Statistics is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a scientific, industrial, or social problem, it is conventional to begin with a statistical population or a statistical model to be studied. Populations can be diverse groups of people or objects such as "all people living in a country" or "every atom composing a crystal". Statistics deals with every aspect of data, including the planning of data collection in terms of the design of surveys and experiments.
Statistical inference is the process of using data analysis to infer properties of an underlying distribution of probability. Inferential statistical analysis infers properties of a population, for example by testing hypotheses and deriving estimates. It is assumed that the observed data set is sampled from a larger population.
The theory of statistics provides a basis for the whole range of techniques, in both study design and data analysis, that are used within applications of statistics. The theory covers approaches to statistical-decision problems and to statistical inference, and the actions and deductions that satisfy the basic principles stated for these different approaches. Within a given approach, statistical theory gives ways of comparing statistical procedures; it can find the best possible procedure within a given context for given statistical problems, or can provide guidance on the choice between alternative procedures.
The following outline is provided as an overview of and topical guide to statistics:
A statistical hypothesis test is a method of statistical inference used to decide whether the data sufficiently supports a particular hypothesis. A statistical hypothesis test typically involves a calculation of a test statistic. Then a decision is made, either by comparing the test statistic to a critical value or equivalently by evaluating a p-value computed from the test statistic. Roughly 100 specialized statistical tests have been defined.
In statistics, point estimation involves the use of sample data to calculate a single value which is to serve as a "best guess" or "best estimate" of an unknown population parameter. More formally, it is the application of a point estimator to the data to obtain a point estimate.
Informally, in frequentist statistics, a confidence interval (CI) is an interval which is expected to typically contain the parameter being estimated. More specifically, given a confidence level , a CI is a random interval which contains the parameter being estimated % of the time. The confidence level, degree of confidence or confidence coefficient represents the long-run proportion of CIs that theoretically contain the true value of the parameter; this is tantamount to the nominal coverage probability. For example, out of all intervals computed at the 95% level, 95% of them should contain the parameter's true value.
Bayesian statistics is a theory in the field of statistics based on the Bayesian interpretation of probability, where probability expresses a degree of belief in an event. The degree of belief may be based on prior knowledge about the event, such as the results of previous experiments, or on personal beliefs about the event. This differs from a number of other interpretations of probability, such as the frequentist interpretation, which views probability as the limit of the relative frequency of an event after many trials. More concretely, analysis in Bayesian methods codifies prior knowledge in the form of a prior distribution.
Mathematical statistics is the application of probability theory, a branch of mathematics, to statistics, as opposed to techniques for collecting statistical data. Specific mathematical techniques which are used for this include mathematical analysis, linear algebra, stochastic analysis, differential equations, and measure theory.
In the design of experiments, optimal experimental designs are a class of experimental designs that are optimal with respect to some statistical criterion. The creation of this field of statistics has been credited to Danish statistician Kirstine Smith.
This glossary of statistics and probability is a list of definitions of terms and concepts used in the mathematical sciences of statistics and probability, their sub-disciplines, and related fields. For additional related terms, see Glossary of mathematics and Glossary of experimental design.
Statistics, in the modern sense of the word, began evolving in the 18th century in response to the novel needs of industrializing sovereign states.
The Foundations of Statistics are the mathematical and philosophical bases for statistical methods. These bases are the theoretical frameworks that ground and justify methods of statistical inference, estimation, hypothesis testing, uncertainty quantification, and the interpretation of statistical conclusions. Further, a foundation can be used to explain statistical paradoxes, provide descriptions of statistical laws, and guide the application of statistics to real-world problems.
Oscar Kempthorne was a British statistician and geneticist known for his research on randomization-analysis and the design of experiments, which had wide influence on research in agriculture, genetics, and other areas of science.
{{cite journal}}
: CS1 maint: multiple names: authors list (link)