George Albert Milliken is emeritus professor of statistics at Kansas State University. He is a Fellow of the American Statistical Association [1] and has published many papers in various statistical journals. Milliken is a co-author of the three volume Analysis of Messy Data series (Volume 1: Designed Experiments; Volume 2: Nonreplicated Experiments; Volume 3: Analysis of Covariance) and the co-author of the book SAS System for Mixed Models.
Milliken's books are widely referenced in the statistical research community. [2] He has placed a significant emphasis of his professional research on the following areas:
Analysis of variance (ANOVA) is a collection of statistical models and their associated estimation procedures used to analyze the differences among means. ANOVA was developed by the statistician Ronald Fisher. ANOVA is based on the law of total variance, where the observed variance in a particular variable is partitioned into components attributable to different sources of variation. In its simplest form, ANOVA provides a statistical test of whether two or more population means are equal, and therefore generalizes the t-test beyond two means.
Biostatistics are the development and application of statistical methods to a wide range of topics in biology. It encompasses the design of biological experiments, the collection and analysis of data from those experiments and the interpretation of the results.
The design of experiments is the design of any task that aims to describe and explain the variation of information under conditions that are hypothesized to reflect the variation. The term is generally associated with experiments in which the design introduces conditions that directly affect the variation, but may also refer to the design of quasi-experiments, in which natural conditions that influence the variation are selected for observation.
The field of system identification uses statistical methods to build mathematical models of dynamical systems from measured data. System identification also includes the optimal design of experiments for efficiently generating informative data for fitting such models as well as model reduction. A common approach is to start from measurements of the behavior of the system and the external influences and try to determine a mathematical relation between them without going into many details of what is actually happening inside the system; this approach is called black box system identification.
Sensitivity analysis is the study of how the uncertainty in the output of a mathematical model or system can be divided and allocated to different sources of uncertainty in its inputs. A related practice is uncertainty analysis, which has a greater focus on uncertainty quantification and propagation of uncertainty; ideally, uncertainty and sensitivity analysis should be run in tandem.
In biology and other experimental sciences, an in silico experiment is one performed on computer or via computer simulation. The phrase is pseudo-Latin for 'in silicon', referring to silicon in computer chips. It was coined in 1987 as an allusion to the Latin phrases in vivo, in vitro, and in situ, which are commonly used in biology. The latter phrases refer, respectively, to experiments done in living organisms, outside living organisms, and where they are found in nature.
In the design of experiments, optimal designs are a class of experimental designs that are optimal with respect to some statistical criterion. The creation of this field of statistics has been credited to Danish statistician Kirstine Smith.
Genstat is a statistical software package with data analysis capabilities, particularly in the field of agriculture.
In medicine, a crossover study or crossover trial is a longitudinal study in which subjects receive a sequence of different treatments. While crossover studies can be observational studies, many important crossover studies are controlled experiments, which are discussed in this article. Crossover designs are common for experiments in many scientific disciplines, for example psychology, pharmaceutical science, and medicine.
Multilevel models are statistical models of parameters that vary at more than one level. An example could be a model of student performance that contains measures for individual students as well as measures for classrooms within which the students are grouped. These models can be seen as generalizations of linear models, although they can also extend to non-linear models. These models became much more popular after sufficient computing power and software became available.
A mixed model, mixed-effects model or mixed error-component model is a statistical model containing both fixed effects and random effects. These models are useful in a wide variety of disciplines in the physical, biological and social sciences. They are particularly useful in settings where repeated measurements are made on the same statistical units, or where measurements are made on clusters of related statistical units. Because of their advantage in dealing with missing values, mixed effects models are often preferred over more traditional approaches such as repeated measures analysis of variance.
In science, randomized experiments are the experiments that allow the greatest reliability and validity of statistical estimates of treatment effects. Randomization-based inference is especially important in experimental design and in survey sampling.
Repeated measures design is a research design that involves multiple measures of the same variable taken on the same or matched subjects either under different conditions or over two or more time periods. For instance, repeated measurements are collected in a longitudinal study in which change over time is assessed.
Oscar Kempthorne was a British statistician and geneticist known for his research on randomization-analysis and the design of experiments, which had wide influence on research in agriculture, genetics, and other areas of science.
In randomized statistical experiments, generalized randomized block designs (GRBDs) are used to study the interaction between blocks and treatments. For a GRBD, each treatment is replicated at least two times in each block; this replication allows the estimation and testing of an interaction term in the linear model.
Jianqing Fan is a Chinese statistician, financial econometrician, and writer. He is currently the Frederick L. Moore '18 Professor of Finance, a Professor of Statistics, and a former Chairman of Department of Operations Research and Financial Engineering (2012–2015) at Princeton University.
Jan de Leeuw is a Dutch statistician and psychometrician. He is distinguished professor emeritus of statistics and founding chair of the Department of Statistics, University of California, Los Angeles. In addition, he is the founding editor and former editor-in-chief of the Journal of Statistical Software, as well as the former editor-in-chief of the Journal of Multivariate Analysis and the Journal of Educational and Behavioral Statistics.
Causal inference is the process of determining the independent, actual effect of a particular phenomenon that is a component of a larger system. The main difference between causal inference and inference of association is that causal inference analyzes the response of an effect variable when a cause of the effect variable is changed. The science of why things occur is called etiology. Causal inference is said to provide the evidence of causality theorized by causal reasoning.
Ajit C. Tamhane is a Professor in the Department of Industrial Engineering and Management Sciences (IEMS) at Northwestern University and also holds a courtesy appointment in the Department of Statistics.