MaryAnn H. Hill is a retired American statistical software developer who contributed to statistics packages including BMDP, SYSTAT, and SPSS. She also published fundamental research on robust statistics, [1] as well as contributing statistical analyses to several medical research publications.
In the 1970s, Hill worked for the University of California, Los Angeles (UCLA) as a developer of the BMDP (Biomedical Data Processing) package, including developing robust regression and ridge regression methods for BMDP. [2] She was later also credited with writing most of the documentation of the BMDP system. [3]
In the 1980s, she was listed as a senior statistician in the UCLA biomathematics program, [4] also affiliated with the VA Medical Center in Los Angeles, [1] and later in the UCLA Department of Psychiatry and as an employee of BMDP Statistical Software, Inc. [5] By the early 1990s, she was working in the department of statistics at the University of Michigan and at Systat Software Inc., [6] working on the SYSTAT statistics package. SYSTAT was sold in 1995 to SPSS, [7] and in 1997 she authored a manual on missing data for SPSS, Inc. [8] She was also listed as a senior statistician at NORC at the University of Chicago in 1997. [9]
Hill was named as a Fellow of the American Statistical Association in 1995. [10]
Hill is the mother of biology professor Karlyn Mueller-Hill, who writes in her 1999 doctoral thesis that Hill's "idea of child day care was to bring [Karlyn] to her graduate classes in statistics". [11]
A data set is a collection of data. In the case of tabular data, a data set corresponds to one or more database tables, where every column of a table represents a particular variable, and each row corresponds to a given record of the data set in question. The data set lists values for each of the variables, such as for example height and weight of an object, for each member of the data set. Data sets can also consist of a collection of documents or files.
Multivariate statistics is a subdivision of statistics encompassing the simultaneous observation and analysis of more than one outcome variable, i.e., multivariate random variables. Multivariate statistics concerns understanding the different aims and background of each of the different forms of multivariate analysis, and how they relate to each other. The practical application of multivariate statistics to a particular problem may involve several types of univariate and multivariate analyses in order to understand the relationships between variables and their relevance to the problem being studied.
Psychological statistics is application of formulas, theorems, numbers and laws to psychology. Statistical methods for psychology include development and application statistical theory and methods for modeling psychological data. These methods include psychometrics, factor analysis, experimental designs, and Bayesian statistics. The article also discusses journals in the same field.
SPSS Statistics is a statistical software suite developed by IBM for data management, advanced analytics, multivariate analysis, business intelligence, and criminal investigation. Long produced by SPSS Inc., it was acquired by IBM in 2009. Versions of the software released since 2015 have the brand name IBM SPSS Statistics.
SAS is a statistical software suite developed by SAS Institute for data management, advanced analytics, multivariate analysis, business intelligence, criminal investigation, and predictive analytics. SAS' analytical software is built upon artificial intelligence and utilizes machine learning, deep learning and generative AI to manage and model data. The software is widely used in industries such as finance, insurance, health care and education.
SUDAAN is a proprietary statistical software package for the analysis of correlated data, including correlated data encountered in complex sample surveys. SUDAAN originated in 1972 at RTI International. Individual commercial licenses are sold for $1,460 a year, or $3,450 permanently.
Variable rules analysis is a set of statistical analysis methods in linguistics that are commonly used in sociolinguistics and historical linguistics to describe patterns of variation between alternative forms in language use. It is also sometimes known as Varbrul analysis, after the name of a software package dedicated to carrying out the relevant statistical computations. The method goes back to a theoretical approach developed by the sociolinguist William Labov in the late 1960s and early 1970s, and its mathematical implementation was developed by Henrietta Cedergren and David Sankoff in 1974.
Anscombe's quartet comprises four data sets that have nearly identical simple descriptive statistics, yet have very different distributions and appear very different when graphed. Each dataset consists of eleven (x, y) points. They were constructed in 1973 by the statistician Francis Anscombe to demonstrate both the importance of graphing data when analyzing it, and the effect of outliers and other influential observations on statistical properties. He described the article as being intended to counter the impression among statisticians that "numerical calculations are exact, but graphs are rough".
Computational statistics, or statistical computing, is the bond between statistics and computer science, and refers to the statistical methods that are enabled by using computational methods. It is the area of computational science specific to the mathematical science of statistics. This area is also developing rapidly, leading to calls that a broader concept of computing should be taught as part of general statistical education.
PSPP is a free software application for analysis of sampled data, intended as a free alternative for IBM SPSS Statistics. It has a graphical user interface and conventional command-line interface. It is written in C and uses GNU Scientific Library for its mathematical routines. The name has "no official acronymic expansion".
In statistics, a generalized estimating equation (GEE) is used to estimate the parameters of a generalized linear model with a possible unmeasured correlation between observations from different timepoints. Although some believe that Generalized estimating equations are robust in everything even with the wrong choice of working-correlation matrix, Generalized estimating equations are only robust to loss of consistency with the wrong choice.
Leland Wilkinson was an American statistician and computer scientist at H2O.ai and Adjunct Professor of Computer Science at University of Illinois at Chicago. Wilkinson developed the SYSTAT statistical package in the early 1980s, sold it to SPSS in 1995, and worked at SPSS for 10 years recruiting and managing the visualization team. He left SPSS in 2008 and became Executive VP of SYSTAT Software Inc. in Chicago. He then served as the VP of Data Visualization at Skytree, Inc and VP of Statistics at Tableau Software before joining H2O.ai. His research focused on scientific visualization and statistical graphics. In these communities he was well known for his book The Grammar of Graphics, which was the foundation for the R package ggplot2.
Peter J. Rousseeuw is a statistician known for his work on robust statistics and cluster analysis. He obtained his PhD in 1981 at the Vrije Universiteit Brussel, following research carried out at the ETH in Zurich, which led to a book on influence functions. Later he was professor at the Delft University of Technology, The Netherlands, at the University of Fribourg, Switzerland, and at the University of Antwerp, Belgium. Next he was a senior researcher at Renaissance Technologies. He then returned to Belgium as professor at KU Leuven, until becoming emeritus in 2022. His former PhD students include Annick Leroy, Hendrik Lopuhaä, Geert Molenberghs, Christophe Croux, Mia Hubert, Stefan Van Aelst, Tim Verdonck and Jakob Raymaekers.
Jacqueline Meulman is a Dutch statistician and professor emerita of Applied Statistics at the Mathematical Institute of Leiden University.
Wilfrid Joseph Dixon was an American mathematician and statistician. He made notable contributions to nonparametric statistics, statistical education and experimental design.
JASP is a free and open-source program for statistical analysis supported by the University of Amsterdam. It is designed to be easy to use, and familiar to users of SPSS. It offers standard analysis procedures in both their classical and Bayesian form. JASP generally produces APA style results tables and plots to ease publication. It promotes open science via integration with the Open Science Framework and reproducibility by integrating the analysis settings into the results. The development of JASP is financially supported by several universities and research funds. As the JASP GUI is developed in C++ using Qt framework, some of the team left to make a notable fork which is Jamovi which has its GUI developed in JavaScript and HTML5.
Imputation and Variance Estimation Software (IVEware) is a collection of routines written under various platforms and packaged to perform multiple imputations, variance estimation and, in general, draw inferences from incomplete data. It can also be used to perform analysis without any missing data. IVEware defaults to assuming a simple random sample, but uses the Jackknife Repeated Replication or Taylor Series Linearization techniques for analyzing data from complex surveys.
Mia Hubert is a Belgian mathematical statistician known for her research on topics in robust statistics including medoid-based clustering,[a] regression depth,[b] the medcouple for robustly measuring skewness,[c] box plots for skewed data,[f] and robust principal component analysis,[d] and for her implementations of robust statistical algorithms in the R statistical software system, MATLAB,[e] and S-PLUS.[a] She is a professor in the statistics and data science section of the department of mathematics at KU Leuven.