John A. Hartigan

Last updated
John Anthony Hartigan
Born (1937-07-02) July 2, 1937 (age 86)
Education University of Sydney
Princeton University
Known for Biclustering
Scientific career
Institutions Princeton University
Yale University
Doctoral advisor John Tukey
Frank Anscombe
Doctoral students Bill Eddy
Michael D. Escobar
Website www.stat.yale.edu/~jah49/

John Anthony Hartigan (born July 2, 1937) is an Australian-American statistician, the Eugene Higgins Professor of Statistics emeritus at Yale University. He made fundamental contributions to clustering algorithms, including the famous Hartigan-Wong method and biclustering, [1] and Bayesian statistics. [2]

Contents

Education and career

Hartigan was born in Sydney, Australia and studied mathematics at the University of Sydney, where he obtained his BSc in 1959 and MSc in 1960. Afterwards, he moved to Princeton University, where he studied under John Tukey and Frank Anscombe. He obtained his PhD in statistics at Princeton in 1962. He joined the faculty of Princeton University in 1964 as an assistant professor and moved to Yale University in 1969, when he became an associate professor. He gained full professorship at Yale in 1972. He was the chair at Yale's department of statistics from 1973 to 1975 and again from 1988 to 1994. [3]

Bibliography

Books

Selected papers

Related Research Articles

<span class="mw-page-title-main">John Tukey</span> American mathematician

John Wilder Tukey was an American mathematician and statistician, best known for the development of the fast Fourier Transform (FFT) algorithm and box plot. The Tukey range test, the Tukey lambda distribution, the Tukey test of additivity, and the Teichmüller–Tukey lemma all bear his name. He is also credited with coining the term 'bit' and the first published use of the word 'software'.

In statistics, the theory of minimum norm quadratic unbiased estimation (MINQUE) was developed by C. R. Rao. Its application was originally to the problem of heteroscedasticity and the estimation of variance components in random effects models.

Biclustering, block clustering, Co-clustering or two-mode clustering is a data mining technique which allows simultaneous clustering of the rows and columns of a matrix. The term was first introduced by Boris Mirkin to name a technique introduced many years earlier, in 1972, by John A. Hartigan.

In econometrics, the Frisch–Waugh–Lovell (FWL) theorem is named after the econometricians Ragnar Frisch, Frederick V. Waugh, and Michael C. Lovell.

In statistics, the Lilliefors test is a normality test based on the Kolmogorov–Smirnov test. It is used to test the null hypothesis that data come from a normally distributed population, when the null hypothesis does not specify which normal distribution; i.e., it does not specify the expected value and variance of the distribution. It is named after Hubert Lilliefors, professor of statistics at George Washington University.

Chi-square automatic interaction detection (CHAID) is a decision tree technique based on adjusted significance testing. The technique was developed in South Africa and was published in 1980 by Gordon V. Kass, who had completed a PhD thesis on this topic. CHAID can be used for prediction as well as classification, and for detection of interaction between variables. CHAID is based on a formal extension of AID and THAID procedures of the 1960s and 1970s, which in turn were extensions of earlier research, including that performed by Belson in the UK in the 1950s. A history of earlier supervised tree methods together with a detailed description of the original CHAID algorithm and the exhaustive CHAID extension by Biggs, De Ville, and Suen, can be found in Ritschard.

<span class="mw-page-title-main">Computational statistics</span> Interface between statistics and computer science

Computational statistics, or statistical computing, is the bond between statistics and computer science. It means statistical methods that are enabled by using computational methods. It is the area of computational science specific to the mathematical science of statistics. This area is also developing rapidly, leading to calls that a broader concept of computing should be taught as part of general statistical education.

<span class="mw-page-title-main">Theodore Wilbur Anderson</span> American mathematician and statistician

Theodore Wilbur Anderson was an American mathematician and statistician who specialized in the analysis of multivariate data. He was born in Minneapolis, Minnesota. He was on the faculty of Columbia University from 1946 until moving to Stanford University in 1967, becoming Emeritus Professor in 1988. He served as Editor of Annals of Mathematical Statistics from 1950 to 1952. He was elected President of the Institute of Mathematical Statistics in 1962.

<span class="mw-page-title-main">Morris H. Hansen</span> American statistician (1910–1990

Morris Howard Hansen (1910–1990) was an American statistician. While at the United States Census Bureau, he was one of the first to develop methods for statistical sampling and made contributions in many areas of surveys and censuses.

<span class="mw-page-title-main">Peter Rousseeuw</span> Belgian statistician (born 1956)

Peter J. Rousseeuw is a statistician known for his work on robust statistics and cluster analysis. He obtained his PhD in 1981 at the Vrije Universiteit Brussel, following research carried out at the ETH in Zurich, which led to a book on influence functions. Later he was professor at the Delft University of Technology, The Netherlands, at the University of Fribourg, Switzerland, and at the University of Antwerp, Belgium. Next he was a senior researcher at Renaissance Technologies. He then returned to Belgium as professor at KU Leuven, until becoming emeritus in 2022. His former PhD students include Annick Leroy, Hendrik Lopuhaä, Geert Molenberghs, Christophe Croux, Mia Hubert, Stefan Van Aelst, Tim Verdonck and Jakob Raymaekers.

John Winsor Pratt is Emeritus William Ziegler professor business administration at Harvard University. His former education was conducted at Princeton and Stanford, where he specialized in mathematics and statistics. Pratt spent most of his academic career at Harvard University. He was an editor of the Journal of the American Statistical Association from 1965 to 1970. His researches on risk aversion, risk sharing incentives, and the nature and discovery of stochastic laws, statistical relationships that describe the effects of decisions. He has made contributions to research in risk aversion theory, notably with Kenneth Arrow on measures of risk aversion.

Jerome Harold Friedman is an American statistician, consultant and Professor of Statistics at Stanford University, known for his contributions in the field of statistics and data mining.

Sudipto Banerjee is an Indian-American statistician best known for his work on Bayesian hierarchical modeling and inference for spatial data analysis. He is Professor and Chair of the Department of Biostatistics in the School of Public Health at the University of California, Los Angeles. He served as the 2022 President of the International Society for Bayesian Analysis.

Paul R. Rosenbaum is the Robert G. Putzel Professor Emeritus in the Department of Statistics and Data Science at Wharton School of the University of Pennsylvania, where he worked from 1986 through 2021. He has written extensively about causal inference in observational studies, including sensitivity analysis, optimal matching, design sensitivity, evidence factors, quasi-experimental devices, and the propensity score. With various coauthors, he has also written about health outcomes, racial disparities in health outcomes, instrumental variables, psychometrics and experimental design.

Alan Enoch Gelfand is an American statistician, and is currently the James B. Duke Professor of Statistics and Decision Sciences at Duke University. Gelfand’s research includes substantial contributions to the fields of Bayesian statistics, spatial statistics and hierarchical modeling.

Lawrence C. Rafsky, is an American data scientist, inventor, and entrepreneur. Rafsky created search algorithms and methodologies for the financial and news information industries. He is co-inventor of the Friedman-Rafsky Test commonly used to test goodness-of-fit for the multivariate normal distribution. Rafsky founded and became chief scientist for Acquire Media, a news and information syndication company, now a subsidiary of Moody's.

<span class="mw-page-title-main">Daniela Witten</span> American biostatistician

Daniela M. Witten is an American biostatistician. She is a professor and the Dorothy Gilford Endowed Chair of Mathematical Statistics at the University of Washington. Her research investigates the use of machine learning to understand high-dimensional data.

Babette Anne Brumback is an American biostatistician known for her work on causal inference. She is a professor of biostatistics at the University of Florida.

Charles "Chuck" Joel Stone was an American statistician and mathematician.

Michael David Escobar is an American biostatistician who is known for Bayesian nonparametrics, mixture models.

References

  1. Hartigan, J. A. (1972). "Direct Clustering of a Data Matrix". Journal of the American Statistical Association. 67 (337): 123–129. doi:10.1080/01621459.1972.10481214. ISSN   0162-1459.
  2. "John Hartigan's contributions to clustering" (PDF).
  3. Barry, Daniel (2005). "A Conversation with John Hartigan". Statistical Science. 20 (4): 418–430. doi: 10.1214/088342304000000242 . ISSN   0883-4237.