William S. Cleveland

Last updated

William Swain Cleveland II (born 1943) is an American computer scientist and Professor of Statistics and Professor of Computer Science at Purdue University, known for his work on data visualization, particularly on nonparametric regression [1] and local regression. [2] He is remembered as one of the developers of the S programming language. [3]

Contents

Biography

Cleveland obtained his AB in Mathematics mid 1960s from Princeton University, where he graduated under William Feller. For his PhD studies in Statistics he moved to Yale University, where he graduated in 1969 under Leonard Jimmie Savage. [4] [5]

After graduation Cleveland started at Bell Labs, where he was staff member of the Statistics Research Department and Department Head for 12 years. While at Bell Labs, he helped to develop the S programming language, a precursor to R. [3] Eventually he moved to the Purdue University, where he became Professor of Statistics and Courtesy Professor of Computer Science. In 1982 he was elected as a Fellow of the American Statistical Association. [6]

His research interests are in the fields of "data visualization, computer networking, machine learning, data mining, time series, statistical modeling, visual perception, environmental science, and seasonal adjustment." [7] Cleveland is credited with defining and naming the field of data science, which he did in a 2001 publication. [8]

Selected publications

Articles, a selection: [9]

Related Research Articles

<span class="mw-page-title-main">Statistics</span> Study of the collection, analysis, interpretation, and presentation of data

Statistics is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a scientific, industrial, or social problem, it is conventional to begin with a statistical population or a statistical model to be studied. Populations can be diverse groups of people or objects such as "all people living in a country" or "every atom composing a crystal". Statistics deals with every aspect of data, including the planning of data collection in terms of the design of surveys and experiments.

<span class="mw-page-title-main">John Tukey</span> American mathematician

John Wilder Tukey was an American mathematician and statistician, best known for the development of the fast Fourier Transform (FFT) algorithm and box plot. The Tukey range test, the Tukey lambda distribution, the Tukey test of additivity, and the Teichmüller–Tukey lemma all bear his name. He is also credited with coining the term bit and the first published use of the word software.

Herman Ole Andreas Wold was a Norwegian-born econometrician and statistician who had a long career in Sweden. Wold was known for his work in mathematical economics, in time series analysis, and in econometric statistics.

<span class="mw-page-title-main">Data and information visualization</span> Visual representation of data

Data and information visualization is the practice of designing and creating easy-to-communicate and easy-to-understand graphic or visual representations of a large amount of complex quantitative and qualitative data and information with the help of static, dynamic or interactive visual items. Typically based on data and information collected from a certain domain of expertise, these visualizations are intended for a broader audience to help them visually explore and discover, quickly understand, interpret and gain important insights into otherwise difficult-to-identify structures, relationships, correlations, local and global patterns, trends, variations, constancy, clusters, outliers and unusual groupings within data. When intended for the general public to convey a concise version of known, specific information in a clear and engaging manner, it is typically called information graphics.

<span class="mw-page-title-main">Local regression</span> Moving average and polynomial regression method for smoothing data

Local regression or local polynomial regression, also known as moving regression, is a generalization of the moving average and polynomial regression. Its most common methods, initially developed for scatterplot smoothing, are LOESS and LOWESS, both pronounced LOH-ess. They are two strongly related non-parametric regression methods that combine multiple regression models in a k-nearest-neighbor-based meta-model. In some fields, LOESS is known and commonly referred to as Savitzky–Golay filter.

<span class="mw-page-title-main">Bradley Efron</span> American statistician

Bradley Efron is an American statistician. Efron has been president of the American Statistical Association (2004) and of the Institute of Mathematical Statistics (1987–1988). He is a past editor of the Journal of the American Statistical Association, and he is the founding editor of the Annals of Applied Statistics. Efron is also the recipient of many awards.

<span class="mw-page-title-main">Computational statistics</span> Interface between statistics and computer science

Computational statistics, or statistical computing, is the bond between statistics and computer science, and refers to the statistical methods that are enabled by using computational methods. It is the area of computational science specific to the mathematical science of statistics. This area is also developing rapidly, leading to calls that a broader concept of computing should be taught as part of general statistical education.

Statistical graphics, also known as statistical graphical techniques, are graphics used in the field of statistics for data visualization.

<span class="mw-page-title-main">Michael Friendly</span>

Michael Louis Friendly is an American-Canadian psychologist, Professor of Psychology at York University in Ontario, Canada, and director of its Statistical Consulting Service, especially known for his contributions to graphical methods for categorical and multivariate data, and on the history of data and information visualisation.

Nan McKenzie Laird is the Harvey V. Fineberg Professor of Public Health, Emerita in Biostatistics at the Harvard T.H. Chan School of Public Health. She served as Chair of the Department from 1990 to 1999. She was the Henry Pickering Walcott Professor of Biostatistics from 1991 to 1999. Laird is a Fellow of the American Statistical Association, as well as the Institute of Mathematical Statistics. She is a member of the International Statistical Institute.

<span class="mw-page-title-main">Plot (graphics)</span> Graphical technique for data sets

A plot is a graphical technique for representing a data set, usually as a graph showing the relationship between two or more variables. The plot can be drawn by hand or by a computer. In the past, sometimes mechanical or electronic plotters were used. Graphs are a visual representation of the relationship between variables, which are very useful for humans who can then quickly derive an understanding which may not have come from lists of values. Given a scale or ruler, graphs can also be used to read off the value of an unknown variable plotted as a function of a known one, but this can also be done with data presented in tabular form. Graphs of functions are used in mathematics, sciences, engineering, technology, finance, and other areas.

<span class="mw-page-title-main">Jayanta Kumar Ghosh</span>

Jayanta Kumar Ghosh was an Indian statistician, an emeritus professor at Indian Statistical Institute and a professor of statistics at Purdue University.

<span class="mw-page-title-main">Peter Rousseeuw</span> Belgian statistician (born 1956)

Peter J. Rousseeuw is a statistician known for his work on robust statistics and cluster analysis. He obtained his PhD in 1981 at the Vrije Universiteit Brussel, following research carried out at the ETH in Zurich, which led to a book on influence functions. Later he was professor at the Delft University of Technology, The Netherlands, at the University of Fribourg, Switzerland, and at the University of Antwerp, Belgium. Next he was a senior researcher at Renaissance Technologies. He then returned to Belgium as professor at KU Leuven, until becoming emeritus in 2022. His former PhD students include Annick Leroy, Hendrik Lopuhaä, Geert Molenberghs, Christophe Croux, Mia Hubert, Stefan Van Aelst, Tim Verdonck and Jakob Raymaekers.

<span class="mw-page-title-main">Data science</span> Interdisciplinary field of study on deriving knowledge and insights from data

Data science is an interdisciplinary academic field that uses statistics, scientific computing, scientific methods, processes, algorithms and systems to extract or extrapolate knowledge and insights from potentially noisy, structured, or unstructured data.

Raymond James Carroll is an American statistician, and Distinguished Professor of statistics, nutrition and toxicology at Texas A&M University. He is a recipient of 1988 COPSS Presidents' Award and 2002 R. A. Fisher Lectureship. He has made fundamental contributions to measurement error model, nonparametric and semiparametric modeling.

<span class="mw-page-title-main">Howard G. Funkhouser</span> American mathematician and historian (1898–1984)

Howard Gray Funkhouser was an American mathematician, historian and associate professor of mathematics at the Washington and Lee University, and later at the Phillips Exeter Academy, particularly known for his early work on the history of graphical methods.

Heike Hofmann is a statistician and Professor in the Department of Statistics at Iowa State University.

<span class="mw-page-title-main">Graphical perception</span>

Graphical perception is the human capacity for visually interpreting information on graphs and charts. Both quantitative and qualitative information can be said to be encoded into the image, and the human capacity to interpret it is sometimes called decoding. The importance of human graphical perception, what we discern easily versus what our brains have more difficulty decoding, is fundamental to good statistical graphics design, where clarity, transparency, accuracy and precision in data display and interpretation are essential for understanding the translation of data in a graph to clarify and interpret the science.

<span class="mw-page-title-main">Peter Bühlmann</span> Swiss mathematician

Peter Lukas Bühlmann is a Swiss mathematician and statistician.

References

  1. Armitage, Peter, Geoffrey Berry, and John NS Matthews. Statistical methods in medical research. John Wiley & Sons, 2008.
  2. Venables, William N., and Brian D. Ripley. Modern applied statistics with S. Springer Science & Business Media, 2002.
  3. 1 2 Berry, Kenneth J.; Johnston, Janis E.; Jr, Paul W. Mielke (2014-04-11). A Chronicle of Permutation Statistical Methods: 1920–2000, and Beyond. Springer Science & Business Media. p. 207. ISBN   978-3-319-02744-9.
  4. William S. Cleveland, CV, at stat.purdue.edu. Accessed 10-04-2015.
  5. "William Swain Cleveland, II". The Mathematics Genealogy Project. Retrieved 2 July 2022.
  6. View/Search Fellows of the ASA, accessed 2016-10-15.
  7. William S. Cleveland: Bio, at stat.purdue.edu. Accessed 10-04-2015.
  8. Brady, Henry E. (2019-05-11). "The Challenge of Big Data and Data Science". Annual Review of Political Science. 22 (1): 297–323. doi: 10.1146/annurev-polisci-090216-023229 . ISSN   1094-2939.
  9. William S. Cleveland, Google scholar profile.