Hadley Wickham | |
---|---|
Born | Hadley Alexander Wickham 14 October 1979 |
Alma mater | University of Auckland (BSc, MSc) Iowa State University (PhD) |
Known for | ggplot2 [1] tidyverse R packages |
Awards | |
Scientific career | |
Fields | |
Institutions |
|
Thesis | Practical tools for exploring data and models (2008) |
Doctoral advisors | |
Website | hadley |
Hadley Alexander Wickham (born 14 October 1979) is a New Zealand statistician known for his work on open-source software for the R statistical programming environment. He is the chief scientist at Posit PBC and an adjunct professor of statistics at the University of Auckland, Stanford University, and Rice University. His work includes the data visualisation system ggplot2 and the tidyverse, a collection of R packages for data science based on the concept of tidy data.
Wickham was born in Hamilton, New Zealand. He received a Bachelors degree in Human Biology and a master's degree in statistics at the University of Auckland in 1999–2004 and his PhD at Iowa State University in 2008 supervised by Di Cook and Heike Hofmann. [3] [4] He is the chief scientist at Posit PBC (formerly RStudio PBC) [5] and an adjunct professor of statistics at the University of Auckland, Stanford University, and Rice University. [6] [7] [8]
Wickham is a prominent and active member of the R user community and has developed several notable and widely used packages including ggplot2, plyr, dplyr and reshape2. [8] [9] Wickham's data analysis packages for R are collectively known as the tidyverse. [10] According to Wickham's tidy data approach, each variable should be a column, each observation should be a row, and each type of observational unit should be a table. [11]
In 2006 he was awarded the John Chambers Award for Statistical Computing for his work developing tools for data reshaping and visualisation. [12] Wickham was named a Fellow by the American Statistical Association in 2015 for "pivotal contributions to statistical practice through innovative and pioneering research in statistical graphics and computing". [13] Wickham was awarded the international COPSS Presidents' Award in 2019 for "influential work in statistical computing, visualisation, graphics, and data analysis" including "making statistical thinking and computing accessible to a large audience". [14]
Wickham's sister Charlotte Wickham is also a statistician. [15]
Wickham's publications [2] include:
R is a programming language for statistical computing and data visualization. It has been adopted in the fields of data mining, bioinformatics, and data analysis.
The Statistics Online Computational Resource (SOCR) is an online multi-institutional research and education organization. SOCR designs, validates and broadly shares a suite of online tools for statistical computing, and interactive materials for hands-on learning and teaching concepts in data science, statistical analysis and probability theory. The SOCR resources are platform agnostic based on HTML, XML and Java, and all materials, tools and services are freely available over the Internet.
George Ross Ihaka is a New Zealand statistician who was an associate professor of statistics at the University of Auckland until his retirement in 2017. Alongside Robert Gentleman, he is one of the creators of the R programming language. In 2008, Ihaka received the Pickering Medal, awarded by the Royal Society of New Zealand, for his work on R.
ggplot2 is an open-source data visualization package for the statistical programming language R. Created by Hadley Wickham in 2005, ggplot2 is an implementation of Leland Wilkinson's Grammar of Graphics—a general scheme for data visualization which breaks up graphs into semantic components such as scales and layers. ggplot2 can serve as a replacement for the base graphics in R and contains a number of defaults for web and print display of common scales. Since 2005, ggplot2 has grown in use to become one of the most popular R packages.
RStudio IDE is an integrated development environment for R, a programming language for statistical computing and graphics. It is available in two formats: RStudio Desktop is a regular desktop application while RStudio Server runs on a remote server and allows accessing RStudio using a web browser. The RStudio IDE is a product of Posit PBC.
Dirk Eddelbuettel is a Canadian statistician, data scientist and researcher. He is the author of the open-source software package Rcpp, written in the R programming language, and has also written the textbook Seamless R and C++ Integration with Rcpp on the topic. He is co-founder of the R In Finance Conference. In addition, he has contributed to many packages in R as well as the Debian project. He is also a co-creator of the Rocker Project bringing Docker to R.
Yihui Xie (谢益辉) is a Chinese software developer who previously worked for Posit PBC. He is the principal author of the open-source software package Knitr for data analysis in the R programming language, and has also written the book Dynamic Documents with R and knitr.
Heike Hofmann is a statistician and Professor in the Department of Statistics at Iowa State University.
Dianne Helen Cook is an Australian statistician, the editor of the Journal of Computational and Graphical Statistics, and an expert on the visualization of high-dimensional data. She is Professor of Business Analytics in the Department of Econometrics and Business Statistics at Monash University and professor emeritus of statistics at Iowa State University. The emeritus status was chosen so that she could continue to supervise graduate students at Iowa State after moving to Australia.
Deborah F. Swayne is an American statistician who worked for AT&T Labs and chaired the Section on Statistical Graphics of the American Statistical Association. She is known for her work as coauthor of GGobi, a software tool for interactive data visualization, and is president of the GGobi Foundation. She retired in 2016.
Mine Çetinkaya-Rundel is a Turkish-American statistician and professor of the practice at Duke University, and a professional educator at RStudio. She is the author of several open source statistics textbooks and is an instructor for Coursera. She is the chair-elect of the Statistical Education Section of the American Statistical Association. Previously, she was a senior lecturer at University of Edinburgh.
Julia Silge is an American data scientist and software engineer. She has developed tools for statistical modelling in the R programming language, including the text mining package tidytext. Silge currently works for Posit PBC, formerly known as RStudio PBC.
Jennifer "Jenny" Bryan is a data scientist and an associate professor of statistics at the University of British Columbia where she developed the Master in Data Science Program. She is a statistician and software engineer at RStudio from Vancouver, Canada and is known for creating open source tools which connect R to Google Sheets and Google Drive.
The tidyverse is a collection of open source packages for the R programming language introduced by Hadley Wickham and his team that "share an underlying design philosophy, grammar, and data structures" of tidy data. Characteristic features of tidyverse packages include extensive use of non-standard evaluation and encouraging piping.
DataGraph is a graphing and data analysis software application for the macOS operating system, developed by Visual Data Tools in Chapel Hill, NC. DataGraph is used for creating publication quality graphics, particularly for research and science.
dplyr is an R package whose set of functions are designed to enable dataframe manipulation in an intuitive, user-friendly way. It is one of the core packages of the popular tidyverse set of packages in the R programming language. Data analysts typically use dplyr in order to transform existing datasets into a format better suited for some particular type of analysis, or data visualization.
R packages are extensions to the R statistical programming language. R packages contain code, data, and documentation in a standardised collection format that can be installed by users of R, typically via a centralised software repository such as CRAN. The large number of packages available for R, and the ease of installing and using them, has been cited as a major factor driving the widespread adoption of the language in data science.
The easystats collection of open source R packages was created in 2019 and primarily includes tools dedicated to the post-processing of statistical models. As of May 2022, the 10 packages composing the easystats ecosystem have been downloaded more than 8 million times, and have been used in more than 1000 scientific publications. The ecosystem is the topic of several statistical courses, video tutorials and books.
Posit PBC is an open-source data science software company. It is a public-benefit corporation founded by J. J. Allaire, creator of the programming language ColdFusion.