Robert Tibshirani

Last updated
Robert Tibshirani
Robert tibshirani.jpg
Born (1956-07-10) July 10, 1956 (age 67)
Nationality Canadian, American
Alma mater University of Waterloo
University of Toronto
Stanford University
Known for Lasso method
SpouseCheryl Denise Tibshirani
Scientific career
Fields Statistics
Institutions Stanford University
Doctoral advisor Bradley Efron [1]
Doctoral students
Website tibshirani.su.domains

Robert Tibshirani FRS FRSC (born July 10, 1956) is a professor in the Departments of Statistics and Biomedical Data Science at Stanford University. He was a professor at the University of Toronto from 1985 to 1998. In his work, he develops statistical tools for the analysis of complex datasets, most recently in genomics and proteomics.

Contents

His most well-known contributions are the Lasso method, which proposed the use of L1 penalization in regression and related problems, and Significance Analysis of Microarrays.

Education and early life

Tibshirani was born on 10 July 1956 in Niagara Falls, Ontario, Canada. He received his B. Math. in statistics and computer science from the University of Waterloo in 1979 and a Master's degree in Statistics from the University of Toronto in 1980. Tibshirani joined the doctoral program at Stanford University in 1981 and received his Ph.D. in 1984 under the supervision of Bradley Efron. His dissertation was entitled "Local likelihood estimation". [1]

Honors and awards

Tibshirani received the COPSS Presidents' Award in 1996. Given jointly by the world's leading statistical societies, the award recognizes outstanding contributions to statistics by a statistician under the age of 40. He is a fellow of the Institute of Mathematical Statistics and the American Statistical Association. He won an E.W.R. Steacie Memorial Fellowship from the Natural Sciences and Engineering Research Council of Canada in 1997. He was elected a Fellow of the Royal Society of Canada in 2001 and a member of the National Academy of Sciences in 2012. [3]

Tibshirani was made the 2012 Statistical Society of Canada's Gold Medalist at their yearly meeting in Guelph, Ontario for "exceptional contributions to methodology and theory for the analysis of complex data sets, smoothing and regression methodology, statistical learning, and classification, and application areas that include public health, genomics, and proteomics". [4] He gave his Gold Medal Address at the 2013 meeting in Edmonton. He was elected to the Royal Society in 2019. Tibshirani was named as the 2021 recipient of the ISI Founders of Statistics Prize for his 1996 paper Regression Shrinkage and Selection via the Lasso.

Personal life

His son, Ryan Tibshirani, [5] with whom he occasionally publishes scientific papers, is a professor at UC Berkeley in the Department of Statistics.

Publications

Tibshirani is a prolific author of scientific works on various topics in applied statistics, including statistical learning, data mining, statistical computing, and bioinformatics. He along with his collaborators has authored about 250 scientific articles. Many of Tibshirani's scientific articles were coauthored by his longtime collaborator, Trevor Hastie. Tibshirani is one of the most ISI Highly Cited Authors in Mathematics by the ISI Web of Knowledge. [6] He has coauthored the following books:

See also

Related Research Articles

<span class="mw-page-title-main">Density estimation</span> Estimate of an unobservable underlying probability density function

In statistics, probability density estimation or simply density estimation is the construction of an estimate, based on observed data, of an unobservable underlying probability density function. The unobservable density function is thought of as the density according to which a large population is distributed; the data are usually thought of as a random sample from that population.

In machine learning and pattern recognition, a feature is an individual measurable property or characteristic of a phenomenon. Choosing informative, discriminating and independent features is a crucial element of effective algorithms in pattern recognition, classification and regression. Features are usually numeric, but structural features such as strings and graphs are used in syntactic pattern recognition. The concept of "feature" is related to that of explanatory variable used in statistical techniques such as linear regression.

Econometric models involving data sampled at different frequencies are of general interest. Mixed-data sampling (MIDAS) is an econometric regression developed by Eric Ghysels with several co-authors. There is now a substantial literature on MIDAS regressions and their applications, including Ghysels, Santa-Clara and Valkanov (2006), Ghysels, Sinko and Valkanov, Andreou, Ghysels and Kourtellos (2010) and Andreou, Ghysels and Kourtellos (2013).

<span class="mw-page-title-main">David Cox (statistician)</span> British statistician and educator (1924–2022)

Sir David Roxbee Cox was a British statistician and educator. His wide-ranging contributions to the field of statistics included introducing logistic regression, the proportional hazards model and the Cox process, a point process named after him.

Model selection is the task of selecting a model from among various candidates on the basis of performance criterion to choose the best one. In the context of machine learning and more generally statistical analysis, this may be the selection of a statistical model from a set of candidate models, given data. In the simplest cases, a pre-existing set of data is considered. However, the task can also involve the design of experiments such that the data collected is well-suited to the problem of model selection. Given candidate models of similar predictive or explanatory power, the simplest model is most likely to be the best choice.

<span class="mw-page-title-main">Bradley Efron</span> American statistician

Bradley Efron is an American statistician. Efron has been president of the American Statistical Association (2004) and of the Institute of Mathematical Statistics (1987–1988). He is a past editor of the Journal of the American Statistical Association, and he is the founding editor of the Annals of Applied Statistics. Efron is also the recipient of many awards.

<span class="mw-page-title-main">Eric Ghysels</span> Belgian economist (born 1956)

Eric Ghysels is a Belgian economist with interest in finance and time series econometrics, and in particular the fields of financial econometrics and financial technology. He is the Edward M. Bernstein Distinguished Professor of Economics at the University of North Carolina and a Professor of Finance at the Kenan-Flagler Business School. He is also the Faculty Research Director of the Rethinc.Labs at the Frank Hawkins Kenan Institute of Private Enterprise.

<span class="mw-page-title-main">Least-angle regression</span>

In statistics, least-angle regression (LARS) is an algorithm for fitting linear regression models to high-dimensional data, developed by Bradley Efron, Trevor Hastie, Iain Johnstone and Robert Tibshirani.

In machine learning and computational learning theory, LogitBoost is a boosting algorithm formulated by Jerome Friedman, Trevor Hastie, and Robert Tibshirani.

In statistical theory, the field of high-dimensional statistics studies data whose dimension is larger than typically considered in classical multivariate analysis. The area arose owing to the emergence of many modern data sets in which the dimension of the data vectors may be comparable to, or even larger than, the sample size, so that justification for the use of traditional techniques, often based on asymptotic arguments with the dimension held fixed as the sample size increased, was lacking.

In statistics, an additive model (AM) is a nonparametric regression method. It was suggested by Jerome H. Friedman and Werner Stuetzle (1981) and is an essential part of the ACE algorithm. The AM uses a one-dimensional smoother to build a restricted class of nonparametric regression models. Because of this, it is less affected by the curse of dimensionality than a p-dimensional smoother. Furthermore, the AM is more flexible than a standard linear model, while being more interpretable than a general regression surface at the cost of approximation errors. Problems with AM, like many other machine-learning methods, include model selection, overfitting, and multicollinearity.

In statistics and machine learning, lasso is a regression analysis method that performs both variable selection and regularization in order to enhance the prediction accuracy and interpretability of the resulting statistical model. The lasso method assumes that the coefficients of the linear model are sparse, meaning that few of them are non-zero. It was originally introduced in geophysics, and later by Robert Tibshirani, who coined the term.

In statistics and, in particular, in the fitting of linear or logistic regression models, the elastic net is a regularized regression method that linearly combines the L1 and L2 penalties of the lasso and ridge methods.

<span class="mw-page-title-main">Trevor Hastie</span> American statistician & computer scientist (born 1953)

Trevor John Hastie is an American statistician and computer scientist. He is currently serving as the John A. Overdeck Professor of Mathematical Sciences and Professor of Statistics at Stanford University. Hastie is known for his contributions to applied statistics, especially in the field of machine learning, data mining, and bioinformatics. He has authored several popular books in statistical learning, including The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Hastie has been listed as an ISI Highly Cited Author in Mathematics by the ISI Web of Knowledge. He also contributed to the development of S.

In machine learning, a probabilistic classifier is a classifier that is able to predict, given an observation of an input, a probability distribution over a set of classes, rather than only outputting the most likely class that the observation should belong to. Probabilistic classifiers provide classification that can be useful in its own right or when combining classifiers into ensembles.

In statistics, the graphical lasso is a sparsepenalized maximum likelihood estimator for the concentration or precision matrix of a multivariate elliptical distribution. The original variant was formulated to solve Dempster's covariance selection problem for the multivariate Gaussian distribution when observations were limited. Subsequently, the optimization algorithms to solve this problem were improved and extended to other types of estimators and distributions.

Out-of-bag (OOB) error, also called out-of-bag estimate, is a method of measuring the prediction error of random forests, boosted decision trees, and other machine learning models utilizing bootstrap aggregating (bagging). Bagging uses subsampling with replacement to create training samples for the model to learn from. OOB error is the mean prediction error on each training sample xi, using only the trees that did not have xi in their bootstrap sample.

The following outline is provided as an overview of and topical guide to machine learning:

Hui Zou is currently a professor of statistics at the University of Minnesota.

<span class="mw-page-title-main">Daniela Witten</span> American biostatistician

Daniela M. Witten is an American biostatistician. She is a professor and the Dorothy Gilford Endowed Chair of Mathematical Statistics at the University of Washington. Her research investigates the use of machine learning to understand high-dimensional data.

References

  1. 1 2 3 Robert Tibshirani at the Mathematics Genealogy Project
  2. Witten, Daniela (2010). A penalized matrix decomposition, and its applications (PDF). stanford.edu (PhD thesis). Stanford University. OCLC   667187274 . Retrieved 2018-08-28.
  3. "National Academy of Sciences Members and Foreign Associates Elected". National Academy of Sciences. May 1, 2012. Archived from the original on May 4, 2012.
  4. "SSC Award Winners in 2012". Archived from the original on 16 July 2012. Retrieved 15 June 2012.
  5. Tibshirani, Robert (2020). Statistical Learning with Sparsity (PDF). Chapman & Hall. p. xv. ISBN   978-0367738334 . Retrieved 5 October 2022.
  6. "H - Research Analytics". Thomson Reuters . Retrieved 8 April 2012.
  7. Hastie, Trevor; Tibshirani, Robert; Friedman, Jerome H. "The Elements of Statistical Learning". Archived from the original on 10 November 2009. Retrieved 15 June 2012.
  8. James, Gareth; Witten, Daniela; Hastie, Trevor; Tibshirani, Robert. "An Introduction to Statistical Learning with Applications in R" . Retrieved 3 July 2016.
  9. Hastie, Trevor; Tibshirani, Robert; Wainwright, Martin. "Statistical Learning with Sparsity: the Lasso and Generalizations" . Retrieved 3 July 2016.