Elad Hazan | |
---|---|
Nationality | Israeli-American |
Occupation | Computer scientist, academic, author and researcher |
Awards | Bell Labs Prize Marie Curie Fellow, European Research Council Google Research Award Amazon Research Award |
Academic background | |
Education | B.Sc., Computer Science, Tel Aviv University M.Sc., Computer Science, Tel Aviv University Ph.D., Computer Science, Princeton University |
Academic work | |
Institutions | Princeton University |
Website | https://www.ehazan.com/ |
Elad Hazan is an Israeli-American computer scientist,academic,author and researcher. He is a Professor of Computer Science at Princeton University,and the co-founder and director of Google AI Princeton. [1] [2]
Hazan co-invented adaptive gradient methods and the AdaGrad algorithm. He has published over 150 articles and has several patents awarded. He has worked machine learning and mathematical optimization,and more recently on control theory and reinforcement learning. [3] He has authored a book,entitled Introduction to Online Convex Optimization. Hazan is the co-founder of In8 Inc.,which was acquired by Google in 2018. [4]
Hazan studied at Tel Aviv University and received his bachelor's and master's degrees in Computer Science in 2001 and 2002,respectively. He then moved to the United States,earning his Doctoral Degree in Computer Science from Princeton University in 2006 under Sanjeev Arora. [1]
Upon receiving his doctoral degree,Hazan held an appointment as a Research Staff Member in the Theory Group at IBM Almaden Research Center in 2006. Following this appointment,he joined Technion - Israel Institute of Technology as an assistant professor in 2010 and was tenured and promoted to Associate Professor in 2013. [1] In 2015,he joined Princeton University as an Assistant Professor of Computer Science,and later became Professor of Computer Science in 2016. Since 2018,he has been serving as a Director of Google AI Princeton. [5]
Hazan's research primarily focuses on machine learning,mathematical optimization,control theory and reinforcement learning. He is the co-inventor of five US patents.
Hazan co-introduced adaptive subgradient methods to dynamically incorporate knowledge of the geometry of the data observed in earlier iterations,and to perform more informative gradient-based learning. The AdaGrad algorithm changed optimization for deep learning and serves as the basis for today's fastest algorithms. In his study,he also made substantial contributions to the theory of online convex optimization,including the Online Newton Step and Online Frank Wolfe algorithm,projection free methods,and adaptive-regret algorithms. [6]
In the area of mathematical optimization,Hazan proposed the first sublinear-time algorithms for linear classification as well as semi-definite programming. He also gave the first linearly converging Frank-Wolfe-type algorithm. [7]
More recently,Hazan and his group proposed a new paradigm for differentiable reinforcement learning called non-stochastic control,which applies online convex optimization to control. [8]
Artificial neural networks (ANNs),usually simply called neural networks (NNs),are computing systems inspired by the biological neural networks that constitute animal brains.
Mathematical optimization or mathematical programming is the selection of a best element,with regard to some criterion,from some set of available alternatives. Optimization problems of sorts arise in all quantitative disciplines from computer science and engineering to operations research and economics,and the development of solution methods has been of interest in mathematics for centuries.
Reinforcement learning (RL) is an area of machine learning concerned with how intelligent agents ought to take actions in an environment in order to maximize the notion of cumulative reward. Reinforcement learning is one of three basic machine learning paradigms,alongside supervised learning and unsupervised learning.
Machine learning (ML) is the study of computer algorithms that can improve automatically through experience and by the use of data. It is seen as a part of artificial intelligence. Machine learning algorithms build a model based on sample data,known as training data,in order to make predictions or decisions without being explicitly programmed to do so. Machine learning algorithms are used in a wide variety of applications,such as in medicine,email filtering,speech recognition,and computer vision,where it is difficult or unfeasible to develop conventional algorithms to perform the needed tasks.
Stochastic gradient descent is an iterative method for optimizing an objective function with suitable smoothness properties. It can be regarded as a stochastic approximation of gradient descent optimization,since it replaces the actual gradient by an estimate thereof. Especially in high-dimensional optimization problems this reduces the computational burden,achieving faster iterations in trade for a lower convergence rate.
Convex optimization is a subfield of mathematical optimization that studies the problem of minimizing convex functions over convex sets. Many classes of convex optimization problems admit polynomial-time algorithms,whereas mathematical optimization is in general NP-hard.
In probability theory and machine learning,the multi-armed bandit problem is a problem in which a fixed limited set of resources must be allocated between competing (alternative) choices in a way that maximizes their expected gain,when each choice's properties are only partially known at the time of allocation,and may become better understood as time passes or by allocating resources to the choice. This is a classic reinforcement learning problem that exemplifies the exploration–exploitation tradeoff dilemma. The name comes from imagining a gambler at a row of slot machines,who has to decide which machines to play,how many times to play each machine and in which order to play them,and whether to continue with the current machine or try a different machine. The multi-armed bandit problem also falls into the broad category of stochastic scheduling.
Stochastic optimization (SO) methods are optimization methods that generate and use random variables. For stochastic problems,the random variables appear in the formulation of the optimization problem itself,which involves random objective functions or random constraints. Stochastic optimization methods also include methods with random iterates. Some stochastic optimization methods use random iterates to solve stochastic problems,combining both meanings of stochastic optimization. Stochastic optimization methods generalize deterministic methods for deterministic problems.
Harold Joseph Kushner is an American applied mathematician and a Professor Emeritus of Applied Mathematics at Brown University. He is known for his work on the theory of stochastic stability,the theory of non-linear filtering,and for the development of numerical methods for stochastic control problems such as the Markov chain approximation method. He is commonly cited as the first person to study Bayesian optimization,based on work he published in 1964.
In computer science,online machine learning is a method of machine learning in which data becomes available in a sequential order and is used to update the best predictor for future data at each step,as opposed to batch learning techniques which generate the best predictor by learning on the entire training data set at once. Online learning is a common technique used in areas of machine learning where it is computationally infeasible to train over the entire dataset,requiring the need of out-of-core algorithms. It is also used in situations where it is necessary for the algorithm to dynamically adapt to new patterns in the data,or when the data itself is generated as a function of time,e.g.,stock price prediction. Online learning algorithms may be prone to catastrophic interference,a problem that can be addressed by incremental learning approaches.
Dimitri Panteli Bertsekas is an applied mathematician,electrical engineer,and computer scientist,a McAfee Professor at the Department of Electrical Engineering and Computer Science in School of Engineering at the Massachusetts Institute of Technology (MIT),Cambridge,Massachusetts,and also a Fulton Professor of Computational Decision Making at Arizona State University,Tempe.
Ralph Tyrrell Rockafellar is an American mathematician and one of the leading scholars in optimization theory and related fields of analysis and combinatorics. He is the author of four major books including the landmark text "Convex Analysis" (1970),which has been cited more than 27,000 times according to Google Scholar and remains the standard reference on the subject,and "Variational Analysis" for which the authors received the Frederick W. Lanchester Prize from the Institute for Operations Research and the Management Sciences (INFORMS).
In machine learning,a hyperparameter is a parameter whose value is used to control the learning process. By contrast,the values of other parameters are derived via training.
Quantum machine learning is the integration of quantum algorithms within machine learning programs. The most common use of the term refers to machine learning algorithms for the analysis of classical data executed on a quantum computer,i.e. quantum-enhanced machine learning. While machine learning algorithms are used to compute immense quantities of data,quantum machine learning utilizes qubits and quantum operations or specialized quantum systems to improve computational speed and data storage done by algorithms in a program. This includes hybrid methods that involve both classical and quantum processing,where computationally difficult subroutines are outsourced to a quantum device. These routines can be more complex in nature and executed faster on a quantum computer. Furthermore,quantum algorithms can be used to analyze quantum states instead of classical data. Beyond quantum computing,the term "quantum machine learning" is also associated with classical machine learning methods applied to data generated from quantum experiments,such as learning the phase transitions of a quantum system or creating new quantum experiments. Quantum machine learning also extends to a branch of research that explores methodological and structural similarities between certain physical systems and learning systems,in particular neural networks. For example,some mathematical and numerical techniques from quantum physics are applicable to classical deep learning and vice versa. Furthermore,researchers investigate more abstract notions of learning theory with respect to quantum information,sometimes referred to as "quantum learning theory".
This glossary of artificial intelligence is a list of definitions of terms and concepts relevant to the study of artificial intelligence,its sub-disciplines,and related fields. Related glossaries include Glossary of computer science,Glossary of robotics,and Glossary of machine vision.
Peter Richtarik is a Slovak mathematician and computer scientist working in the area of big data optimization and machine learning,known for his work on randomized coordinate descent algorithms,stochastic gradient descent and federated learning. He is currently a Professor of Computer Science at the King Abdullah University of Science and Technology.
The following outline is provided as an overview of and topical guide to machine learning. Machine learning is a subfield of soft computing within computer science that evolved from the study of pattern recognition and computational learning theory in artificial intelligence. In 1959,Arthur Samuel defined machine learning as a "field of study that gives computers the ability to learn without being explicitly programmed". Machine learning explores the study and construction of algorithms that can learn from and make predictions on data. Such algorithms operate by building a model from an example training set of input observations in order to make data-driven predictions or decisions expressed as outputs,rather than following strictly static program instructions.
In machine learning,hyperparameter optimization or tuning is the problem of choosing a set of optimal hyperparameters for a learning algorithm. A hyperparameter is a parameter whose value is used to control the learning process. By contrast,the values of other parameters are learned.
In machine learning and statistics,the learning rate is a tuning parameter in an optimization algorithm that determines the step size at each iteration while moving toward a minimum of a loss function. Since it influences to what extent newly acquired information overrides old information,it metaphorically represents the speed at which a machine learning model "learns". In the adaptive control literature,the learning rate is commonly referred to as gain.
James Milton Renegar Jr. is an American mathematician,specializing in optimization algorithms for linear programming and nonlinear programming.