Yurii Nesterov | |
---|---|
Born | |
Citizenship | Belgium |
Alma mater | Moscow State University (1977) |
Awards |
The WLA Prize in Computer Science or Mathematics, 2023 [1] Contents |
Scientific career | |
Fields | |
Institutions | |
Doctoral advisor | Boris Polyak |
Yurii Nesterov is a Russian mathematician, an internationally recognized expert in convex optimization, especially in the development of efficient algorithms and numerical optimization analysis. He is currently a professor at the University of Louvain (UCLouvain).
In 1977, Yurii Nesterov graduated in applied mathematics at Moscow State University. From 1977 to 1992 he was a researcher at the Central Economic Mathematical Institute of the Russian Academy of Sciences. Since 1993, he has been working at UCLouvain, specifically in the Department of Mathematical Engineering from the Louvain School of Engineering, Center for Operations Research and Econometrics.
In 2000, Nesterov received the Dantzig Prize. [2]
In 2009, Nesterov won the John von Neumann Theory Prize. [3]
In 2016, Nesterov received the EURO Gold Medal. [4]
In 2023, Yurii Nesterov and Arkadi Nemirovski received the WLA Prize in Computer Science or Mathematics, "for their seminal work in convex optimization theory". [5]
Nesterov is most famous for his work in convex optimization, including his 2004 book, considered a canonical reference on the subject. [6] His main novel contribution is an accelerated version of gradient descent that converges considerably faster than ordinary gradient descent (commonly referred as Nesterov momentum, Nesterov Acceleration or Nesterov accelerated gradient, in short — NAG). [7] [8] [9] [10] [11] This method, sometimes called "FISTA", was further developed by Beck & Teboulle in their 2009 paper "A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems". [12]
His work with Arkadi Nemirovski in their 1994 book [13] is the first to point out that the interior point method can solve convex optimization problems, and the first to make a systematic study of semidefinite programming (SDP). Also in this book, they introduced the self-concordant functions which are useful in the analysis of Newton's method. [14]
The John von Neumann Theory Prize of the Institute for Operations Research and the Management Sciences (INFORMS) is awarded annually to an individual who has made fundamental and sustained contributions to theory in operations research and the management sciences.
Linear programming (LP), also called linear optimization, is a method to achieve the best outcome in a mathematical model whose requirements and objective are represented by linear relationships. Linear programming is a special case of mathematical programming.
Mathematical optimization or mathematical programming is the selection of a best element, with regard to some criteria, from some set of available alternatives. It is generally divided into two subfields: discrete optimization and continuous optimization. Optimization problems arise in all quantitative disciplines from computer science and engineering to operations research and economics, and the development of solution methods has been of interest in mathematics for centuries.
Gradient descent is a method for unconstrained mathematical optimization. It is a first-order iterative algorithm for minimizing a differentiable multivariate function.
Stochastic gradient descent is an iterative method for optimizing an objective function with suitable smoothness properties. It can be regarded as a stochastic approximation of gradient descent optimization, since it replaces the actual gradient by an estimate thereof. Especially in high-dimensional optimization problems this reduces the very high computational burden, achieving faster iterations in exchange for a lower convergence rate.
Convex optimization is a subfield of mathematical optimization that studies the problem of minimizing convex functions over convex sets. Many classes of convex optimization problems admit polynomial-time algorithms, whereas mathematical optimization is in general NP-hard.
In convex optimization, a linear matrix inequality (LMI) is an expression of the form
The Bregman method is an iterative algorithm to solve certain convex optimization problems involving regularization. The original version is due to Lev M. Bregman, who published it in 1967.
Arkadi Nemirovski is a professor at the H. Milton Stewart School of Industrial and Systems Engineering at the Georgia Institute of Technology. He has been a leader in continuous optimization and is best known for his work on the ellipsoid method, modern interior-point methods and robust optimization.
Coordinate descent is an optimization algorithm that successively minimizes along coordinate directions to find the minimum of a function. At each iteration, the algorithm determines a coordinate or coordinate block via a coordinate selection rule, then exactly or inexactly minimizes over the corresponding coordinate hyperplane while fixing all other coordinates or coordinate blocks. A line search along the coordinate direction can be performed at the current iterate to determine the appropriate step size. Coordinate descent is applicable in both differentiable and derivative-free contexts.
Randomized (Block) Coordinate Descent Method is an optimization algorithm popularized by Nesterov (2010) and Richtárik and Takáč (2011). The first analysis of this method, when applied to the problem of minimizing a smooth convex function, was performed by Nesterov (2010). In Nesterov's analysis the method needs to be applied to a quadratic perturbation of the original function with an unknown scaling factor. Richtárik and Takáč (2011) give iteration complexity bounds which do not require this, i.e., the method is applied to the objective function directly. Furthermore, they generalize the setting to the problem of minimizing a composite function, i.e., sum of a smooth convex and a convex block-separable function:
Laurence Alexander Wolsey is a Belgian-English mathematician working in the field of integer programming. His mother Anna Wolsey-Mautner was the daughter of the Viennese Industrialist Konrad David Mautner. He is a former president and research director of the Center for Operations Research and Econometrics (CORE) at Université catholique de Louvain in Belgium. He is professor emeritus of applied mathematics at the engineering school of the same university.
Proximal gradient methods are a generalized form of projection used to solve non-differentiable convex optimization problems.
Proximal gradientmethods for learning is an area of research in optimization and statistical learning theory which studies algorithms for a general class of convex regularization problems where the regularization penalty may not be differentiable. One such example is regularization of the form
Alfredo Noel Iusem is an Argentine-born Brazilian mathematician working on mathematical optimization.
The Center for Operations Research and Econometrics (CORE) is an interdisciplinary research institute of the University of Louvain (UCLouvain) located in Louvain-la-Neuve, Belgium. Since 2010, it is part of the Louvain Institute of Data Analysis and Modeling in economics and statistics (LIDAM), along with the Institute for Economic and Social Research (IRES), Louvain Finance (LFIN) and the Institute of Statistics, Biostatistics and Actuarial Sciences (ISBA).
Peter Richtarik is a Slovak mathematician and computer scientist working in the area of big data optimization and machine learning, known for his work on randomized coordinate descent algorithms, stochastic gradient descent and federated learning. He is currently a Professor of Computer Science at the King Abdullah University of Science and Technology.
In mathematical optimization, oracle complexity is a standard theoretical framework to study the computational requirements for solving classes of optimization problems. It is suitable for analyzing iterative algorithms which proceed by computing local information about the objective function at various points. The framework has been used to provide tight worst-case guarantees on the number of required iterations, for several important classes of optimization problems.
In mathematics, mirror descent is an iterative optimization algorithm for finding a local minimum of a differentiable function.
Sébastien Bubeck is a French-American computer scientist and mathematician. He is currently Microsoft's Vice President of Applied Research and leads the Machine Learning Foundations group at Microsoft Research Redmond. Bubeck was formerly professor at Princeton University and a researcher at the University of California, Berkeley. He is known for his contributions to online learning, optimization and more recently studying deep neural networks, and in particular transformer models.