Eugene A. Feinberg | |
---|---|
Born | 1954 Moscow, Russia |
Alma mater | Vilnius University |
Known for | Markov Decision Processes, Stochastic Models |
Awards | IEEE Charles Hirsh Award (2012), IBM Faculty Award (2012) |
Scientific career | |
Fields | Operations Research |
Thesis | (1979) |
Doctoral advisor | Alexander Yushkevich |
Eugene A. Feinberg is an American mathematician and distinguished professor of applied mathematics and statistics at Stony Brook University. He is noted for his work in probability theory, real analysis, and Markov decision processes.
Feinberg was born in Moscow, Russia in 1954. He received his masters degree in applied mathematics from the Russian University of Transport (MIIT). He completed his PhD thesis at Vilnius University in 1979 under Alexander Yushkevich [1] and held research and faculty positions at the from 1976 to 1988. [2] Feinberg immigrated to the United States in 1988, working as a visiting faculty member of Yale University's operations research group. In 1989, he joined Stony Brook University's faculty in the Applied Mathematics and Statistics department. [3]
Feinberg's research interests include applied probability and its applications to operations research, Markov decision processes, and industrial applications of operations research. His work includes the theory of MDPs and solutions to Kolmogorov's forward equations for jump Markov processes. He also contributed to real analysis by developing generalizations of Fatou's lemma and Berge's maximum theorem. Feinberg has also worked on applications including electric grid forecasting.
Feinberg was awarded an honorary Doctorate from the Institute of Applied System Analysis at the National Technical University of Ukraine.
Feinberg is a member of the INFORMS Fellows class of 2011, elected "for his fundamental contributions to the theory and practice of operations research in the areas of Markov decision processes and dynamic programming." [17]
He was among the recipients of the 2012 IBM Faculty Award. [18]
Feineberg received the IEEE Charles Hirsch Award in 2012 for "developing and implementing on Long Island, electric load forecasting methods and smart grid technologies." [19]
He is a member of the editorial boards of Mathematics of Operations Research, [20] Operations Research Letters, [21] Stochastic Models, [22] and Applied Mathematics Letters. [23]
Reinforcement learning (RL) is an interdisciplinary area of machine learning and optimal control concerned with how an intelligent agent ought to take actions in a dynamic environment in order to maximize the cumulative reward. Reinforcement learning is one of three basic machine learning paradigms, alongside supervised learning and unsupervised learning.
In mathematics, a stochastic matrix is a square matrix used to describe the transitions of a Markov chain. Each of its entries is a nonnegative real number representing a probability. It is also called a probability matrix, transition matrix, substitution matrix, or Markov matrix. The stochastic matrix was first developed by Andrey Markov at the beginning of the 20th century, and has found use throughout a wide variety of scientific fields, including probability theory, statistics, mathematical finance and linear algebra, as well as computer science and population genetics. There are several different definitions and types of stochastic matrices:
In mathematics, a Markov decision process (MDP) is a discrete-time stochastic control process. It provides a mathematical framework for modeling decision making in situations where outcomes are partly random and partly under the control of a decision maker. MDPs are useful for studying optimization problems solved via dynamic programming. MDPs were known at least as early as the 1950s; a core body of research on Markov decision processes resulted from Ronald Howard's 1960 book, Dynamic Programming and Markov Processes. They are used in many disciplines, including robotics, automatic control, economics and manufacturing. The name of MDPs comes from the Russian mathematician Andrey Markov as they are an extension of Markov chains.
In probability theory and machine learning, the multi-armed bandit problem is a problem in which a decision maker iteratively selects one of multiple fixed choices when the properties of each choice are only partially known at the time of allocation, and may become better understood as time passes. A fundamental aspect of bandit problems is that choosing an arm does not affect the properties of the arm or other arms.
The Gittins index is a measure of the reward that can be achieved through a given stochastic process with certain properties, namely: the process has an ultimate termination state and evolves with an option, at each intermediate state, of terminating. Upon terminating at a given state, the reward achieved is the sum of the probabilistic expected rewards associated with every state from the actual terminating state to the ultimate terminal state, inclusive. The index is a real scalar.
Multiscale decision-making, also referred to as multiscale decision theory (MSDT), is an approach in operations research that combines game theory, multi-agent influence diagrams, in particular dependency graphs, and Markov decision processes to solve multiscale challenges in sociotechnical systems. MSDT considers interdependencies within and between the following scales: system level, time and information.
In queueing theory, a discipline within the mathematical theory of probability, Burke's theorem is a theorem asserting that, for the M/M/1 queue, M/M/c queue or M/M/∞ queue in the steady state with arrivals is a Poisson process with rate parameter λ:
In queueing theory, a discipline within the mathematical theory of probability, quasireversibility is a property of some queues. The concept was first identified by Richard R. Muntz and further developed by Frank Kelly. Quasireversibility differs from reversibility in that a stronger condition is imposed on arrival rates and a weaker condition is applied on probability fluxes. For example, an M/M/1 queue with state-dependent arrival rates and state-dependent service times is reversible, but not quasireversible.
In applied probability, a regenerative process is a class of stochastic process with the property that certain portions of the process can be treated as being statistically independent of each other. This property can be used in the derivation of theoretical properties of such processes.
In queueing theory, a discipline within the mathematical theory of probability, a fluid queue is a mathematical model used to describe the fluid level in a reservoir subject to randomly determined periods of filling and emptying. The term dam theory was used in earlier literature for these models. The model has been used to approximate discrete models, model the spread of wildfires, in ruin theory and to model high speed data networks. The model applies the leaky bucket algorithm to a stochastic source.
John Winsor Pratt is Emeritus William Ziegler professor business administration at Harvard University. His former education was conducted at Princeton University and Stanford University, where he specialized in mathematics and statistics. Pratt spent most of his academic career at Harvard University. He was an editor of the Journal of the American Statistical Association from 1965 to 1970. His researches on risk aversion, risk sharing incentives, and the nature and discovery of stochastic laws, statistical relationships that describe the effects of decisions. He has made contributions to research in risk aversion theory, notably with Kenneth Arrow on measures of risk aversion.
In queueing theory, a discipline within the mathematical theory of probability, an M/G/k queue is a queue model where arrivals are Markovian, service times have a General distribution and there are k servers. The model name is written in Kendall's notation, and is an extension of the M/M/c queue, where service times must be exponentially distributed and of the M/G/1 queue with a single server. Most performance metrics for this queueing system are not known and remain an open problem.
Jean-François Mertens was a Belgian game theorist and mathematical economist.
Cyrus Derman was an American mathematician and amateur musician who did research in Markov decision process, stochastic processes, operations research, statistics and a variety of other fields.
Andrzej Piotr Ruszczyński is a Polish-American applied mathematician, noted for his contributions to mathematical optimization, in particular, stochastic programming and risk-averse optimization.
Bruce Edward Hajek is a Professor in the Coordinated Science Laboratory, the head of the Department of Electrical and Computer Engineering, and the Leonard C. and Mary Lou Hoeft Chair in Engineering at the University of Illinois Urbana–Champaign. He does research in communication networking, auction theory, stochastic analysis, combinatorial optimization, machine learning, information theory, and bioinformatics.
The decentralized partially observable Markov decision process (Dec-POMDP) is a model for coordination and decision-making among multiple agents. It is a probabilistic model that can consider uncertainty in outcomes, sensors and communication.
Richard H. Stockbridge is a Distinguished Professor of Mathematics at the University of Wisconsin-Milwaukee. His contributions to research primarily involve stochastic control theory, optimal stopping and mathematical finance. Most notably, alongside Professors Thomas G. Kurtz, Kurt Helmes, and Chao Zhu, he developed the methodology of using linear programming to solve stochastic control problems.
Amber Lynn Puha is an American mathematician and educator at California State University San Marcos. Her research concerns probability theory and stochastic processes.
A minimum-cost spanning-tree game is a kind of a cooperative game. In an MCST game, each player is a node in a complete graph. The graph contains an additional node - the supply node - denoted by s. The goal of the players is that all of them will be connected by a path to s. To this end, they need to construct a spanning tree. Each edge in the graph has a cost, and the players build the minimum cost spanning tree. The question then arises, how to allocate the cost of this MCST among the players?