Stochastic control

Last updated

Stochastic control or stochastic optimal control is a sub field of control theory that deals with the existence of uncertainty either in observations or in the noise that drives the evolution of the system. The system designer assumes, in a Bayesian probability-driven fashion, that random noise with known probability distribution affects the evolution and observation of the state variables. Stochastic control aims to design the time path of the controlled variables that performs the desired control task with minimum cost, somehow defined, despite the presence of this noise. [1] The context may be either discrete time or continuous time.

Contents

Certainty equivalence

An extremely well-studied formulation in stochastic control is that of linear quadratic Gaussian control. Here the model is linear, the objective function is the expected value of a quadratic form, and the disturbances are purely additive. A basic result for discrete-time centralized systems with only additive uncertainty is the certainty equivalence property: [2] that the optimal control solution in this case is the same as would be obtained in the absence of the additive disturbances. This property is applicable to all centralized systems with linear equations of evolution, quadratic cost function, and noise entering the model only additively; the quadratic assumption allows for the optimal control laws, which follow the certainty-equivalence property, to be linear functions of the observations of the controllers.

Any deviation from the above assumptions—a nonlinear state equation, a non-quadratic objective function, noise in the multiplicative parameters of the model, or decentralization of control—causes the certainty equivalence property not to hold. For example, its failure to hold for decentralized control was demonstrated in Witsenhausen's counterexample.

Discrete time

In a discrete-time context, the decision-maker observes the state variable, possibly with observational noise, in each time period. The objective may be to optimize the sum of expected values of a nonlinear (possibly quadratic) objective function over all the time periods from the present to the final period of concern, or to optimize the value of the objective function as of the final period only. At each time period new observations are made, and the control variables are to be adjusted optimally. Finding the optimal solution for the present time may involve iterating a matrix Riccati equation backwards in time from the last period to the present period.

In the discrete-time case with uncertainty about the parameter values in the transition matrix (giving the effect of current values of the state variables on their own evolution) and/or the control response matrix of the state equation, but still with a linear state equation and quadratic objective function, a Riccati equation can still be obtained for iterating backward to each period's solution even though certainty equivalence does not apply. [2] ch.13 [3] The discrete-time case of a non-quadratic loss function but only additive disturbances can also be handled, albeit with more complications. [4]

Example

A typical specification of the discrete-time stochastic linear quadratic control problem is to minimize [2] :ch. 13, [3] [5]

where E1 is the expected value operator conditional on y0, superscript T indicates a matrix transpose, and S is the time horizon, subject to the state equation

where y is an n × 1 vector of observable state variables, u is a k × 1 vector of control variables, At is the time t realization of the stochastic n × n state transition matrix, Bt is the time t realization of the stochastic n × k matrix of control multipliers, and Q (n × n) and R (k × k) are known symmetric positive definite cost matrices. We assume that each element of A and B is jointly independently and identically distributed through time, so the expected value operations need not be time-conditional.

Induction backwards in time can be used to obtain the optimal control solution at each time, [2] :ch. 13

with the symmetric positive definite cost-to-go matrix X evolving backwards in time from according to

which is known as the discrete-time dynamic Riccati equation of this problem. The only information needed regarding the unknown parameters in the A and B matrices is the expected value and variance of each element of each matrix and the covariances among elements of the same matrix and among elements across matrices.

The optimal control solution is unaffected if zero-mean, i.i.d. additive shocks also appear in the state equation, so long as they are uncorrelated with the parameters in the A and B matrices. But if they are so correlated, then the optimal control solution for each period contains an additional additive constant vector. If an additive constant vector appears in the state equation, then again the optimal control solution for each period contains an additional additive constant vector.

The steady-state characterization of X (if it exists), relevant for the infinite-horizon problem in which S goes to infinity, can be found by iterating the dynamic equation for X repeatedly until it converges; then X is characterized by removing the time subscripts from its dynamic equation.

Continuous time

If the model is in continuous time, the controller knows the state of the system at each instant of time. The objective is to maximize either an integral of, for example, a concave function of a state variable over a horizon from time zero (the present) to a terminal time T, or a concave function of a state variable at some future date T. As time evolves, new observations are continuously made and the control variables are continuously adjusted in optimal fashion.

Stochastic model predictive control

In the literature, there are two types of MPCs for stochastic systems; Robust model predictive control and Stochastic Model Predictive Control (SMPC). Robust model predictive control is a more conservative method which considers the worst scenario in the optimization procedure. However, this method, similar to other robust controls, deteriorates the overall controller's performance and also is applicable only for systems with bounded uncertainties. The alternative method, SMPC, considers soft constraints which limit the risk of violation by a probabilistic inequality. [6]

In finance

In a continuous time approach in a finance context, the state variable in the stochastic differential equation is usually wealth or net worth, and the controls are the shares placed at each time in the various assets. Given the asset allocation chosen at any time, the determinants of the change in wealth are usually the stochastic returns to assets and the interest rate on the risk-free asset. The field of stochastic control has developed greatly since the 1970s, particularly in its applications to finance. Robert Merton used stochastic control to study optimal portfolios of safe and risky assets. [7] His work and that of Black–Scholes changed the nature of the finance literature. Influential mathematical textbook treatments were by Fleming and Rishel, [8] and by Fleming and Soner. [9] These techniques were applied by Stein to the financial crisis of 2007–08. [10]

The maximization, say of the expected logarithm of net worth at a terminal date T, is subject to stochastic processes on the components of wealth. [11] In this case, in continuous time Itô's equation is the main tool of analysis. In the case where the maximization is an integral of a concave function of utility over an horizon (0,T), dynamic programming is used. There is no certainty equivalence as in the older literature, because the coefficients of the control variables—that is, the returns received by the chosen shares of assets—are stochastic.

See also

Related Research Articles

<span class="mw-page-title-main">Mathematical optimization</span> Study of mathematical algorithms for optimization problems

Mathematical optimization or mathematical programming is the selection of a best element, with regard to some criterion, from some set of available alternatives. It is generally divided into two subfields: discrete optimization and continuous optimization. Optimization problems of sorts arise in all quantitative disciplines from computer science and engineering to operations research and economics, and the development of solution methods has been of interest in mathematics for centuries.

<span class="mw-page-title-main">Least squares</span> Approximation method in statistics

The method of least squares is a standard approach in regression analysis to approximate the solution of overdetermined systems by minimizing the sum of the squares of the residuals made in the results of each individual equation.

Controllability is an important property of a control system, and the controllability property plays a crucial role in many control problems, such as stabilization of unstable systems by feedback, or optimal control.

An inverse problem in science is the process of calculating from a set of observations the causal factors that produced them: for example, calculating an image in X-ray computed tomography, source reconstruction in acoustics, or calculating the density of the Earth from measurements of its gravity field. It is called an inverse problem because it starts with the effects and then calculates the causes. It is the inverse of a forward problem, which starts with the causes and then calculates the effects.

In mathematics, a Gaussian function, often simply referred to as a Gaussian, is a function of the base form

<span class="mw-page-title-main">Optimal control</span> Mathematical way of attaining a desired output from a dynamic system

Optimal control theory is a branch of mathematical optimization that deals with finding a control for a dynamical system over a period of time such that an objective function is optimized. It has numerous applications in science, engineering and operations research. For example, the dynamical system might be a spacecraft with controls corresponding to rocket thrusters, and the objective might be to reach the moon with minimum fuel expenditure. Or the dynamical system could be a nation's economy, with the objective to minimize unemployment; the controls in this case could be fiscal and monetary policy. A dynamical system may also be introduced to embed operations research problems within the framework of optimal control theory.

In optimal control theory, the Hamilton-Jacobi-Bellman (HJB) equation gives a necessary and sufficient condition for optimality of a control with respect to a loss function. It is, in general, a nonlinear partial differential equation in the value function, which means its solution is the value function itself. Once this solution is known, it can be used to obtain the optimal control by taking the maximizer of the Hamiltonian involved in the HJB equation.

In control theory, the discrete Lyapunov equation is of the form

In statistics, ordinary least squares (OLS) is a type of linear least squares method for choosing the unknown parameters in a linear regression model by the principle of least squares: minimizing the sum of the squares of the differences between the observed dependent variable in the input dataset and the output of the (linear) function of the independent variable.

In control theory, the linear–quadratic–Gaussian (LQG) control problem is one of the most fundamental optimal control problems, and it can also be operated repeatedly for model predictive control. It concerns linear systems driven by additive white Gaussian noise. The problem is to determine an output feedback law that is optimal in the sense of minimizing the expected value of a quadratic cost criterion. Output measurements are assumed to be corrupted by Gaussian noise and the initial state, likewise, is assumed to be a Gaussian random vector.

In the theory of stochastic processes, filtering describes the problem of determining the state of a system from an incomplete and potentially noisy set of observations. While originally motivated by problems in engineering, filtering found applications in many fields from signal processing to finance.

Non-linear least squares is the form of least squares analysis used to fit a set of m observations with a model that is non-linear in n unknown parameters (m ≥ n). It is used in some forms of nonlinear regression. The basis of the method is to approximate the model by a linear one and to refine the parameters by successive iterations. There are many similarities to linear least squares, but also some significant differences. In economic theory, the non-linear least squares method is applied in (i) the probit regression, (ii) threshold regression, (iii) smooth regression, (iv) logistic link regression, (v) Box–Cox transformed regressors ().

An algebraic Riccati equation is a type of nonlinear equation that arises in the context of infinite-horizon optimal control problems in continuous time or discrete time.

In control theory, optimal projection equations constitute necessary and sufficient conditions for a locally optimal reduced-order LQG controller.

Linear least squares (LLS) is the least squares approximation of linear functions to data. It is a set of formulations for solving statistical problems involved in linear regression, including variants for ordinary (unweighted), weighted, and generalized (correlated) residuals. Numerical methods for linear least squares include inverting the matrix of the normal equations and orthogonal decomposition methods.

A matrix difference equation is a difference equation in which the value of a vector of variables at one point in time is related to its own value at one or more previous points in time, using matrices. The order of the equation is the maximum time gap between any two indicated values of the variable vector. For example,

In macroeconomics, multiplier uncertainty is lack of perfect knowledge of the multiplier effect of a particular policy action, such as a monetary or fiscal policy change, upon the intended target of the policy. For example, a fiscal policy maker may have a prediction as to the value of the fiscal multiplier—the ratio of the effect of a government spending change on GDP to the size of the government spending change—but is not likely to know the exact value of this ratio. Similar uncertainty may surround the magnitude of effect of a change in the monetary base or its growth rate upon some target variable, which could be the money supply, the exchange rate, the inflation rate, or GDP.

In statistics and in machine learning, a linear predictor function is a linear function of a set of coefficients and explanatory variables, whose value is used to predict the outcome of a dependent variable. This sort of function usually comes in linear regression, where the coefficients are called regression coefficients. However, they also occur in various types of linear classifiers, as well as in various other models, such as principal component analysis and factor analysis. In many of these models, the coefficients are referred to as "weights".

<span class="mw-page-title-main">Simulation-based optimization</span> Form of optimization

Simulation-based optimization integrates optimization techniques into simulation modeling and analysis. Because of the complexity of the simulation, the objective function may become difficult and expensive to evaluate. Usually, the underlying simulation model is stochastic, so that the objective function must be estimated using statistical estimation techniques.

References

  1. Definition from Answers.com
  2. 1 2 3 4 Chow, Gregory P. (1976). Analysis and Control of Dynamic Economic Systems. New York: Wiley. ISBN   0-471-15616-7.
  3. 1 2 Turnovsky, Stephen (1976). "Optimal Stabilization Policies for Stochastic Linear Systems: The Case of Correlated Multiplicative and Additive disturbances". Review of Economic Studies . 43 (1): 191–94. doi:10.2307/2296614. JSTOR   2296614.
  4. Mitchell, Douglas W. (1990). "Tractable Risk Sensitive Control Based on Approximate Expected Utility". Economic Modelling. 7 (2): 161–164. doi:10.1016/0264-9993(90)90018-Y.
  5. Turnovsky, Stephen (1974). "The stability properties of optimal economic policies". American Economic Review . 64 (1): 136–148. JSTOR   1814888.
  6. Hashemian; Armaou (2017). "Stochastic MPC Design for a Two-Component Granulation Process". IEEE Proceedings: 4386–4391. arXiv: 1704.04710 . Bibcode:2017arXiv170404710H.
  7. Merton, Robert (1990). Continuous Time Finance. Blackwell.
  8. Fleming, W.; Rishel, R. (1975). Deterministic and Stochastic Optimal Control. ISBN   0-387-90155-8.
  9. Fleming, W.; Soner, M. (2006). Controlled Markov Processes and Viscosity Solutions. Springer.
  10. Stein, J. L. (2012). Stochastic Optimal Control and the US Financial Crisis. Springer-Science.
  11. Barreiro-Gomez, J.; Tembine, H. (2019). "Blockchain Token Economics: A Mean-Field-Type Game Perspective". IEEE Access. 7: 64603–64613. doi: 10.1109/ACCESS.2019.2917517 . ISSN   2169-3536.

Further reading