Separation principle in stochastic control

Last updated

The separation principle is one of the fundamental principles of stochastic control theory, which states that the problems of optimal control and state estimation can be decoupled under certain conditions. In its most basic formulation it deals with a linear stochastic system

Contents

with a state process , an output process and a control , where is a vector-valued Wiener process, is a zero-mean Gaussian random vector independent of , , and , , , , are matrix-valued functions which generally are taken to be continuous of bounded variation. Moreover, is nonsingular on some interval . The problem is to design an output feedback law which maps the observed process to the control input in a nonanticipatory manner so as to minimize the functional

where denotes expected value, prime () denotes transpose. and and are continuous matrix functions of bounded variation, is positive semi-definite and is positive definite for all . Under suitable conditions, which need to be properly stated, the optimal policy can be chosen in the form

where is the linear least-squares estimate of the state vector obtained from the Kalman filter

where is the gain of the optimal linear-quadratic regulator obtained by taking and deterministic, and where is the Kalman gain. There is also a non-Gaussian version of this problem (to be discussed below) where the Wiener process is replaced by a more general square-integrable martingale with possible jumps. [1] In this case, the Kalman filter needs to be replace by a nonlinear filter providing an estimate of the (strict sense) conditional mean

where

is the filtration generated by the output process; i.e., the family of increasing sigma fields representing the data as it is produced.

In the early literature on the separation principle it was common to allow as admissible controls all processes that are adapted to the filtration . This is equivalent to allowing all non-anticipatory Borel functions as feedback laws, which raises the question of existence of a unique solution to the equations of the feedback loop. Moreover, one needs to exclude the possibility that a nonlinear controller extracts more information from the data than what is possible with a linear control law. [2]

Choices of the class of admissible control laws

Linear-quadratic control problems are often solved by a completion-of-squares argument. In our present context we have

in which the first term takes the form [3]

where is the covariance matrix

The separation principle would now follow immediately if were independent of the control. However this needs to be established.

The state equation can be integrated to take the form

where is the state process obtained by setting and is the transition matrix function. By linearity, equals

where . Consequently,

but we need to establish that does not depend on the control. This would be the case if

where is the output process obtained by setting . This issue was discussed in detail by Lindquist. [2] In fact, since the control process is in general a nonlinear function of the data and thus non-Gaussian, then so is the output process . To avoid these problems one might begin by uncoupling the feedback loop and determine an optimal control process in the class of stochastic processes that are adapted to the family of sigma fields. This problem, where one optimizes over the class of all control processes adapted to a fixed filtration, is called a stochastic open loop (SOL) problem. [2] It is not uncommon in the literature to assume from the outset that the control is adapted to ; see, e.g., Section 2.3 in Bensoussan, [4] also van Handel [5] and Willems. [6]

In Lindquist 1973 [2] a procedure was proposed for how to embed the class of admissible controls in various SOL classes in a problem-dependent manner, and then construct the corresponding feedback law. The largest class of admissible feedback laws consists of the non-anticipatory functions such that the feedback equation has a unique solution and the corresponding control process is adapted to . Next, we give a few examples of specific classes of feedback laws that belong to this general class, as well as some other strategies in the literature to overcome the problems described above.

Linear control laws

The admissible class of control laws could be restricted to contain only certain linear ones as in Davis. [7] More generally, the linear class

where is a deterministic function and is an kernel, ensures that is independent of the control. [8] [2] In fact, the Gaussian property will then be preserved, and will be generated by the Kalman filter. Then the error process is generated by

which is clearly independent of the choice of control, and thus so is .

Lipschitz-continuous control laws

Wonham proved a separation theorem for controls in the class , even for a more general cost functional than J(u). [9] However, the proof is far from simple and there are many technical assumptions. For example, must square and have a determinant bounded away from zero, which is a serious restriction. A later proof by Fleming and Rishel [10] is considerably simpler. They also prove the separation theorem with quadratic cost functional for a class of Lipschitz continuous feedback laws, namely , where is a non-anticipatory function of which is Lipschitz continuous in this argument. Kushner [11] proposed a more restricted class , where the modified state process is given by

leading to the identity .

Imposing delay

If there is a delay in the processing of the observed data so that, for each , is a function of , then , , see Example 3 in Georgiou and Lindquist. [1] Consequently, is independent of the control. Nevertheless, the control policy must be such that the feedback equations have a unique solution.

Consequently, the problem with possibly control-dependent sigma fields does not occur in the usual discrete-time formulation. However, a procedure used in several textbooks to construct the continuous-time as the limit of finite difference quotients of the discrete-time , which does not depend on the control, is circular or a best incomplete; see Remark 4 in Georgiou and Lindquist. [1]

Weak solutions

An approach introduced by Duncan and Varaiya [12] and Davis and Varaiya, [13] see also Section 2.4 in Bensoussan [4] is based on weak solutions of the stochastic differential equation. Considering such solutions of

we can change the probability measure (that depends on ) via a Girsanov transformation so that

becomes a new Wiener process, which (under the new probability measure) can be assumed to be unaffected by the control. The question of how this could be implemented in an engineering system is left open.

Nonlinear filtering solutions

Although a nonlinear control law will produce a non-Gaussian state process, it can be shown, using nonlinear filtering theory (Chapters 16.1 in Lipster and Shirayev [14] ), that the state process is conditionally Gaussian given the filtration . This fact can be used to show that is actually generated by a Kalman filter (see Chapters 11 and 12 in Lipster and Shirayev [14] ). However, this requires quite a sophisticated analysis and is restricted to the case where the driving noise is a Wiener process.

Additional historical perspective can be found in Mitter. [15]

Issues on feedback in linear stochastic systems

At this point it is suitable to consider a more general class of controlled linear stochastic systems that also covers systems with time delays, namely

with a stochastic vector process which does not depend on the control. [2] The standard stochastic system is then obtained as a special case where , and . We shall use the short-hand notation

for the feedback system, where

is a Volterra operator.

In this more general formulation the embedding procedure of Lindquist [2] defines the class of admissible feedback laws as the class of non-anticipatory functions such that the feedback equation has a unique solution and is adapted to .

In Georgiou and Lindquist [1] a new framework for the separation principle was proposed. This approach considers stochastic systems as well-defined maps between sample paths rather than between stochastic processes and allows us to extend the separation principle to systems driven by martingales with possible jumps. The approach is motivated by engineering thinking where systems and feedback loops process signals, and not stochastic processes per se or transformations of probability measures. Hence the purpose is to create a natural class of admissible control laws that make engineering sense, including those that are nonlinear and discontinuous.

The feedback equation has a unique strong solution if there exists a non-anticipating function such that satisfies the equation with probability one and all other solutions coincide with with probability one. However, in the sample-wise setting, more is required, namely that such a unique solution exists and that holds for all , not just almost all. The resulting feedback loop is deterministically well-posedin the sense that the feedback equations admit a unique solution that causally depends on the input for each input sample path.

In this context, a signal is defined to be a sample path of a stochastic process with possible discontinuities. More precisely, signals will belong to the Skorohod space, i.e., the space of functions which are continuous on the right and have a left limit at all points (càdlàg functions). In particular, the space of continuous functions is a proper subspace of . Hence the response of a typical nonlinear operation that involves thresholding and switching can be modeled as a signal. The same goes for sample paths of counting processes and other martingales. A system is defined to be a measurable non-anticipatory map sending sample paths to sample paths so that their outputs at any time is a measurable function of past values of the input and time. For example, stochastic differential equations with Lipschitz coefficients driven by a Wiener process induce maps between corresponding path spaces, see page 127 in Rogers and Williams, [16] and pages 126-128 in Klebaner. [17] Also, under fairly general conditions (see e.g., Chapter V in Protter [18] ), stochastic differential equations driven by martingales with sample paths in have strong solutions who are semi-martingales.

For the time setting , the feedback system can be written , where can be interpreted as an input.

Definition. A feedback loop is deterministically well-posed if it has a unique solution for all inputs and is a system.

This implies that the processes and define identical filtrations. [1] Consequently, no new information is created by the loop. However, what we need is that for . This is ensured by the following lemma (Lemma 8 in Georgiou and Lindquist [1] ).

Key Lemma. If the feedback loop is deterministically well-posed, is a system, and is a linear system having a right inverse that is also a system, then is a system and for .

The condition on in this lemma is clearly satisfied in the standard linear stochastic system, for which , and hence . The remaining conditions are collected in the following definition.

Definition. A feedback law is deterministically well-posed for the system if is a system and the feedback system deterministically well-posed.

Examples of simple systems that are not deterministically well-posed are given in Remark 12 in Georgiou and Lindquist. [1]

A separation principle for physically realizable control laws

By only considering feedback laws that are deterministically well-posed, all admissible control laws are physically realizable in the engineering sense that they induce a signal that travels through the feedback loop. The proof of the following theorem can be found in Georgiou and Lindquist 2013. [1]

Separation theorem. Given the linear stochastic system

where is a vector-valued Wiener process, is a zero-mean Gaussian random vector independent of , consider the problem of minimizing the quadratic functional J(u) over the class of all deterministically well-posed feedback laws . Then the unique optimal control law is given by where is defined as above and is given by the Kalman filter. More generally, if is a square-integrable martingale and is an arbitrary zero mean random vector, , where , is the optimal control law provided it is deterministically well-posed.

In the general non-Gaussian case, which may involve counting processes, the Kalman filter needs to be replaced by an nonlinear filter.

A Separation principle for delay-differential systems

Stochastic control for time-delay systems were first studied in Lindquist, [19] [20] [8] [2] and Brooks, [21] although Brooks relies on the strong assumption that the observation is functionally independent of the control , thus avoiding the key question of feedback.

Consider the delay-differential system [8]

where is now a (square-integrable) Gaussian (vector) martingale, and where and are of bounded variation in the first argument and continuous on the right in the second, is deterministic for , and . More precisely, for , for , and the total variation of is bounded by an integrable function in the variable , and the same holds for .

We want to determine a control law which minimizes

where is a positive Stieltjes measure. The corresponding deterministic problem obtained by setting is given by

with [8] .

The following separation principle for the delay system above can be found in Georgiou and Lindquist 2013 [1] and generalizes the corresponding result in Lindquist 1973 [8]

Theorem. There is a unique feedback law in the class of deterministically well-posed control laws that minimizes , and it is given by

where is the deterministic control gain and is given by the linear (distributed) filter

where is the innovation process

and the gain is as defined in page 120 in Lindquist. [8]

Related Research Articles

Bessel function Families of solutions to related differential equations

Bessel functions, first defined by the mathematician Daniel Bernoulli and then generalized by Friedrich Bessel, are canonical solutions y(x) of Bessel's differential equation

Fourier transform Mathematical transform that expresses a function of time as a function of frequency

A Fourier transform (FT) is a mathematical transform that decomposes functions depending on space or time into functions depending on spatial frequency or temporal frequency. That process is also called analysis. An example application would be decomposing the waveform of a musical chord into terms of the intensity of its constituent pitches. The term Fourier transform refers to both the frequency domain representation and the mathematical operation that associates the frequency domain representation to a function of space or time.

Spectral density Relative importance of certain frequencies in a composite signal

The power spectrum of a time series describes the distribution of power into frequency components composing that signal. According to Fourier analysis, any physical signal can be decomposed into a number of discrete frequencies, or a spectrum of frequencies over a continuous range. The statistical average of a certain signal or sort of signal as analyzed in terms of its frequency content, is called its spectrum.

Beta function

In mathematics, the beta function, also called the Euler integral of the first kind, is a special function that is closely related to the gamma function and to binomial coefficients. It is defined by the integral

In mathematics and statistics, a stationary process is a stochastic process whose unconditional joint probability distribution does not change when shifted in time. Consequently, parameters such as mean and variance also do not change over time. To get an intuition of stationarity, one can imagine a frictionless pendulum. It swings back and forth in an oscillatory motion, yet the amplitude and frequency remain constant. Although the pendulum is moving, the process is stationary as its "statistics" are constant. However, if a force were to be applied to the pendulum, either the frequency or amplitude would change, thus making the process non-stationary.

In signal processing, a finite impulse response (FIR) filter is a filter whose impulse response is of finite duration, because it settles to zero in finite time. This is in contrast to infinite impulse response (IIR) filters, which may have internal feedback and may continue to respond indefinitely.

In the mathematical field of complex analysis, contour integration is a method of evaluating certain integrals along paths in the complex plane.

In the theory of stochastic processes, the Karhunen–Loève theorem, also known as the Kosambi–Karhunen–Loève theorem is a representation of a stochastic process as an infinite linear combination of orthogonal functions, analogous to a Fourier series representation of a function on a bounded interval. The transformation is also known as Hotelling transform and eigenvector transform, and is closely related to principal component analysis (PCA) technique widely used in image processing and in data analysis in many fields.

Multiple integral Generalization of definite integrals to functions of multiple variables

In mathematics, a multiple integral is a definite integral of a function of several real variables, for instance, f(x, y) or f(x, y, z). Integrals of a function of two variables over a region in are called double integrals, and integrals of a function of three variables over a region in are called triple integrals. For multiple integrals of a single-variable function, see the Cauchy formula for repeated integration.

A cyclostationary process is a signal having statistical properties that vary cyclically with time. A cyclostationary process can be viewed as multiple interleaved stationary processes. For example, the maximum daily temperature in New York City can be modeled as a cyclostationary process: the maximum temperature on July 21 is statistically different from the temperature on December 20; however, it is a reasonable approximation that the temperature on December 20 of different years has identical statistics. Thus, we can view the random process composed of daily maximum temperatures as 365 interleaved stationary processes, each of which takes on a new value once per year.

Wigner distribution function

The Wigner distribution function (WDF) is used in signal processing as a transform in time-frequency analysis.

In the mathematical discipline of graph theory, the expander walk sampling theorem intuitively states that sampling vertices in an expander graph by doing relatively short random walk can simulate sampling the vertices independently from a uniform distribution. The earliest version of this theorem is due to Ajtai, Komlós & Szemerédi (1987), and the more general version is typically attributed to Gillman (1998).

In mathematics, a local martingale is a type of stochastic process, satisfying the localized version of the martingale property. Every martingale is a local martingale; every bounded local martingale is a martingale; in particular, every local martingale that is bounded from below is a supermartingale, and every local martingale that is bounded from above is a submartingale; however, in general a local martingale is not a martingale, because its expectation can be distorted by large values of small probability. In particular, a driftless diffusion process is a local martingale, but not necessarily a martingale.

In control theory, a separation principle, more formally known as a principle of separation of estimation and control, states that under some assumptions the problem of designing an optimal feedback controller for a stochastic system can be solved by designing an optimal observer for the state of the system, which feeds into an optimal deterministic controller for the system. Thus the problem can be broken into two separate parts, which facilitates the design.

Bilinear time–frequency distributions, or quadratic time–frequency distributions, arise in a sub-field of signal analysis and signal processing called time–frequency signal processing, and, in the statistical analysis of time series data. Such methods are used where one needs to deal with a situation where the frequency composition of a signal may be changing over time; this sub-field used to be called time–frequency signal analysis, and is now more often called time–frequency signal processing due to the progress in using these methods to a wide range of signal-processing problems.

In mathematics and physics, the Magnus expansion, named after Wilhelm Magnus (1907–1990), provides an exponential representation of the solution of a first-order homogeneous linear differential equation for a linear operator. In particular, it furnishes the fundamental matrix of a system of linear ordinary differential equations of order n with varying coefficients. The exponent is aggregated as an infinite series, whose terms involve multiple integrals and nested commutators.

The narrow escape problem is a ubiquitous problem in biology, biophysics and cellular biology.

Brownian excursion

In probability theory a Brownian excursion process is a stochastic process that is closely related to a Wiener process. Realisations of Brownian excursion processes are essentially just realizations of a Wiener process selected to satisfy certain conditions. In particular, a Brownian excursion process is a Wiener process conditioned to be positive and to take the value 0 at time 1. Alternatively, it is a Brownian bridge process conditioned to be positive. BEPs are important because, among other reasons, they naturally arise as the limit process of a number of conditional functional central limit theorems.

In mathematics, the oscillator representation is a projective unitary representation of the symplectic group, first investigated by Irving Segal, David Shale, and André Weil. A natural extension of the representation leads to a semigroup of contraction operators, introduced as the oscillator semigroup by Roger Howe in 1988. The semigroup had previously been studied by other mathematicians and physicists, most notably Felix Berezin in the 1960s. The simplest example in one dimension is given by SU(1,1). It acts as Möbius transformations on the extended complex plane, leaving the unit circle invariant. In that case the oscillator representation is a unitary representation of a double cover of SU(1,1) and the oscillator semigroup corresponds to a representation by contraction operators of the semigroup in SL(2,C) corresponding to Möbius transformations that take the unit disk into itself.

The redundancy principle in biology expresses the need of many copies of the same entity to fulfill a biological function. Examples are numerous: disproportionate numbers of spermatozoa during fertilization compared to one egg, large number of neurotransmitters released during neuronal communication compared to the number of receptors, large numbers of released calcium ions during transient in cells and many more in molecular and cellular transduction or gene activation and cell signaling. This redundancy is particularly relevant when the sites of activation is physically separated from the initial position of the molecular messengers. The redundancy is often generated for the purpose of resolving the time constraint of fast-activating pathways. It can be expressed in terms of the theory of extreme statistics to determine its laws and quantify how shortest paths are selected. The main goal is to estimate these large numbers from physical principles and mathematical derivations.

References

  1. 1 2 3 4 5 6 7 8 9 Tryphon T. Georgiou and Anders Lindquist (2013). "The Separation Principle in Stochastic Control, Redux". IEEE Transactions on Automatic Control. 58 (10): 2481–2494. arXiv: 1103.3005 . doi:10.1109/TAC.2013.2259207..
  2. 1 2 3 4 5 6 7 8 Anders Lindquist (1973). "On Feedback Control of Linear Stochastic Systems". SIAM Journal on Control. 11 (2): 323–343. doi:10.1137/0311025..
  3. Karl Johan Astrom (1970). Introduction to Stochastic Control Theory. Vol. 58. Academic Press. ISBN   978-0-486-44531-1..
  4. 1 2 A. Bensoussan (1992). Stochastic Control of Partially Observable Systems. Cambridge University Press..
  5. Ramon van Handel (2007). Stochastic Calculus, Filtering, and Stochastic Control (PDF). unpublished notes.
  6. Jan C. Willems. (1978). "Recursive filtering". Statistica Neerlandica. 32 (1): 1–39. doi:10.1111/j.1467-9574.1978.tb01382.x..
  7. M.H.A. Davis (1978). Linear Estimation and Stochastic Control. Chapman and Hall..
  8. 1 2 3 4 5 6 Anders Lindquist (1973). "Optimal control of linear stochastic systems with applications to time lag systems". Information Sciences. 5: 81–126. doi:10.1016/0020-0255(73)90005-4..
  9. Murray Wonham (1968). "On the separation theorem of stochastic control". SIAM J. Control. 6 (2): 312–326. doi:10.1137/0306023.
  10. W.H. Fleming and R.W. Rishel (1968). Deterministic and Stochastic Optimal Control. Springer-Verlag..
  11. H. Kushner (1971). Introduction to Stochastic Control. Holt, Rinehart and Winston..
  12. Tyrone Duncan and Pravin Varaiya (1971). "On the solutions of a stochastic control system" (PDF). SIAM J. Control. 9 (3): 354–371. doi:10.1137/0309026. hdl: 1808/16692 ..
  13. M.H.A. Davis and P. Varaiya (1972). "Information states for stochastic systems". J. Math. Anal. Applications. 37: 384–402. doi: 10.1016/0022-247X(72)90281-8 ..
  14. 1 2 R.S. Liptser and A.N. Shirayev (1978). Statistics of Random Processes II, Applications. Springer-Verlag..
  15. S. Mitter (1996). "Filtering and stochastic control: A historical perspective". IEEE Control Systems Magazine. 13 (3): 67–76..
  16. Rogers, L. Chris G., and David Williams (2000). Diffusions, Markov processes and martingales: Volume 2, Itô calculus. Cambridge university press.{{cite book}}: CS1 maint: multiple names: authors list (link)
  17. Klebaner, Fima C. (2012). Introduction to Stochastic Calculus with Applications. Imperial College Press via World Scientific Publishing Company.
  18. Protter, P. E. (2004). Stochastic Integration and Differential Equations. Springer.
  19. Anders Lindquist (1968). "On optimal stochastic control with smoothed information". Information Sciences. 1: 55–85. doi:10.1016/0020-0255(68)90007-8..
  20. Anders Lindquist (1969). "An innovations approach to optimal control of linear stochastic systems with time delay". Information Sciences. 1 (3): 279–295. doi:10.1016/S0020-0255(69)80014-9..
  21. R. Brooks (1972). "Linear Stochastic Control: An extended separation principle". J. Math. Anal. Appl. 38 (3): 569–587. doi: 10.1016/0022-247X(72)90069-8 ..