Unscented optimal control

Last updated

In mathematics, unscented optimal control combines the notion of the unscented transform with deterministic optimal control to address a class of uncertain optimal control problems. [1] [2] [3] It is a specific application of Riemmann-Stieltjes optimal control theory, [4] [5] a concept introduced by Ross and his coworkers.

Contents

Mathematical description

Suppose that the initial state of a dynamical system,

is an uncertain quantity. Let be the sigma points. Then sigma-copies of the dynamical system are given by,

Applying standard deterministic optimal control principles to this ensemble generates an unscented optimal control. [6] [7] [8] Unscented optimal control is a special case of tychastic optimal control theory. [1] [9] [10] According to Aubin [10] and Ross, [1] tychastic processes differ from stochastic processes in that a tychastic process is conditionally deterministic.

Applications

Unscented optimal control theory has been applied to UAV guidance, [8] [11] spacecraft attitude control, [12] air-traffic control [13] and low-thrust trajectory optimization [2] [6]

Related Research Articles

Kalman filter Algorithm that estimates unknowns from a series of measurements over time

In statistics and control theory, Kalman filtering, also known as linear quadratic estimation (LQE), is an algorithm that uses a series of measurements observed over time, containing statistical noise and other inaccuracies, and produces estimates of unknown variables that tend to be more accurate than those based on a single measurement alone, by estimating a joint probability distribution over the variables for each timeframe. The filter is named after Rudolf E. Kálmán, who was one of the primary developers of its theory.

Optimal control theory is a branch of mathematical optimization that deals with finding a control for a dynamical system over a period of time such that an objective function is optimized. It has numerous applications in science, engineering and operations research. For example, the dynamical system might be a spacecraft with controls corresponding to rocket thrusters, and the objective might be to reach the moon with minimum fuel expenditure. Or the dynamical system could be a nation's economy, with the objective to minimize unemployment; the controls in this case could be fiscal and monetary policy. A dynamical system may also be introduced to embed operations research problems within the framework of optimal control theory.

Model predictive control (MPC) is an advanced method of process control that is used to control a process while satisfying a set of constraints. It has been in use in the process industries in chemical plants and oil refineries since the 1980s. In recent years it has also been used in power system balancing models and in power electronics. Model predictive controllers rely on dynamic models of the process, most often linear empirical models obtained by system identification. The main advantage of MPC is the fact that it allows the current timeslot to be optimized, while keeping future timeslots in account. This is achieved by optimizing a finite time-horizon, but only implementing the current timeslot and then optimizing again, repeatedly, thus differing from Linear-Quadratic Regulator (LQR). Also MPC has the ability to anticipate future events and can take control actions accordingly. PID controllers do not have this predictive ability. MPC is nearly universally implemented as a digital control, although there is research into achieving faster response times with specially designed analog circuitry.

Trajectory optimization is the process of designing a trajectory that minimizes some measure of performance while satisfying a set of constraints. Generally speaking, trajectory optimization is a technique for computing an open-loop solution to an optimal control problem. It is often used for systems where computing the full closed-loop solution is not required, impractical or impossible. If a trajectory optimization problem can be solved at a rate given by the inverse of the Lipschitz constant, then it can be used iteratively to generate a closed-loop solution in the sense of Caratheodory. If only the first step of the trajectory is executed for an infinite-horizon problem, then this is known as Model Predictive Control (MPC).

In control theory, the linear–quadratic–Gaussian (LQG) control problem is one of the most fundamental optimal control problems. It concerns linear systems driven by additive white Gaussian noise. The problem is to determine an output feedback law that is optimal in the sense of minimizing the expected value of a quadratic cost criterion. Output measurements are assumed to be corrupted by Gaussian noise and the initial state, likewise, is assumed to be a Gaussian random vector.

Boolean network Discrete set of boolean variables

A Boolean network consists of a discrete set of boolean variables each of which has a Boolean function assigned to it which takes inputs from a subset of those variables and output that determines the state of the variable it is assigned to. This set of functions in effect determines a topology (connectivity) on the set of variables, which then become nodes in a network. Usually, the dynamics of the system is taken as a discrete time series where the state of the entire network at time t+1 is determined by evaluating each variable's function on the state of the network at time t. This may be done synchronously or asynchronously.

The value function of an optimization problem gives the value attained by the objective function at a solution, while only depending on the parameters of the problem. In a controlled dynamical system, the value function represents the optimal payoff of the system over the interval [t, t1] when started at the time-t state variable x(t)=x. If the objective function represents some cost that is to be minimized, the value function can be interpreted as the cost to finish the optimal program, and is thus referred to as "cost-to-go function." In an economic context, where the objective function usually represents utility, the value function is conceptually equivalent to the indirect utility function.

GPS/INS is the use of GPS satellite signals to correct or calibrate a solution from an inertial navigation system (INS). The method is applicable for any GNSS/INS system.

Pseudospectral optimal control is a joint theoretical-computational method for solving optimal control problems. It combines pseudospectral (PS) theory with optimal control theory to produce PS optimal control theory. PS optimal control theory has been used in ground and flight systems in military and industrial applications. The techniques have been extensively used to solve a wide range of problems such as those arising in UAV trajectory generation, missile guidance, control of robotic arms, vibration damping, lunar guidance, magnetic control, swing-up and stabilization of an inverted pendulum, orbit transfers, tether libration control, ascent guidance and quantum control.

DIDO is a software product for solving general-purpose optimal control problems. It is widely used in academia, industry, and NASA. Hailed as a breakthrough software, DIDO is based on the pseudospectral optimal control theory of Ross and Fahroo. The latest enhancements to DIDO are described in Ross.

Stochastic control or stochastic optimal control is a sub field of control theory that deals with the existence of uncertainty either in observations or in the noise that drives the evolution of the system. The system designer assumes, in a Bayesian probability-driven fashion, that random noise with known probability distribution affects the evolution and observation of the state variables. Stochastic control aims to design the time path of the controlled variables that performs the desired control task with minimum cost, somehow defined, despite the presence of this noise. The context may be either discrete time or continuous time.

The Sethi model was developed by Suresh P. Sethi and describes the process of how sales evolve over time in response to advertising. The model assumes that the rate of change in sales depend on three effects: response to advertising that acts positively on the unsold portion of the market, the loss due to forgetting or possibly due to competitive factors that act negatively on the sold portion of the market, and a random effect that can go either way.

The Legendre pseudospectral method for optimal control problems is based on Legendre polynomials. It is part of the larger theory of pseudospectral optimal control, a term coined by Ross. A basic version of the Legendre pseudospectral was originally proposed by Elnagar and his coworkers in 1995. Since then, Ross, Fahroo and their coworkers have extended, generalized and applied the method for a large range of problems. An application that has received wide publicity is the use of their method for generating real time trajectories for the International Space Station.

The Chebyshev pseudospectral method for optimal control problems is based on Chebyshev polynomials of the first kind. It is part of the larger theory of pseudospectral optimal control, a term coined by Ross. Unlike the Legendre pseudospectral method, the Chebyshev pseudospectral (PS) method does not immediately offer high-accuracy quadrature solutions. Consequently, two different versions of the method have been proposed: one by Elnagar et al., and another by Fahroo and Ross. The two versions differ in their quadrature techniques. The Fahroo–Ross method is more commonly used today due to the ease in implementation of the Clenshaw–Curtis quadrature technique. In 2008, Trefethen showed that the Clenshaw–Curtis method was nearly as accurate as Gauss quadrature. This breakthrough result opened the door for a covector mapping theorem for Chebyshev PS methods. A complete mathematical theory for Chebyshev PS methods was finally developed in 2009 by Gong, Ross and Fahroo.

The unscented transform (UT) is a mathematical function used to estimate the result of applying a given nonlinear transformation to a probability distribution that is characterized only in terms of a finite set of statistics. The most common use of the unscented transform is in the nonlinear projection of mean and covariance estimates in the context of nonlinear extensions of the Kalman filter. Its creator Jeffrey Uhlmann explained that "unscented" was an arbitrary name that he adopted to avoid it being referred to as the “Uhlmann filter.”

Introduced by I. Michael Ross and F. Fahroo, the Ross–Fahroo pseudospectral methods are a broad collection of pseudospectral methods for optimal control. Examples of the Ross–Fahroo pseudospectral methods are the pseudospectral knotting method, the flat pseudospectral method, the Legendre-Gauss-Radau pseudospectral method and pseudospectral methods for infinite-horizon optimal control.

Named after I. Michael Ross and F. Fahroo, the Ross–Fahroo lemma is a fundamental result in optimal control theory.

The Bellman pseudospectral method is a pseudospectral method for optimal control based on Bellman's principle of optimality. It is part of the larger theory of pseudospectral optimal control, a term coined by Ross. The method is named after Richard E. Bellman. It was introduced by Ross et al. first as a means to solve multiscale optimal control problems, and later expanded to obtain suboptimal solutions for general optimal control problems.

Isaac Michael Ross is a Distinguished Professor and Program Director of Control and Optimization at the Naval Postgraduate School in Monterey, CA. He has published papers in pseudospectral optimal control theory, energy-sink theory, the optimization and deflection of near-Earth asteroids and comets, robotics, attitude dynamics and control, real-time optimal control unscented optimal control and a textbook on optimal control. The Kang-Ross-Gong theorem, Ross' π lemma, Ross' time constant, the Ross–Fahroo lemma, and the Ross–Fahroo pseudospectral method are all named after him.

Moving horizon estimation (MHE) is an optimization approach that uses a series of measurements observed over time, containing noise and other inaccuracies, and produces estimates of unknown variables or parameters. Unlike deterministic approaches, MHE requires an iterative approach that relies on linear programming or nonlinear programming solvers to find a solution.

References

  1. 1 2 3 Ross, Isaac (2015). A primer on Pontryagin's principle in optimal control. San Francisco: Collegiate Publishers. pp. 75–82. ISBN   978-0-9843571-1-6.
  2. 1 2 Unscented Optimal Control for Orbital and Proximity Operations in an Uncertain Environment: A New Zermelo Problem I. Michael Ross, Ronald Proulx, Mark Karpenko August 2014, American Institute of Aeronautics and Astronautics (AIAA) DOI: 10.2514/6.2014-4423
  3. Ross et al, Unscented Control for Uncertain Dynamical Systems, US Patent US 9,727,034 Bl. Issued Aug 8, 2017. https://calhoun.nps.edu/bitstream/handle/10945/55812/USPN%209727034.pdf?sequence=1&isAllowed=y
  4. Ross, I. Michael; Karpenko, Mark; Proulx, Ronald J. (2015). "Riemann-Stieltjes Optimal Control Problems for Uncertain Dynamic Systems". Journal of Guidance Control and Dynamics. AIAA. 38 (7): 1251–1263. doi:10.2514/1.G000505.
  5. Karpenko, Mark; Proulx, Ronald J. (2016). "Experimental Implementation of Riemann–Stieltjes Optimal Control for Agile Imaging Satellites". Journal of Guidance, Control, and Dynamics. 39 (1): 144–150. doi:10.2514/1.g001325. ISSN   0731-5090.
  6. 1 2 Naoya Ozaki and Ryu Funase. "Tube Stochastic Differential Dynamic Programming for Robust Low-Thrust Trajectory Optimization Problems", 2018 AIAA Guidance, Navigation, and Control Conference, AIAA SciTech Forum, (AIAA 2018-0861) https://doi.org/10.2514/6.2018-0861
  7. "Robust Differential Dynamic Programming for Low-Thrust Trajectory Design: Approach with Robust Model Predictive Control Technique" (PDF).
  8. 1 2 Shaffer, R.; Karpenko, M.; Gong, Q. (July 2016). "Unscented guidance for waypoint navigation of a fixed-wing UAV". 2016 American Control Conference (ACC): 473–478. doi:10.1109/acc.2016.7524959. ISBN   978-1-4673-8682-1.
  9. Ross, I. Michael; Karpenko, Mark; Proulx, Ronald J. (July 2016). "Path constraints in tychastic and unscented optimal control: Theory, application and experimental results". 2016 American Control Conference (ACC). IEEE: 2918–2923. doi:10.1109/acc.2016.7525362. ISBN   978-1-4673-8682-1.
  10. 1 2 Aubin, Jean-Pierre; Saint-Pierre, Patrick (2008), A Tychastic Approach to Guaranteed Pricing and Management of Portfolios under Transaction Constraints, Progress in Probability, 59, Basel: Birkhäuser Basel, pp. 411–433, doi:10.1007/978-3-7643-8458-6_22, ISBN   978-3-7643-8457-9 , retrieved 2020-12-23
  11. Ross, I. M.; Proulx, R. J.; Karpenko, M. (July 2015). "Unscented guidance". 2015 American Control Conference (ACC): 5605–5610. doi:10.1109/acc.2015.7172217. ISBN   978-1-4799-8684-2.
  12. Ross, I. M.; Karpenko, M.; Proulx, R. J. (July 2016). "Path constraints in tychastic and unscented optimal control: Theory, application and experimental results". 2016 American Control Conference (ACC): 2918–2923. doi:10.1109/acc.2016.7525362. ISBN   978-1-4673-8682-1.
  13. Ng, Hok Kwan (2020-06-08), "Strategic Planning with Unscented Optimal Guidance for Urban Air Mobility", AIAA AVIATION 2020 FORUM, AIAA AVIATION Forum, American Institute of Aeronautics and Astronautics, doi:10.2514/6.2020-2904, ISBN   978-1-62410-598-2 , retrieved 2020-12-23