Unscented optimal control

Last updated

In mathematics, unscented optimal control combines the notion of the unscented transform with deterministic optimal control to address a class of uncertain optimal control problems. [1] [2] [3] [4] It is a specific application of tychastic optimal control theory, [1] [5] [6] [7] which is a generalization of Riemmann-Stieltjes optimal control theory, [8] [9] a concept introduced by Ross and his coworkers.

Contents

Mathematical description

Suppose that the initial state of a dynamical system,

is an uncertain quantity. Let be the sigma points. Then sigma-copies of the dynamical system are given by,

Applying standard deterministic optimal control principles to this ensemble generates an unscented optimal control. [10] [11] [12] Unscented optimal control is a special case of tychastic optimal control theory. [1] [5] [13] According to Aubin [13] and Ross, [1] tychastic processes differ from stochastic processes in that a tychastic process is conditionally deterministic.

Applications

Unscented optimal control theory has been applied to UAV guidance, [12] [14] spacecraft attitude control, [6] air-traffic control [15] and low-thrust trajectory optimization [2] [10]

Related Research Articles

<span class="mw-page-title-main">Optimal control</span> Mathematical way of attaining a desired output from a dynamic system

Optimal control theory is a branch of control theory that deals with finding a control for a dynamical system over a period of time such that an objective function is optimized. It has numerous applications in science, engineering and operations research. For example, the dynamical system might be a spacecraft with controls corresponding to rocket thrusters, and the objective might be to reach the Moon with minimum fuel expenditure. Or the dynamical system could be a nation's economy, with the objective to minimize unemployment; the controls in this case could be fiscal and monetary policy. A dynamical system may also be introduced to embed operations research problems within the framework of optimal control theory.

Model predictive control (MPC) is an advanced method of process control that is used to control a process while satisfying a set of constraints. It has been in use in the process industries in chemical plants and oil refineries since the 1980s. In recent years it has also been used in power system balancing models and in power electronics. Model predictive controllers rely on dynamic models of the process, most often linear empirical models obtained by system identification. The main advantage of MPC is the fact that it allows the current timeslot to be optimized, while keeping future timeslots in account. This is achieved by optimizing a finite time-horizon, but only implementing the current timeslot and then optimizing again, repeatedly, thus differing from a linear–quadratic regulator (LQR). Also MPC has the ability to anticipate future events and can take control actions accordingly. PID controllers do not have this predictive ability. MPC is nearly universally implemented as a digital control, although there is research into achieving faster response times with specially designed analog circuitry.

Trajectory optimization is the process of designing a trajectory that minimizes some measure of performance while satisfying a set of constraints. Generally speaking, trajectory optimization is a technique for computing an open-loop solution to an optimal control problem. It is often used for systems where computing the full closed-loop solution is not required, impractical or impossible. If a trajectory optimization problem can be solved at a rate given by the inverse of the Lipschitz constant, then it can be used iteratively to generate a closed-loop solution in the sense of Caratheodory. If only the first step of the trajectory is executed for an infinite-horizon problem, then this is known as Model Predictive Control (MPC).

<span class="mw-page-title-main">Boolean network</span> Discrete set of Boolean variables

A Boolean network consists of a discrete set of Boolean variables each of which has a Boolean function assigned to it which takes inputs from a subset of those variables and output that determines the state of the variable it is assigned to. This set of functions in effect determines a topology (connectivity) on the set of variables, which then become nodes in a network. Usually, the dynamics of the system is taken as a discrete time series where the state of the entire network at time t+1 is determined by evaluating each variable's function on the state of the network at time t. This may be done synchronously or asynchronously.

GPS/INS is the use of GPS satellite signals to correct or calibrate a solution from an inertial navigation system (INS). The method is applicable for any GNSS/INS system.

The Gauss pseudospectral method (GPM), one of many topics named after Carl Friedrich Gauss, is a direct transcription method for discretizing a continuous optimal control problem into a nonlinear program (NLP). The Gauss pseudospectral method differs from several other pseudospectral methods in that the dynamics are not collocated at either endpoint of the time interval. This collocation, in conjunction with the proper approximation to the costate, leads to a set of KKT conditions that are identical to the discretized form of the first-order optimality conditions. This equivalence between the KKT conditions and the discretized first-order optimality conditions leads to an accurate costate estimate using the KKT multipliers of the NLP.

Pseudospectral optimal control is a joint theoretical-computational method for solving optimal control problems. It combines pseudospectral (PS) theory with optimal control theory to produce a PS optimal control theory. PS optimal control theory has been used in ground and flight systems in military and industrial applications. The techniques have been extensively used to solve a wide range of problems such as those arising in UAV trajectory generation, missile guidance, control of robotic arms, vibration damping, lunar guidance, magnetic control, swing-up and stabilization of an inverted pendulum, orbit transfers, tether libration control, ascent guidance and quantum control.

DIDO is a MATLAB optimal control toolbox for solving general-purpose optimal control problems. It is widely used in academia, industry, and NASA. Hailed as a breakthrough software, DIDO is based on the pseudospectral optimal control theory of Ross and Fahroo. The latest enhancements to DIDO are described in Ross.

The Legendre pseudospectral method for optimal control problems is based on Legendre polynomials. It is part of the larger theory of pseudospectral optimal control, a term coined by Ross. A basic version of the Legendre pseudospectral was originally proposed by Elnagar and his coworkers in 1995. Since then, Ross, Fahroo and their coworkers have extended, generalized and applied the method for a large range of problems. An application that has received wide publicity is the use of their method for generating real time trajectories for the International Space Station.

The Chebyshev pseudospectral method for optimal control problems is based on Chebyshev polynomials of the first kind. It is part of the larger theory of pseudospectral optimal control, a term coined by Ross. Unlike the Legendre pseudospectral method, the Chebyshev pseudospectral (PS) method does not immediately offer high-accuracy quadrature solutions. Consequently, two different versions of the method have been proposed: one by Elnagar et al., and another by Fahroo and Ross. The two versions differ in their quadrature techniques. The Fahroo–Ross method is more commonly used today due to the ease in implementation of the Clenshaw–Curtis quadrature technique. In 2008, Trefethen showed that the Clenshaw–Curtis method was nearly as accurate as Gauss quadrature. This breakthrough result opened the door for a covector mapping theorem for Chebyshev PS methods. A complete mathematical theory for Chebyshev PS methods was finally developed in 2009 by Gong, Ross and Fahroo.

The unscented transform (UT) is a mathematical function used to estimate the result of applying a given nonlinear transformation to a probability distribution that is characterized only in terms of a finite set of statistics. The most common use of the unscented transform is in the nonlinear projection of mean and covariance estimates in the context of nonlinear extensions of the Kalman filter. Its creator Jeffrey Uhlmann explained that "unscented" was an arbitrary name that he adopted to avoid it being referred to as the “Uhlmann filter.”

Differential dynamic programming (DDP) is an optimal control algorithm of the trajectory optimization class. The algorithm was introduced in 1966 by Mayne and subsequently analysed in Jacobson and Mayne's eponymous book. The algorithm uses locally-quadratic models of the dynamics and cost functions, and displays quadratic convergence. It is closely related to Pantoja's step-wise Newton's method.

Introduced by I. Michael Ross and F. Fahroo, the Ross–Fahroo pseudospectral methods are a broad collection of pseudospectral methods for optimal control. Examples of the Ross–Fahroo pseudospectral methods are the pseudospectral knotting method, the flat pseudospectral method, the Legendre-Gauss-Radau pseudospectral method and pseudospectral methods for infinite-horizon optimal control.

Named after I. Michael Ross and F. Fahroo, the Ross–Fahroo lemma is a fundamental result in optimal control theory.

The Bellman pseudospectral method is a pseudospectral method for optimal control based on Bellman's principle of optimality. It is part of the larger theory of pseudospectral optimal control, a term coined by Ross. The method is named after Richard E. Bellman. It was introduced by Ross et al. first as a means to solve multiscale optimal control problems, and later expanded to obtain suboptimal solutions for general optimal control problems.

Ross' π lemma, named after I. Michael Ross, is a result in computational optimal control. Based on generating Carathéodory-π solutions for feedback control, Ross' π-lemma states that there is fundamental time constant within which a control solution must be computed for controllability and stability. This time constant, known as Ross' time constant, is proportional to the inverse of the Lipschitz constant of the vector field that governs the dynamics of a nonlinear control system.

Isaac Michael Ross is a Distinguished Professor and Program Director of Control and Optimization at the Naval Postgraduate School in Monterey, CA. He has published a highly-regarded textbook on optimal control theory and seminal papers in pseudospectral optimal control theory, energy-sink theory, the optimization and deflection of near-Earth asteroids and comets, robotics, attitude dynamics and control, orbital mechanics, real-time optimal control, unscented optimal control and continuous optimization. The Kang–Ross–Gong theorem, Ross' π lemma, Ross' time constant, the Ross–Fahroo lemma, and the Ross–Fahroo pseudospectral method are all named after him. According to a report published by Stanford University, Ross is one of the world's top 2% of scientists.

Moving horizon estimation (MHE) is an optimization approach that uses a series of measurements observed over time, containing noise and other inaccuracies, and produces estimates of unknown variables or parameters. Unlike deterministic approaches, MHE requires an iterative approach that relies on linear programming or nonlinear programming solvers to find a solution.

Telespazio Germany GmbH is a European aerospace company, founded in 1978. The company provides consulting, technology and engineering services in aerospace missions for ESOC, EUMETSAT and the German Aerospace Center (DLR).

<span class="mw-page-title-main">Duane McRuer</span> Vehicle controls scientist, cofounder of Systems Technology Inc.

Duane Torrance "Mac" McRuer was a scientist, engineer, and expert in aircraft flight and other vehicle controls who cofounded Systems Technology Inc. in 1957. He made many contributions to the theory and practical application of human-machine interaction and control.

References

  1. 1 2 3 4 Ross, Isaac (2015). A primer on Pontryagin's principle in optimal control. San Francisco: Collegiate Publishers. pp. 75–82. ISBN   978-0-9843571-1-6.
  2. 1 2 Ross, I. Michael; Proulx, Ronald; Karpenko, Mark (August 4–7, 2014). Unscented Optimal Control for Orbital and Proximity Operations in an Uncertain Environment: A New Zermelo Problem. AIAA/AAS Astrodynamics Specialist Conference. San Diego, CA: American Institute of Aeronautics and Astronautics. doi:10.2514/6.2014-4423 . Retrieved August 23, 2024.
  3. Ross et al, Unscented Control for Uncertain Dynamical Systems, US Patent US 9,727,034 Bl. Issued Aug 8, 2017. https://calhoun.nps.edu/bitstream/handle/10945/55812/USPN%209727034.pdf?sequence=1&isAllowed=y
  4. Manchester, Zachary; Kuindersma, Scott (December 2016). "Derivative-free trajectory optimization with unscented dynamic programming". 2016 IEEE 55th Conference on Decision and Control (CDC). IEEE. pp. 3642–3647. doi:10.1109/cdc.2016.7798817. ISBN   978-1-5090-1837-6.
  5. 1 2 Ross, I. Michael; Karpenko, Mark; Proulx, Ronald J. (July 2016). "Path constraints in tychastic and unscented optimal control: Theory, application and experimental results". 2016 American Control Conference (ACC). IEEE. pp. 2918–2923. doi:10.1109/acc.2016.7525362. ISBN   978-1-4673-8682-1. S2CID   1123147.
  6. 1 2 Ross, I. M.; Karpenko, M.; Proulx, R. J. (July 2016). "Path constraints in tychastic and unscented optimal control: Theory, application and experimental results". 2016 American Control Conference (ACC). pp. 2918–2923. doi:10.1109/acc.2016.7525362. ISBN   978-1-4673-8682-1. S2CID   1123147.
  7. Ross, I. M.; Proulx, R. J.; Karpenko, M. (2024-05-04). "Unscented Trajectory Optimization". arXiv: 2405.02753 [math.OC].
  8. Ross, I. Michael; Karpenko, Mark; Proulx, Ronald J. (2015). "Riemann-Stieltjes Optimal Control Problems for Uncertain Dynamic Systems". Journal of Guidance, Control, and Dynamics. 38 (7). AIAA: 1251–1263. Bibcode:2015JGCD...38.1251R. doi:10.2514/1.G000505. hdl: 10945/48189 . S2CID   121424228.
  9. Karpenko, Mark; Proulx, Ronald J. (2016). "Experimental Implementation of Riemann–Stieltjes Optimal Control for Agile Imaging Satellites". Journal of Guidance, Control, and Dynamics. 39 (1): 144–150. Bibcode:2016JGCD...39..144K. doi:10.2514/1.g001325. hdl: 10945/50355 . ISSN   0731-5090. S2CID   116887441.
  10. 1 2 Ozaki, Naoya; Funase, Ryu (January 8–12, 2018). Tube Stochastic Differential Dynamic Programming for Robust Low-Thrust Trajectory Optimization Problems. 2018 AIAA Guidance, Navigation, and Control Conference. Kissimmee, Florida. doi:10.2514/6.2018-0861.
  11. "Robust Differential Dynamic Programming for Low-Thrust Trajectory Design: Approach with Robust Model Predictive Control Technique" (PDF).
  12. 1 2 Shaffer, R.; Karpenko, M.; Gong, Q. (July 2016). "Unscented guidance for waypoint navigation of a fixed-wing UAV". 2016 American Control Conference (ACC). pp. 473–478. doi:10.1109/acc.2016.7524959. ISBN   978-1-4673-8682-1. S2CID   11741951.
  13. 1 2 Aubin, Jean-Pierre; Saint-Pierre, Patrick (2008). "A Tychastic Approach to Guaranteed Pricing and Management of Portfolios under Transaction Constraints". Seminar on Stochastic Analysis, Random Fields and Applications V. Progress in Probability. Vol. 59. Basel: Birkhäuser Basel. pp. 411–433. doi:10.1007/978-3-7643-8458-6_22. ISBN   978-3-7643-8457-9 . Retrieved 2020-12-23.
  14. Ross, I. M.; Proulx, R. J.; Karpenko, M. (July 2015). "Unscented guidance". 2015 American Control Conference (ACC). pp. 5605–5610. doi:10.1109/acc.2015.7172217. ISBN   978-1-4799-8684-2. S2CID   28136418.
  15. Ng, Hok Kwan (2020-06-08). "Strategic Planning with Unscented Optimal Guidance for Urban Air Mobility". AIAA Aviation 2020 Forum. American Institute of Aeronautics and Astronautics. doi:10.2514/6.2020-2904. ISBN   978-1-62410-598-2. S2CID   225658104 . Retrieved 2020-12-23.