Bellman pseudospectral method

Last updated

The Bellman pseudospectral method is a pseudospectral method for optimal control based on Bellman's principle of optimality. It is part of the larger theory of pseudospectral optimal control, a term coined by Ross. [1] The method is named after Richard E. Bellman. It was introduced by Ross et al. [2] [3] first as a means to solve multiscale optimal control problems, and later expanded to obtain suboptimal solutions for general optimal control problems.

Contents

Theoretical foundations

The multiscale version of the Bellman pseudospectral method is based on the spectral convergence property of the Ross–Fahroo pseudospectral methods. That is, because the Ross–Fahroo pseudospectral method converges at an exponentially fast rate, pointwise convergence to a solution is obtained at very low number of nodes even when the solution has high-frequency components. This aliasing phenomenon in optimal control was first discovered by Ross et al. [2] Rather than use signal processing techniques to anti-alias the solution, Ross et al. proposed that Bellman's principle of optimality can be applied to the converged solution to extract information between the nodes. Because the Gauss–Lobatto nodes cluster at the boundary points, Ross et al. suggested that if the node density around the initial conditions satisfy the Nyquist–Shannon sampling theorem, then the complete solution can be recovered by solving the optimal control problem in a recursive fashion over piecewise segments known as Bellman segments. [2]

In an expanded version of the method, Ross et al., [3] proposed that method could also be used to generate feasible solutions that were not necessarily optimal. In this version, one can apply the Bellman pseudospectral method at even lower number of nodes even under the knowledge that the solution may not have converged to the optimal one. In this situation, one obtains a feasible solution.

A remarkable feature of the Bellman pseudospectral method is that it automatically determines several measures of suboptimality based on the original pseudospectral cost and the cost generated by the sum of the Bellman segments. [2] [3]

Computational efficiency

One of the computational advantages of the Bellman pseudospectral method is that it allows one to escape Gaussian rules in the distribution of node points. That is, in a standard pseudospectral method, the distribution of node points are Gaussian (typically Gauss-Lobatto for finite horizon and Gauss-Radau for infinite horizon). The Gaussian points are sparse in the middle of the interval (middle is defined in a shifted sense for infinite-horizon problems) and dense at the boundaries. The second-order accumulation of points near the boundaries have the effect of wasting nodes. The Bellman pseudospectral method takes advantage of the node accumulation at the initial point to anti-alias the solution and discards the remainder of the nodes. Thus the final distribution of nodes is non-Gaussian and dense while the computational method retains a sparse structure.

Applications

The Bellman pseudospectral method was first applied by Ross et al. [2] to solve the challenging problem of very low thrust trajectory optimization. It has been successfully applied to solve a practical problem of generating very high accuracy solutions to a trans-Earth-injection problem of bringing a space capsule from a lunar orbit to a pin-pointed Earth-interface condition for successful reentry. [4] [5]

The Bellman pseudospectral method is most commonly used as an additional check on the optimality of a pseudospectral solution generated by the Ross–Fahroo pseudospectral methods. That is, in addition to the use of Pontryagin's minimum principle in conjunction with the solutions obtained by the Ross–Fahroo pseudospectral methods, the Bellman pseudospectral method is used as a primal-only test on the optimality of the computed solution. [6] [7]

See also

Related Research Articles

Optimal control theory is a branch of applied mathematics that deals with finding a control law for a dynamical system over a period of time such that an objective function is optimized. It has numerous applications in both science and engineering. For example, the dynamical system might be a spacecraft with controls corresponding to rocket thrusters, and the objective might be to reach the moon with minimum fuel expenditure. Or the dynamical system could be a nation's economy, with the objective to minimize unemployment; the controls in this case could be fiscal and monetary policy.

In optimal control theory, the Hamilton–Jacobi–Bellman (HJB) equation gives a necessary and sufficient condition for optimality of a control with respect to a loss function. It is, in general, a nonlinear partial differential equation in the value function, which means its solution is the value function itself. Once the solution is known, it can be used to obtain the optimal control by taking the maximizer/minimizer of the Hamiltonian involved in the HJB equation.

Trajectory optimization is the process of designing a trajectory that minimizes some measure of performance while satisfying a set of constraints. Generally speaking, trajectory optimization is a technique for computing an open-loop solution to an optimal control problem. It is often used for systems where computing the full closed-loop solution is either impossible or impractical.

In the numerical solution of partial differential equations, a topic in mathematics, the spectral element method (SEM) is a formulation of the finite element method (FEM) that uses high degree piecewise polynomials as basis functions. The spectral element method was introduced in a 1984 paper by A. T. Patera.

The Gauss pseudospectral method (GPM), one of many topics named after Carl Friedrich Gauss, is a direct transcription method for discretizing a continuous optimal control problem into a nonlinear program (NLP). The Gauss pseudospectral method differs from several other pseudospectral methods in that the dynamics are not collocated at either endpoint of the time interval. This collocation, in conjunction with the proper approximation to the costate, leads to a set of KKT conditions that are identical to the discretized form of the first-order optimality conditions. This equivalence between the KKT conditions and the discretized first-order optimality conditions leads to an accurate costate estimate using the KKT multipliers of the NLP.

Pseudospectral optimal control is a joint theoretical-computational method for solving optimal control problems. It combines pseudospectral (PS) theory with optimal control theory to produce PS optimal control theory. PS optimal control theory has been used in ground and flight systems in military and industrial applications. The techniques have been extensively used to solve a wide range of problems such as those arising in UAV trajectory generation, missile guidance, control of robotic arms, vibration damping, lunar guidance, magnetic control, swing-up and stabilization of an inverted pendulum, orbit transfers, tether libration control, ascent guidance and quantum control.

DIDO is a software product for solving general-purpose optimal control problems. It is widely used in academia, industry, and NASA. Hailed as a breakthrough software, DIDO is based on the pseudospectral optimal control theory of Ross and Fahroo.

In applied mathematics, the pseudospectral knotting method is a generalization and enhancement of a standard pseudospectral method for optimal control. The concept was introduced by I. Michael Ross and F. Fahroo in 2004, and forms part of the collection of the Ross–Fahroo pseudospectral methods.

The Legendre pseudospectral method for optimal control problems is based on Legendre polynomials. It is part of the larger theory of pseudospectral optimal control, a term coined by Ross. A basic version of the Legendre pseudospectral was originally proposed by Elnagar and his coworkers in 1995. Since then, Ross, Fahroo and their coworkers have extended, generalized and applied the method for a large range of problems. An application that has received wide publicity is the use of their method for generating real time trajectories for the International Space Station.

The Chebyshev pseudospectral method for optimal control problems is based on Chebyshev polynomials of the first kind. It is part of the larger theory of pseudospectral optimal control, a term coined by Ross. Unlike the Legendre pseudospectral method, the Chebyshev pseudospectral (PS) method does not immediately offer high-accuracy quadrature solutions. Consequently, two different versions of the method have been proposed: one by Elnagar et al., and another by Fahroo and Ross. The two versions differ in their quadrature techniques. The Fahroo–Ross method is more commonly used today due to the ease in implementation of the Clenshaw–Curtis quadrature technique. In 2008, Trefethen showed that the Clenshaw–Curtis method was nearly as accurate as Gauss quadrature. This breakthrough result opened the door for a covector mapping theorem for Chebyshev PS methods. A complete mathematical theory for Chebyshev PS methods was finally developed in 2009 by Gong, Ross and Fahroo.

Introduced by I. Michael Ross and F. Fahroo, the Ross–Fahroo pseudospectral methods are a broad collection of pseudospectral methods for optimal control. Examples of the Ross–Fahroo pseudospectral methods are the pseudospectral knotting method, the flat pseudospectral method, the Legendre-Gauss-Radau pseudospectral method and pseudospectral methods for infinite-horizon optimal control.

Named after I. Michael Ross and F. Fahroo, the Ross–Fahroo lemma is a fundamental result in optimal control theory.

The covector mapping principle is a special case of Riesz' representation theorem, which is a fundamental theorem in functional analysis. The name was coined by Ross and co-workers, It provides conditions under which dualization can be commuted with discretization in the case of computational optimal control.

Ross' π lemma, named after I. Michael Ross, is a result in computational optimal control. Based on generating Carathéodory-π solutions for feedback control, Ross' π-lemma states that there is fundamental time constant within which a control solution must be computed for controllability and stability. This time constant, known as Ross' time constant, is proportional to the inverse of the Lipschitz constant of the vector field that governs the dynamics of a nonlinear control system.

Isaac Michael Ross is a Distinguished Professor and Program Director of Control and Optimization at the Naval Postgraduate School in Monterey, CA. He has published papers in pseudospectral optimal control theory, energy-sink theory, the optimization and deflection of near-Earth asteroids and comets, robotics, attitude dynamics and control, real-time optimal control unscented optimal control and a textbook on optimal control. The Kang-Ross-Gong theorem, Ross' π lemma, Ross' time constant, the Ross–Fahroo lemma, and the Ross–Fahroo pseudospectral method are all named after him.

Fariba Fahroo is an American Persian mathematician, a program manager at the Air Force Office of Scientific Research, and a former program manager at the Defense Sciences Office. Along with I. M. Ross, she has published papers in pseudospectral optimal control theory. The Ross–Fahroo lemma and the Ross–Fahroo pseudospectral method are named after her. In 2010, she received, the AIAA Mechanics and Control of Flight Award for fundamental contributions to flight mechanics.

The flat pseudospectral method is part of the family of the Ross–Fahroo pseudospectral methods introduced by Ross and Fahroo. The method combines the concept of differential flatness with pseudospectral optimal control to generate outputs in the so-called flat space.

A Carathéodory-π solution is a generalized solution to an ordinary differential equation. The concept is due to I. Michael Ross and named in honor of Constantin Carathéodory. Its practicality was demonstrated in 2008 by Ross et al. in a laboratory implementation of the concept. The concept is most useful for implementing feedback controls, particularly those generated by an application of Ross' pseudospectral optimal control theory.

GPOPS-II is a general-purpose MATLAB software for solving continuous optimal control problems using hp-adaptive Gaussian quadrature collocation and sparse nonlinear programming. The acronym GPOPS stands for "General Purpose OPtimal Control Software", and the Roman numeral "II" refers to the fact that GPOPS-II is the second software of its type.

References

  1. Ross, I. M.; Karpenko, M. (2012). "A Review of Pseudospectral Optimal Control: From Theory to Flight". Annual Reviews in Control. 36 (2): 182–197. doi:10.1016/j.arcontrol.2012.09.002.
  2. 1 2 3 4 5 Ross, I. M.; Gong, Q.; Sekhavat, P. (2007). "Low-Thrust, High-Accuracy Trajectory Optimization". Journal of Guidance, Control and Dynamics. 30 (4): 921–933. Bibcode:2007JGCD...30..921R. doi:10.2514/1.23181. hdl:10945/49785.
  3. 1 2 3 I. M. Ross, Q. Gong and P. Sekhavat, The Bellman pseudospectral method, AIAA/AAS Astrodynamics Specialist Conference and Exhibit, Honolulu, Hawaii, AIAA-2008-6448, August 18–21, 2008.
  4. Yan, H.; Gong, Q.; Park, C.; Ross, I. M.; D'Souza, C. N. (2011). "High Accuracy Trajectory Optimization for a Trans-Earth Lunar Mission". Journal of Guidance, Control and Dynamics. 34 (4): 1219–1227. Bibcode:2011JGCD...34.1219Y. doi:10.2514/1.49237.
  5. H. Yan, Q. Gong, C. D. Park, I. M. Ross and C. N. D'Souza, High-Accuracy Moon to Earth trajectory optimization, AIAA Guidance, Navigation, and Control Conference, 2010.
  6. Fleming, A.; Sekhavat, P.; Ross, I. M. (2010). "Minimum-Time Reorientation of a Rigid Body". Journal of Guidance, Control and Dynamics. 33 (1): 160–170. Bibcode:2010JGCD...33..160F. doi:10.2514/1.43549.
  7. Ross, I. M.; Sekhavat, P.; Fleming, A.; Gong, Q. (2008). "Optimal feedback control: foundations, examples, and experimental results for a new approach". Journal of Guidance, Control, and Dynamics. 31 (2): 307–321. Bibcode:2008JGCD...31..307R. doi:10.2514/1.29532.