Trajectory optimization is the process of designing a trajectory that minimizes (or maximizes) some measure of performance while satisfying a set of constraints. Generally speaking, trajectory optimization is a technique for computing an open-loop solution to an optimal control problem. It is often used for systems where computing the full closed-loop solution is not required, impractical or impossible. If a trajectory optimization problem can be solved at a rate given by the inverse of the Lipschitz constant, then it can be used iteratively to generate a closed-loop solution in the sense of Caratheodory. If only the first step of the trajectory is executed for an infinite-horizon problem, then this is known as Model Predictive Control (MPC).
Although the idea of trajectory optimization has been around for hundreds of years (calculus of variations, brachystochrone problem), it only became practical for real-world problems with the advent of the computer. Many of the original applications of trajectory optimization were in the aerospace industry, computing rocket and missile launch trajectories. More recently, trajectory optimization has also been used in a wide variety of industrial process and robotics applications. [1]
Trajectory optimization first showed up in 1697, with the introduction of the Brachystochrone problem: find the shape of a wire such that a bead sliding along it will move between two points in the minimum time. [2] The interesting thing about this problem is that it is optimizing over a curve (the shape of the wire), rather than a single number. The most famous of the solutions was computed using calculus of variations.
In the 1950s, the digital computer started to make trajectory optimization practical for solving real-world problems. The first optimal control approaches grew out of the calculus of variations, based on the research of Gilbert Ames Bliss and Bryson [3] in America, and Pontryagin [4] in Russia. Pontryagin's maximum principle is of particular note. These early researchers created the foundation of what we now call indirect methods for trajectory optimization.
Much of the early work in trajectory optimization was focused on computing rocket thrust profiles, both in a vacuum and in the atmosphere. This early research discovered many basic principles that are still used today. Another successful application was the climb to altitude trajectories for the early jet aircraft. Because of the high drag associated with the transonic drag region and the low thrust of early jet aircraft, trajectory optimization was the key to maximizing climb to altitude performance. Optimal control based trajectories were responsible for some of the world records. In these situations, the pilot followed a Mach versus altitude schedule based on optimal control solutions.
One of the important early problems in trajectory optimization was that of the singular arc, where Pontryagin's maximum principle fails to yield a complete solution. An example of a problem with singular control is the optimization of the thrust of a missile flying at a constant altitude and which is launched at low speed. Here the problem is one of a bang-bang control at maximum possible thrust until the singular arc is reached. Then the solution to the singular control provides a lower variable thrust until burnout. At that point bang-bang control provides that the control or thrust go to its minimum value of zero. This solution is the foundation of the boost-sustain rocket motor profile widely used today to maximize missile performance.
There are a wide variety of applications for trajectory optimization, primarily in robotics: industry, manipulation, walking, path-planning, and aerospace. It can also be used for modeling and estimation.
Depending on the configuration, open-chain robotic manipulators require a degree of trajectory optimization. For instance, a robotic arm with 7 joints and 7 links (7-DOF) is a redundant system where one cartesian position of an end-effector can correspond to an infinite number of joint angle positions, thus this redundancy can be used to optimize a trajectory to, for example, avoid any obstacles in the workspace or minimize the torque in the joints. [5]
Trajectory optimization is often used to compute trajectories for quadrotor helicopters. These applications typically used highly specialized algorithms. [6] [7] One interesting application shown by the U.Penn GRASP Lab is computing a trajectory that allows a quadrotor to fly through a hoop as it is thrown. Another, this time by the ETH Zurich Flying Machine Arena, involves two quadrotors tossing a pole back and forth between them, with it balanced like an inverted pendulum. The problem of computing minimum-energy trajectories for a quadcopter, has also been recently studied. [8]
Trajectory optimization is used in manufacturing, particularly for controlling chemical processes [9] or computing the desired path for robotic manipulators. [10]
There are a variety of different applications for trajectory optimization within the field of walking robotics. For example, one paper used trajectory optimization of bipedal gaits on a simple model to show that walking is energetically favorable for moving at a low speed and running is energetically favorable for moving at a high speed. [11] Like in many other applications, trajectory optimization can be used to compute a nominal trajectory, around which a stabilizing controller is built. [12] Trajectory optimization can be applied in detailed motion planning complex humanoid robots, such as Atlas. [13] Finally, trajectory optimization can be used for path-planning of robots with complicated dynamics constraints, using reduced complexity models. [14]
For tactical missiles, the flight profiles are determined by the thrust and lift histories. These histories can be controlled by a number of means including such techniques as using an angle of attack command history or an altitude/downrange schedule that the missile must follow. Each combination of missile design factors, desired missile performance, and system constraints results in a new set of optimal control parameters. [15]
The techniques to any optimization problems can be divided into two categories: indirect and direct. An indirect method works by analytically constructing the necessary and sufficient conditions for optimality, which are then solved numerically. A direct method attempts a direct numerical solution by constructing a sequence of continually improving approximations to the optimal solution. [16]
The optimal control problem is an infinite-dimensional optimization problem, since the decision variables are functions, rather than real numbers. All solution techniques perform transcription, a process by which the trajectory optimization problem (optimizing over functions) is converted into a constrained parameter optimization problem (optimizing over real numbers). Generally, this constrained parameter optimization problem is a non-linear program, although in special cases it can be reduced to a quadratic program or linear program.
Single shooting is the simplest type of trajectory optimization technique. The basic idea is similar to how you would aim a cannon: pick a set of parameters for the trajectory, simulate the entire thing, and then check to see if you hit the target. The entire trajectory is represented as a single segment, with a single constraint, known as a defect constraint, requiring that the final state of the simulation matches the desired final state of the system. Single shooting is effective for problems that are either simple or have an extremely good initialization. Both the indirect and direct formulation tend to have difficulties otherwise. [16] [19] [20]
Multiple shooting is a simple extension to single shooting that renders it far more effective. Rather than representing the entire trajectory as a single simulation (segment), the algorithm breaks the trajectory into many shorter segments, and a defect constraint is added between each. The result is large sparse non-linear program, which tends to be easier to solve than the small dense programs produced by single shooting. [19] [20]
Direct collocation methods work by approximating the state and control trajectories using polynomial splines. These methods are sometimes referred to as direct transcription. Trapezoidal collocation is a commonly used low-order direct collocation method. The dynamics, path objective, and control are all represented using linear splines, and the dynamics are satisfied using trapezoidal quadrature. Hermite-Simpson Collocation is a common medium-order direct collocation method. The state is represented by a cubic-Hermite spline, and the dynamics are satisfied using Simpson quadrature. [16] [20]
Orthogonal collocation is technically a subset of direct collocation, but the implementation details are so different that it can reasonably be considered its own set of methods. Orthogonal collocation differs from direct collocation in that it typically uses high-order splines, and each segment of the trajectory might be represented by a spline of a different order. The name comes from the use of orthogonal polynomials in the state and control splines. [20] [21]
In pseudospectral discretization the entire trajectory is represented by a collection of basis functions in the time domain (independent variable). The basis functions need not be polynomials. Pseudospectral discretization is also known as spectral collocation. [22] [23] [24] When used to solve a trajectory optimization problem whose solution is smooth, a pseudospectral method will achieve spectral (exponential) convergence. [25] If the trajectory is not smooth, the convergence is still very fast, faster than Runge-Kutta methods. [26] [27]
In 1990 Dewey H. Hodges and Robert R. Bless [28] proposed a weak Hamiltonian finite element method for optimal control problems. The idea was to derive a weak variational form of first order necessary conditions for optimality, discretise the time domain in finite intervals and use a simple zero order polynomial representation of states, controls and adjoints over each interval.
Differential dynamic programming, is a bit different than the other techniques described here. In particular, it does not cleanly separate the transcription and the optimization. Instead, it does a sequence of iterative forward and backward passes along the trajectory. Each forward pass satisfies the system dynamics, and each backward pass satisfies the optimality conditions for control. Eventually, this iteration converges to a trajectory that is both feasible and optimal. [29]
In contrast to the aforementioned classical methods, generative machine learning methods may be used to generate a desirable trajectory. In particular, diffusion models learn to iteratively reverse a destructive forward process in which noise is added to data until it becomes noise itself by estimating the noise to remove at every time step. Thus given easily to sample random noise as input, the diffusion process will recover a plausible corresponding noise-free data point. Recent methods [30] [31] have parameterized trajectories as matrices of state-action pairs at consecutive time steps and trained a diffusion model to generate such a matrix. To address the issue of controllability of the generated samples, the Diffuser method [30] proposes two techniques to steer the generated sample, thereby reducing the optimization problem to a sampling problem. First, guided diffusion [32] [33] can be used to incorporate a cost (or reward) function into the generation process. For this purpose the gradient of the cost function modifies the mean of the estimated noise at every time step. Second, for motion planning problems in which the start and the end states of the trajectory are known, and the trajectory needs to comply with constraints to find a viable path, an inpainting approach can be used. Similar to the first technique, a prior modifies the distribution of trajectories, which in this case assigns high probability to trajectories satisfying the constraints (e.g. arriving at a state at time step ), and zero probability to all other trajectories. As a result, sampling from this distribution will produce trajectories that satisfy the constraints.
There are many techniques to choose from when solving a trajectory optimization problem. There is no best method, but some methods might do a better job on specific problems. This section provides a rough understanding of the trade-offs between methods.
When solving a trajectory optimization problem with an indirect method, you must explicitly construct the adjoint equations and their gradients. This is often difficult to do, but it gives an excellent accuracy metric for the solution. Direct methods are much easier to set up and solve, but do not have a built-in accuracy metric. [16] As a result, direct methods are more widely used, especially in non-critical applications. Indirect methods still have a place in specialized applications, particularly aerospace, where accuracy is critical.
One place where indirect methods have particular difficulty is on problems with path inequality constraints. These problems tend to have solutions for which the constraint is partially active. When constructing the adjoint equations for an indirect method, the user must explicitly write down when the constraint is active in the solution, which is difficult to know a priori. One solution is to use a direct method to compute an initial guess, which is then used to construct a multi-phase problem where the constraint is prescribed. The resulting problem can then be solved accurately using an indirect method. [16]
Single shooting methods are best used for problems where the control is very simple (or there is an extremely good initial guess). For example, a satellite mission planning problem where the only control is the magnitude and direction of an initial impulse from the engines. [19]
Multiple shooting tends to be good for problems with relatively simple control, but complicated dynamics. Although path constraints can be used, they make the resulting nonlinear program relatively difficult to solve.
Direct collocation methods are good for problems where the accuracy of the control and the state are similar. These methods tend to be less accurate than others (due to their low-order), but are particularly robust for problems with difficult path constraints.
Orthogonal collocation methods are best for obtaining high-accuracy solutions to problems where the accuracy of the control trajectory is important. Some implementations have trouble with path constraints. These methods are particularly good when the solution is smooth.
Mathematical optimization or mathematical programming is the selection of a best element, with regard to some criteria, from some set of available alternatives. It is generally divided into two subfields: discrete optimization and continuous optimization. Optimization problems arise in all quantitative disciplines from computer science and engineering to operations research and economics, and the development of solution methods has been of interest in mathematics for centuries.
Optimal control theory is a branch of control theory that deals with finding a control for a dynamical system over a period of time such that an objective function is optimized. It has numerous applications in science, engineering and operations research. For example, the dynamical system might be a spacecraft with controls corresponding to rocket thrusters, and the objective might be to reach the Moon with minimum fuel expenditure. Or the dynamical system could be a nation's economy, with the objective to minimize unemployment; the controls in this case could be fiscal and monetary policy. A dynamical system may also be introduced to embed operations research problems within the framework of optimal control theory.
In computer science and operations research, the ant colony optimization algorithm (ACO) is a probabilistic technique for solving computational problems that can be reduced to finding good paths through graphs. Artificial ants represent multi-agent methods inspired by the behavior of real ants. The pheromone-based communication of biological ants is often the predominant paradigm used. Combinations of artificial ants and local search algorithms have become a preferred method for numerous optimization tasks involving some sort of graph, e.g., vehicle routing and internet routing.
Model predictive control (MPC) is an advanced method of process control that is used to control a process while satisfying a set of constraints. It has been in use in the process industries in chemical plants and oil refineries since the 1980s. In recent years it has also been used in power system balancing models and in power electronics. Model predictive controllers rely on dynamic models of the process, most often linear empirical models obtained by system identification. The main advantage of MPC is the fact that it allows the current timeslot to be optimized, while keeping future timeslots in account. This is achieved by optimizing a finite time-horizon, but only implementing the current timeslot and then optimizing again, repeatedly, thus differing from a linear–quadratic regulator (LQR). Also MPC has the ability to anticipate future events and can take control actions accordingly. PID controllers do not have this predictive ability. MPC is nearly universally implemented as a digital control, although there is research into achieving faster response times with specially designed analog circuitry.
Multi-objective optimization or Pareto optimization is an area of multiple-criteria decision making that is concerned with mathematical optimization problems involving more than one objective function to be optimized simultaneously. Multi-objective is a type of vector optimization that has been applied in many fields of science, including engineering, economics and logistics where optimal decisions need to be taken in the presence of trade-offs between two or more conflicting objectives. Minimizing cost while maximizing comfort while buying a car, and maximizing performance whilst minimizing fuel consumption and emission of pollutants of a vehicle are examples of multi-objective optimization problems involving two and three objectives, respectively. In practical problems, there can be more than three objectives.
The Gauss pseudospectral method (GPM), one of many topics named after Carl Friedrich Gauss, is a direct transcription method for discretizing a continuous optimal control problem into a nonlinear program (NLP). The Gauss pseudospectral method differs from several other pseudospectral methods in that the dynamics are not collocated at either endpoint of the time interval. This collocation, in conjunction with the proper approximation to the costate, leads to a set of KKT conditions that are identical to the discretized form of the first-order optimality conditions. This equivalence between the KKT conditions and the discretized first-order optimality conditions leads to an accurate costate estimate using the KKT multipliers of the NLP.
Pseudospectral optimal control is a joint theoretical-computational method for solving optimal control problems. It combines pseudospectral (PS) theory with optimal control theory to produce a PS optimal control theory. PS optimal control theory has been used in ground and flight systems in military and industrial applications. The techniques have been extensively used to solve a wide range of problems such as those arising in UAV trajectory generation, missile guidance, control of robotic arms, vibration damping, lunar guidance, magnetic control, swing-up and stabilization of an inverted pendulum, orbit transfers, tether libration control, ascent guidance and quantum control.
DIDO is a MATLAB optimal control toolbox for solving general-purpose optimal control problems. It is widely used in academia, industry, and NASA. Hailed as a breakthrough software, DIDO is based on the pseudospectral optimal control theory of Ross and Fahroo. The latest enhancements to DIDO are described in Ross.
The PROPT MATLAB Optimal Control Software is a new generation platform for solving applied optimal control and parameters estimation problems.
In applied mathematics, the pseudospectral knotting method is a generalization and enhancement of the standard pseudospectral method for optimal control. Introduced by I. Michael Ross and F. Fahroo in 2004, it forms part of the collection of the Ross–Fahroo pseudospectral methods.
The Legendre pseudospectral method for optimal control problems is based on Legendre polynomials. It is part of the larger theory of pseudospectral optimal control, a term coined by Ross. A basic version of the Legendre pseudospectral was originally proposed by Elnagar and his coworkers in 1995. Since then, Ross, Fahroo and their coworkers have extended, generalized and applied the method for a large range of problems. An application that has received wide publicity is the use of their method for generating real time trajectories for the International Space Station.
The Chebyshev pseudospectral method for optimal control problems is based on Chebyshev polynomials of the first kind. It is part of the larger theory of pseudospectral optimal control, a term coined by Ross. Unlike the Legendre pseudospectral method, the Chebyshev pseudospectral (PS) method does not immediately offer high-accuracy quadrature solutions. Consequently, two different versions of the method have been proposed: one by Elnagar et al., and another by Fahroo and Ross. The two versions differ in their quadrature techniques. The Fahroo–Ross method is more commonly used today due to the ease in implementation of the Clenshaw–Curtis quadrature technique. In 2008, Trefethen showed that the Clenshaw–Curtis method was nearly as accurate as Gauss quadrature. This breakthrough result opened the door for a covector mapping theorem for Chebyshev PS methods. A complete mathematical theory for Chebyshev PS methods was finally developed in 2009 by Gong, Ross and Fahroo.
JModelica.org is a commercial software platform based on the Modelica modeling language for modeling, simulating, optimizing and analyzing complex dynamic systems. The platform is maintained and developed by Modelon AB in collaboration with academic and industrial institutions, notably Lund University and the Lund Center for Control of Complex Systems (LCCC). The platform has been used in industrial projects with applications in robotics, vehicle systems, energy systems, CO2 separation and polyethylene production.
Introduced by I. Michael Ross and F. Fahroo, the Ross–Fahroo pseudospectral methods are a broad collection of pseudospectral methods for optimal control. Examples of the Ross–Fahroo pseudospectral methods are the pseudospectral knotting method, the flat pseudospectral method, the Legendre-Gauss-Radau pseudospectral method and pseudospectral methods for infinite-horizon optimal control.
Named after I. Michael Ross and F. Fahroo, the Ross–Fahroo lemma is a fundamental result in optimal control theory.
The Bellman pseudospectral method is a pseudospectral method for optimal control based on Bellman's principle of optimality. It is part of the larger theory of pseudospectral optimal control, a term coined by Ross. The method is named after Richard E. Bellman. It was introduced by Ross et al. first as a means to solve multiscale optimal control problems, and later expanded to obtain suboptimal solutions for general optimal control problems.
Isaac Michael Ross is a Distinguished Professor and Program Director of Control and Optimization at the Naval Postgraduate School in Monterey, CA. He has published a highly-regarded textbook on optimal control theory and seminal papers in pseudospectral optimal control theory, energy-sink theory, the optimization and deflection of near-Earth asteroids and comets, robotics, attitude dynamics and control, orbital mechanics, real-time optimal control, unscented optimal control and continuous optimization. The Kang–Ross–Gong theorem, Ross' π lemma, Ross' time constant, the Ross–Fahroo lemma, and the Ross–Fahroo pseudospectral method are all named after him. According to a report published by Stanford University, Ross is one of the world's top 2% of scientists.
Fariba Fahroo is an American Persian mathematician, a program manager at the Air Force Office of Scientific Research, and a former program manager at the Defense Sciences Office. Along with I. M. Ross, she has published papers in pseudospectral optimal control theory. The Ross–Fahroo lemma and the Ross–Fahroo pseudospectral method are named after her. In 2010, she received, the AIAA Mechanics and Control of Flight Award for fundamental contributions to flight mechanics.
GPOPS-II is a general-purpose MATLAB software for solving continuous optimal control problems using hp-adaptive Gaussian quadrature collocation and sparse nonlinear programming. The acronym GPOPS stands for "General Purpose OPtimal Control Software", and the Roman numeral "II" refers to the fact that GPOPS-II is the second software of its type.
{{cite book}}
: CS1 maint: multiple names: authors list (link)