Affine scaling

Last updated
The affine scaling method is an interior point method, meaning that it forms a trajectory of points strictly inside the feasible region of a linear program (as opposed to the simplex algorithm, which walks the corners of the feasible region). Karmarkar.svg
The affine scaling method is an interior point method, meaning that it forms a trajectory of points strictly inside the feasible region of a linear program (as opposed to the simplex algorithm, which walks the corners of the feasible region).

In mathematical optimization, affine scaling is an algorithm for solving linear programming problems. Specifically, it is an interior point method, discovered by Soviet mathematician I. I. Dikin in 1967 and reinvented in the U.S. in the mid-1980s.

Contents

History

Affine scaling has a history of multiple discovery. It was first published by I. I. Dikin at Energy Systems Institute of Russian Academy of Sciences (Siberian Energy Institute, USSR Academy of Sc. at that time) in the 1967 Doklady Akademii Nauk SSSR , followed by a proof of its convergence in 1974. [1] Dikin's work went largely unnoticed until the 1984 discovery of Karmarkar's algorithm, the first practical polynomial time algorithm for linear programming. The importance and complexity of Karmarkar's method prompted mathematicians to search for a simpler version.

Several groups then independently came up with a variant of Karmarkar's algorithm. E. R. Barnes at IBM, [2] a team led by R. J. Vanderbei at AT&T, [3] and several others replaced the projective transformations that Karmarkar used by affine ones. After a few years, it was realized that the "new" affine scaling algorithms were in fact reinventions of the decades-old results of Dikin. [1] [4] Of the re-discoverers, only Barnes and Vanderbei et al. managed to produce an analysis of affine scaling's convergence properties. Karmarkar, who had also came with affine scaling in this timeframe, mistakenly believed that it converged as quickly as his own algorithm. [5] :346

Algorithm

Affine scaling works in two phases, the first of which finds a feasible point from which to start optimizing, while the second does the actual optimization while staying strictly inside the feasible region.

Both phases solve linear programs in equality form, viz.

minimize cx
subject to Ax = b, x ≥ 0.

These problems are solved using an iterative method, which conceptually proceeds by plotting a trajectory of points strictly inside the feasible region of a problem, computing projected gradient descent steps in a re-scaled version of the problem, then scaling the step back to the original problem. The scaling ensures that the algorithm can continue to do large steps even when the point under consideration is close to the feasible region's boundary. [5] :337

Formally, the iterative method at the heart of affine scaling takes as inputs A, b, c, an initial guess x0 > 0 that is strictly feasible (i.e., Ax0 = b), a tolerance ε and a stepsize β. It then proceeds by iterating [1] :111

Initialization

Phase I, the initialization, solves an auxiliary problem with an additional variable u and uses the result to derive an initial point for the original problem. Let x0 be an arbitrary, strictly positive point; it need not be feasible for the original problem. The infeasibility of x0 is measured by the vector

.

If v = 0, x0 is feasible. If it is not, phase I solves the auxiliary problem

minimize u
subject to Ax + uv = b, x ≥ 0, u ≥ 0.

This problem has the right form for solution by the above iterative algorithm, [lower-alpha 1] and

is a feasible initial point for it. Solving the auxiliary problem gives

.

If u* = 0, then x* ≥0 is feasible in the original problem (though not necessarily strictly interior), while if u* > 0, the original problem is infeasible. [5] :343

Analysis

While easy to state, affine scaling was found hard to analyze. Its convergence depends on the step size, β. For step sizes β2/3, Vanderbei's variant of affine scaling has been proven to converge, while for β > 0.995, an example problem is known that converges to a suboptimal value. [5] :342 Other variants of the algorithm have been shown to exhibit chaotic behavior even on small problems when β > 2/3. [6] [7]

Notes

  1. The structure in the auxiliary problem permits some simplification of the formulas. [5] :344

Related Research Articles

Quadratic programming (QP) is the process of solving certain mathematical optimization problems involving quadratic functions. Specifically, one seeks to optimize a multivariate quadratic function subject to linear constraints on the variables. Quadratic programming is a type of nonlinear programming.

Linear programming Method to solve some optimization problems

Linear programming is a method to achieve the best outcome in a mathematical model whose requirements are represented by linear relationships. Linear programming is a special case of mathematical programming.

Mathematical optimization Study of mathematical algorithms for optimization problems

Mathematical optimization or mathematical programming is the selection of a best element, with regard to some criterion, from some set of available alternatives. Optimization problems of sorts arise in all quantitative disciplines from computer science and engineering to operations research and economics, and the development of solution methods has been of interest in mathematics for centuries.

In mathematical optimization, Dantzig's simplex algorithm is a popular algorithm for linear programming.

In mathematics and computing, the Levenberg–Marquardt algorithm, also known as the damped least-squares (DLS) method, is used to solve non-linear least squares problems. These minimization problems arise especially in least squares curve fitting. The LMA interpolates between the Gauss–Newton algorithm (GNA) and the method of gradient descent. The LMA is more robust than the GNA, which means that in many cases it finds a solution even if it starts very far off the final minimum. For well-behaved functions and reasonable starting parameters, the LMA tends to be slower than the GNA. LMA can also be viewed as Gauss–Newton using a trust region approach.

Interior-point method Algorithms for solving convex optimization problems

Interior-point methods are a certain class of algorithms that solve linear and nonlinear convex optimization problems.

Mehrotra's predictor–corrector method in optimization is a specific interior point method for linear programming. It was proposed in 1989 by Sanjay Mehrotra.

Convex optimization is a subfield of mathematical optimization that studies the problem of minimizing convex functions over convex sets. Many classes of convex optimization problems admit polynomial-time algorithms, whereas mathematical optimization is in general NP-hard.

The Frank–Wolfe algorithm is an iterative first-order optimization algorithm for constrained convex optimization. Also known as the conditional gradient method, reduced gradient algorithm and the convex combination algorithm, the method was originally proposed by Marguerite Frank and Philip Wolfe in 1956. In each iteration, the Frank–Wolfe algorithm considers a linear approximation of the objective function, and moves towards a minimizer of this linear function.

Karmarkar's algorithm is an algorithm introduced by Narendra Karmarkar in 1984 for solving linear programming problems. It was the first reasonably efficient algorithm that solves these problems in polynomial time. The ellipsoid method is also polynomial time but proved to be inefficient in practice.

Ellipsoid method

In mathematical optimization, the ellipsoid method is an iterative method for minimizing convex functions. When specialized to solving feasible linear optimization problems with rational data, the ellipsoid method is an algorithm which finds an optimal solution in a number of steps that is polynomial in the input size.

Semidefinite programming (SDP) is a subfield of convex optimization concerned with the optimization of a linear objective function over the intersection of the cone of positive semidefinite matrices with an affine space, i.e., a spectrahedron.

Limited-memory BFGS is an optimization algorithm in the family of quasi-Newton methods that approximates the Broyden–Fletcher–Goldfarb–Shanno algorithm (BFGS) using a limited amount of computer memory. It is a popular algorithm for parameter estimation in machine learning. The algorithm's target problem is to minimize over unconstrained values of the real-vector where is a differentiable scalar function.

A second-order cone program (SOCP) is a convex optimization problem of the form

Multi-objective optimization is an area of multiple criteria decision making that is concerned with mathematical optimization problems involving more than one objective function to be optimized simultaneously. Multi-objective optimization has been applied in many fields of science, including engineering, economics and logistics where optimal decisions need to be taken in the presence of trade-offs between two or more conflicting objectives. Minimizing cost while maximizing comfort while buying a car, and maximizing performance whilst minimizing fuel consumption and emission of pollutants of a vehicle are examples of multi-objective optimization problems involving two and three objectives, respectively. In practical problems, there can be more than three objectives.

In numerical linear algebra, the alternating-direction implicit (ADI) method is an iterative method used to solve Sylvester matrix equations. It is a popular method for solving the large matrix equations that arise in systems theory and control, and can be formulated to construct solutions in a memory-efficient, factored form. It is also used to numerically solve parabolic and elliptic partial differential equations, and is a classic method used for modeling heat conduction and solving the diffusion equation in two or more dimensions. It is an example of an operator splitting method.

In mathematical optimization, linear-fractional programming (LFP) is a generalization of linear programming (LP). Whereas the objective function in a linear program is a linear function, the objective function in a linear-fractional program is a ratio of two linear functions. A linear program can be regarded as a special case of a linear-fractional program in which the denominator is the constant function one.

Robert J. Vanderbei is an American mathematician and Professor in the Department of Operations Research and Financial Engineering at Princeton University.

Augmented Lagrangian methods are a certain class of algorithms for solving constrained optimization problems. They have similarities to penalty methods in that they replace a constrained optimization problem by a series of unconstrained problems and add a penalty term to the objective; the difference is that the augmented Lagrangian method adds yet another term, designed to mimic a Lagrange multiplier. The augmented Lagrangian is related to, but not identical with the method of Lagrange multipliers.

In numerical mathematics, the Uzawa iteration is an algorithm for solving saddle point problems. It is named after Hirofumi Uzawa and was originally introduced in the context of concave programming.

References

  1. 1 2 3 Vanderbei, R. J.; Lagarias, J. C. (1990). "I. I. Dikin's convergence result for the affine-scaling algorithm". Mathematical developments arising from linear programming (Brunswick, ME, 1988). Contemporary Mathematics. Vol. 114. Providence, RI: American Mathematical Society. pp. 109–119. doi:10.1090/conm/114/1097868. MR   1097868.
  2. Barnes, Earl R. (1986). "A variation on Karmarkar's algorithm for solving linear programming problems". Mathematical Programming. 36 (2): 174–182. doi:10.1007/BF02592024.
  3. Vanderbei, Robert J.; Meketon, Marc S.; Freedman, Barry A. (1986). "A Modification of Karmarkar's Linear Programming Algorithm" (PDF). Algorithmica. 1 (1–4): 395–407. doi:10.1007/BF01840454.
  4. Bayer, D. A.; Lagarias, J. C. (1989). "The nonlinear geometry of linear programming I: Affine and projective scaling trajectories" (PDF). Transactions of the American Mathematical Society . 314 (2): 499. doi: 10.1090/S0002-9947-1989-1005525-6 .
  5. 1 2 3 4 5 Vanderbei, Robert J. (2001). Linear Programming: Foundations and Extensions. Springer Verlag. pp. 333–347.
  6. Bruin, H.; Fokkink, R.J.; Gu, G.; Roos, C. (2014). "On the chaotic behavior of the primal–dual affine–scaling algorithm for linear optimization" (PDF). Chaos . 24 (4): 043132. arXiv: 1409.6108 . Bibcode:2014Chaos..24d3132B. doi:10.1063/1.4902900. PMID   25554052.
  7. Castillo, Ileana; Barnes, Earl R. (2006). "Chaotic Behavior of the Affine Scaling Algorithm for Linear Programming". SIAM J. Optim. 11 (3): 781–795. doi:10.1137/S1052623496314070.

Further reading