In mathematical optimization, Dantzig's simplex algorithm (or simplex method) is a popular algorithm for linear programming. [1]
The name of the algorithm is derived from the concept of a simplex and was suggested by T. S. Motzkin. [2] Simplices are not actually used in the method, but one interpretation of it is that it operates on simplicial cones , and these become proper simplices with an additional constraint. [3] [4] [5] [6] The simplicial cones in question are the corners (i.e., the neighborhoods of the vertices) of a geometric object called a polytope. The shape of this polytope is defined by the constraints applied to the objective function.
George Dantzig worked on planning methods for the US Army Air Force during World War II using a desk calculator. During 1946, his colleague challenged him to mechanize the planning process to distract him from taking another job. Dantzig formulated the problem as linear inequalities inspired by the work of Wassily Leontief, however, at that time he didn't include an objective as part of his formulation. Without an objective, a vast number of solutions can be feasible, and therefore to find the "best" feasible solution, military-specified "ground rules" must be used that describe how goals can be achieved as opposed to specifying a goal itself. Dantzig's core insight was to realize that most such ground rules can be translated into a linear objective function that needs to be maximized. [7] Development of the simplex method was evolutionary and happened over a period of about a year. [8]
After Dantzig included an objective function as part of his formulation during mid-1947, the problem was mathematically more tractable. Dantzig realized that one of the unsolved problems that he had mistaken as homework in his professor Jerzy Neyman's class (and actually later solved), was applicable to finding an algorithm for linear programs. This problem involved finding the existence of Lagrange multipliers for general linear programs over a continuum of variables, each bounded between zero and one, and satisfying linear constraints expressed in the form of Lebesgue integrals. Dantzig later published his "homework" as a thesis to earn his doctorate. The column geometry used in this thesis gave Dantzig insight that made him believe that the Simplex method would be very efficient. [9]
The simplex algorithm operates on linear programs in the canonical form
with the coefficients of the objective function, is the matrix transpose, and are the variables of the problem, is a p×n matrix, and . There is a straightforward process to convert any linear program into one in standard form, so using this form of linear programs results in no loss of generality.
In geometric terms, the feasible region defined by all values of such that and is a (possibly unbounded) convex polytope. An extreme point or vertex of this polytope is known as basic feasible solution (BFS).
It can be shown that for a linear program in standard form, if the objective function has a maximum value on the feasible region, then it has this value on (at least) one of the extreme points. [10] This in itself reduces the problem to a finite computation since there is a finite number of extreme points, but the number of extreme points is unmanageably large for all but the smallest linear programs. [11]
It can also be shown that, if an extreme point is not a maximum point of the objective function, then there is an edge containing the point so that the value of the objective function is strictly increasing on the edge moving away from the point. [12] If the edge is finite, then the edge connects to another extreme point where the objective function has a greater value, otherwise the objective function is unbounded above on the edge and the linear program has no solution. The simplex algorithm applies this insight by walking along edges of the polytope to extreme points with greater and greater objective values. This continues until the maximum value is reached, or an unbounded edge is visited (concluding that the problem has no solution). The algorithm always terminates because the number of vertices in the polytope is finite; moreover since we jump between vertices always in the same direction (that of the objective function), we hope that the number of vertices visited will be small. [12]
The solution of a linear program is accomplished in two steps. In the first step, known as Phase I, a starting extreme point is found. Depending on the nature of the program this may be trivial, but in general it can be solved by applying the simplex algorithm to a modified version of the original program. The possible results of Phase I are either that a basic feasible solution is found or that the feasible region is empty. In the latter case the linear program is called infeasible. In the second step, Phase II, the simplex algorithm is applied using the basic feasible solution found in Phase I as a starting point. The possible results from Phase II are either an optimum basic feasible solution or an infinite edge on which the objective function is unbounded above. [13] [14] [15]
The transformation of a linear program to one in standard form may be accomplished as follows. [16] First, for each variable with a lower bound other than 0, a new variable is introduced representing the difference between the variable and bound. The original variable can then be eliminated by substitution. For example, given the constraint
a new variable, , is introduced with
The second equation may be used to eliminate from the linear program. In this way, all lower bound constraints may be changed to non-negativity restrictions.
Second, for each remaining inequality constraint, a new variable, called a slack variable , is introduced to change the constraint to an equality constraint. This variable represents the difference between the two sides of the inequality and is assumed to be non-negative. For example, the inequalities
are replaced with
It is much easier to perform algebraic manipulation on inequalities in this form. In inequalities where ≥ appears such as the second one, some authors refer to the variable introduced as a surplus variable.
Third, each unrestricted variable is eliminated from the linear program. This can be done in two ways, one is by solving for the variable in one of the equations in which it appears and then eliminating the variable by substitution. The other is to replace the variable with the difference of two restricted variables. For example, if is unrestricted then write
The equation may be used to eliminate from the linear program.
When this process is complete the feasible region will be in the form
It is also useful to assume that the rank of is the number of rows. This results in no loss of generality since otherwise either the system has redundant equations which can be dropped, or the system is inconsistent and the linear program has no solution. [17]
A linear program in standard form can be represented as a tableau of the form
The first row defines the objective function and the remaining rows specify the constraints. The zero in the first column represents the zero vector of the same dimension as the vector (different authors use different conventions as to the exact layout). If the columns of can be rearranged so that it contains the identity matrix of order (the number of rows in ) then the tableau is said to be in canonical form. [18] The variables corresponding to the columns of the identity matrix are called basic variables while the remaining variables are called nonbasic or free variables. If the values of the nonbasic variables are set to 0, then the values of the basic variables are easily obtained as entries in and this solution is a basic feasible solution. The algebraic interpretation here is that the coefficients of the linear equation represented by each row are either , , or some other number. Each row will have column with value , columns with coefficients , and the remaining columns with some other coefficients (these other variables represent our non-basic variables). By setting the values of the non-basic variables to zero we ensure in each row that the value of the variable represented by a in its column is equal to the value at that row.
Conversely, given a basic feasible solution, the columns corresponding to the nonzero variables can be expanded to a nonsingular matrix. If the corresponding tableau is multiplied by the inverse of this matrix then the result is a tableau in canonical form. [19]
Let
be a tableau in canonical form. Additional row-addition transformations can be applied to remove the coefficients cT
B from the objective function. This process is called pricing out and results in a canonical tableau
where zB is the value of the objective function at the corresponding basic feasible solution. The updated coefficients, also known as relative cost coefficients, are the rates of change of the objective function with respect to the nonbasic variables. [14]
The geometrical operation of moving from a basic feasible solution to an adjacent basic feasible solution is implemented as a pivot operation. First, a nonzero pivot element is selected in a nonbasic column. The row containing this element is multiplied by its reciprocal to change this element to 1, and then multiples of the row are added to the other rows to change the other entries in the column to 0. The result is that, if the pivot element is in a row r, then the column becomes the r-th column of the identity matrix. The variable for this column is now a basic variable, replacing the variable which corresponded to the r-th column of the identity matrix before the operation. In effect, the variable corresponding to the pivot column enters the set of basic variables and is called the entering variable, and the variable being replaced leaves the set of basic variables and is called the leaving variable. The tableau is still in canonical form but with the set of basic variables changed by one element. [13] [14]
Let a linear program be given by a canonical tableau. The simplex algorithm proceeds by performing successive pivot operations each of which give an improved basic feasible solution; the choice of pivot element at each step is largely determined by the requirement that this pivot improves the solution.
Since the entering variable will, in general, increase from 0 to a positive number, the value of the objective function will decrease if the derivative of the objective function with respect to this variable is negative. Equivalently, the value of the objective function is increased if the pivot column is selected so that the corresponding entry in the objective row of the tableau is positive.
If there is more than one column so that the entry in the objective row is positive then the choice of which one to add to the set of basic variables is somewhat arbitrary and several entering variable choice rules [20] such as Devex algorithm [21] have been developed.
If all the entries in the objective row are less than or equal to 0 then no choice of entering variable can be made and the solution is in fact optimal. It is easily seen to be optimal since the objective row now corresponds to an equation of the form
By changing the entering variable choice rule so that it selects a column where the entry in the objective row is negative, the algorithm is changed so that it finds the minimum of the objective function rather than the maximum.
Once the pivot column has been selected, the choice of pivot row is largely determined by the requirement that the resulting solution be feasible. First, only positive entries in the pivot column are considered since this guarantees that the value of the entering variable will be nonnegative. If there are no positive entries in the pivot column then the entering variable can take any non-negative value with the solution remaining feasible. In this case the objective function is unbounded below and there is no minimum.
Next, the pivot row must be selected so that all the other basic variables remain positive. A calculation shows that this occurs when the resulting value of the entering variable is at a minimum. In other words, if the pivot column is c, then the pivot row r is chosen so that
is the minimum over all r so that arc > 0. This is called the minimum ratio test. [20] If there is more than one row for which the minimum is achieved then a dropping variable choice rule [22] can be used to make the determination.
Consider the linear program
With the addition of slack variables s and t, this is represented by the canonical tableau
where columns 5 and 6 represent the basic variables s and t and the corresponding basic feasible solution is
Columns 2, 3, and 4 can be selected as pivot columns, for this example column 4 is selected. The values of z resulting from the choice of rows 2 and 3 as pivot rows are 10/1 = 10 and 15/3 = 5 respectively. Of these the minimum is 5, so row 3 must be the pivot row. Performing the pivot produces
Now columns 4 and 5 represent the basic variables z and s and the corresponding basic feasible solution is
For the next step, there are no positive entries in the objective row and in fact
so the minimum value of Z is −20.
In general, a linear program will not be given in the canonical form and an equivalent canonical tableau must be found before the simplex algorithm can start. This can be accomplished by the introduction of artificial variables. Columns of the identity matrix are added as column vectors for these variables. If the b value for a constraint equation is negative, the equation is negated before adding the identity matrix columns. This does not change the set of feasible solutions or the optimal solution, and it ensures that the slack variables will constitute an initial feasible solution. The new tableau is in canonical form but it is not equivalent to the original problem. So a new objective function, equal to the sum of the artificial variables, is introduced and the simplex algorithm is applied to find the minimum; the modified linear program is called the Phase I problem. [23]
The simplex algorithm applied to the Phase I problem must terminate with a minimum value for the new objective function since, being the sum of nonnegative variables, its value is bounded below by 0. If the minimum is 0 then the artificial variables can be eliminated from the resulting canonical tableau producing a canonical tableau equivalent to the original problem. The simplex algorithm can then be applied to find the solution; this step is called Phase II. If the minimum is positive then there is no feasible solution for the Phase I problem where the artificial variables are all zero. This implies that the feasible region for the original problem is empty, and so the original problem has no solution. [13] [14] [24]
Consider the linear program
This is represented by the (non-canonical) tableau
Introduce artificial variables u and v and objective function W = u + v, giving a new tableau
The equation defining the original objective function is retained in anticipation of Phase II.
By construction, u and v are both basic variables since they are part of the initial identity matrix. However, the objective function W currently assumes that u and v are both 0. In order to adjust the objective function to be the correct value where u = 10 and v = 15, add the third and fourth rows to the first row giving
Select column 5 as a pivot column, so the pivot row must be row 4, and the updated tableau is
Now select column 3 as a pivot column, for which row 3 must be the pivot row, to get
The artificial variables are now 0 and they may be dropped giving a canonical tableau equivalent to the original problem:
This is, fortuitously, already optimal and the optimum value for the original linear program is −130/7.
The tableau form used above to describe the algorithm lends itself to an immediate implementation in which the tableau is maintained as a rectangular (m + 1)-by-(m + n + 1) array. It is straightforward to avoid storing the m explicit columns of the identity matrix that will occur within the tableau by virtue of B being a subset of the columns of [A, I]. This implementation is referred to as the "standard simplex algorithm". The storage and computation overhead is such that the standard simplex method is a prohibitively expensive approach to solving large linear programming problems.
In each simplex iteration, the only data required are the first row of the tableau, the (pivotal) column of the tableau corresponding to the entering variable and the right-hand-side. The latter can be updated using the pivotal column and the first row of the tableau can be updated using the (pivotal) row corresponding to the leaving variable. Both the pivotal column and pivotal row may be computed directly using the solutions of linear systems of equations involving the matrix B and a matrix-vector product using A. These observations motivate the "revised simplex algorithm", for which implementations are distinguished by their invertible representation of B. [25]
In large linear-programming problems A is typically a sparse matrix and, when the resulting sparsity of B is exploited when maintaining its invertible representation, the revised simplex algorithm is much more efficient than the standard simplex method. Commercial simplex solvers are based on the revised simplex algorithm. [24] [25] [26] [27] [28]
If the values of all basic variables are strictly positive, then a pivot must result in an improvement in the objective value. When this is always the case no set of basic variables occurs twice and the simplex algorithm must terminate after a finite number of steps. Basic feasible solutions where at least one of the basic variables is zero are called degenerate and may result in pivots for which there is no improvement in the objective value. In this case there is no actual change in the solution but only a change in the set of basic variables. When several such pivots occur in succession, there is no improvement; in large industrial applications, degeneracy is common and such "stalling" is notable. Worse than stalling is the possibility the same set of basic variables occurs twice, in which case, the deterministic pivoting rules of the simplex algorithm will produce an infinite loop, or "cycle". While degeneracy is the rule in practice and stalling is common, cycling is rare in practice. A discussion of an example of practical cycling occurs in Padberg. [24] Bland's rule prevents cycling and thus guarantees that the simplex algorithm always terminates. [24] [29] [30] Another pivoting algorithm, the criss-cross algorithm never cycles on linear programs. [31]
History-based pivot rules such as Zadeh's rule and Cunningham's rule also try to circumvent the issue of stalling and cycling by keeping track of how often particular variables are being used and then favor such variables that have been used least often.
The simplex method is remarkably efficient in practice and was a great improvement over earlier methods such as Fourier–Motzkin elimination. However, in 1972, Klee and Minty [32] gave an example, the Klee–Minty cube, showing that the worst-case complexity of simplex method as formulated by Dantzig is exponential time. Since then, for almost every variation on the method, it has been shown that there is a family of linear programs for which it performs badly. It is an open question if there is a variation with polynomial time, although sub-exponential pivot rules are known. [33]
In 2014, it was proved[ citation needed ] that a particular variant of the simplex method is NP-mighty, i.e., it can be used to solve, with polynomial overhead, any problem in NP implicitly during the algorithm's execution. Moreover, deciding whether a given variable ever enters the basis during the algorithm's execution on a given input, and determining the number of iterations needed for solving a given problem, are both NP-hard problems. [34] At about the same time it was shown that there exists an artificial pivot rule for which computing its output is PSPACE-complete. [35] In 2015, this was strengthened to show that computing the output of Dantzig's pivot rule is PSPACE-complete. [36]
Analyzing and quantifying the observation that the simplex algorithm is efficient in practice despite its exponential worst-case complexity has led to the development of other measures of complexity. The simplex algorithm has polynomial-time average-case complexity under various probability distributions, with the precise average-case performance of the simplex algorithm depending on the choice of a probability distribution for the random matrices. [37] [38] Another approach to studying "typical phenomena" uses Baire category theory from general topology, and to show that (topologically) "most" matrices can be solved by the simplex algorithm in a polynomial number of steps.[ citation needed ]
Another method to analyze the performance of the simplex algorithm studies the behavior of worst-case scenarios under small perturbation – are worst-case scenarios stable under a small change (in the sense of structural stability), or do they become tractable? This area of research, called smoothed analysis, was introduced specifically to study the simplex method. Indeed, the running time of the simplex method on input with noise is polynomial in the number of variables and the magnitude of the perturbations. [39] [40]
Other algorithms for solving linear-programming problems are described in the linear-programming article. Another basis-exchange pivoting algorithm is the criss-cross algorithm. [41] [42] There are polynomial-time algorithms for linear programming that use interior point methods: these include Khachiyan's ellipsoidal algorithm, Karmarkar's projective algorithm, and path-following algorithms. [15] The Big-M method is an alternative strategy for solving a linear program, using a single-phase simplex.
Linear–fractional programming (LFP) is a generalization of linear programming (LP). In LP the objective function is a linear function, while the objective function of a linear–fractional program is a ratio of two linear functions. In other words, a linear program is a fractional–linear program in which the denominator is the constant function having the value one everywhere. A linear–fractional program can be solved by a variant of the simplex algorithm [43] [44] [45] [46] or by the criss-cross algorithm. [47]
In mathematics, Gaussian elimination, also known as row reduction, is an algorithm for solving systems of linear equations. It consists of a sequence of row-wise operations performed on the corresponding matrix of coefficients. This method can also be used to compute the rank of a matrix, the determinant of a square matrix, and the inverse of an invertible matrix. The method is named after Carl Friedrich Gauss (1777–1855). To perform row reduction on a matrix, one uses a sequence of elementary row operations to modify the matrix until the lower left-hand corner of the matrix is filled with zeros, as much as possible. There are three types of elementary row operations:
Quadratic programming (QP) is the process of solving certain mathematical optimization problems involving quadratic functions. Specifically, one seeks to optimize a multivariate quadratic function subject to linear constraints on the variables. Quadratic programming is a type of nonlinear programming.
Linear programming (LP), also called linear optimization, is a method to achieve the best outcome in a mathematical model whose requirements and objective are represented by linear relationships. Linear programming is a special case of mathematical programming.
Mathematical optimization or mathematical programming is the selection of a best element, with regard to some criteria, from some set of available alternatives. It is generally divided into two subfields: discrete optimization and continuous optimization. Optimization problems arise in all quantitative disciplines from computer science and engineering to operations research and economics, and the development of solution methods has been of interest in mathematics for centuries.
In mathematics, a system of linear equations is a collection of two or more linear equations involving the same variables. For example,
An integer programming problem is a mathematical optimization or feasibility program in which some or all of the variables are restricted to be integers. In many settings the term refers to integer linear programming (ILP), in which the objective function and the constraints are linear.
Branch and bound is a method for solving optimization problems by breaking them down into smaller sub-problems and using a bounding function to eliminate sub-problems that cannot contain the optimal solution. It is an algorithm design paradigm for discrete and combinatorial optimization problems, as well as mathematical optimization. A branch-and-bound algorithm consists of a systematic enumeration of candidate solutions by means of state space search: the set of candidate solutions is thought of as forming a rooted tree with the full set at the root. The algorithm explores branches of this tree, which represent subsets of the solution set. Before enumerating the candidate solutions of a branch, the branch is checked against upper and lower estimated bounds on the optimal solution, and is discarded if it cannot produce a better solution than the best one found so far by the algorithm.
In mathematics, the kernel of a linear map, also known as the null space or nullspace, is the part of the domain which is mapped to the zero vector of the co-domain; the kernel is always a linear subspace of the domain. That is, given a linear map L : V → W between two vector spaces V and W, the kernel of L is the vector space of all elements v of V such that L(v) = 0, where 0 denotes the zero vector in W, or more symbolically:
In mathematical optimization theory, the linear complementarity problem (LCP) arises frequently in computational mechanics and encompasses the well-known quadratic programming as a special case. It was proposed by Cottle and Dantzig in 1968.
In mathematical optimization, the cutting-plane method is any of a variety of optimization methods that iteratively refine a feasible set or objective function by means of linear inequalities, termed cuts. Such procedures are commonly used to find integer solutions to mixed integer linear programming (MILP) problems, as well as to solve general, not necessarily differentiable convex optimization problems. The use of cutting planes to solve MILP was introduced by Ralph E. Gomory.
Convex optimization is a subfield of mathematical optimization that studies the problem of minimizing convex functions over convex sets. Many classes of convex optimization problems admit polynomial-time algorithms, whereas mathematical optimization is in general NP-hard.
In mathematical optimization, Bland's rule is an algorithmic refinement of the simplex method for linear optimization.
The dual of a given linear program (LP) is another LP that is derived from the original LP in the following schematic way:
Dantzig–Wolfe decomposition is an algorithm for solving linear programming problems with special structure. It was originally developed by George Dantzig and Philip Wolfe and initially published in 1960. Many texts on linear programming have sections dedicated to discussing this decomposition algorithm.
In mathematical optimization, linear-fractional programming (LFP) is a generalization of linear programming (LP). Whereas the objective function in a linear program is a linear function, the objective function in a linear-fractional program is a ratio of two linear functions. A linear program can be regarded as a special case of a linear-fractional program in which the denominator is the constant function 1.
Linear least squares (LLS) is the least squares approximation of linear functions to data. It is a set of formulations for solving statistical problems involved in linear regression, including variants for ordinary (unweighted), weighted, and generalized (correlated) residuals. Numerical methods for linear least squares include inverting the matrix of the normal equations and orthogonal decomposition methods.
In mathematical optimization, the criss-cross algorithm is any of a family of algorithms for linear programming. Variants of the criss-cross algorithm also solve more general problems with linear inequality constraints and nonlinear objective functions; there are criss-cross algorithms for linear-fractional programming problems, quadratic-programming problems, and linear complementarity problems.
The Klee–Minty cube or Klee–Minty polytope is a unit hypercube of variable dimension whose corners have been perturbed. Klee and Minty demonstrated that George Dantzig's simplex algorithm has poor worst-case performance when initialized at one corner of their "squashed cube". On the three-dimensional version, the simplex algorithm and the criss-cross algorithm visit all 8 corners in the worst case.
In mathematical optimization, the revised simplex method is a variant of George Dantzig's simplex method for linear programming.
In the theory of linear programming, a basic feasible solution (BFS) is a solution with a minimal set of non-zero variables. Geometrically, each BFS corresponds to a vertex of the polyhedron of feasible solutions. If there exists an optimal solution, then there exists an optimal BFS. Hence, to find an optimal solution, it is sufficient to consider the BFS-s. This fact is used by the simplex algorithm, which essentially travels from one BFS to another until an optimal solution is found.
These introductions are written for students of computer science and operations research: