Benders decomposition

Last updated November 03, 2024

Benders decomposition (or Benders' decomposition) is a technique in mathematical programming that allows the solution of very large linear programming problems that have a special block structure. This block structure often occurs in applications such as stochastic programming as the uncertainty is usually represented with scenarios. The technique is named after Jacques F. Benders.

The strategy behind Benders decomposition can be summarized as divide-and-conquer. That is, in Benders decomposition, the variables of the original problem are divided into two subsets so that a first-stage master problem is solved over the first set of variables, and the values for the second set of variables are determined in a second-stage subproblem for a given first-stage solution. If the subproblem determines that the fixed first-stage decisions are in fact infeasible, then so-called Benders cuts are generated and added to the master problem, which is then re-solved until no cuts can be generated. Since Benders decomposition adds new constraints as it progresses towards a solution, the approach is called "row generation". In contrast, Dantzig–Wolfe decomposition uses "column generation".

Methodology

Assume a problem that occurs in two or more stages, where the decisions for the later stages rely on the results from the earlier ones. An attempt at first-stage decisions can be made without prior knowledge of optimality according to later stage decisions. This first-stage decision is the master problem. Further stages can then be analyzed as separate subproblems. Information from these subproblems is passed back to the master problem. If constraints for a subproblem were violated, they can be added back to the master problem. The master problem is then re-solved.

The master problem represents an initial convex set which is further constrained by information gathered from the subproblems. Because the feasible space only shrinks as information is added, the objective value for the master function provides a lower bound on the objective function of the overall problem.

Benders Decomposition is applicable to problems with a largely block-diagonal structure.

Mathematical Formulation

Assume a problem of the following structure:

{\begin{aligned}&{\text{minimize}}&&\mathbf {c} ^{\mathrm {T} }\mathbf {x} +\mathbf {d} ^{\mathrm {T} }\mathbf {y} \\&{\text{subject to}}&&A\mathbf {x} +B\mathbf {y} \geq \mathbf {b} \\&&&\mathbf {y} \in Y\\&&&\mathbf {x} \geq \mathbf {0} \end{aligned}}

Where $A,B$ represent the constraints shared by both stages of variables and $Y$ represents the feasible set for $\mathbf {y}$ . Notice that for any fixed $\mathbf {\bar {y}} \in Y$ , the residual problem is

{\begin{aligned}&{\text{minimize}}&&\mathbf {c} ^{\mathrm {T} }\mathbf {x} +\mathbf {d} ^{\mathrm {T} }\mathbf {\bar {y}} \\&{\text{subject to}}&&A\mathbf {x} \geq \mathbf {b} -B\mathbf {\bar {y}} \\&&&\mathbf {x} \geq \mathbf {0} \end{aligned}}

The dual of the residual problem is

{\begin{aligned}&{\text{maximize}}&&(\mathbf {b} -B\mathbf {\bar {y}} )^{\mathrm {T} }\mathbf {u} +\mathbf {d} ^{\mathrm {T} }\mathbf {\bar {y}} \\&{\text{subject to}}&&A^{\mathrm {T} }\mathbf {u} \leq \mathbf {c} \\&&&\mathbf {u} \geq \mathbf {0} \end{aligned}}

Using the dual representation of the residual problem, the original problem can be rewritten as an equivalent minimax problem

\min _{\mathbf {y} \in Y}\left[\mathbf {d} ^{\mathrm {T} }\mathbf {y} +\max _{\mathbf {u} \geq \mathbf {0} }\left\{(\mathbf {b} -B\mathbf {y} )^{\mathrm {T} }\mathbf {u} \mid A^{\mathrm {T} }\mathbf {u} \leq \mathbf {c} \right\}\right].

Benders decomposition relies on an iterative procedure that chooses successive values of $\mathbf {y}$ without considering the inner problem except through a set of cut constraints that are created through a pass-back mechanism from the maximization problem. Although the minimax formulation is written in terms of $(\mathbf {u} ,\mathbf {y} )$ , for an optimal $\mathbf {\bar {y}}$ the corresponding $\mathbf {\bar {x}}$ can be found by solving the original problem with $\mathbf {\bar {y}}$ fixed.

Master Problem Formulation

The decisions for the first stage problem can be described by the smaller minimization problem

{\begin{aligned}&{\text{minimize}}&&\mathbf {z} \\&{\text{subject to}}&&\{{\text{cuts}}\}\\&&&\mathbf {y} \in Y\\\end{aligned}}

Initially the set of cuts is empty. Solving this master problem will constitute a "first guess" at an optimal solution to the overall problem, with the value of $\mathbf {z}$ unbounded below and $\mathbf {y}$ taking on any feasible value.

The set of cuts will be filled in a sequence of iterations by solving the inner maximization problem of the minimax formulation. The cuts both guide the master problem towards an optimal $\mathbf {y}$ , if one exists, and ensure that $\mathbf {y}$ is feasible for the full problem. The set of cuts define the relationship between $\mathbf {y}$ , $\mathbf {z}$ , and implicitly $\mathbf {x}$ .

Since the value of $z$ starts unconstrained and we only add constraints at each iteration, meaning the feasible space can only shrink, the value of the master problem at any iteration provides a lower bound on the solution to the overall problem. If for some $\mathbf {\bar {y}}$ the objective value of the master problem is equal to the value of the optimal value of the inner problem, then by duality theory the solution is optimal.

Subproblem Formulation

The subproblem considers the suggested solution $\mathbf {\bar {y}}$ to the master problem and solves the inner maximization problem from the minimax formulation. The inner problem is formulated using the dual representation

{\begin{aligned}&{\text{maximize}}&&(\mathbf {b} -B\mathbf {\bar {y}} )^{\mathrm {T} }\mathbf {u} +\mathbf {d} ^{\mathrm {T} }\mathbf {\bar {y}} \\&{\text{subject to}}&&A^{\mathrm {T} }\mathbf {u} \leq \mathbf {c} \\&&&\mathbf {u} \geq \mathbf {0} \end{aligned}}

While the master problem provides a lower bound on the value of the problem, the subproblem is used to get an upper bound. The result of solving the subproblem for any given $\mathbf {\bar {y}}$ can either be a finite optimal value for which an extreme point $\mathbf {\bar {u}}$ can be found, an unbounded solution for which an extreme ray $\mathbf {\bar {u}}$ in the recession cone can be found, or a finding that the subproblem is infeasible.

Procedure

At a high level, the procedure will iteratively consider the master problem and subproblem. Each iteration provides an updated upper and lower bound on the optimal objective value. The result of the subproblem either provides a new constraint to add to the master problem or a certificate that no finite optimal solution exists for the problem. The procedure terminates when it is shown that no finite optimal solution exists or when the gap between the upper and lower bound is sufficiently small. In such a case, the value of $\mathbf {\bar {x}}$ is determined by solving the primal residual problem fixing $\mathbf {\bar {y}}$ .

Formally, the procedure begins with the lower bound set to $-\inf$ , the upper bound set to $\inf$ , and the cuts in the master problem empty. An initial solution is produced by selecting any $\mathbf {\bar {y}} \in Y$ . Then the iterative procedure begins and continues until the gap between the upper and lower bound is at most $\epsilon$ or it is shown that no finite optimal solution exists.

The first step of each iteration begins by updating the upper bound by solving the subproblem given the most recent value of $\mathbf {\bar {y}}$ . There are three possible outcomes from solving the subproblem.

In the first case, the objective value of the subproblem is unbounded above. By duality theory, when a dual problem has unbounded objective the corresponding primal problem is infeasible. This means that the choice of $\mathbf {\bar {y}}$ does not satisfy $A\mathbf {x} +B\mathbf {\bar {y}} \geq \mathbf {b}$ for any $\mathbf {x} \geq \mathbf {0}$ . This solution can be removed from the master problem by taking an extreme ray $\mathbf {\bar {u}}$ that certifies the subproblem has unbounded objective and adding a constraint to the master asserting that $(\mathbf {b} -B\mathbf {y} )^{\mathrm {T} }\mathbf {\bar {u}} \leq \mathbf {0}$ .

In the second case, the subproblem is infeasible. Since the dual feasible space to the problem is empty, either the original problem is not feasible or there is a ray in the primal problem that certifies the objective value is unbounded below. In either case, the procedure terminates.

In the third case, the subproblem has a finite optimal solution. By duality theory for linear programs, the optimal value of the subproblem is equal to the optimal value of the original problem constrained on the choice of $\mathbf {\bar {y}}$ . This allows the upper bound to be updated to the value of the optimal solution of the subproblem, if it is better than the current upper bound. Given an optimal extreme point $\mathbf {\bar {u}}$ , it also yields a new constraint that requires the master problem to consider the objective value under this particular solution by asserting that $z\geq (\mathbf {b} -B\mathbf {y} )^{\mathrm {T} }\mathbf {\bar {u}} +\mathbf {d} ^{\mathrm {T} }\mathbf {y}$ . This will strictly increase the value of $z$ at the solution $\mathbf {\bar {y}}$ in the master problem if the choice of $\mathbf {\bar {y}}$ was suboptimal.

Finally, the last part of each iteration is creating a new solution to the master problem by solving the master problem with the new constraint. The new solution $(\mathbf {\bar {y}} ,z)$ is used to update the lower bound. If the gap between the best upper and lower bound is less than $\epsilon$ then the procedure terminates and the value of $\mathbf {\bar {x}}$ is determined by solving the primal residual problem fixing $\mathbf {\bar {y}}$ . Otherwise, the procedure continues on to the next iteration.

Related Research Articles

Linear programming (LP), also called linear optimization, is a method to achieve the best outcome in a mathematical model whose requirements and objective are represented by linear relationships. Linear programming is a special case of mathematical programming.

In mathematical optimization, Dantzig's simplex algorithm is a popular algorithm for linear programming.

Optimal control theory is a branch of control theory that deals with finding a control for a dynamical system over a period of time such that an objective function is optimized. It has numerous applications in science, engineering and operations research. For example, the dynamical system might be a spacecraft with controls corresponding to rocket thrusters, and the objective might be to reach the Moon with minimum fuel expenditure. Or the dynamical system could be a nation's economy, with the objective to minimize unemployment; the controls in this case could be fiscal and monetary policy. A dynamical system may also be introduced to embed operations research problems within the framework of optimal control theory.

An integer programming problem is a mathematical optimization or feasibility program in which some or all of the variables are restricted to be integers. In many settings the term refers to integer linear programming (ILP), in which the objective function and the constraints are linear.

In mathematics, nonlinear programming (NLP) is the process of solving an optimization problem where some of the constraints are not linear equalities or the objective function is not a linear function. An optimization problem is one of calculation of the extrema of an objective function over a set of unknown real variables and conditional to the satisfaction of a system of equalities and inequalities, collectively termed constraints. It is the sub-field of mathematical optimization that deals with problems that are not linear.

Column generation or delayed column generation is an efficient algorithm for solving large linear programs.

In mathematical optimization, the cutting-plane method is any of a variety of optimization methods that iteratively refine a feasible set or objective function by means of linear inequalities, termed cuts. Such procedures are commonly used to find integer solutions to mixed integer linear programming (MILP) problems, as well as to solve general, not necessarily differentiable convex optimization problems. The use of cutting planes to solve MILP was introduced by Ralph E. Gomory.

Convex optimization is a subfield of mathematical optimization that studies the problem of minimizing convex functions over convex sets. Many classes of convex optimization problems admit polynomial-time algorithms, whereas mathematical optimization is in general NP-hard.

In numerical optimization, the Broyden–Fletcher–Goldfarb–Shanno (BFGS) algorithm is an iterative method for solving unconstrained nonlinear optimization problems. Like the related Davidon–Fletcher–Powell method, BFGS determines the descent direction by preconditioning the gradient with curvature information. It does so by gradually improving an approximation to the Hessian matrix of the loss function, obtained only from gradient evaluations via a generalized secant method.

The Frank–Wolfe algorithm is an iterative first-order optimization algorithm for constrained convex optimization. Also known as the conditional gradient method, reduced gradient algorithm and the convex combination algorithm, the method was originally proposed by Marguerite Frank and Philip Wolfe in 1956. In each iteration, the Frank–Wolfe algorithm considers a linear approximation of the objective function, and moves towards a minimizer of this linear function.

In mathematical optimization theory, duality or the duality principle is the principle that optimization problems may be viewed from either of two perspectives, the primal problem or the dual problem. If the primal is a minimization problem then the dual is a maximization problem. Any feasible solution to the primal (minimization) problem is at least as large as any feasible solution to the dual (maximization) problem. Therefore, the solution to the primal is an upper bound to the solution of the dual, and the solution of the dual is a lower bound to the solution of the primal. This fact is called weak duality.

In mathematical optimization, constrained optimization is the process of optimizing an objective function with respect to some variables in the presence of constraints on those variables. The objective function is either a cost function or energy function, which is to be minimized, or a reward function or utility function, which is to be maximized. Constraints can be either hard constraints, which set conditions for the variables that are required to be satisfied, or soft constraints, which have some variable values that are penalized in the objective function if, and based on the extent that, the conditions on the variables are not satisfied.

In control theory, the linear–quadratic–Gaussian (LQG) control problem is one of the most fundamental optimal control problems, and it can also be operated repeatedly for model predictive control. It concerns linear systems driven by additive white Gaussian noise. The problem is to determine an output feedback law that is optimal in the sense of minimizing the expected value of a quadratic cost criterion. Output measurements are assumed to be corrupted by Gaussian noise and the initial state, likewise, is assumed to be a Gaussian random vector.

In mathematics and economics, transportation theory or transport theory is a name given to the study of optimal transportation and allocation of resources. The problem was formalized by the French mathematician Gaspard Monge in 1781.

The dual of a given linear program (LP) is another LP that is derived from the original LP in the following schematic way:

In multilinear algebra, the tensor rank decomposition or rank-R decomposition is the decomposition of a tensor as a sum of R rank-1 tensors, where R is minimal. Computing this decomposition is an open problem.

In mathematical optimization, the problem of non-negative least squares (NNLS) is a type of constrained least squares problem where the coefficients are not allowed to become negative. That is, given a matrix $A$ and a (column) vector of response variables $y$ , the goal is to find

In the theory of linear programming, a basic feasible solution (BFS) is a solution with a minimal set of non-zero variables. Geometrically, each BFS corresponds to a vertex of the polyhedron of feasible solutions. If there exists an optimal solution, then there exists an optimal BFS. Hence, to find an optimal solution, it is sufficient to consider the BFS-s. This fact is used by the simplex algorithm, which essentially travels from one BFS to another until an optimal solution is found.

The Karmarkar–Karp (KK) bin packing algorithms are several related approximation algorithm for the bin packing problem. The bin packing problem is a problem of packing items of different sizes into bins of identical capacity, such that the total number of bins is as small as possible. Finding the optimal solution is computationally hard. Karmarkar and Karp devised an algorithm that runs in polynomial time and finds a solution with at most $bins, where OPT is the number of bins in the optimal solution. They also devised several other algorithms with slightly different approximation guarantees and run-time bounds.$

Lexicographic max-min optimization is a kind of multi-objective optimization. In general, multi-objective optimization deals with optimization problems with two or more objective functions to be optimized simultaneously. Lexmaxmin optimization presumes that the decision-maker would like the smallest objective value to be as high as possible; subject to this, the second-smallest objective should be as high as possible; and so on. In other words, the decision-maker ranks the possible solutions according to a leximin order of their objective function values.

References

Benders, J. F. (Sept. 1962), "Partitioning procedures for solving mixed-variables programming problems", Numerische Mathematik 4(3): 238–252.
Lasdon, Leon S. (2002), Optimization Theory for Large Systems (reprint of the 1970 Macmillan ed.), Mineola, New York: Dover Publications, pp. xiii+523, MR 1888251 .

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.