Dykstra's projection algorithm

Last updated

Dykstra's algorithm is a method that computes a point in the intersection of convex sets, and is a variant of the alternating projection method (also called the projections onto convex sets method). In its simplest form, the method finds a point in the intersection of two convex sets by iteratively projecting onto each of the convex set; it differs from the alternating projection method in that there are intermediate steps. A parallel version of the algorithm was developed by Gaffke and Mathar.

Contents

The method is named after Richard L. Dykstra who proposed it in the 1980s.

A key difference between Dykstra's algorithm and the standard alternating projection method occurs when there is more than one point in the intersection of the two sets. In this case, the alternating projection method gives some arbitrary point in this intersection, whereas Dykstra's algorithm gives a specific point: the projection of r onto the intersection, where r is the initial point used in the algorithm,

Algorithm

Dykstra algorithm.svg

Dykstra's algorithm finds for each the only such that:

where are convex sets. This problem is equivalent to finding the projection of onto the set , which we denote by .

To use Dykstra's algorithm, one must know how to project onto the sets and separately.

First, consider the basic alternating projection (aka POCS) method (first studied, in the case when the sets were linear subspaces, by John von Neumann [1] ), which initializes and then generates the sequence

.

Dykstra's algorithm is of a similar form, but uses additional auxiliary variables. Start with and update by

Then the sequence converges to the solution of the original problem. For convergence results and a modern perspective on the literature, see [2]

Related Research Articles

In mathematics, a contraction mapping, or contraction or contractor, on a metric space (M, d) is a function f from M to itself, with the property that there is some real number such that for all x and y in M,

<span class="mw-page-title-main">Convex set</span> In geometry, set whose intersection with every line is a single line segment

In geometry, a subset of a Euclidean space, or more generally an affine space over the reals, is convex if, given any two points in the subset, the subset contains the whole line segment that joins them. Equivalently, a convex set or a convex region is a subset that intersects every line into a single line segment . For example, a solid cube is a convex set, but anything that is hollow or has an indent, for example, a crescent shape, is not convex.

<span class="mw-page-title-main">Affine transformation</span> Geometric transformation that preserves lines but not angles nor the origin

In Euclidean geometry, an affine transformation or affinity is a geometric transformation that preserves lines and parallelism, but not necessarily Euclidean distances and angles.

In functional analysis and related areas of mathematics, locally convex topological vector spaces (LCTVS) or locally convex spaces are examples of topological vector spaces (TVS) that generalize normed spaces. They can be defined as topological vector spaces whose topology is generated by translations of balanced, absorbent, convex sets. Alternatively they can be defined as a vector space with a family of seminorms, and a topology can be defined in terms of that family. Although in general such spaces are not necessarily normable, the existence of a convex local base for the zero vector is strong enough for the Hahn–Banach theorem to hold, yielding a sufficiently rich theory of continuous linear functionals.

Multi-task learning (MTL) is a subfield of machine learning in which multiple learning tasks are solved at the same time, while exploiting commonalities and differences across tasks. This can result in improved learning efficiency and prediction accuracy for the task-specific models, when compared to training the models separately. Early versions of MTL were called "hints".

Convex optimization is a subfield of mathematical optimization that studies the problem of minimizing convex functions over convex sets. Many classes of convex optimization problems admit polynomial-time algorithms, whereas mathematical optimization is in general NP-hard.

The Frank–Wolfe algorithm is an iterative first-order optimization algorithm for constrained convex optimization. Also known as the conditional gradient method, reduced gradient algorithm and the convex combination algorithm, the method was originally proposed by Marguerite Frank and Philip Wolfe in 1956. In each iteration, the Frank–Wolfe algorithm considers a linear approximation of the objective function, and moves towards a minimizer of this linear function.

<span class="mw-page-title-main">Krein–Milman theorem</span> On when a space equals the closed convex hull of its extreme points

In the mathematical theory of functional analysis, the Krein–Milman theorem is a proposition about compact convex sets in locally convex topological vector spaces (TVSs).

In mathematical optimization, the ellipsoid method is an iterative method for minimizing convex functions. When specialized to solving feasible linear optimization problems with rational data, the ellipsoid method is an algorithm which finds an optimal solution in a number of steps that is polynomial in the input size.

<span class="mw-page-title-main">Chan's algorithm</span> Algorithm for finding the convex hull of a set of points in the plane

In computational geometry, Chan's algorithm, named after Timothy M. Chan, is an optimal output-sensitive algorithm to compute the convex hull of a set of points, in 2- or 3-dimensional space. The algorithm takes time, where is the number of vertices of the output. In the planar case, the algorithm combines an algorithm with Jarvis march, in order to obtain an optimal time. Chan's algorithm is notable because it is much simpler than the Kirkpatrick–Seidel algorithm, and it naturally extends to 3-dimensional space. This paradigm has been independently developed by Frank Nielsen in his Ph.D. thesis.

Subgradient methods are iterative methods for solving convex minimization problems. Originally developed by Naum Z. Shor and others in the 1960s and 1970s, subgradient methods are convergent when applied even to a non-differentiable objective function. When the objective function is differentiable, sub-gradient methods for unconstrained problems use the same search direction as the method of steepest descent.

The difference-map algorithm is a search algorithm for general constraint satisfaction problems. It is a meta-algorithm in the sense that it is built from more basic algorithms that perform projections onto constraint sets. From a mathematical perspective, the difference-map algorithm is a dynamical system based on a mapping of Euclidean space. Solutions are encoded as fixed points of the mapping.

In computer science, locality-sensitive hashing (LSH) is an algorithmic technique that hashes similar input items into the same "buckets" with high probability. Since similar items end up in the same buckets, this technique can be used for data clustering and nearest neighbor search. It differs from conventional hashing techniques in that hash collisions are maximized, not minimized. Alternatively, the technique can be seen as a way to reduce the dimensionality of high-dimensional data; high-dimensional input items can be reduced to low-dimensional versions while preserving relative distances between items.

Augmented Lagrangian methods are a certain class of algorithms for solving constrained optimization problems. They have similarities to penalty methods in that they replace a constrained optimization problem by a series of unconstrained problems and add a penalty term to the objective; the difference is that the augmented Lagrangian method adds yet another term designed to mimic a Lagrange multiplier. The augmented Lagrangian is related to, but not identical with the method of Lagrange multipliers.

The Landweber iteration or Landweber algorithm is an algorithm to solve ill-posed linear inverse problems, and it has been extended to solve non-linear problems that involve constraints. The method was first proposed in the 1950s by Louis Landweber, and it can be now viewed as a special case of many other more general methods.

In mathematics, projections onto convex sets (POCS), sometimes known as the alternating projection method, is a method to find a point in the intersection of two closed convex sets. It is a very simple algorithm and has been rediscovered many times. The simplest case, when the sets are affine spaces, was analyzed by John von Neumann. The case when the sets are affine spaces is special, since the iterates not only converge to a point in the intersection but to the orthogonal projection of the point onto the intersection. For general closed convex sets, the limit point need not be the projection. Classical work on the case of two closed convex sets shows that the rate of convergence of the iterates is linear. There are now extensions that consider cases when there are more than two sets, or when the sets are not convex, or that give faster convergence rates. Analysis of POCS and related methods attempt to show that the algorithm converges, and whether it converges to the projection of the original point. These questions are largely known for simple cases, but a topic of active research for the extensions. There are also variants of the algorithm, such as Dykstra's projection algorithm. See the references in the further reading section for an overview of the variants, extensions and applications of the POCS method; a good historical background can be found in section III of.

<span class="mw-page-title-main">Proximal gradient method</span>

Proximal gradient methods are a generalized form of projection used to solve non-differentiable convex optimization problems.

Proximal gradientmethods for learning is an area of research in optimization and statistical learning theory which studies algorithms for a general class of convex regularization problems where the regularization penalty may not be differentiable. One such example is regularization of the form

In mathematical optimization, the proximal operator is an operator associated with a proper, lower semi-continuous convex function from a Hilbert space to , and is defined by:

<span class="mw-page-title-main">Chambolle-Pock algorithm</span> Primal-Dual algorithm optimization for convex problems

In mathematics, the Chambolle-Pock algorithm is an algorithm used to solve convex optimization problems. It was introduced by Antonin Chambolle and Thomas Pock in 2011 and has since become a widely used method in various fields, including image processing, computer vision, and signal processing.

References

Citations

  1. J. von Neumann, On rings of operators. Reduction theory, Ann. of Math. 50 (1949) 401–485 (a reprint of lecture notes first distributed in 1933).
  2. P. L. Combettes and J.-C. Pesquet, "Proximal splitting methods in signal processing," in: Fixed-Point Algorithms for Inverse Problems in Science and Engineering, (H. H. Bauschke, R. S. Burachik, P. L. Combettes, V. Elser, D. R. Luke, and H. Wolkowicz, Editors), pp. 185–212. Springer, New York, 2011