Assignment problem

Last updated
Worked example of assigning tasks to an unequal number of workers using the Hungarian method Hungarian algorithm unbalanced assignment problem example.svg
Worked example of assigning tasks to an unequal number of workers using the Hungarian method

The assignment problem is a fundamental combinatorial optimization problem. In its most general form, the problem is as follows:

Contents

The problem instance has a number of agents and a number of tasks. Any agent can be assigned to perform any task, incurring some cost that may vary depending on the agent-task assignment. It is required to perform as many tasks as possible by assigning at most one agent to each task and at most one task to each agent, in such a way that the total cost of the assignment is minimized.

Alternatively, describing the problem using graph theory:

The assignment problem consists of finding, in a weighted bipartite graph, a matching of a given size, in which the sum of weights of the edges is minimum.

If the numbers of agents and tasks are equal, then the problem is called balanced assignment. Otherwise, it is called unbalanced assignment. [1] If the total cost of the assignment for all tasks is equal to the sum of the costs for each agent (or the sum of the costs for each task, which is the same thing in this case), then the problem is called linear assignment. Commonly, when speaking of the assignment problem without any additional qualification, then the linear balanced assignment problem is meant.

Examples

Suppose that a taxi firm has three taxis (the agents) available, and three customers (the tasks) wishing to be picked up as soon as possible. The firm prides itself on speedy pickups, so for each taxi the "cost" of picking up a particular customer will depend on the time taken for the taxi to reach the pickup point. This is a balanced assignment problem. Its solution is whichever combination of taxis and customers results in the least total cost.

Now, suppose that there are four taxis available, but still only three customers. This is an unbalanced assignment problem. One way to solve it is to invent a fourth dummy task, perhaps called "sitting still doing nothing", with a cost of 0 for the taxi assigned to it. This reduces the problem to a balanced assignment problem, which can then be solved in the usual way and still give the best solution to the problem.

Similar adjustments can be done in order to allow more tasks than agents, tasks to which multiple agents must be assigned (for instance, a group of more customers than will fit in one taxi), or maximizing profit rather than minimizing cost.

Formal definition

The formal definition of the assignment problem (or linear assignment problem) is

Given two sets, A and T, together with a weight function C : A×T R . Find a bijection f : AT such that the cost function:
is minimized.

Usually the weight function is viewed as a square real-valued matrix C, so that the cost function is written down as:

The problem is "linear" because the cost function to be optimized as well as all the constraints contain only linear terms.

Algorithms

A naive solution for the assignment problem is to check all the assignments and calculate the cost of each one. This may be very inefficient since, with n agents and n tasks, there are n! (factorial of n) different assignments.

Another naive solution is to greedily assign the pair with the smallest cost first, and remove the vertices; then, among the remaining vertices, assign the pair with the smallest cost; and so on. This algorithm may yield a non-optimal solution. For example, suppose there are two tasks and two agents with costs as follows:

The greedy algorithm would assign Task 1 to Alice and Task 2 to George, for a total cost of 9; but the reverse assignment has a total cost of 7.

Fortunately, there are many algorithms for finding the optimal assignment in time polynomial in n. The assignment problem is a special case of the transportation problem, which is a special case of the minimum cost flow problem, which in turn is a special case of a linear program. While it is possible to solve any of these problems using the simplex algorithm, or in worst-case polynomial time using the ellipsoid method, each specialization has a smaller solution space and thus more efficient algorithms designed to take advantage of its special structure.

Balanced assignment

In the balanced assignment problem, both parts of the bipartite graph have the same number of vertices, denoted by n.

One of the first polynomial-time algorithms for balanced assignment was the Hungarian algorithm. It is a global algorithm – it is based on improving a matching along augmenting paths (alternating paths between unmatched vertices). Its run-time complexity, when using Fibonacci heaps, is , [2] where m is a number of edges. This is currently the fastest run-time of a strongly polynomial algorithm for this problem. If all weights are integers, then the run-time can be improved to , but the resulting algorithm is only weakly-polynomial. [3] If the weights are integers, and all weights are at most C (where C>1 is some integer), then the problem can be solved in weakly-polynomial time in a method called weight scaling. [4] [5] [6]

In addition to the global methods, there are local methods which are based on finding local updates (rather than full augmenting paths). These methods have worse asymptotic runtime guarantees, but they often work better in practice. These algorithms are called auction algorithms, push-relabel algorithms, or preflow-push algorithms. Some of these algorithms were shown to be equivalent. [7]

Some of the local methods assume that the graph admits a perfect matching; if this is not the case, then some of these methods might run forever. [1] :3 A simple technical way to solve this problem is to extend the input graph to a complete bipartite graph, by adding artificial edges with very large weights. These weights should exceed the weights of all existing matchings, to prevent appearance of artificial edges in the possible solution.

As shown by Mulmuley, Vazirani and Vazirani, [8] the problem of minimum weight perfect matching is converted to finding minors in the adjacency matrix of a graph. Using the isolation lemma, a minimum weight perfect matching in a graph can be found with probability at least 12. For a graph with n vertices, it requires time.

Unbalanced assignment

In the unbalanced assignment problem, the larger part of the bipartite graph has n vertices and the smaller part has r<n vertices. There is also a constant s which is at most the cardinality of a maximum matching in the graph. The goal is to find a minimum-cost matching of size exactly s. The most common case is the case in which the graph admits a one-sided-perfect matching (i.e., a matching of size r), and s=r.

Unbalanced assignment can be reduced to a balanced assignment. The naive reduction is to add new vertices to the smaller part and connect them to the larger part using edges of cost 0. However, this requires new edges. A more efficient reduction is called the doubling technique. Here, a new graph G' is built from two copies of the original graph G: a forward copy Gf and a backward copy Gb. The backward copy is "flipped", so that, in each side of G', there are now n+r vertices. Between the copies, we need to add two kinds of linking edges: [1] :4–6

All in all, at most new edges are required. The resulting graph always has a perfect matching of size . A minimum-cost perfect matching in this graph must consist of minimum-cost maximum-cardinality matchings in Gf and Gb. The main problem with this doubling technique is that there is no speed gain when .

Instead of using reduction, the unbalanced assignment problem can be solved by directly generalizing existing algorithms for balanced assignment. The Hungarian algorithm can be generalized to solve the problem in strongly-polynomial time. In particular, if s=r then the runtime is . If the weights are integers, then Thorup's method can be used to get a runtime of . [1] :6

Solution by linear programming

The assignment problem can be solved by presenting it as a linear program. For convenience we will present the maximization problem. Each edge (i,j), where i is in A and j is in T, has a weight . For each edge we have a variable . The variable is 1 if the edge is contained in the matching and 0 otherwise, so we set the domain constraints:

The total weight of the matching is: . The goal is to find a maximum-weight perfect matching.

To guarantee that the variables indeed represent a perfect matching, we add constraints saying that each vertex is adjacent to exactly one edge in the matching, i.e., .

All in all we have the following LP:

This is an integer linear program. However, we can solve it without the integrality constraints (i.e., drop the last constraint), using standard methods for solving continuous linear programs. While this formulation allows also fractional variable values, in this special case, the LP always has an optimal solution where the variables take integer values. This is because the constraint matrix of the fractional LP is totally unimodular  – it satisfies the four conditions of Hoffman and Gale.

Other methods and approximation algorithms

Other approaches for the assignment problem exist and are reviewed by Duan and Pettie [9] (see Table II). Their work proposes an approximation algorithm for the assignment problem (and the more general maximum weight matching problem), which runs in linear time for any fixed error bound.

Generalization

When phrased as a graph theory problem, the assignment problem can be extended from bipartite graphs to arbitrary graphs. The corresponding problem, of finding a matching in a weighted graph where the sum of weights is maximized, is called the maximum weight matching problem.

Another generalization of the assignment problem is extending the number of sets to be matched from two to many. So that rather than matching agents to tasks, the problem is extended to matching agents to tasks to time intervals to locations. This results in Multidimensional assignment problem (MAP).

See also

References and further reading

  1. 1 2 3 4 Lyle Ramshaw, Robert E. Tarjan (2012). "On minimum-cost assignments in unbalanced bipartite graphs" (PDF). HP research labs.
  2. Fredman, Michael L.; Tarjan, Robert Endre (1987-07-01). "Fibonacci Heaps and Their Uses in Improved Network Optimization Algorithms". J. ACM. 34 (3): 596–615. doi: 10.1145/28869.28874 . ISSN   0004-5411. S2CID   7904683.
  3. Thorup, Mikkel (2004-11-01). "Integer priority queues with decrease key in constant time and the single source shortest paths problem". Journal of Computer and System Sciences. Special Issue on STOC 2003. 69 (3): 330–353. doi: 10.1016/j.jcss.2004.04.003 . ISSN   0022-0000.
  4. Gabow, H.; Tarjan, R. (1989-10-01). "Faster Scaling Algorithms for Network Problems". SIAM Journal on Computing. 18 (5): 1013–1036. doi:10.1137/0218069. ISSN   0097-5397.
  5. Goldberg, A.; Kennedy, R. (1997-11-01). "Global Price Updates Help". SIAM Journal on Discrete Mathematics. 10 (4): 551–572. doi:10.1137/S0895480194281185. ISSN   0895-4801.
  6. Orlin, James B.; Ahuja, Ravindra K. (1992-02-01). "New scaling algorithms for the assignment and minimum mean cycle problems". Mathematical Programming. 54 (1–3): 41–56. doi:10.1007/BF01586040. ISSN   0025-5610. S2CID   18213947.
  7. Alfaro, Carlos A.; Perez, Sergio L.; Valencia, Carlos E.; Vargas, Marcos C. (2022-06-01). "The assignment problem revisited". Optimization Letters. 16 (5): 1531–1548. doi:10.1007/s11590-021-01791-4. ISSN   1862-4480. S2CID   238644205.
  8. Mulmuley, Ketan; Vazirani, Umesh; Vazirani, Vijay (1987). "Matching is as easy as matrix inversion". Combinatorica. 7 (1): 105–113. doi:10.1007/BF02579206. S2CID   47370049.
  9. Duan, Ran; Pettie, Seth (2014-01-01). "Linear-Time Approximation for Maximum Weight Matching" (PDF). Journal of the ACM. 61: 1–23. doi:10.1145/2529989. S2CID   207208641.

Related Research Articles

<span class="mw-page-title-main">Travelling salesman problem</span> NP-hard problem in combinatorial optimization

The travelling salesman problem, also known as the travelling salesperson problem (TSP), asks the following question: "Given a list of cities and the distances between each pair of cities, what is the shortest possible route that visits each city exactly once and returns to the origin city?" It is an NP-hard problem in combinatorial optimization, important in theoretical computer science and operations research.

<span class="mw-page-title-main">Bipartite graph</span> Graph divided into two independent sets

In the mathematical field of graph theory, a bipartite graph is a graph whose vertices can be divided into two disjoint and independent sets and , that is, every edge connects a vertex in to one in . Vertex sets and are usually called the parts of the graph. Equivalently, a bipartite graph is a graph that does not contain any odd-length cycles.

<span class="mw-page-title-main">Maximum flow problem</span> Computational problem in graph theory

In optimization theory, maximum flow problems involve finding a feasible flow through a flow network that obtains the maximum possible flow rate.

In the mathematical discipline of graph theory, a matching or independent edge set in an undirected graph is a set of edges without common vertices. In other words, a subset of the edges is a matching if each vertex appears in at most one edge of that matching. Finding a matching in a bipartite graph can be treated as a network flow problem.

<span class="mw-page-title-main">Edge coloring</span> Problem of coloring a graphs edges such that meeting edges do not match

In graph theory, a proper edge coloring of a graph is an assignment of "colors" to the edges of the graph so that no two incident edges have the same color. For example, the figure to the right shows an edge coloring of a graph by the colors red, blue, and green. Edge colorings are one of several different types of graph coloring. The edge-coloring problem asks whether it is possible to color the edges of a given graph using at most k different colors, for a given value of k, or with the fewest possible colors. The minimum required number of colors for the edges of a given graph is called the chromatic index of the graph. For example, the edges of the graph in the illustration can be colored by three colors but cannot be colored by two colors, so the graph shown has chromatic index three.

The Hungarian method is a combinatorial optimization algorithm that solves the assignment problem in polynomial time and which anticipated later primal–dual methods. It was developed and published in 1955 by Harold Kuhn, who gave it the name "Hungarian method" because the algorithm was largely based on the earlier works of two Hungarian mathematicians, Dénes Kőnig and Jenő Egerváry. However, in 2006 it was discovered that Carl Gustav Jacobi had solved the assignment problem in the 19th century, and the solution had been published posthumously in 1890 in Latin.

<span class="mw-page-title-main">Kőnig's theorem (graph theory)</span> Theorem showing that maximum matching and minimum vertex cover are equivalent for bipartite graphs

In the mathematical area of graph theory, Kőnig's theorem, proved by Dénes Kőnig, describes an equivalence between the maximum matching problem and the minimum vertex cover problem in bipartite graphs. It was discovered independently, also in 1931, by Jenő Egerváry in the more general case of weighted graphs.

Maximum cardinality matching is a fundamental problem in graph theory. We are given a graph G, and the goal is to find a matching containing as many edges as possible; that is, a maximum cardinality subset of the edges such that each vertex is adjacent to at most one edge of the subset. As each edge will cover exactly two vertices, this problem is equivalent to the task of finding a matching that covers as many vertices as possible.

The minimum-cost flow problem (MCFP) is an optimization and decision problem to find the cheapest possible way of sending a certain amount of flow through a flow network. A typical application of this problem involves finding the best delivery route from a factory to a warehouse where the road network has some capacity and cost associated. The minimum cost flow problem is one of the most fundamental among all flow and circulation problems because most other such problems can be cast as a minimum cost flow problem and also that it can be solved efficiently using the network simplex algorithm.

In graph theory and theoretical computer science, the longest path problem is the problem of finding a simple path of maximum length in a given graph. A path is called simple if it does not have any repeated vertices; the length of a path may either be measured by its number of edges, or by the sum of the weights of its edges. In contrast to the shortest path problem, which can be solved in polynomial time in graphs without negative-weight cycles, the longest path problem is NP-hard and the decision version of the problem, which asks whether a path exists of at least some given length, is NP-complete. This means that the decision problem cannot be solved in polynomial time for arbitrary graphs unless P = NP. Stronger hardness results are also known showing that it is difficult to approximate. However, it has a linear time solution for directed acyclic graphs, which has important applications in finding the critical path in scheduling problems.

<span class="mw-page-title-main">Maximum cut</span> Problem of finding a maximum cut in a graph

In a graph, a maximum cut is a cut whose size is at least the size of any other cut. That is, it is a partition of the graph's vertices into two complementary sets S and T, such that the number of edges between S and T is as large as possible. Finding such a cut is known as the max-cut problem.

The #P-completeness of 01-permanent, sometimes known as Valiant's theorem, is a mathematical proof about the permanent of matrices, considered a seminal result in computational complexity theory. In a 1979 scholarly paper, Leslie Valiant proved that the computational problem of computing the permanent of a matrix is #P-hard, even if the matrix is restricted to have entries that are all 0 or 1. In this restricted case, computing the permanent is even #P-complete, because it corresponds to the #P problem of counting the number of permutation matrices one can get by changing ones into zeroes.

In combinatorial optimization, the matroid intersection problem is to find a largest common independent set in two matroids over the same ground set. If the elements of the matroid are assigned real weights, the weighted matroid intersection problem is to find a common independent set with the maximum possible weight. These problems generalize many problems in combinatorial optimization including finding maximum matchings and maximum weight matchings in bipartite graphs and finding arborescences in directed graphs.

<span class="mw-page-title-main">David Shmoys</span> American mathematician

David Bernard Shmoys is a Professor in the School of Operations Research and Information Engineering and the Department of Computer Science at Cornell University. He obtained his Ph.D. from the University of California, Berkeley in 1984. His major focus has been in the design and analysis of algorithms for discrete optimization problems.

In graph theory, the graph bandwidth problem is to label the n vertices vi of a graph G with distinct integers so that the quantity is minimized . The problem may be visualized as placing the vertices of a graph at distinct integer points along the x-axis so that the length of the longest edge is minimized. Such placement is called linear graph arrangement, linear graph layout or linear graph placement.

In mathematical optimization, the network simplex algorithm is a graph theoretic specialization of the simplex algorithm. The algorithm is usually formulated in terms of a minimum-cost flow problem. The network simplex method works very well in practice, typically 200 to 300 times faster than the simplex method applied to general linear program of same dimensions.

Rank-maximal (RM) allocation is a rule for fair division of indivisible items. Suppose we have to allocate some items among people. Each person can rank the items from best to worst. The RM rule says that we have to give as many people as possible their best (#1) item. Subject to that, we have to give as many people as possible their next-best (#2) item, and so on.

<span class="mw-page-title-main">Ruzsa–Szemerédi problem</span>

In combinatorial mathematics and extremal graph theory, the Ruzsa–Szemerédi problem or (6,3)-problem asks for the maximum number of edges in a graph in which every edge belongs to a unique triangle. Equivalently it asks for the maximum number of edges in a balanced bipartite graph whose edges can be partitioned into a linear number of induced matchings, or the maximum number of triples one can choose from points so that every six points contain at most two triples. The problem is named after Imre Z. Ruzsa and Endre Szemerédi, who first proved that its answer is smaller than by a slowly-growing factor.

In graph theory, a fractional matching is a generalization of a matching in which, intuitively, each vertex may be broken into fractions that are matched to different neighbor vertices.

Birkhoff's algorithm is an algorithm for decomposing a bistochastic matrix into a convex combination of permutation matrices. It was published by Garrett Birkhoff in 1946. It has many applications. One such application is for the problem of fair random assignment: given a randomized allocation of items, Birkhoff's algorithm can decompose it into a lottery on deterministic allocations.