Maximum coverage problem

Last updated December 28, 2024

The maximum coverage problem is a classical question in computer science, computational complexity theory, and operations research. It is a problem that is widely taught in approximation algorithms.

The maximum coverage problem is NP-hard, and cannot be approximated to within $1-{\frac {1}{e}}+o(1)\approx 0.632$ under standard assumptions. This result essentially matches the approximation ratio achieved by the generic greedy algorithm used for maximization of submodular functions with a cardinality constraint.^[1]

ILP formulation

The maximum coverage problem can be formulated as the following integer linear program.

maximize	$\sum _{e_{j}\in E}y_{j}$	(maximizing the sum of covered elements)
subject to	$\sum {x_{i}}\leq k$	(no more than $k$ sets are selected)
	$\sum _{e_{j}\in S_{i}}x_{i}\geq y_{j}$	(if $y_{j}>0$ then at least one set $e_{j}\in S_{i}$ is selected)
	$y_{j}\in \{0,1\}$	(if $y_{j}=1$ then $e_{j}$ is covered)
	$x_{i}\in \{0,1\}$	(if $x_{i}=1$ then $S_{i}$ is selected for the cover)

Greedy algorithm

The greedy algorithm for maximum coverage chooses sets according to one rule: at each stage, choose a set which contains the largest number of uncovered elements. It can be shown that this algorithm achieves an approximation ratio of $1-{\frac {1}{e}}$ .^[2] ln-approximability results show that the greedy algorithm is essentially the best-possible polynomial time approximation algorithm for maximum coverage unless $P=NP$ .^[3]

Known extensions

The inapproximability results apply to all extensions of the maximum coverage problem since they hold the maximum coverage problem as a special case.

The Maximum Coverage Problem can be applied to road traffic situations; one such example is selecting which bus routes in a public transportation network should be installed with pothole detectors to maximise coverage, when only a limited number of sensors is available. This problem is a known extension of the Maximum Coverage Problem and was first explored in literature by Junade Ali and Vladimir Dyo.^[4]

Weighted version

In the weighted version every element $e_{j}$ has a weight $w(e_{j})$ . The task is to find a maximum coverage which has maximum weight. The basic version is a special case when all weights are $1$ .

maximize

\sum _{e\in E}w(e_{j})\cdot y_{j}

. (maximizing the weighted sum of covered elements).

subject to

\sum {x_{i}}\leq k

; (no more than

k

sets are selected).

\sum _{e_{j}\in S_{i}}x_{i}\geq y_{j}

; (if

y_{j}>0

then at least one set

e_{j}\in S_{i}

is selected).

y_{j}\in \{0,1\}

; (if

y_{j}=1

then

e_{j}

is covered)

x_{i}\in \{0,1\}

(if

x_{i}=1

then

S_{i}

is selected for the cover).

The greedy algorithm for the weighted maximum coverage at each stage chooses a set that contains the maximum weight of uncovered elements. This algorithm achieves an approximation ratio of $1-{\frac {1}{e}}$ .^[1]

Budgeted maximum coverage

In the budgeted maximum coverage version, not only does every element $e_{j}$ have a weight $w(e_{j})$ , but also every set $S_{i}$ has a cost $c(S_{i})$ . Instead of $k$ that limits the number of sets in the cover a budget $B$ is given. This budget $B$ limits the total cost of the cover that can be chosen.

maximize

\sum _{e\in E}w(e_{j})\cdot y_{j}

. (maximizing the weighted sum of covered elements).

subject to

\sum {c(S_{i})\cdot x_{i}}\leq B

; (the cost of the selected sets cannot exceed

B

).

\sum _{e_{j}\in S_{i}}x_{i}\geq y_{j}

; (if

y_{j}>0

then at least one set

e_{j}\in S_{i}

is selected).

y_{j}\in \{0,1\}

; (if

y_{j}=1

then

e_{j}

is covered)

x_{i}\in \{0,1\}

(if

x_{i}=1

then

S_{i}

is selected for the cover).

A greedy algorithm will no longer produce solutions with a performance guarantee. Namely, the worst case behavior of this algorithm might be very far from the optimal solution. The approximation algorithm is extended by the following way. First, define a modified greedy algorithm, that selects the set $S_{i}$ that has the best ratio of weighted uncovered elements to cost. Second, among covers of cardinality $1,2,...,k-1$ , find the best cover that does not violate the budget. Call this cover $H_{1}$ . Third, find all covers of cardinality $k$ that do not violate the budget. Using these covers of cardinality $k$ as starting points, apply the modified greedy algorithm, maintaining the best cover found so far. Call this cover $H_{2}$ . At the end of the process, the approximate best cover will be either $H_{1}$ or $H_{2}$ . This algorithm achieves an approximation ratio of $1-{1 \over e}$ for values of $k\geq 3$ . This is the best possible approximation ratio unless $NP\subseteq DTIME(n^{O(\log \log n)})$ .^[5]

Generalized maximum coverage

In the generalized maximum coverage version every set $S_{i}$ has a cost $c(S_{i})$ , element $e_{j}$ has a different weight and cost depending on which set covers it. Namely, if $e_{j}$ is covered by set $S_{i}$ the weight of $e_{j}$ is $w_{i}(e_{j})$ and its cost is $c_{i}(e_{j})$ . A budget $B$ is given for the total cost of the solution.

maximize

\sum _{e\in E,S_{i}}w_{i}(e_{j})\cdot y_{ij}

. (maximizing the weighted sum of covered elements in the sets in which they are covered).

subject to

\sum {c_{i}(e_{j})\cdot y_{ij}}+\sum {c(S_{i})\cdot x_{i}}\leq B

; (the cost of the selected sets cannot exceed

B

).

\sum _{i}y_{ij}\leq 1

; (element

e_{j}=1

can only be covered by at most one set).

\sum _{S_{i}}x_{i}\geq y_{ij}

; (if

y_{j}>0

then at least one set

e_{j}\in S_{i}

is selected).

y_{ij}\in \{0,1\}

; (if

y_{ij}=1

then

e_{j}

is covered by set

S_{i}

)

x_{i}\in \{0,1\}

(if

x_{i}=1

then

S_{i}

is selected for the cover).

Generalized maximum coverage algorithm

The algorithm uses the concept of residual cost/weight. The residual cost/weight is measured against a tentative solution and it is the difference of the cost/weight from the cost/weight gained by a tentative solution.

The algorithm has several stages. First, find a solution using greedy algorithm. In each iteration of the greedy algorithm the tentative solution is added the set which contains the maximum residual weight of elements divided by the residual cost of these elements along with the residual cost of the set. Second, compare the solution gained by the first step to the best solution which uses a small number of sets. Third, return the best out of all examined solutions. This algorithm achieves an approximation ratio of $1-1/e-o(1)$ .^[6]

Notes

1 2 G. L. Nemhauser, L. A. Wolsey and M. L. Fisher. An analysis of approximations for maximizing submodular set functions I, Mathematical Programming 14 (1978), 265–294
↑ Hochbaum, Dorit S. (1997). "Approximating Covering and Packing Problems: Set Cover, Vertex Cover, Independent Set, and Related Problems". In Hochbaum, Dorit S. (ed.). Approximation Algorithms for NP-Hard Problems. Boston: PWS Publishing Company. pp. 94–143. ISBN 978-053494968-6.
↑ Feige, Uriel (July 1998). "A Threshold of ln n for Approximating Set Cover". Journal of the ACM. 45 (4). New York, NY, USA: Association for Computing Machinery: 634–652. doi: 10.1145/285055.285059 . ISSN 0004-5411. S2CID 52827488.
↑ Ali, Junade; Dyo, Vladimir (2017). "Coverage and Mobile Sensor Placement for Vehicles on Predetermined Routes: A Greedy Heuristic Approach". Proceedings of the 14th International Joint Conference on e-Business and Telecommunications. Vol. 2: WINSYS. pp. 83–88. doi:10.5220/0006469800830088. ISBN 978-989-758-261-5.
↑ Khuller, Samir; Moss, Anna; Naor, Joseph (Seffi) (1999). "The budgeted maximum coverage problem". Information Processing Letters. 70: 39–45. CiteSeerX 10.1.1.49.5784 . doi:10.1016/S0020-0190(99)00031-9.
↑ Cohen, Reuven; Katzir, Liran (2008). "The Generalized Maximum Coverage Problem". Information Processing Letters. 108: 15–22. CiteSeerX 10.1.1.156.2073 . doi:10.1016/j.ipl.2008.03.017.

Related Research Articles

The knapsack problem is the following problem in combinatorial optimization:

In the theory of computational complexity, the travelling salesman problem (TSP) asks the following question: "Given a list of cities and the distances between each pair of cities, what is the shortest possible route that visits each city exactly once and returns to the origin city?" It is an NP-hard problem in combinatorial optimization, important in theoretical computer science and operations research.

In computer science and optimization theory, the max-flow min-cut theorem states that in a flow network, the maximum amount of flow passing from the source to the sink is equal to the total weight of the edges in a minimum cut, i.e., the smallest total weight of the edges which if removed would disconnect the source from the sink.

The assignment problem is a fundamental combinatorial optimization problem. In its most general form, the problem is as follows:

In computer science and operations research, approximation algorithms are efficient algorithms that find approximate solutions to optimization problems with provable guarantees on the distance of the returned solution to the optimal one. Approximation algorithms naturally arise in the field of theoretical computer science as a consequence of the widely believed P ≠ NP conjecture. Under this conjecture, a wide class of optimization problems cannot be solved exactly in polynomial time. The field of approximation algorithms, therefore, tries to understand how closely it is possible to approximate optimal solutions to such problems in polynomial time. In an overwhelming majority of the cases, the guarantee of such algorithms is a multiplicative one expressed as an approximation ratio or approximation factor i.e., the optimal solution is always guaranteed to be within a (predetermined) multiplicative factor of the returned solution. However, there are also many approximation algorithms that provide an additive guarantee on the quality of the returned solution. A notable example of an approximation algorithm that provides both is the classic approximation algorithm of Lenstra, Shmoys and Tardos for scheduling on unrelated parallel machines.

<span class="mw-page-title-main">Set cover problem</span> Classical problem in combinatorics

The set cover problem is a classical question in combinatorics, computer science, operations research, and complexity theory.

In number theory and computer science, the partition problem, or number partitioning, is the task of deciding whether a given multiset S of positive integers can be partitioned into two subsets S₁ and S₂ such that the sum of the numbers in S₁ equals the sum of the numbers in S₂. Although the partition problem is NP-complete, there is a pseudo-polynomial time dynamic programming solution, and there are heuristics that solve the problem in many instances, either optimally or approximately. For this reason, it has been called "the easiest hard problem".

<span class="mw-page-title-main">Kőnig's theorem (graph theory)</span> Theorem showing that maximum matching and minimum vertex cover are equivalent for bipartite graphs

In the mathematical area of graph theory, Kőnig's theorem, proved by Dénes Kőnig, describes an equivalence between the maximum matching problem and the minimum vertex cover problem in bipartite graphs. It was discovered independently, also in 1931, by Jenő Egerváry in the more general case of weighted graphs.

In mathematics, the relaxation of a (mixed) integer linear program is the problem that arises by removing the integrality constraint of each variable.

Interval scheduling is a class of problems in computer science, particularly in the area of algorithm design. The problems consider a set of tasks. Each task is represented by an interval describing the time in which it needs to be processed by some machine. For instance, task A might run from 2:00 to 5:00, task B might run from 4:00 to 10:00 and task C might run from 9:00 to 11:00. A subset of intervals is compatible if no two intervals overlap on the machine/resource. For example, the subset {A,C} is compatible, as is the subset {B}; but neither {A,B} nor {B,C} are compatible subsets, because the corresponding intervals within each subset overlap.

In applied mathematics, the maximum generalized assignment problem is a problem in combinatorial optimization. This problem is a generalization of the assignment problem in which both tasks and agents have a size. Moreover, the size of each task might vary from one agent to the other.

The study of facility location problems (FLP), also known as location analysis, is a branch of operations research and computational geometry concerned with the optimal placement of facilities to minimize transportation costs while considering factors like avoiding placing hazardous materials near housing, and competitors' facilities. The techniques also apply to cluster analysis.

In combinatorial optimization, the matroid intersection problem is to find a largest common independent set in two matroids over the same ground set. If the elements of the matroid are assigned real weights, the weighted matroid intersection problem is to find a common independent set with the maximum possible weight. These problems generalize many problems in combinatorial optimization including finding maximum matchings and maximum weight matchings in bipartite graphs and finding arborescences in directed graphs.

In mathematics, a submodular set function is a set function that, informally, describes the relationship between a set of inputs and an output, where adding more of one input has a decreasing additional benefit. The natural diminishing returns property which makes them suitable for many applications, including approximation algorithms, game theory and electrical networks. Recently, submodular functions have also found utility in several real world problems in machine learning and artificial intelligence, including automatic summarization, multi-document summarization, feature selection, active learning, sensor placement, image collection summarization and many other domains.

The quadratic knapsack problem (QKP), first introduced in 19th century, is an extension of knapsack problem that allows for quadratic terms in the objective function: Given a set of items, each with a weight, a value, and an extra profit that can be earned if two items are selected, determine the number of items to include in a collection without exceeding capacity of the knapsack, so as to maximize the overall profit. Usually, quadratic knapsack problems come with a restriction on the number of copies of each kind of item: either 0, or 1. This special type of QKP forms the 0-1 quadratic knapsack problem, which was first discussed by Gallo et al. The 0-1 quadratic knapsack problem is a variation of the knapsack problem, combining the features of the 0-1 knapsack problem and the quadratic knapsack problem.

Quantum optimization algorithms are quantum algorithms that are used to solve optimization problems. Mathematical optimization deals with finding the best solution to a problem from a set of possible solutions. Mostly, the optimization problem is formulated as a minimization problem, where one tries to minimize an error which depends on the solution: the optimal solution has the minimal error. Different optimization techniques are applied in various fields such as mechanics, economics and engineering, and as the complexity and amount of data involved rise, more efficient ways of solving optimization problems are needed. Quantum computing may allow problems which are not practically feasible on classical computers to be solved, or suggest a considerable speed up with respect to the best known classical algorithm.

In computer science, multiway number partitioning is the problem of partitioning a multiset of numbers into a fixed number of subsets, such that the sums of the subsets are as similar as possible. It was first presented by Ronald Graham in 1969 in the context of the identical-machines scheduling problem. The problem is parametrized by a positive integer k, and called k-way number partitioning. The input to the problem is a multiset S of numbers, whose sum is k*T.

Longest-processing-time-first (LPT) is a greedy algorithm for job scheduling. The input to the algorithm is a set of jobs, each of which has a specific processing-time. There is also a number m specifying the number of machines that can process the jobs. The LPT algorithm works as follows:

Order the jobs by descending order of their processing-time, such that the job with the longest processing time is first.
Schedule each job in this sequence into a machine in which the current load is smallest.

The configuration linear program (configuration-LP) is a linear programming technique used for solving combinatorial optimization problems. It was introduced in the context of the cutting stock problem. Later, it has been applied to the bin packing and job scheduling problems. In the configuration-LP, there is a variable for each possible configuration - each possible multiset of items that can fit in a single bin. Usually, the number of configurations is exponential in the problem size, but in some cases it is possible to attain approximate solutions using only a polynomial number of configurations.

The welfare maximization problem is an optimization problem studied in economics and computer science. Its goal is to partition a set of items among agents with different utility functions, such that the welfare – defined as the sum of the agents' utilities – is as high as possible. In other words, the goal is to find an item allocation satisfying the utilitarian rule.

References

Vazirani, Vijay V. (2001). Approximation Algorithms. Springer-Verlag. ISBN 978-3-540-65367-7.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[NVF-1] 1 2 G. L. Nemhauser, L. A. Wolsey and M. L. Fisher. An analysis of approximations for maximizing submodular set functions I, Mathematical Programming 14 (1978), 265–294

[2] Hochbaum, Dorit S. (1997). "Approximating Covering and Packing Problems: Set Cover, Vertex Cover, Independent Set, and Related Problems". In Hochbaum, Dorit S. (ed.). Approximation Algorithms for NP-Hard Problems. Boston: PWS Publishing Company. pp. 94–143. ISBN 978-053494968-6.

[3] Feige, Uriel (July 1998). "A Threshold of ln n for Approximating Set Cover". Journal of the ACM. 45 (4). New York, NY, USA: Association for Computing Machinery: 634–652. doi: 10.1145/285055.285059 . ISSN 0004-5411. S2CID 52827488.

[4] Ali, Junade; Dyo, Vladimir (2017). "Coverage and Mobile Sensor Placement for Vehicles on Predetermined Routes: A Greedy Heuristic Approach". Proceedings of the 14th International Joint Conference on e-Business and Telecommunications. Vol. 2: WINSYS. pp. 83–88. doi:10.5220/0006469800830088. ISBN 978-989-758-261-5.

[5] Khuller, Samir; Moss, Anna; Naor, Joseph (Seffi) (1999). "The budgeted maximum coverage problem". Information Processing Letters. 70: 39–45. CiteSeerX 10.1.1.49.5784 . doi:10.1016/S0020-0190(99)00031-9.

[6] Cohen, Reuven; Katzir, Liran (2008). "The Generalized Maximum Coverage Problem". Information Processing Letters. 108: 15–22. CiteSeerX 10.1.1.156.2073 . doi:10.1016/j.ipl.2008.03.017.

[1]

[2]

[3]

[4]

[5]

[6]

Maximum coverage problem

Contents

ILP formulation

Greedy algorithm

Known extensions

Weighted version

Budgeted maximum coverage

Generalized maximum coverage

Generalized maximum coverage algorithm

Related problems

Notes

Related Research Articles

References