Set packing

Last updated

Set packing is a classical NP-complete problem in computational complexity theory and combinatorics, and was one of Karp's 21 NP-complete problems. Suppose one has a finite set S and a list of subsets of S. Then, the set packing problem asks if some k subsets in the list are pairwise disjoint (in other words, no two of them share an element).

Contents

More formally, given a universe and a family of subsets of , a packing is a subfamily of sets such that all sets in are pairwise disjoint. The size of the packing is . In the set packing decision problem, the input is a pair and an integer ; the question is whether there is a set packing of size or more. In the set packing optimization problem, the input is a pair , and the task is to find a set packing that uses the most sets.

The problem is clearly in NP since, given subsets, we can easily verify that they are pairwise disjoint in polynomial time.

The optimization version of the problem, maximum set packing, asks for the maximum number of pairwise disjoint sets in the list. It is a maximization problem that can be formulated naturally as an integer linear program, belonging to the class of packing problems.

Integer linear program formulation

The maximum set packing problem can be formulated as the following integer linear program.

maximize(maximize the total number of subsets)
subject tofor all (selected sets have to be pairwise disjoint)
for all .(every set is either in the set packing or not)

Complexity

The set packing problem is not only NP-complete, but its optimization version (general maximum set packing problem) has been proven as difficult to approximate as the maximum clique problem; in particular, it cannot be approximated within any constant factor. [1] The best known algorithm approximates it within a factor of . [2] The weighted variant can also be approximated as well. [3]

Packing sets with a bounded size

The problem does have a variant which is more tractable. Given any positive integer k≥3, the k-set packing problem is a variant of set packing in which each set contains at most k elements.

When k=1, the problem is trivial. When k=2, the problem is equivalent to finding a maximum cardinality matching, which can be solved in polynomial time.

For any k≥3, the problem is NP-hard, as it is more general than 3-dimensional matching. However, there are constant-factor approximation algorithms:

Packing sets with a bounded degree

In another more tractable variant, if no element occurs in more than d of the subsets, the answer can be approximated within a factor of d. This is also true for the weighted version.

Equivalent problems

Hypergraph matching is equivalent to set packing: the sets correspond to the hyperedges.

The independent set problem is also equivalent to set packing – there is a one-to-one polynomial-time reduction between them:

This is also a bidirectional PTAS reduction, and it shows that the two problems are equally difficult to approximate.

In the special case when each set contains at most k elements (the k-set packing problem), the intersection graph is (k+1)-claw-free. This is because, if a set intersects some k+1 sets, then at least two of these sets intersect, so there cannot be a (k+1)-claw. So Maximum Independent Set in claw-free graphs [6] can be seen as a generalization of Maximum k-Set Packing.

Special cases

Graph matching is a special case of set packing in which the size of all sets is 2 (the sets correspond to the edges). In this special case, a maximum-size matching can be found in polynomial time.

3-dimensional matching is a special case in which the size of all sets is 3, and in addition, the elements are partitioned into 3 colors and each set contains exactly one element of each color. This special case is still NP-hard, though it has better constant-factor approximation algorithms than the general case.

In the set cover problem , we are given a family of subsets of a universe , and the goal is to determine whether we can choose t sets that together contain every element of . These sets may overlap. The optimization version finds the minimum number of such sets. The maximum set packing need not cover every possible element.

In the exact cover problem, every element of should be contained in exactly one of the subsets. Finding such an exact cover is an NP-complete problem, even in the special case in which the size of all sets is 3 (this special case is called exact 3 cover or X3C). However, if we create a singleton set for each element of S and add these to the list, the resulting problem is about as easy as set packing.

Karp originally showed set packing NP-complete via a reduction from the clique problem .

Notes

  1. Hazan, Elad; Safra, Shmuel; Schwartz, Oded (2006), "On the complexity of approximating k-set packing", Computational Complexity , 15 (1): 20–39, CiteSeerX   10.1.1.352.5754 , doi:10.1007/s00037-006-0205-6, MR   2226068, S2CID   1858087 . See in particular p. 21: "Maximum clique (and therefore also maximum independent set and maximum set packing) cannot be approximated to within unless NP ZPP."
  2. Halldórsson, Magnus M.; Kratochvíl, Jan; Telle, Jan Arne (1998). Independent sets with domination constraints. 25th International Colloquium on Automata, Languages and Programming. Lecture Notes in Computer Science. Vol. 1443. Springer-Verlag. pp. 176–185.
  3. Halldórsson, Magnus M. (1999). Approximations of weighted independent set and hereditary subset problems. 5th Annual International Conference on Computing and Combinatorics. Lecture Notes in Computer Science. Vol. 1627. Springer-Verlag. pp. 261–270.
  4. Cygan, Marek (October 2013). "Improved Approximation for 3-Dimensional Matching via Bounded Pathwidth Local Search". 2013 IEEE 54th Annual Symposium on Foundations of Computer Science. pp. 509–518. arXiv: 1304.1424 . doi:10.1109/FOCS.2013.61. ISBN   978-0-7695-5135-7. S2CID   14160646.
  5. Fürer, Martin; Yu, Huiwen (2014). "Approximating the k-set packing problem by local improvements". In Fouilhoux, Pierre; Gouveia, Luis Eduardo Neves; Mahjoub, A. Ridha; Paschos, Vangelis T. (eds.). Combinatorial Optimization. Lecture Notes in Computer Science. Vol. 8596. Cham: Springer International Publishing. pp. 408–420. doi:10.1007/978-3-319-09174-7_35. ISBN   978-3-319-09174-7. S2CID   15815885.
  6. Neuwohner, Meike (2021). "An improved approximation algorithm for the maximum weight independent set problem in d-claw free graphs". In Bläser, Markus; Monmege, Benjamin (eds.). 38th International Symposium on Theoretical Aspects of Computer Science, STACS 2021, March 16–19, 2021, Saarbrücken, Germany (Virtual Conference). LIPIcs. Vol. 187. Schloss Dagstuhl – Leibniz-Zentrum für Informatik. pp. 53:1–53:20. arXiv: 2106.03545 . doi: 10.4230/LIPICS.STACS.2021.53 .

Related Research Articles

<span class="mw-page-title-main">Steiner tree problem</span> On short connecting networks with added vertices

In combinatorial mathematics, the Steiner tree problem, or minimum Steiner tree problem, named after Jakob Steiner, is an umbrella term for a class of problems in combinatorial optimization. While Steiner tree problems may be formulated in a number of settings, they all require an optimal interconnect for a given set of objects and a predefined objective function. One well-known variant, which is often used synonymously with the term Steiner tree problem, is the Steiner tree problem in graphs. Given an undirected graph with non-negative edge weights and a subset of vertices, usually referred to as terminals, the Steiner tree problem in graphs requires a tree of minimum weight that contains all terminals and minimizes the total weight of its edges. Further well-known variants are the Euclidean Steiner tree problem and the rectilinear minimum Steiner tree problem.

<span class="mw-page-title-main">Independent set (graph theory)</span> Unrelated vertices in graphs

In graph theory, an independent set, stable set, coclique or anticlique is a set of vertices in a graph, no two of which are adjacent. That is, it is a set of vertices such that for every two vertices in , there is no edge connecting the two. Equivalently, each edge in the graph has at most one endpoint in . A set is independent if and only if it is a clique in the graph's complement. The size of an independent set is the number of vertices it contains. Independent sets have also been called "internally stable sets", of which "stable set" is a shortening.

<span class="mw-page-title-main">Vertex cover</span> Subset of a graphs vertices, including at least one endpoint of every edge

In graph theory, a vertex cover of a graph is a set of vertices that includes at least one endpoint of every edge of the graph.

In computer science and operations research, approximation algorithms are efficient algorithms that find approximate solutions to optimization problems with provable guarantees on the distance of the returned solution to the optimal one. Approximation algorithms naturally arise in the field of theoretical computer science as a consequence of the widely believed P ≠ NP conjecture. Under this conjecture, a wide class of optimization problems cannot be solved exactly in polynomial time. The field of approximation algorithms, therefore, tries to understand how closely it is possible to approximate optimal solutions to such problems in polynomial time. In an overwhelming majority of the cases, the guarantee of such algorithms is a multiplicative one expressed as an approximation ratio or approximation factor i.e., the optimal solution is always guaranteed to be within a (predetermined) multiplicative factor of the returned solution. However, there are also many approximation algorithms that provide an additive guarantee on the quality of the returned solution. A notable example of an approximation algorithm that provides both is the classic approximation algorithm of Lenstra, Shmoys and Tardos for scheduling on unrelated parallel machines.

In the mathematical discipline of graph theory, a matching or independent edge set in an undirected graph is a set of edges without common vertices. In other words, a subset of the edges is a matching if each vertex appears in at most one edge of that matching. Finding a matching in a bipartite graph can be treated as a network flow problem.

<span class="mw-page-title-main">Set cover problem</span> Classical problem in combinatorics

The set cover problem is a classical question in combinatorics, computer science, operations research, and complexity theory.

<span class="mw-page-title-main">Dominating set</span> Subset of a graphs nodes such that all other nodes link to at least one

In graph theory, a dominating set for a graph G is a subset D of its vertices, such that any vertex of G is in D, or has a neighbor in D. The domination numberγ(G) is the number of vertices in a smallest dominating set for G.

In graph theory, a domatic partition of a graph is a partition of into disjoint sets , ,..., such that each Vi is a dominating set for G. The figure on the right shows a domatic partition of a graph; here the dominating set consists of the yellow vertices, consists of the green vertices, and consists of the blue vertices.

In the mathematical discipline of graph theory, a feedback vertex set (FVS) of a graph is a set of vertices whose removal leaves a graph without cycles. Equivalently, each FVS contains at least one vertex of any cycle in the graph. The feedback vertex set number of a graph is the size of a smallest feedback vertex set. The minimum feedback vertex set problem is an NP-complete problem; it was among the first problems shown to be NP-complete. It has wide applications in operating systems, database systems, and VLSI chip design.

<span class="mw-page-title-main">Feedback arc set</span> Edges that hit all cycles in a graph

In graph theory and graph algorithms, a feedback arc set or feedback edge set in a directed graph is a subset of the edges of the graph that contains at least one edge out of every cycle in the graph. Removing these edges from the graph breaks all of the cycles, producing an acyclic subgraph of the given graph, often called a directed acyclic graph. A feedback arc set with the fewest possible edges is a minimum feedback arc set and its removal leaves a maximum acyclic subgraph; weighted versions of these optimization problems are also used. If a feedback arc set is minimal, meaning that removing any edge from it produces a subset that is not a feedback arc set, then it has an additional property: reversing all of its edges, rather than removing them, produces a directed acyclic graph.

In computational complexity theory, the class APX is the set of NP optimization problems that allow polynomial-time approximation algorithms with approximation ratio bounded by a constant. In simple terms, problems in this class have efficient algorithms that can find an answer within some fixed multiplicative factor of the optimal answer.

In graph theory, a cut is a partition of the vertices of a graph into two disjoint subsets. Any cut determines a cut-set, the set of edges that have one endpoint in each subset of the partition. These edges are said to cross the cut. In a connected graph, each cut-set determines a unique cut, and in some cases cuts are identified with their cut-sets rather than with their vertex partitions.

<span class="mw-page-title-main">Kőnig's theorem (graph theory)</span> Theorem showing that maximum matching and minimum vertex cover are equivalent for bipartite graphs

In the mathematical area of graph theory, Kőnig's theorem, proved by Dénes Kőnig, describes an equivalence between the maximum matching problem and the minimum vertex cover problem in bipartite graphs. It was discovered independently, also in 1931, by Jenő Egerváry in the more general case of weighted graphs.

In graph theory, a clique cover or partition into cliques of a given undirected graph is a collection of cliques that cover the whole graph. A minimum clique cover is a clique cover that uses as few cliques as possible. The minimum k for which a clique cover exists is called the clique cover number of the given graph.

<span class="mw-page-title-main">Maximum cut</span> Problem of finding a maximum cut in a graph

In a graph, a maximum cut is a cut whose size is at least the size of any other cut. That is, it is a partition of the graph's vertices into two complementary sets S and T, such that the number of edges between S and T is as large as possible. Finding such a cut is known as the max-cut problem.

In the mathematical fields of graph theory and combinatorial optimization, the bipartite dimension or biclique cover number of a graph G = (VE) is the minimum number of bicliques (that is complete bipartite subgraphs), needed to cover all edges in E. A collection of bicliques covering all edges in G is called a biclique edge cover, or sometimes biclique cover. The bipartite dimension of G is often denoted by the symbol d(G).

<span class="mw-page-title-main">3-dimensional matching</span>

In the mathematical discipline of graph theory, a 3-dimensional matching is a generalization of bipartite matching to 3-partite hypergraphs, which consist of hyperedges each of which contains 3 vertices.

In graph theory, the metric k-center problem is a combinatorial optimization problem studied in theoretical computer science. Given n cities with specified distances, one wants to build k warehouses in different cities and minimize the maximum distance of a city to a warehouse. In graph theory, this means finding a set of k vertices for which the largest distance of any point to its closest vertex in the k-set is minimum. The vertices must be in a metric space, providing a complete graph that satisfies the triangle inequality.

In the mathematical field of graph theory, the intersection number of a graph is the smallest number of elements in a representation of as an intersection graph of finite sets. In such a representation, each vertex is represented as a set, and two vertices are connected by an edge whenever their sets have a common element. Equivalently, the intersection number is the smallest number of cliques needed to cover all of the edges of .

<span class="mw-page-title-main">Matroid parity problem</span> Largest independent set of paired elements

In combinatorial optimization, the matroid parity problem is a problem of finding the largest independent set of paired elements in a matroid. The problem was formulated by Lawler (1976) as a common generalization of graph matching and matroid intersection. It is also known as polymatroid matching, or the matchoid problem.

References