K-minimum spanning tree

Last updated June 24, 2023

The $k$ -minimum spanning tree problem, studied in theoretical computer science, asks for a tree of minimum cost that has exactly $k$ vertices and forms a subgraph of a larger graph. It is also called the $k$ -MST or edge-weighted $k$ -cardinality tree. Finding this tree is NP-hard, but it can be approximated to within a constant approximation ratio in polynomial time.

Problem statement

The input to the problem consists of an undirected graph with weights on its edges, and a number $k$ . The output is a tree with $k$ vertices and $k - 1$ edges, with all of the edges of the output tree belonging to the input graph. The cost of the output is the sum of the weights of its edges, and the goal is to find the tree that has minimum cost. The problem was formulated by Lozovanu & Zelikovsky (1993) ^[1] and by Ravi et al. (1996).

Ravi et al. also considered a geometric version of the problem, which can be seen as a special case of the graph problem. In the geometric $k$ -minimum spanning tree problem, the input is a set of points in the plane. Again, the output should be a tree with $k$ of the points as its vertices, minimizing the total Euclidean length of its edges. That is, it is a graph $k$ -minimum spanning tree on a complete graph with Euclidean distances as weights.^[2]

Computational complexity

When $k$ is a fixed constant, the $k$ -minimum spanning tree problem can be solved in polynomial time by a brute-force search algorithm that tries all $k$ -tuples of vertices. However, for variable $k$ , the $k$ -minimum spanning tree problem has been shown to be NP-hard by a reduction from the Steiner tree problem.^[1]^[2]

The reduction takes as input an instance of the Steiner tree problem: a weighted graph, with a subset of its vertices selected as terminals. The goal of the Steiner tree problem is to connect these terminals by a tree whose weight is as small as possible. To transform this problem into an instance of the $k$ -minimum spanning tree problem, Ravi et al. (1996) attach to each terminal a tree of zero-weight edges with a large number $t$ of vertices per tree. (For a graph with $n$ vertices and $r$ terminals, they use $t = n - r - 1$ added vertices per tree.) Then, they ask for the $k$ -minimum spanning tree in this augmented graph with $k = rt$ . The only way to include this many vertices in a $k$ -spanning tree is to use at least one vertex from each added tree, for there are not enough vertices remaining if even one of the added trees is missed. However, for this choice of $k$ , it is possible for $k$ -spanning tree to include only as few edges of the original graph as are needed to connect all the terminals. Therefore, the $k$ -minimum spanning tree must be formed by combining the optimal Steiner tree with enough of the zero-weight edges of the added trees to make the total tree size large enough.^[2]

Even for a graph whose edge weights belong to the set ${1, 2, 3$ }, testing whether the optimal solution value is less than a given threshold is NP-complete. It remains NP-complete for planar graphs. The geometric version of the problem is also NP-hard, but not known to belong to NP, because of the difficulty of comparing sums of square roots; instead it lies in the class of problems reducible to the existential theory of the reals.^[2]

The $k$ -minimum spanning tree may be found in polynomial time for graphs of bounded treewidth, and for graphs with only two distinct edge weights.^[2]

Approximation algorithms

Because of the high computational complexity of finding an optimal solution to the $k$ -minimum spanning tree, much of the research on the problem has instead concentrated on approximation algorithms for the problem. The goal of such algorithms is to find an approximate solution in polynomial time with a small approximation ratio. The approximation ratio is defined as the ratio of the computed solution length to the optimal length for a worst-case instance, one that maximizes this ratio. Because the NP-hardness reduction for the $k$ -minimum spanning tree problem preserves the weight of all solutions, it also preserves the hardness of approximation of the problem. In particular, because the Steiner tree problem is NP-hard to approximate to an approximation ratio better than 96/95,^[3] the same is true for the $k$ -minimum spanning tree problem.

The best approximation known for the general problem achieves an approximation ratio of 2, and is by Garg (2005).^[4] This approximation relies heavily on the primal-dual schema of Goemans & Williamson (1992).^[5] When the input consists of points in the Euclidean plane (any two of which can be connected in the tree with cost equal to their distance) there exists a polynomial time approximation scheme devised by Arora (1998).^[6]

Related Research Articles

The travelling salesman problem (TSP) asks the following question: "Given a list of cities and the distances between each pair of cities, what is the shortest possible route that visits each city exactly once and returns to the origin city?" It is an NP-hard problem in combinatorial optimization, important in theoretical computer science and operations research.

A minimum spanning tree (MST) or minimum weight spanning tree is a subset of the edges of a connected, edge-weighted undirected graph that connects all the vertices together, without any cycles and with the minimum possible total edge weight. That is, it is a spanning tree whose sum of edge weights is as small as possible. More generally, any edge-weighted undirected graph has a minimum spanning forest, which is a union of the minimum spanning trees for its connected components.

The Bottleneck traveling salesman problem is a problem in discrete or combinatorial optimization. The problem is to find the Hamiltonian cycle in a weighted graph which minimizes the weight of the highest-weight edge of the cycle. It was first formulated by Gilmore & Gomory (1964) with some additional constraints, and in its full generality by Garfinkel & Gilbert (1978).

In the mathematical field of graph theory, a spanning treeT of an undirected graph G is a subgraph that is a tree which includes all of the vertices of G. In general, a graph may have several spanning trees, but a graph that is not connected will not contain a spanning tree. If all of the edges of G are also edges of a spanning tree T of G, then G is a tree and is identical to T.

In combinatorial mathematics, the Steiner tree problem, or minimum Steiner tree problem, named after Jakob Steiner, is an umbrella term for a class of problems in combinatorial optimization. While Steiner tree problems may be formulated in a number of settings, they all require an optimal interconnect for a given set of objects and a predefined objective function. One well-known variant, which is often used synonymously with the term Steiner tree problem, is the Steiner tree problem in graphs. Given an undirected graph with non-negative edge weights and a subset of vertices, usually referred to as terminals, the Steiner tree problem in graphs requires a tree of minimum weight that contains all terminals and minimizes the total weight of its edges. Further well-known variants are the Euclidean Steiner tree problem and the rectilinear minimum Steiner tree problem.

In graph theory, an independent set, stable set, coclique or anticlique is a set of vertices in a graph, no two of which are adjacent. That is, it is a set $of vertices such that for every two vertices in, there is no edge connecting the two. Equivalently, each edge in the graph has at most one endpoint in . A set is independent if and only if it is a clique in the graph's complement. The size of an independent set is the number of vertices it contains. Independent sets have also been called "internally stable sets", of which "stable set" is a shortening.$

In graph theory, a vertex cover of a graph is a set of vertices that includes at least one endpoint of every edge of the graph.

In computer science and operations research, approximation algorithms are efficient algorithms that find approximate solutions to optimization problems with provable guarantees on the distance of the returned solution to the optimal one. Approximation algorithms naturally arise in the field of theoretical computer science as a consequence of the widely believed P ≠ NP conjecture. Under this conjecture, a wide class of optimization problems cannot be solved exactly in polynomial time. The field of approximation algorithms, therefore, tries to understand how closely it is possible to approximate optimal solutions to such problems in polynomial time. In an overwhelming majority of the cases, the guarantee of such algorithms is a multiplicative one expressed as an approximation ratio or approximation factor i.e., the optimal solution is always guaranteed to be within a (predetermined) multiplicative factor of the returned solution. However, there are also many approximation algorithms that provide an additive guarantee on the quality of the returned solution. A notable example of an approximation algorithm that provides both is the classic approximation algorithm of Lenstra, Shmoys and Tardos for scheduling on unrelated parallel machines.

The set cover problem is a classical question in combinatorics, computer science, operations research, and complexity theory. It is one of Karp's 21 NP-complete problems shown to be NP-complete in 1972.

A Euclidean minimum spanning tree of a finite set of points in the Euclidean plane or higher-dimensional Euclidean space connects the points by a system of line segments with the points as endpoints, minimizing the total length of the segments. In it, any two points can reach each other along a path through the line segments. It can be found as the minimum spanning tree of a complete graph with the points as vertices and the Euclidean distances between points as edge weights.

In graph theory and graph algorithms, a feedback arc set or feedback edge set in a directed graph is a subset of the edges of the graph that contains at least one edge out of every cycle in the graph. Removing these edges from the graph breaks all of the cycles, producing a directed acyclic graph, an acyclic subgraph of the given graph. The feedback arc set with the fewest possible edges is the minimum feedback arc set and its removal leaves the maximum acyclic subgraph; weighted versions of these optimization problems are also used. If a feedback arc set is minimal, meaning that removing any edge from it produces a subset that is not a feedback arc set, then it has an additional property: reversing all of its edges, rather than removing them, produces a directed acyclic graph.

In graph theory, a cut is a partition of the vertices of a graph into two disjoint subsets. Any cut determines a cut-set, the set of edges that have one endpoint in each subset of the partition. These edges are said to cross the cut. In a connected graph, each cut-set determines a unique cut, and in some cases cuts are identified with their cut-sets rather than with their vertex partitions.

The Christofides algorithm or Christofides–Serdyukov algorithm is an algorithm for finding approximate solutions to the travelling salesman problem, on instances where the distances form a metric space . It is an approximation algorithm that guarantees that its solutions will be within a factor of 3/2 of the optimal solution length, and is named after Nicos Christofides and Anatoliy I. Serdyukov, who discovered it independently in 1976.

<span class="mw-page-title-main">Maximum cut</span> Problem of finding a maximum cut in a graph

For a graph, a maximum cut is a cut whose size is at least the size of any other cut. That is, it is a partition of the graph's vertices into two complementary sets $S$ and $T$ , such that the number of edges between $S$ and $T$ is as large as possible. Finding such a cut is known as the max-cut problem.

In computational geometry and computer science, the minimum-weight triangulation problem is the problem of finding a triangulation of minimal total edge length. That is, an input polygon or the convex hull of an input point set must be subdivided into triangles that meet edge-to-edge and vertex-to-vertex, in such a way as to minimize the sum of the perimeters of the triangles. The problem is NP-hard for point set inputs, but may be approximated to any desired degree of accuracy. For polygon inputs, it may be solved exactly in polynomial time. The minimum weight triangulation has also sometimes been called the optimal triangulation.

The rectilinear Steiner tree problem, minimum rectilinear Steiner tree problem (MRST), or rectilinear Steiner minimum tree problem (RSMT) is a variant of the geometric Steiner tree problem in the plane, in which the Euclidean distance is replaced with the rectilinear distance. The problem may be formally stated as follows: given n points in the plane, it is required to interconnect them all by a shortest network which consists only of vertical and horizontal line segments. It can be shown that such a network is a tree whose vertices are the input points plus some extra points.

In computer science, the minimum routing cost spanning tree of a weighted graph is a spanning tree minimizing the sum of pairwise distances between vertices in the tree. It is also called the optimum distance spanning tree, shortest total path length spanning tree, minimum total distance spanning tree, or minimum average distance spanning tree. In an unweighted graph, this is the spanning tree of minimum Wiener index. Hu (1974) writes that the problem of constructing these trees was proposed by Francesco Maffioli.

In network theory, the Wiener connector is a means of maximizing efficiency in connecting specified "query vertices" in a network. Given a connected, undirected graph and a set of query vertices in a graph, the minimum Wiener connector is an induced subgraph that connects the query vertices and minimizes the sum of shortest path distances among all pairs of vertices in the subgraph. In combinatorial optimization, the minimum Wiener connector problem is the problem of finding the minimum Wiener connector. It can be thought of as a version of the classic Steiner tree problem, where instead of minimizing the size of the tree, the objective is to minimize the distances in the subgraph.

In the study of hierarchical clustering, Dasgupta's objective is a measure of the quality of a clustering, defined from a similarity measure on the elements to be clustered. It is named after Sanjoy Dasgupta, who formulated it in 2016. Its key property is that, when the similarity comes from an ultrametric space, the optimal clustering for this quality measure follows the underlying structure of the ultrametric space. In this sense, clustering methods that produce good clusterings for this objective can be expected to approximate the ground truth underlying the given similarity measure.

A parameterized approximation algorithm is a type of algorithm that aims to find approximate solutions to NP-hard optimization problems in polynomial time in the input size and a function of a specific parameter. These algorithms are designed to combine the best aspects of both traditional approximation algorithms and fixed-parameter tractability.

References

1 2 Lozovanu, D.; Zelikovsky, A. (1993), "Minimal and bounded tree problems", Tezele Congresului XVIII al Academiei Romano-Americane, Kishniev, p. 25. As cited by Ravi et al. (1996).
1 2 3 4 5 Ravi, R.; Sundaram, R.; Marathe, M.; Rosenkrantz, D.; Ravi, S. (1996), "Spanning trees short or small", SIAM Journal on Discrete Mathematics , 9 (2): 178–200, arXiv: math/9409222 , doi:10.1137/S0895480194266331, S2CID 8253322 . A preliminary version of this work was presented earlier, at the 5th Annual ACM-SIAM Symposium on Discrete Algorithms, 1994, pp. 546–555.
↑ Chlebík, Miroslav; Chlebíková, Janka (2008), "The Steiner tree problem on graphs: Inapproximability results", Theoretical Computer Science , 406 (3): 207–214, doi: 10.1016/j.tcs.2008.06.046 .
↑ Garg, Naveen (2005), "Saving an epsilon: a 2-approximation for the k-MST problem in graphs", Proceedings of the 37th Annual ACM Symposium on Theory of Computing, pp. 396–402, doi:10.1145/1060590.1060650, S2CID 17089806 .
↑ Goemans, M.; Williamson, P. (1992), "A general approximation technique for constrained forest problems", SIAM Journal on Computing , 24 (2): 296–317, CiteSeerX 10.1.1.55.7342 , doi:10.1137/S0097539793242618, S2CID 1796896 .
↑ Arora, Sanjeev (1998), "Polynomial time approximation schemes for Euclidean traveling salesman and other geometric problems", Journal of the ACM , 45 (5): 753–782, doi: 10.1145/290179.290180 , S2CID 3023351 .

External links

Minimum k-spanning tree in "A compendium of NP optimization problems"
KCTLIB, KCTLIB -- A Library for the Edge-Weighted K-Cardinality Tree Problem

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[lz-1] 1 2 Lozovanu, D.; Zelikovsky, A. (1993), "Minimal and bounded tree problems", Tezele Congresului XVIII al Academiei Romano-Americane, Kishniev, p. 25. As cited by Ravi et al. (1996).

[rsmr-2] 1 2 3 4 5 Ravi, R.; Sundaram, R.; Marathe, M.; Rosenkrantz, D.; Ravi, S. (1996), "Spanning trees short or small", SIAM Journal on Discrete Mathematics , 9 (2): 178–200, arXiv: math/9409222 , doi:10.1137/S0895480194266331, S2CID 8253322 . A preliminary version of this work was presented earlier, at the 5th Annual ACM-SIAM Symposium on Discrete Algorithms, 1994, pp. 546–555.

[3] Chlebík, Miroslav; Chlebíková, Janka (2008), "The Steiner tree problem on graphs: Inapproximability results", Theoretical Computer Science , 406 (3): 207–214, doi: 10.1016/j.tcs.2008.06.046 .

[4] Garg, Naveen (2005), "Saving an epsilon: a 2-approximation for the k-MST problem in graphs", Proceedings of the 37th Annual ACM Symposium on Theory of Computing, pp. 396–402, doi:10.1145/1060590.1060650, S2CID 17089806 .

[5] Goemans, M.; Williamson, P. (1992), "A general approximation technique for constrained forest problems", SIAM Journal on Computing , 24 (2): 296–317, CiteSeerX 10.1.1.55.7342 , doi:10.1137/S0097539793242618, S2CID 1796896 .

[6] Arora, Sanjeev (1998), "Polynomial time approximation schemes for Euclidean traveling salesman and other geometric problems", Journal of the ACM , 45 (5): 753–782, doi: 10.1145/290179.290180 , S2CID 3023351 .

[1]

[2]

[3]

[4]

[5]

[6]