Longest path problem

Last updated

In graph theory and theoretical computer science, the longest path problem is the problem of finding a simple path of maximum length in a given graph. A path is called simple if it does not have any repeated vertices; the length of a path may either be measured by its number of edges, or (in weighted graphs) by the sum of the weights of its edges. In contrast to the shortest path problem, which can be solved in polynomial time in graphs without negative-weight cycles, the longest path problem is NP-hard and the decision version of the problem, which asks whether a path exists of at least some given length, is NP-complete. This means that the decision problem cannot be solved in polynomial time for arbitrary graphs unless P = NP. Stronger hardness results are also known showing that it is difficult to approximate. However, it has a linear time solution for directed acyclic graphs, which has important applications in finding the critical path in scheduling problems.

Contents

NP-hardness

The NP-hardness of the unweighted longest path problem can be shown using a reduction from the Hamiltonian path problem: a graph G has a Hamiltonian path if and only if its longest path has length n  1, where n is the number of vertices in G. Because the Hamiltonian path problem is NP-complete, this reduction shows that the decision version of the longest path problem is also NP-complete. In this decision problem, the input is a graph G and a number k; the desired output is yes if G contains a path of k or more edges, and no otherwise. [1]

If the longest path problem could be solved in polynomial time, it could be used to solve this decision problem, by finding a longest path and then comparing its length to the number k. Therefore, the longest path problem is NP-hard. The question "does there exist a simple path in a given graph with at least k edges" is NP-complete. [2]

In weighted complete graphs with non-negative edge weights, the weighted longest path problem is the same as the Travelling salesman path problem, because the longest path always includes all vertices. [3]

Acyclic graphs

A longest path between two given vertices s and t in a weighted graph G is the same thing as a shortest path in a graph −G derived from G by changing every weight to its negation. Therefore, if shortest paths can be found in −G, then longest paths can also be found in G. [4]

For most graphs, this transformation is not useful because it creates cycles of negative length in −G. But if G is a directed acyclic graph (DAG), then no negative cycles can be created, and a longest path in G can be found in linear time by applying a linear time algorithm for shortest paths in −G, which is also a directed acyclic graph. [4] For a DAG, the longest path from a source vertex to all other vertices can be obtained by running the shortest-path algorithm on −G.

Similarly, for each vertex v in a given DAG, the length of the longest path ending at v may be obtained by the following steps:

  1. Find a topological ordering of the given DAG.
  2. For each vertex v of the DAG, in the topological ordering, compute the length of the longest path ending at v by looking at its incoming neighbors and adding one to the maximum length recorded for those neighbors. If v has no incoming neighbors, set the length of the longest path ending at v to zero. In either case, record this number so that later steps of the algorithm can access it.

Once this has been done, the longest path in the whole DAG may be obtained by starting at the vertex v with the largest recorded value, then repeatedly stepping backwards to its incoming neighbor with the largest recorded value, and reversing the sequence of vertices found in this way.

This is equivalent to running the shortest-path algorithm on −G.

Critical paths

The critical path method for scheduling a set of activities involves the construction of a directed acyclic graph in which the vertices represent project milestones and the edges represent activities that must be performed after one milestone and before another; each edge is weighted by an estimate of the amount of time the corresponding activity will take to complete. In such a graph, the longest path from the first milestone to the last one is the critical path, which describes the total time for completing the project. [4]

Longest paths of directed acyclic graphs may also be applied in layered graph drawing: assigning each vertex v of a directed acyclic graph G to the layer whose number is the length of the longest path ending at v results in a layer assignment for G with the minimum possible number of layers. [5]

Approximation

Björklund, Husfeldt & Khanna (2004) write that the longest path problem in unweighted undirected graphs "is notorious for the difficulty of understanding its approximation hardness". [6] The best polynomial time approximation algorithm known for this case achieves only a very weak approximation ratio, . [7] For all , it is not possible to approximate the longest path to within a factor of unless NP is contained within quasi-polynomial deterministic time; however, there is a big gap between this inapproximability result and the known approximation algorithms for this problem. [8]

In the case of unweighted but directed graphs, strong inapproximability results are known. For every the problem cannot be approximated to within a factor of unless P = NP, and with stronger complexity-theoretic assumptions it cannot be approximated to within a factor of . [6] The color-coding technique can be used to find paths of logarithmic length, if they exist, but this gives an approximation ratio of only . [9]

Parameterized complexity

The longest path problem is fixed-parameter tractable when parameterized by the length of the path. For instance, it can be solved in time linear in the size of the input graph (but exponential in the length of the path), by an algorithm that performs the following steps:

  1. Perform a depth-first search of the graph. Let be the height of the resulting depth-first search tree.
  2. Use the sequence of root-to-leaf paths of the depth-first search tree, in the order in which they were traversed by the search, to construct a path decomposition of the graph, with pathwidth .
  3. Apply dynamic programming to this path decomposition to find a longest path in time , where is the number of vertices in the graph.

Since the output path has length at least as large as , the running time is also bounded by , where is the length of the longest path. [10] Using color-coding, the dependence on path length can be reduced to singly exponential. [9] [11] [12] [13] A similar dynamic programming technique shows that the longest path problem is also fixed-parameter tractable when parameterized by the treewidth of the graph.

For graphs of bounded clique-width, the longest path can also be solved by a polynomial time dynamic programming algorithm. However, the exponent of the polynomial depends on the clique-width of the graph, so this algorithms is not fixed-parameter tractable. The longest path problem, parameterized by clique-width, is hard for the parameterized complexity class , showing that a fixed-parameter tractable algorithm is unlikely to exist. [14]

Special classes of graphs

A linear-time algorithm for finding a longest path in a tree was proposed by Edsger Dijkstra around 1960, while a formal proof of this algorithm was published in 2002. [15] Furthermore, a longest path can be computed in polynomial time on weighted trees, on block graphs, on cacti, [16] on bipartite permutation graphs, [17] and on Ptolemaic graphs. [18]

For the class of interval graphs, an -time algorithm is known, which uses a dynamic programming approach. [19] This dynamic programming approach has been exploited to obtain polynomial-time algorithms on the greater classes of circular-arc graphs [20] and of co-comparability graphs (i.e. of the complements of comparability graphs, which also contain permutation graphs), [21] both having the same running time . The latter algorithm is based on special properties of the lexicographic depth first search (LDFS) vertex ordering [22] of co-comparability graphs. For co-comparability graphs also an alternative polynomial-time algorithm with higher running time is known, which is based on the Hasse diagram of the partially ordered set defined by the complement of the input co-comparability graph. [23]

Furthermore, the longest path problem is solvable in polynomial time on any class of graphs with bounded treewidth or bounded clique-width, such as the distance-hereditary graphs. Finally, it is clearly NP-hard on all graph classes on which the Hamiltonian path problem is NP-hard, such as on split graphs, circle graphs, and planar graphs.

A simple model of a directed acyclic graph is the Price model, developed by Derek J. de Solla Price to represent citation networks. This is simple enough to allow for analytic results to be found for some properties. For instance, the length of the longest path, from the n-th node added to the network to the first node in the network, scales as [24] .

See also

Related Research Articles

<span class="mw-page-title-main">Directed acyclic graph</span> Directed graph with no directed cycles

In mathematics, particularly graph theory, and computer science, a directed acyclic graph (DAG) is a directed graph with no directed cycles. That is, it consists of vertices and edges, with each edge directed from one vertex to another, such that following those directions will never form a closed loop. A directed graph is a DAG if and only if it can be topologically ordered, by arranging the vertices as a linear ordering that is consistent with all edge directions. DAGs have numerous scientific and computational applications, ranging from biology to information science to computation (scheduling).

<span class="mw-page-title-main">Graph coloring</span> Methodic assignment of colors to elements of a graph

In graph theory, graph coloring is a special case of graph labeling; it is an assignment of labels traditionally called "colors" to elements of a graph subject to certain constraints. In its simplest form, it is a way of coloring the vertices of a graph such that no two adjacent vertices are of the same color; this is called a vertex coloring. Similarly, an edge coloring assigns a color to each edge so that no two adjacent edges are of the same color, and a face coloring of a planar graph assigns a color to each face or region so that no two faces that share a boundary have the same color.

<span class="mw-page-title-main">Steiner tree problem</span> On short connecting networks with added vertices

In combinatorial mathematics, the Steiner tree problem, or minimum Steiner tree problem, named after Jakob Steiner, is an umbrella term for a class of problems in combinatorial optimization. While Steiner tree problems may be formulated in a number of settings, they all require an optimal interconnect for a given set of objects and a predefined objective function. One well-known variant, which is often used synonymously with the term Steiner tree problem, is the Steiner tree problem in graphs. Given an undirected graph with non-negative edge weights and a subset of vertices, usually referred to as terminals, the Steiner tree problem in graphs requires a tree of minimum weight that contains all terminals and minimizes the total weight of its edges. Further well-known variants are the Euclidean Steiner tree problem and the rectilinear minimum Steiner tree problem.

<span class="mw-page-title-main">Independent set (graph theory)</span> Unrelated vertices in graphs

In graph theory, an independent set, stable set, coclique or anticlique is a set of vertices in a graph, no two of which are adjacent. That is, it is a set of vertices such that for every two vertices in , there is no edge connecting the two. Equivalently, each edge in the graph has at most one endpoint in . A set is independent if and only if it is a clique in the graph's complement. The size of an independent set is the number of vertices it contains. Independent sets have also been called "internally stable sets", of which "stable set" is a shortening.

In the mathematical discipline of graph theory, a matching or independent edge set in an undirected graph is a set of edges without common vertices. In other words, a subset of the edges is a matching if each vertex appears in at most one edge of that matching. Finding a matching in a bipartite graph can be treated as a network flow problem.

In computer science, a topological sort or topological ordering of a directed graph is a linear ordering of its vertices such that for every directed edge (u,v) from vertex u to vertex v, u comes before v in the ordering. For instance, the vertices of the graph may represent tasks to be performed, and the edges may represent constraints that one task must be performed before another; in this application, a topological ordering is just a valid sequence for the tasks. Precisely, a topological sort is a graph traversal in which each node v is visited only after all its dependencies are visited. A topological ordering is possible if and only if the graph has no directed cycles, that is, if it is a directed acyclic graph (DAG). Any DAG has at least one topological ordering, and algorithms are known for constructing a topological ordering of any DAG in linear time. Topological sorting has many applications, especially in ranking problems such as feedback arc set. Topological sorting is possible even when the DAG has disconnected components.

<span class="mw-page-title-main">Euclidean minimum spanning tree</span> Shortest network connecting points

A Euclidean minimum spanning tree of a finite set of points in the Euclidean plane or higher-dimensional Euclidean space connects the points by a system of line segments with the points as endpoints, minimizing the total length of the segments. In it, any two points can reach each other along a path through the line segments. It can be found as the minimum spanning tree of a complete graph with the points as vertices and the Euclidean distances between points as edge weights.

In graph theory, the metric dimension of a graph G is the minimum cardinality of a subset S of vertices such that all other vertices are uniquely determined by their distances to the vertices in S. Finding the metric dimension of a graph is an NP-hard problem; the decision version, determining whether the metric dimension is less than a given value, is NP-complete.

<span class="mw-page-title-main">Dominating set</span> Subset of a graphs nodes such that all other nodes link to at least one

In graph theory, a dominating set for a graph G is a subset D of its vertices, such that any vertex of G is in D, or has a neighbor in D. The domination numberγ(G) is the number of vertices in a smallest dominating set for G.

In graph theory, a domatic partition of a graph is a partition of into disjoint sets , ,..., such that each Vi is a dominating set for G. The figure on the right shows a domatic partition of a graph; here the dominating set consists of the yellow vertices, consists of the green vertices, and consists of the blue vertices.

In the mathematical discipline of graph theory, a feedback vertex set (FVS) of a graph is a set of vertices whose removal leaves a graph without cycles. Equivalently, each FVS contains at least one vertex of any cycle in the graph. The feedback vertex set number of a graph is the size of a smallest feedback vertex set. The minimum feedback vertex set problem is an NP-complete problem; it was among the first problems shown to be NP-complete. It has wide applications in operating systems, database systems, and VLSI chip design.

<span class="mw-page-title-main">Feedback arc set</span> Edges that hit all cycles in a graph

In graph theory and graph algorithms, a feedback arc set or feedback edge set in a directed graph is a subset of the edges of the graph that contains at least one edge out of every cycle in the graph. Removing these edges from the graph breaks all of the cycles, producing an acyclic subgraph of the given graph, often called a directed acyclic graph. A feedback arc set with the fewest possible edges is a minimum feedback arc set and its removal leaves a maximum acyclic subgraph; weighted versions of these optimization problems are also used. If a feedback arc set is minimal, meaning that removing any edge from it produces a subset that is not a feedback arc set, then it has an additional property: reversing all of its edges, rather than removing them, produces a directed acyclic graph.

In graph theory, a connected graph is k-edge-connected if it remains connected whenever fewer than k edges are removed.

In computer science, a kernelization is a technique for designing efficient algorithms that achieve their efficiency by a preprocessing stage in which inputs to the algorithm are replaced by a smaller input, called a "kernel". The result of solving the problem on the kernel should either be the same as on the original input, or it should be easy to transform the output on the kernel to the desired output for the original problem.

In graph theory, the planar separator theorem is a form of isoperimetric inequality for planar graphs, that states that any planar graph can be split into smaller pieces by removing a small number of vertices. Specifically, the removal of vertices from an n-vertex graph can partition the graph into disjoint subgraphs each of which has at most vertices.

In the mathematical field of graph theory, the intersection number of a graph is the smallest number of elements in a representation of as an intersection graph of finite sets. In such a representation, each vertex is represented as a set, and two vertices are connected by an edge whenever their sets have a common element. Equivalently, the intersection number is the smallest number of cliques needed to cover all of the edges of .

<span class="mw-page-title-main">Dense subgraph</span> Highly connected subgraph

In graph theory and computer science, a dense subgraph is a subgraph with many edges per vertex. This is formalized as follows: let G = (V, E) be an undirected graph and let S = (VS, ES) be a subgraph of G. Then the density of S is defined to be:

In network theory, the Wiener connector is a means of maximizing efficiency in connecting specified "query vertices" in a network. Given a connected, undirected graph and a set of query vertices in a graph, the minimum Wiener connector is an induced subgraph that connects the query vertices and minimizes the sum of shortest path distances among all pairs of vertices in the subgraph. In combinatorial optimization, the minimum Wiener connector problem is the problem of finding the minimum Wiener connector. It can be thought of as a version of the classic Steiner tree problem, where instead of minimizing the size of the tree, the objective is to minimize the distances in the subgraph.

<span class="mw-page-title-main">Cutwidth</span> Property in graph theory

In graph theory, the cutwidth of an undirected graph is the smallest integer with the following property: there is an ordering of the vertices of the graph, such that every cut obtained by partitioning the vertices into earlier and later subsets of the ordering is crossed by at most edges. That is, if the vertices are numbered , then for every , the number of edges with and is at most .

The highway dimension is a graph parameter modelling transportation networks, such as road networks or public transportation networks. It was first formally defined by Abraham et al. based on the observation by Bast et al. that any road network has a sparse set of "transit nodes", such that driving from a point A to a sufficiently far away point B along the shortest route will always pass through one of these transit nodes. It has also been proposed that the highway dimension captures the properties of public transportation networks well, given that longer routes using busses, trains, or airplanes will typically be serviced by larger transit hubs. This relates to the spoke–hub distribution paradigm in transport topology optimization.

References

  1. Schrijver, Alexander (2003), Combinatorial Optimization: Polyhedra and Efficiency, Volume 1, Algorithms and Combinatorics, vol. 24, Springer, p. 114, ISBN   9783540443896 .
  2. Cormen, Thomas H.; Leiserson, Charles E.; Rivest, Ronald L.; Stein, Clifford (2001), Introduction To Algorithms (2nd ed.), MIT Press, p. 978, ISBN   9780262032933 .
  3. Lawler, Eugene L. (2001), Combinatorial Optimization: Networks and Matroids, Courier Dover Publications, p. 64, ISBN   9780486414539 .
  4. 1 2 3 Sedgewick, Robert; Wayne, Kevin Daniel (2011), Algorithms (4th ed.), Addison-Wesley Professional, pp. 661–666, ISBN   9780321573513 .
  5. Di Battista, Giuseppe; Eades, Peter; Tamassia, Roberto; Tollis, Ioannis G. (1998), "Layered Drawings of Digraphs", Graph Drawing: Algorithms for the Visualization of Graphs, Prentice Hall, pp. 265–302, ISBN   978-0-13-301615-4 .
  6. 1 2 Björklund, Andreas; Husfeldt, Thore; Khanna, Sanjeev (2004), "Approximating longest directed paths and cycles", Proc. Int. Coll. Automata, Languages and Programming (ICALP 2004), Lecture Notes in Computer Science, vol. 3142, Berlin: Springer-Verlag, pp. 222–233, MR   2160935 .
  7. Gabow, Harold N.; Nie, Shuxin (2008), "Finding long paths, cycles and circuits", International Symposium on Algorithms and Computation, Lecture Notes in Computer Science, vol. 5369, Berlin: Springer, pp. 752–763, doi:10.1007/978-3-540-92182-0_66, ISBN   978-3-540-92181-3, MR   2539968 . For earlier work with even weaker approximation bounds, see Gabow, Harold N. (2007), "Finding paths and cycles of superpolylogarithmic length" (PDF), SIAM Journal on Computing , 36 (6): 1648–1671, doi:10.1137/S0097539704445366, MR   2299418 and Björklund, Andreas; Husfeldt, Thore (2003), "Finding a path of superlogarithmic length", SIAM Journal on Computing , 32 (6): 1395–1402, doi:10.1137/S0097539702416761, MR   2034242 .
  8. Karger, David; Motwani, Rajeev; Ramkumar, G. D. S. (1997), "On approximating the longest path in a graph", Algorithmica , 18 (1): 82–98, doi:10.1007/BF02523689, MR   1432030, S2CID   3241830 .
  9. 1 2 Alon, Noga; Yuster, Raphael; Zwick, Uri (1995), "Color-coding", Journal of the ACM , 42 (4): 844–856, doi: 10.1145/210332.210337 , MR   1411787, S2CID   208936467 .
  10. Bodlaender, Hans L. (1993), "On linear time minor tests with depth-first search", Journal of Algorithms, 14 (1): 1–23, doi:10.1006/jagm.1993.1001, MR   1199244 . For an earlier FPT algorithm with slightly better dependence on the path length, but worse dependence on the size of the graph, see Monien, B. (1985), "How to find long paths efficiently", Analysis and design of algorithms for combinatorial problems (Udine, 1982), North-Holland Math. Stud., vol. 109, Amsterdam: North-Holland, pp. 239–254, doi:10.1016/S0304-0208(08)73110-4, ISBN   9780444876997, MR   0808004 .
  11. Chen, Jianer; Lu, Songjian; Sze, Sing-Hoi; Zhang, Fenghui (2007), "Improved algorithms for path, matching, and packing problems", Proc. 18th ACM-SIAM Symposium on Discrete algorithms (SODA '07) (PDF), pp. 298–307.
  12. Koutis, Ioannis (2008), "Faster algebraic algorithms for path and packing problems", International Colloquium on Automata, Languages and Programming (PDF), Lecture Notes in Computer Science, vol. 5125, Berlin: Springer, pp. 575–586, CiteSeerX   10.1.1.141.6899 , doi:10.1007/978-3-540-70575-8_47, ISBN   978-3-540-70574-1, MR   2500302, archived from the original (PDF) on 2017-08-09, retrieved 2013-08-09.
  13. Williams, Ryan (2009), "Finding paths of length k in O*(2k) time", Information Processing Letters, 109 (6): 315–318, arXiv: 0807.3026 , doi:10.1016/j.ipl.2008.11.004, MR   2493730, S2CID   10295448 .
  14. Fomin, Fedor V.; Golovach, Petr A.; Lokshtanov, Daniel; Saurabh, Saket (2009), "Clique-width: on the price of generality", Proc. 20th ACM-SIAM Symposium on Discrete Algorithms (SODA '09) (PDF), pp. 825–834, archived from the original (PDF) on 2012-10-18, retrieved 2012-12-01.
  15. Bulterman, R.W.; van der Sommen, F.W.; Zwaan, G.; Verhoeff, T.; van Gasteren, A.J.M. (2002), "On computing a longest path in a tree", Information Processing Letters, 81 (2): 93–96, doi:10.1016/S0020-0190(01)00198-3 .
  16. Uehara, Ryuhei; Uno, Yushi (2004), "Efficient algorithms for the longest path problem", in Fleischer, Rudolf; Trippen, Gerhard (eds.), Algorithms and Computation, 15th International Symposium, ISAAC 2004, Hong Kong, China, December 20-22, 2004, Proceedings, Lecture Notes in Computer Science, vol. 3341, Springer, pp. 871–883, doi:10.1007/978-3-540-30551-4_74, ISBN   978-3-540-24131-7 .
  17. Uehara, Ryuhei; Valiente, Gabriel (2007), "Linear structure of bipartite permutation graphs and the longest path problem", Information Processing Letters, 103 (2): 71–77, CiteSeerX   10.1.1.101.96 , doi:10.1016/j.ipl.2007.02.010 .
  18. Takahara, Yoshihiro; Teramoto, Sachio; Uehara, Ryuhei (2008), "Longest path problems on Ptolemaic graphs", IEICE Transactions, 91-D (2): 170–177, doi: 10.1093/ietisy/e91-d.2.170 .
  19. Ioannidou, Kyriaki; Mertzios, George B.; Nikolopoulos, Stavros D. (2011), "The longest path problem has a polynomial solution on interval graphs", Algorithmica, 61 (2): 320–341, CiteSeerX   10.1.1.224.4927 , doi:10.1007/s00453-010-9411-3, S2CID   7577817 .
  20. Mertzios, George B.; Bezakova, Ivona (2014), "Computing and counting longest paths on circular-arc graphs in polynomial time", Discrete Applied Mathematics, 164 (2): 383–399, CiteSeerX   10.1.1.224.779 , doi:10.1016/j.dam.2012.08.024 .
  21. Mertzios, George B.; Corneil, Derek G. (2012), "A simple polynomial algorithm for the longest path problem on cocomparability graphs", SIAM Journal on Discrete Mathematics, 26 (3): 940–963, arXiv: 1004.4560 , doi:10.1137/100793529, S2CID   4645245 .
  22. Corneil, Derek G.; Krueger, Richard (2008), "A unified view of graph searching", SIAM Journal on Discrete Mathematics, 22 (4): 1259–1276, doi:10.1137/050623498 .
  23. Ioannidou, Kyriaki; Nikolopoulos, Stavros D. (2011), "The longest path problem is polynomial on cocomparability graphs" (PDF), Algorithmica, 65: 177–205, CiteSeerX   10.1.1.415.9996 , doi:10.1007/s00453-011-9583-5, S2CID   7271040 .
  24. Evans, T.S.; Calmon, L.; Vasiliauskaite, V. (2020), "The Longest Path in the Price Model", Scientific Reports, 10 (1): 10503, arXiv: 1903.03667 , Bibcode:2020NatSR..1010503E, doi:10.1038/s41598-020-67421-8, PMC   7324613 , PMID   32601403