Graph traversal

Last updated

In computer science, graph traversal (also known as graph search) refers to the process of visiting (checking and/or updating) each vertex in a graph. Such traversals are classified by the order in which the vertices are visited. Tree traversal is a special case of graph traversal.

Contents

Redundancy

Unlike tree traversal, graph traversal may require that some vertices be visited more than once, since it is not necessarily known before transitioning to a vertex that it has already been explored. As graphs become more dense, this redundancy becomes more prevalent, causing computation time to increase; as graphs become more sparse, the opposite holds true.

Thus, it is usually necessary to remember which vertices have already been explored by the algorithm, so that vertices are revisited as infrequently as possible (or in the worst case, to prevent the traversal from continuing indefinitely). This may be accomplished by associating each vertex of the graph with a "color" or "visitation" state during the traversal, which is then checked and updated as the algorithm visits each vertex. If the vertex has already been visited, it is ignored and the path is pursued no further; otherwise, the algorithm checks/updates the vertex and continues down its current path.

Several special cases of graphs imply the visitation of other vertices in their structure, and thus do not require that visitation be explicitly recorded during the traversal. An important example of this is a tree: during a traversal it may be assumed that all "ancestor" vertices of the current vertex (and others depending on the algorithm) have already been visited. Both the depth-first and breadth-first graph searches are adaptations of tree-based algorithms, distinguished primarily by the lack of a structurally determined "root" vertex and the addition of a data structure to record the traversal's visitation state.

Graph traversal algorithms

Note. — If each vertex in a graph is to be traversed by a tree-based algorithm (such as DFS or BFS), then the algorithm must be called at least once for each connected component of the graph. This is easily accomplished by iterating through all the vertices of the graph, performing the algorithm on each vertex that is still unvisited when examined.

A depth-first search (DFS) is an algorithm for traversing a finite graph. DFS visits the child vertices before visiting the sibling vertices; that is, it traverses the depth of any particular path before exploring its breadth. A stack (often the program's call stack via recursion) is generally used when implementing the algorithm.

The algorithm begins with a chosen "root" vertex; it then iteratively transitions from the current vertex to an adjacent, unvisited vertex, until it can no longer find an unexplored vertex to transition to from its current location. The algorithm then backtracks along previously visited vertices, until it finds a vertex connected to yet more uncharted territory. It will then proceed down the new path as it had before, backtracking as it encounters dead-ends, and ending only when the algorithm has backtracked past the original "root" vertex from the very first step.

DFS is the basis for many graph-related algorithms, including topological sorts and planarity testing.

Pseudocode

  • Input: A graph G and a vertex v of G.
  • Output: A labeling of the edges in the connected component of v as discovery edges and back edges.
procedure DFS(G, v) is     label v as explored     for all edges e in G.incidentEdges(v) doif edge e is unexplored thenwG.adjacentVertex(v, e)             if vertex w is unexplored then                 label e as a discovered edge                 recursively call DFS(G, w)             else                label e as a back edge

A breadth-first search (BFS) is another technique for traversing a finite graph. BFS visits the sibling vertices before visiting the child vertices, and a queue is used in the search process. This algorithm is often used to find the shortest path from one vertex to another.

Pseudocode

  • Input: A graph G and a vertex v of G.
  • Output: The closest vertex to v satisfying some conditions, or null if no such vertex exists.
procedure BFS(G, v) is     create a queue Q     enqueue v onto Q     mark vwhileQ is not empty dowQ.dequeue()         ifw is what we are looking for then             return wfor all edges e in G.adjacentEdges(w) doxG.adjacentVertex(w, e)             ifx is not marked then                 mark x                 enqueue x onto Qreturn null

Applications

Breadth-first search can be used to solve many problems in graph theory, for example:

Graph exploration

The problem of graph exploration can be seen as a variant of graph traversal. It is an online problem, meaning that the information about the graph is only revealed during the runtime of the algorithm. A common model is as follows: given a connected graph G = (V, E) with non-negative edge weights. The algorithm starts at some vertex, and knows all incident outgoing edges and the vertices at the end of these edges—but not more. When a new vertex is visited, then again all incident outgoing edges and the vertices at the end are known. The goal is to visit all n vertices and return to the starting vertex, but the sum of the weights of the tour should be as small as possible. The problem can also be understood as a specific version of the travelling salesman problem, where the salesman has to discover the graph on the go.

For general graphs, the best known algorithms for both undirected and directed graphs is a simple greedy algorithm:

Universal traversal sequences

A universal traversal sequence is a sequence of instructions comprising a graph traversal for any regular graph with a set number of vertices and for any starting vertex. A probabilistic proof was used by Aleliunas et al. to show that there exists a universal traversal sequence with number of instructions proportional to O(n5) for any regular graph with n vertices. [6] The steps specified in the sequence are relative to the current node, not absolute. For example, if the current node is vj, and vj has d neighbors, then the traversal sequence will specify the next node to visit, vj+1, as the ith neighbor of vj, where 1 ≤ id.

See also

Related Research Articles

<span class="mw-page-title-main">Tree (graph theory)</span> Undirected, connected and acyclic graph

In graph theory, a tree is an undirected graph in which any two vertices are connected by exactly one path, or equivalently a connected acyclic undirected graph. A forest is an undirected graph in which any two vertices are connected by at most one path, or equivalently an acyclic undirected graph, or equivalently a disjoint union of trees.

<span class="mw-page-title-main">Breadth-first search</span> Algorithm to search the nodes of a graph

Breadth-first search (BFS) is an algorithm for searching a tree data structure for a node that satisfies a given property. It starts at the tree root and explores all nodes at the present depth prior to moving on to the nodes at the next depth level. Extra memory, usually a queue, is needed to keep track of the child nodes that were encountered but not yet explored.

<span class="mw-page-title-main">Depth-first search</span> Search algorithm

Depth-first search (DFS) is an algorithm for traversing or searching tree or graph data structures. The algorithm starts at the root node and explores as far as possible along each branch before backtracking. Extra memory, usually a stack, is needed to keep track of the nodes discovered so far along a specified branch which helps in backtracking of the graph.

The Hamiltonian path problem is a topic discussed in the fields of complexity theory and graph theory. It decides if a directed or undirected graph, G, contains a Hamiltonian path, a path that visits every vertex in the graph exactly once. The problem may specify the start and end of the path, in which case the starting vertex s and ending vertex t must be identified.

<span class="mw-page-title-main">Cycle (graph theory)</span> Trail in which only the first and last vertices are equal.

In graph theory, a cycle in a graph is a non-empty trail in which only the first and last vertices are equal. A directed cycle in a directed graph is a non-empty directed trail in which only the first and last vertices are equal.

This is a glossary of graph theory. Graph theory is the study of graphs, systems of nodes or vertices connected in pairs by lines or edges.

<span class="mw-page-title-main">Eulerian path</span> Trail in a graph that visits each edge once

In graph theory, an Eulerian trail is a trail in a finite graph that visits every edge exactly once. Similarly, an Eulerian circuit or Eulerian cycle is an Eulerian trail that starts and ends on the same vertex. They were first discussed by Leonhard Euler while solving the famous Seven Bridges of Königsberg problem in 1736. The problem can be stated mathematically like this:

<span class="mw-page-title-main">Spanning tree</span> Tree which includes all vertices of a graph

In the mathematical field of graph theory, a spanning treeT of an undirected graph G is a subgraph that is a tree which includes all of the vertices of G. In general, a graph may have several spanning trees, but a graph that is not connected will not contain a spanning tree. If all of the edges of G are also edges of a spanning tree T of G, then G is a tree and is identical to T.

<span class="mw-page-title-main">Graph (abstract data type)</span> Abstract data type in computer science

In computer science, a graph is an abstract data type that is meant to implement the undirected graph and directed graph concepts from the field of graph theory within mathematics.

<span class="mw-page-title-main">Strongly connected component</span> Partition of a graph whose components are reachable from all vertices

In the mathematical theory of directed graphs, a graph is said to be strongly connected if every vertex is reachable from every other vertex. The strongly connected components of a directed graph form a partition into subgraphs that are themselves strongly connected. It is possible to test the strong connectivity of a graph, or to find its strongly connected components, in linear time (that is, Θ(V + E )).

In computer science, a topological sort or topological ordering of a directed graph is a linear ordering of its vertices such that for every directed edge (u,v) from vertex u to vertex v, u comes before v in the ordering. For instance, the vertices of the graph may represent tasks to be performed, and the edges may represent constraints that one task must be performed before another; in this application, a topological ordering is just a valid sequence for the tasks. Precisely, a topological sort is a graph traversal in which each node v is visited only after all its dependencies are visited. A topological ordering is possible if and only if the graph has no directed cycles, that is, if it is a directed acyclic graph (DAG). Any DAG has at least one topological ordering, and algorithms are known for constructing a topological ordering of any DAG in linear time. Topological sorting has many applications, especially in ranking problems such as feedback arc set. Topological sorting is possible even when the DAG has disconnected components.

<span class="mw-page-title-main">Bridge (graph theory)</span> Edge in node-link graph whose removal would disconnect the graph

In graph theory, a bridge, isthmus, cut-edge, or cut arc is an edge of a graph whose deletion increases the graph's number of connected components. Equivalently, an edge is a bridge if and only if it is not contained in any cycle. For a connected graph, a bridge can uniquely determine a cut. A graph is said to be bridgeless or isthmus-free if it contains no bridges.

<span class="mw-page-title-main">Connectivity (graph theory)</span> Basic concept of graph theory

In mathematics and computer science, connectivity is one of the basic concepts of graph theory: it asks for the minimum number of elements that need to be removed to separate the remaining nodes into two or more isolated subgraphs. It is closely related to the theory of network flow problems. The connectivity of a graph is an important measure of its resilience as a network.

In graph theory, a connected dominating set and a maximum leaf spanning tree are two closely related structures defined on an undirected graph.

<span class="mw-page-title-main">Biconnected component</span> Maximal biconnected subgraph

In graph theory, a biconnected component or block is a maximal biconnected subgraph. Any connected graph decomposes into a tree of biconnected components called the block-cut tree of the graph. The blocks are attached to each other at shared vertices called cut vertices or separating vertices or articulation points. Specifically, a cut vertex is any vertex whose removal increases the number of connected components. A block containing at most one cut vertex is called a leaf block, it corresponds to a leaf vertex in the block-cut tree.

In computer science, the Hopcroft–Karp algorithm is an algorithm that takes a bipartite graph as input and produces a maximum-cardinality matching as output — a set of as many edges as possible with the property that no two edges share an endpoint. It runs in time in the worst case, where is set of edges in the graph, is set of vertices of the graph, and it is assumed that . In the case of dense graphs the time bound becomes , and for sparse random graphs it runs in time with high probability.

<span class="mw-page-title-main">Tarjan's strongly connected components algorithm</span> Graph algorithm

Tarjan's strongly connected components algorithm is an algorithm in graph theory for finding the strongly connected components (SCCs) of a directed graph. It runs in linear time, matching the time bound for alternative methods including Kosaraju's algorithm and the path-based strong component algorithm. The algorithm is named for its inventor, Robert Tarjan.

<span class="mw-page-title-main">Shortest-path tree</span>

In mathematics and computer science, a shortest-path tree rooted at a vertex v of a connected, undirected graph G is a spanning tree T of G, such that the path distance from root v to any other vertex u in T is the shortest path distance from v to u in G.

<span class="mw-page-title-main">Directed graph</span> Graph with oriented edges

In mathematics, and more specifically in graph theory, a directed graph is a graph that is made up of a set of vertices connected by directed edges, often called arcs.

References

  1. Rosenkrantz, Daniel J.; Stearns, Richard E.; Lewis, II, Philip M. (1977). "An Analysis of Several Heuristics for the Traveling Salesman Problem". SIAM Journal on Computing. 6 (3): 563–581. doi:10.1137/0206041. S2CID   14764079.
  2. Birx, Alexander; Disser, Yann; Hopp, Alexander V.; Karousatou, Christina (May 2021). "An improved lower bound for competitive graph exploration". Theoretical Computer Science. 868: 65–86. arXiv: 2002.10958 . doi:10.1016/j.tcs.2021.04.003. S2CID   211296296.
  3. Miyazaki, Shuichi; Morimoto, Naoyuki; Okabe, Yasuo (2009). "The Online Graph Exploration Problem on Restricted Graphs". IEICE Transactions on Information and Systems. E92-D (9): 1620–1627. Bibcode:2009IEITI..92.1620M. doi:10.1587/transinf.E92.D.1620. hdl: 2433/226939 . S2CID   8355092.
  4. Brandt, Sebastian; Foerster, Klaus-Tycho; Maurer, Jonathan; Wattenhofer, Roger (November 2020). "Online graph exploration on a restricted graph class: Optimal solutions for tadpole graphs". Theoretical Computer Science. 839: 176–185. arXiv: 1903.00581 . doi:10.1016/j.tcs.2020.06.007. S2CID   67856035.
  5. Foerster, Klaus-Tycho; Wattenhofer, Roger (December 2016). "Lower and upper competitive bounds for online directed graph exploration". Theoretical Computer Science. 655: 15–29. doi: 10.1016/j.tcs.2015.11.017 .
  6. Aleliunas, R.; Karp, R.; Lipton, R.; Lovász, L.; Rackoff, C. (1979). "Random walks, universal traversal sequences, and the complexity of maze problems". 20th Annual Symposium on Foundations of Computer Science (SFCS 1979): 218–223. doi:10.1109/SFCS.1979.34. S2CID   18719861.