Breadth-first search

Last updated
Breadth-first search
Order in which the nodes get expanded Breadth-first-tree.svg
Order in which the nodes get expanded
Order in which the nodes are expanded
Class Search algorithm
Data structure Graph
Worst-case performance
Worst-case space complexity
Optimalyes (for unweighted graphs)
Animated example of a breadth-first search. Black: explored, grey: queued to be explored later on Animated BFS.gif
Animated example of a breadth-first search. Black: explored, grey: queued to be explored later on
BFS on Maze-solving algorithm BFS-Algorithm Search Way.gif
BFS on Maze-solving algorithm
Top part of Tic-tac-toe game tree Tic-tac-toe-game-tree.svg
Top part of Tic-tac-toe game tree

Breadth-first search (BFS) is an algorithm for searching a tree data structure for a node that satisfies a given property. It starts at the tree root and explores all nodes at the present depth prior to moving on to the nodes at the next depth level. Extra memory, usually a queue, is needed to keep track of the child nodes that were encountered but not yet explored.

Contents

For example, in a chess endgame, a chess engine may build the game tree from the current position by applying all possible moves and use breadth-first search to find a win position for White. Implicit trees (such as game trees or other problem-solving trees) may be of infinite size; breadth-first search is guaranteed to find a solution node [1] if one exists.

In contrast, (plain) depth-first search (DFS), which explores the node branch as far as possible before backtracking and expanding other nodes, [2] may get lost in an infinite branch and never make it to the solution node. Iterative deepening depth-first search avoids the latter drawback at the price of exploring the tree's top parts over and over again. On the other hand, both depth-first algorithms typically require far less extra memory than breadth-first search. [3]

Breadth-first search can be generalized to both undirected graphs and directed graphs with a given start node (sometimes referred to as a 'search key'). [4] In state space search in artificial intelligence, repeated searches of vertices are often allowed, while in theoretical analysis of algorithms based on breadth-first search, precautions are typically taken to prevent repetitions.

BFS and its application in finding connected components of graphs were invented in 1945 by Konrad Zuse, in his (rejected) Ph.D. thesis on the Plankalkül programming language, but this was not published until 1972. [5] It was reinvented in 1959 by Edward F. Moore, who used it to find the shortest path out of a maze, [6] [7] and later developed by C. Y. Lee into a wire routing algorithm (published in 1961). [8]

Pseudocode

Input: A graph G and a starting vertexroot of G

Output: Goal state. The parent links trace the shortest path back to root [9]

 1  procedure BFS(G, root) is  2      let Q be a queue  3      label root as explored  4      Q.enqueue(root)  5      whileQ is not empty do  6          v := Q.dequeue()  7          ifv is the goal then  8              returnv  9          for all edges from v to winG.adjacentEdges(v) do 10              ifw is not labeled as explored then 11                  label w as explored 12                  w.parent := v 13                  Q.enqueue(w)

More details

This non-recursive implementation is similar to the non-recursive implementation of depth-first search, but differs from it in two ways:

  1. it uses a queue (First In First Out) instead of a stack (Last In First Out) and
  2. it checks whether a vertex has been explored before enqueueing the vertex rather than delaying this check until the vertex is dequeued from the queue.

If G is a tree, replacing the queue of this breadth-first search algorithm with a stack will yield a depth-first search algorithm. For general graphs, replacing the stack of the iterative depth-first search implementation with a queue would also produce a breadth-first search algorithm, although a somewhat nonstandard one. [10]

The Q queue contains the frontier along which the algorithm is currently searching.

Nodes can be labelled as explored by storing them in a set, or by an attribute on each node, depending on the implementation.

Note that the word node is usually interchangeable with the word vertex.

The parent attribute of each node is useful for accessing the nodes in a shortest path, for example by backtracking from the destination node up to the starting node, once the BFS has been run, and the predecessors nodes have been set.

Breadth-first search produces a so-called breadth first tree. You can see how a breadth first tree looks in the following example.

Example

The following is an example of the breadth-first tree obtained by running a BFS on German cities starting from Frankfurt:

An example map of Southern Germany with some connections between cities MapGermanyGraph.svg
An example map of Southern Germany with some connections between cities
The breadth-first tree obtained when running BFS on the given map and starting in Frankfurt GermanyBFS.svg
The breadth-first tree obtained when running BFS on the given map and starting in Frankfurt

Analysis

Time and space complexity

The time complexity can be expressed as , since every vertex and every edge will be explored in the worst case. is the number of vertices and is the number of edges in the graph. Note that may vary between and , depending on how sparse the input graph is. [11]

When the number of vertices in the graph is known ahead of time, and additional data structures are used to determine which vertices have already been added to the queue, the space complexity can be expressed as , where is the number of vertices. This is in addition to the space required for the graph itself, which may vary depending on the graph representation used by an implementation of the algorithm.

When working with graphs that are too large to store explicitly (or infinite), it is more practical to describe the complexity of breadth-first search in different terms: to find the nodes that are at distance d from the start node (measured in number of edge traversals), BFS takes O(bd + 1) time and memory, where b is the "branching factor" of the graph (the average out-degree). [12] :81

Completeness

In the analysis of algorithms, the input to breadth-first search is assumed to be a finite graph, represented as an adjacency list, adjacency matrix, or similar representation. However, in the application of graph traversal methods in artificial intelligence the input may be an implicit representation of an infinite graph. In this context, a search method is described as being complete if it is guaranteed to find a goal state if one exists. Breadth-first search is complete, but depth-first search is not. When applied to infinite graphs represented implicitly, breadth-first search will eventually find the goal state, but depth first search may get lost in parts of the graph that have no goal state and never return. [13]

BFS ordering

An enumeration of the vertices of a graph is said to be a BFS ordering if it is the possible output of the application of BFS to this graph.

Let be a graph with vertices. Recall that is the set of neighbors of . Let be a list of distinct elements of , for , let be the least such that is a neighbor of , if such a exists, and be otherwise.

Let be an enumeration of the vertices of . The enumeration is said to be a BFS ordering (with source ) if, for all , is the vertex such that is minimal. Equivalently, is a BFS ordering if, for all with , there exists a neighbor of such that .

Applications

Breadth-first search can be used to solve many problems in graph theory, for example:

See also

Related Research Articles

<span class="mw-page-title-main">Dijkstra's algorithm</span> Graph search algorithm

Dijkstra's algorithm is an algorithm for finding the shortest paths between nodes in a weighted graph, which may represent, for example, road networks. It was conceived by computer scientist Edsger W. Dijkstra in 1956 and published three years later.

The Ford–Fulkerson method or Ford–Fulkerson algorithm (FFA) is a greedy algorithm that computes the maximum flow in a flow network. It is sometimes called a "method" instead of an "algorithm" as the approach to finding augmenting paths in a residual graph is not fully specified or it is specified in several implementations with different running times. It was published in 1956 by L. R. Ford Jr. and D. R. Fulkerson. The name "Ford–Fulkerson" is often also used for the Edmonds–Karp algorithm, which is a fully defined implementation of the Ford–Fulkerson method.

<span class="mw-page-title-main">Depth-first search</span> Search algorithm

Depth-first search (DFS) is an algorithm for traversing or searching tree or graph data structures. The algorithm starts at the root node and explores as far as possible along each branch before backtracking. Extra memory, usually a stack, is needed to keep track of the nodes discovered so far along a specified branch which helps in backtracking of the graph.

<span class="mw-page-title-main">Bellman–Ford algorithm</span> Algorithm for finding the shortest paths in graphs

The Bellman–Ford algorithm is an algorithm that computes shortest paths from a single source vertex to all of the other vertices in a weighted digraph. It is slower than Dijkstra's algorithm for the same problem, but more versatile, as it is capable of handling graphs in which some of the edge weights are negative numbers. The algorithm was first proposed by Alfonso Shimbel (1955), but is instead named after Richard Bellman and Lester Ford Jr., who published it in 1958 and 1956, respectively. Edward F. Moore also published a variation of the algorithm in 1959, and for this reason it is also sometimes called the Bellman–Ford–Moore algorithm.

In computer science, the Edmonds–Karp algorithm is an implementation of the Ford–Fulkerson method for computing the maximum flow in a flow network in time. The algorithm was first published by Yefim Dinitz in 1970, and independently published by Jack Edmonds and Richard Karp in 1972. Dinitz's algorithm includes additional techniques that reduce the running time to .

<span class="mw-page-title-main">Spanning tree</span> Tree which includes all vertices of a graph

In the mathematical field of graph theory, a spanning treeT of an undirected graph G is a subgraph that is a tree which includes all of the vertices of G. In general, a graph may have several spanning trees, but a graph that is not connected will not contain a spanning tree. If all of the edges of G are also edges of a spanning tree T of G, then G is a tree and is identical to T.

<span class="mw-page-title-main">Graph (abstract data type)</span> Abstract data type in computer science

In computer science, a graph is an abstract data type that is meant to implement the undirected graph and directed graph concepts from the field of graph theory within mathematics.

<span class="mw-page-title-main">Strongly connected component</span> Partition of a graph whose components are reachable from all vertices

In the mathematical theory of directed graphs, a graph is said to be strongly connected if every vertex is reachable from every other vertex. The strongly connected components of an arbitrary directed graph form a partition into subgraphs that are themselves strongly connected. It is possible to test the strong connectivity of a graph, or to find its strongly connected components, in linear time (that is, Θ(V + E)).

In computer science, a topological sort or topological ordering of a directed graph is a linear ordering of its vertices such that for every directed edge (u,v) from vertex u to vertex v, u comes before v in the ordering. For instance, the vertices of the graph may represent tasks to be performed, and the edges may represent constraints that one task must be performed before another; in this application, a topological ordering is just a valid sequence for the tasks. Precisely, a topological sort is a graph traversal in which each node v is visited only after all its dependencies are visited. A topological ordering is possible if and only if the graph has no directed cycles, that is, if it is a directed acyclic graph (DAG). Any DAG has at least one topological ordering, and algorithms are known for constructing a topological ordering of any DAG in linear time. Topological sorting has many applications, especially in ranking problems such as feedback arc set. Topological sorting is possible even when the DAG has disconnected components.

Johnson's algorithm is a way to find the shortest paths between all pairs of vertices in an edge-weighted directed graph. It allows some of the edge weights to be negative numbers, but no negative-weight cycles may exist. It works by using the Bellman–Ford algorithm to compute a transformation of the input graph that removes all negative weights, allowing Dijkstra's algorithm to be used on the transformed graph. It is named after Donald B. Johnson, who first published the technique in 1977.

<span class="mw-page-title-main">Centrality</span> Degree of connectedness within a graph

In graph theory and network analysis, indicators of centrality assign numbers or rankings to nodes within a graph corresponding to their network position. Applications include identifying the most influential person(s) in a social network, key infrastructure nodes in the Internet or urban networks, super-spreaders of disease, and brain networks. Centrality concepts were first developed in social network analysis, and many of the terms used to measure centrality reflect their sociological origin.

In graph theory, reachability refers to the ability to get from one vertex to another within a graph. A vertex can reach a vertex if there exists a sequence of adjacent vertices which starts with and ends with .

In computer science, Kosaraju-Sharir's algorithm is a linear time algorithm to find the strongly connected components of a directed graph. Aho, Hopcroft and Ullman credit it to S. Rao Kosaraju and Micha Sharir. Kosaraju suggested it in 1978 but did not publish it, while Sharir independently discovered it and published it in 1981. It makes use of the fact that the transpose graph has exactly the same strongly connected components as the original graph.

In computer science, the Hopcroft–Karp algorithm is an algorithm that takes a bipartite graph as input and produces a maximum-cardinality matching as output — a set of as many edges as possible with the property that no two edges share an endpoint. It runs in time in the worst case, where is set of edges in the graph, is set of vertices of the graph, and it is assumed that . In the case of dense graphs the time bound becomes , and for sparse random graphs it runs in time with high probability.

In computer science, graph traversal refers to the process of visiting each vertex in a graph. Such traversals are classified by the order in which the vertices are visited. Tree traversal is a special case of graph traversal.

In graph theory, the planar separator theorem is a form of isoperimetric inequality for planar graphs, that states that any planar graph can be split into smaller pieces by removing a small number of vertices. Specifically, the removal of vertices from an n-vertex graph can partition the graph into disjoint subgraphs each of which has at most vertices.

In computer science, lexicographic breadth-first search or Lex-BFS is a linear time algorithm for ordering the vertices of a graph. The algorithm is different from a breadth-first search, but it produces an ordering that is consistent with breadth-first search.

<span class="mw-page-title-main">Betweenness centrality</span> Measure of a graphs centrality, based on shortest paths

In graph theory, betweenness centrality is a measure of centrality in a graph based on shortest paths. For every pair of vertices in a connected graph, there exists at least one shortest path between the vertices such that either the number of edges that the path passes through or the sum of the weights of the edges is minimized. The betweenness centrality for each vertex is the number of these shortest paths that pass through the vertex.

The Shortest Path Faster Algorithm (SPFA) is an improvement of the Bellman–Ford algorithm that computes single-source shortest paths in a weighted directed graph. The algorithm is believed to work well on sparse random graphs and is particularly suitable for graphs that contain negative-weight edges. However, the worst-case complexity of SPFA is the same as that of Bellman–Ford, so for graphs with nonnegative edge weights Dijkstra's algorithm is preferred. SPFA was first published by Edward F. Moore in 1959, as a generalization of breadth first search; SPFA is Moore's “Algorithm D.” The name, “Shortest Path Faster Algorithm (SPFA),” was given by Fanding Duan, a Chinese researcher who rediscovered the algorithm in 1994.

In computing and graph theory, a dynamic connectivity structure is a data structure that dynamically maintains information about the connected components of a graph.

References

  1. that is, a node satisfying the specified property
  2. Cormen Thomas H.; et al. (2009). "22.3". Introduction to Algorithms. MIT Press.
  3. Korf, Richard E. (1985). "Depth-First Iterative Deepening: An Optimal Admissible Tree Search". Artificial Intelligence (27): 99–100. doi:10.7916/D8HQ46X1.
  4. "Graph500 benchmark specification (supercomputer performance evaluation)". Graph500.org, 2010. Archived from the original on 2015-03-26. Retrieved 2015-03-15.
  5. Zuse, Konrad (1972), Der Plankalkül (in German), Konrad Zuse Internet Archive. See pp. 96–105 of the linked pdf file (internal numbering 2.47–2.56).
  6. Moore, Edward F. (1959). "The shortest path through a maze". Proceedings of the International Symposium on the Theory of Switching. Harvard University Press. pp. 285–292. As cited by Cormen, Leiserson, Rivest, and Stein.
  7. Skiena, Steven (2008). "Sorting and Searching". The Algorithm Design Manual. Springer. p. 480. Bibcode:2008adm..book.....S. doi:10.1007/978-1-84800-070-4_4. ISBN   978-1-84800-069-8.
  8. Lee, C. Y. (1961). "An Algorithm for Path Connections and Its Applications". IRE Transactions on Electronic Computers (3): 346–365. doi:10.1109/TEC.1961.5219222. S2CID   40700386.
  9. Cormen, Thomas H. "22.2 Breadth-first search". Introduction to algorithms. ISBN   978-81-203-4007-7. OCLC   1006880283.
  10. "Stack-based graph traversal ≠ depth first search". 11011110.github.io. Retrieved 2020-06-10.
  11. Cormen, Thomas H.; Leiserson, Charles E.; Rivest, Ronald L.; Stein, Clifford (2001) [1990]. "22.2 Breadth-first search". Introduction to Algorithms (2nd ed.). MIT Press and McGraw-Hill. pp. 531–539. ISBN   0-262-03293-7.
  12. Russell, Stuart; Norvig, Peter (2003) [1995]. Artificial Intelligence: A Modern Approach (2nd ed.). Prentice Hall. ISBN   978-0137903955.
  13. Coppin, B. (2004). Artificial intelligence illuminated. Jones & Bartlett Learning. pp. 79–80.
  14. Aziz, Adnan; Prakash, Amit (2010). "4. Algorithms on Graphs". Algorithms for Interviews. p. 144. ISBN   978-1453792995.
  15. Dhulipala, Laxman; Blelloch, Guy E.; Shun, Julian (August 21, 2019). Theoretically Efficient Parallel Graph Algorithms Can Be Fast and Scalable. p. 17. arXiv: 1805.05208 . doi:10.1145/3210377.3210414. ISBN   9781450357999. S2CID   44126609.