Pseudoforest

Last updated
A 1-forest (a maximal pseudoforest), formed by three 1-trees Pseudoforest.svg
A 1-forest (a maximal pseudoforest), formed by three 1-trees

In graph theory, a pseudoforest is an undirected graph [1] in which every connected component has at most one cycle. That is, it is a system of vertices and edges connecting pairs of vertices, such that no two cycles of consecutive edges share any vertex with each other, nor can any two cycles be connected to each other by a path of consecutive edges. A pseudotree is a connected pseudoforest.

Contents

The names are justified by analogy to the more commonly studied trees and forests. (A tree is a connected graph with no cycles; a forest is a disjoint union of trees.) Gabow and Tarjan [2] attribute the study of pseudoforests to Dantzig's 1963 book on linear programming, in which pseudoforests arise in the solution of certain network flow problems. [3] Pseudoforests also form graph-theoretic models of functions and occur in several algorithmic problems. Pseudoforests are sparse graphs – their number of edges is linearly bounded in terms of their number of vertices (in fact, they have at most as many edges as they have vertices) – and their matroid structure allows several other families of sparse graphs to be decomposed as unions of forests and pseudoforests. The name "pseudoforest" comes from Picard & Queyranne (1982).

Definitions and structure

We define an undirected graph to be a set of vertices and edges such that each edge has two vertices (which may coincide) as endpoints. That is, we allow multiple edges (edges with the same pair of endpoints) and loops (edges whose two endpoints are the same vertex). [1] A subgraph of a graph is the graph formed by any subsets of its vertices and edges such that each edge in the edge subset has both endpoints in the vertex subset. A connected component of an undirected graph is the subgraph consisting of the vertices and edges that can be reached by following edges from a single given starting vertex. A graph is connected if every vertex or edge is reachable from every other vertex or edge. A cycle in an undirected graph is a connected subgraph in which each vertex is incident to exactly two edges, or is a loop. [4]

The 21 unicyclic graphs with at most six vertices The21.GIF
The 21 unicyclic graphs with at most six vertices

A pseudoforest is an undirected graph in which each connected component contains at most one cycle. [5] Equivalently, it is an undirected graph in which each connected component has no more edges than vertices. [6] The components that have no cycles are just trees, while the components that have a single cycle within them are called 1-trees or unicyclic graphs. That is, a 1-tree is a connected graph containing exactly one cycle. A pseudoforest with a single connected component (usually called a pseudotree, although some authors define a pseudotree to be a 1-tree) is either a tree or a 1-tree; in general a pseudoforest may have multiple connected components as long as all of them are trees or 1-trees.

If one removes from a 1-tree one of the edges in its cycle, the result is a tree. Reversing this process, if one augments a tree by connecting any two of its vertices by a new edge, the result is a 1-tree; the path in the tree connecting the two endpoints of the added edge, together with the added edge itself, form the 1-tree's unique cycle. If one augments a 1-tree by adding an edge that connects one of its vertices to a newly added vertex, the result is again a 1-tree, with one more vertex; an alternative method for constructing 1-trees is to start with a single cycle and then repeat this augmentation operation any number of times. The edges of any 1-tree can be partitioned in a unique way into two subgraphs, one of which is a cycle and the other of which is a forest, such that each tree of the forest contains exactly one vertex of the cycle. [7]

Certain more specific types of pseudoforests have also been studied.

A 1-forest, sometimes called a maximal pseudoforest, is a pseudoforest to which no more edges can be added without causing some component of the graph to contain multiple cycles. If a pseudoforest contains a tree as one of its components, it cannot be a 1-forest, for one can add either an edge connecting two vertices within that tree, forming a single cycle, or an edge connecting that tree to some other component. Thus, the 1-forests are exactly the pseudoforests in which every component is a 1-tree.
The spanning pseudoforests of an undirected graph G are the pseudoforest subgraphs of G that have all the vertices of G. Such a pseudoforest need not have any edges, since for example the subgraph that has all the vertices of G and no edges is a pseudoforest (whose components are trees consisting of a single vertex).
The maximal pseudoforests ofG are the pseudoforest subgraphs of G that are not contained within any larger pseudoforest of G. A maximal pseudoforest of G is always a spanning pseudoforest, but not conversely. If G has no connected components that are trees, then its maximal pseudoforests are 1-forests, but if G does have a tree component, its maximal pseudoforests are not 1-forests. Stated precisely, in any graph G its maximal pseudoforests consist of every tree component of G, together with one or more disjoint 1-trees covering the remaining vertices of G.

Directed pseudoforests

Versions of these definitions are also used for directed graphs. Like an undirected graph, a directed graph consists of vertices and edges, but each edge is directed from one of its endpoints to the other endpoint. A directed pseudoforest is a directed graph in which each vertex has at most one outgoing edge; that is, it has outdegree at most one. A directed 1-forest most commonly called a functional graph (see below), sometimes maximal directed pseudoforest is a directed graph in which each vertex has outdegree exactly one. [8] If D is a directed pseudoforest, the undirected graph formed by removing the direction from each edge of D is an undirected pseudoforest.

Number of edges

Every pseudoforest on a set of n vertices has at most n edges, and every maximal pseudoforest on a set of n vertices has exactly n edges. Conversely, if a graph G has the property that, for every subset S of its vertices, the number of edges in the induced subgraph of S is at most the number of vertices in S, then G is a pseudoforest. 1-trees can be defined as connected graphs with equally many vertices and edges. [2]

Moving from individual graphs to graph families, if a family of graphs has the property that every subgraph of a graph in the family is also in the family, and every graph in the family has at most as many edges as vertices, then the family contains only pseudoforests. For instance, every subgraph of a thrackle (a graph drawn so that every pair of edges has one point of intersection) is also a thrackle, so Conway's conjecture that every thrackle has at most as many edges as vertices can be restated as saying that every thrackle is a pseudoforest. A more precise characterization is that, if the conjecture is true, then the thrackles are exactly the pseudoforests with no four-vertex cycle and at most one odd cycle. [9]

Streinu and Theran [10] generalize the sparsity conditions defining pseudoforests: they define a graph as being (k,l)-sparse if every nonempty subgraph with n vertices has at most kn  l edges, and (k,l)-tight if it is (k,l)-sparse and has exactly kn  l edges. Thus, the pseudoforests are the (1,0)-sparse graphs, and the maximal pseudoforests are the (1,0)-tight graphs. Several other important families of graphs may be defined from other values of k and l, and when l  k the (k,l)-sparse graphs may be characterized as the graphs formed as the edge-disjoint union of l forests and k  l pseudoforests. [11]

Almost every sufficiently sparse random graph is pseudoforest. [12] That is, if c is a constant with 0 <c< 1/2, and Pc(n) is the probability that choosing uniformly at random among the n-vertex graphs with cn edges results in a pseudoforest, then Pc(n) tends to one in the limit for large n. However, for c> 1/2, almost every random graph with cn edges has a large component that is not unicyclic.

Enumeration

A graph is simple if it has no self-loops and no multiple edges with the same endpoints. The number of simple 1-trees with n labelled vertices is [13]

The values for n up to 300 can be found in sequence OEIS:  A057500 of the On-Line Encyclopedia of Integer Sequences.

The number of maximal directed pseudoforests on n vertices, allowing self-loops, is nn, because for each vertex there are n possible endpoints for the outgoing edge. André Joyal used this fact to provide a bijective proof of Cayley's formula, that the number of undirected trees on n nodes is nn  2, by finding a bijection between maximal directed pseudoforests and undirected trees with two distinguished nodes. [14] If self-loops are not allowed, the number of maximal directed pseudoforests is instead (n  1)n.

Graphs of functions

A function from the set {0,1,2,3,4,5,6,7,8} to itself, and the corresponding functional graph Functional graph.svg
A function from the set {0,1,2,3,4,5,6,7,8} to itself, and the corresponding functional graph

Directed pseudoforests and endofunctions are in some sense mathematically equivalent. Any function ƒ from a set X to itself (that is, an endomorphism of X) can be interpreted as defining a directed pseudoforest which has an edge from x to y whenever ƒ(x) = y. The resulting directed pseudoforest is maximal, and may include self-loops whenever some value x has ƒ(x) = x. Alternatively, omitting the self-loops produces a non-maximal pseudoforest. In the other direction, any maximal directed pseudoforest determines a function ƒ such that ƒ(x) is the target of the edge that goes out from x, and any non-maximal directed pseudoforest can be made maximal by adding self-loops and then converted into a function in the same way. For this reason, maximal directed pseudoforests are sometimes called functional graphs. [2] Viewing a function as a functional graph provides a convenient language for describing properties that are not as easily described from the function-theoretic point of view; this technique is especially applicable to problems involving iterated functions, which correspond to paths in functional graphs.

Cycle detection, the problem of following a path in a functional graph to find a cycle in it, has applications in cryptography and computational number theory, as part of Pollard's rho algorithm for integer factorization and as a method for finding collisions in cryptographic hash functions. In these applications, ƒ is expected to behave randomly; Flajolet and Odlyzko [15] study the graph-theoretic properties of the functional graphs arising from randomly chosen mappings. In particular, a form of the birthday paradox implies that, in a random functional graph with n vertices, the path starting from a randomly selected vertex will typically loop back on itself to form a cycle within O(n) steps. Konyagin et al. have made analytical and computational progress on graph statistics. [16]

Martin, Odlyzko, and Wolfram [17] investigate pseudoforests that model the dynamics of cellular automata. These functional graphs, which they call state transition diagrams, have one vertex for each possible configuration that the ensemble of cells of the automaton can be in, and an edge connecting each configuration to the configuration that follows it according to the automaton's rule. One can infer properties of the automaton from the structure of these diagrams, such as the number of components, length of limiting cycles, depth of the trees connecting non-limiting states to these cycles, or symmetries of the diagram. For instance, any vertex with no incoming edge corresponds to a Garden of Eden pattern and a vertex with a self-loop corresponds to a still life pattern.

Another early application of functional graphs is in the trains used to study Steiner triple systems. [18] The train of a triple system is a functional graph having a vertex for each possible triple of symbols; each triple pqr is mapped by ƒ to stu, where pqs, prt, and qru are the triples that belong to the triple system and contain the pairs pq, pr, and qr respectively. Trains have been shown to be a powerful invariant of triple systems although somewhat cumbersome to compute.

Bicircular matroid

A matroid is a mathematical structure in which certain sets of elements are defined to be independent, in such a way that the independent sets satisfy properties modeled after the properties of linear independence in a vector space. One of the standard examples of a matroid is the graphic matroid in which the independent sets are the sets of edges in forests of a graph; the matroid structure of forests is important in algorithms for computing the minimum spanning tree of the graph. Analogously, we may define matroids from pseudoforests.

For any graph G = (V,E), we may define a matroid on the edges of G, in which a set of edges is independent if and only if it forms a pseudoforest; this matroid is known as the bicircular matroid (or bicycle matroid) of G. [19] [20] The smallest dependent sets for this matroid are the minimal connected subgraphs of G that have more than one cycle, and these subgraphs are sometimes called bicycles. There are three possible types of bicycle: a theta graph has two vertices that are connected by three internally disjoint paths, a figure 8 graph consists of two cycles sharing a single vertex, and a handcuff graph is formed by two disjoint cycles connected by a path. [21] A graph is a pseudoforest if and only if it does not contain a bicycle as a subgraph. [10]

Forbidden minors

The butterfly graph (left) and diamond graph (right), forbidden minors for pseudoforests Butterfly and diamond graphs.svg
The butterfly graph (left) and diamond graph (right), forbidden minors for pseudoforests

Forming a minor of a pseudoforest by contracting some of its edges and deleting others produces another pseudoforest. Therefore, the family of pseudoforests is closed under minors, and the Robertson–Seymour theorem implies that pseudoforests can be characterized in terms of a finite set of forbidden minors, analogously to Wagner's theorem characterizing the planar graphs as the graphs having neither the complete graph K5 nor the complete bipartite graph K3,3 as minors. As discussed above, any non-pseudoforest graph contains as a subgraph a handcuff, figure 8, or theta graph; any handcuff or figure 8 graph may be contracted to form a butterfly graph (five-vertex figure 8), and any theta graph may be contracted to form a diamond graph (four-vertex theta graph), [22] so any non-pseudoforest contains either a butterfly or a diamond as a minor, and these are the only minor-minimal non-pseudoforest graphs. Thus, a graph is a pseudoforest if and only if it does not have the butterfly or the diamond as a minor. If one forbids only the diamond but not the butterfly, the resulting larger graph family consists of the cactus graphs and disjoint unions of multiple cactus graphs. [23]

More simply, if multigraphs with self-loops are considered, there is only one forbidden minor, a vertex with two loops.

Algorithms

An early algorithmic use of pseudoforests involves the network simplex algorithm and its application to generalized flow problems modeling the conversion between commodities of different types. [3] [24] In these problems, one is given as input a flow network in which the vertices model each commodity and the edges model allowable conversions between one commodity and another. Each edge is marked with a capacity (how much of a commodity can be converted per unit time), a flow multiplier (the conversion rate between commodities), and a cost (how much loss or, if negative, profit is incurred per unit of conversion). The task is to determine how much of each commodity to convert via each edge of the flow network, in order to minimize cost or maximize profit, while obeying the capacity constraints and not allowing commodities of any type to accumulate unused. This type of problem can be formulated as a linear program, and solved using the simplex algorithm. The intermediate solutions arising from this algorithm, as well as the eventual optimal solution, have a special structure: each edge in the input network is either unused or used to its full capacity, except for a subset of the edges, forming a spanning pseudoforest of the input network, for which the flow amounts may lie between zero and the full capacity. In this application, unicyclic graphs are also sometimes called augmented trees and maximal pseudoforests are also sometimes called augmented forests. [24]

The minimum spanning pseudoforest problem involves finding a spanning pseudoforest of minimum weight in a larger edge-weighted graph G. Due to the matroid structure of pseudoforests, minimum-weight maximal pseudoforests may be found by greedy algorithms similar to those for the minimum spanning tree problem. However, Gabow and Tarjan found a more efficient linear-time approach in this case. [2]

The pseudoarboricity of a graph G is defined by analogy to the arboricity as the minimum number of pseudoforests into which its edges can be partitioned; equivalently, it is the minimum k such that G is (k,0)-sparse, or the minimum k such that the edges of G can be oriented to form a directed graph with outdegree at most k. Due to the matroid structure of pseudoforests, the pseudoarboricity may be computed in polynomial time. [25]

A random bipartite graph with n vertices on each side of its bipartition, and with cn edges chosen independently at random from each of the n2 possible pairs of vertices, is a pseudoforest with high probability whenever c is a constant strictly less than one. This fact plays a key role in the analysis of cuckoo hashing, a data structure for looking up key-value pairs by looking in one of two hash tables at locations determined from the key: one can form a graph, the "cuckoo graph", whose vertices correspond to hash table locations and whose edges link the two locations at which one of the keys might be found, and the cuckoo hashing algorithm succeeds in finding locations for all of its keys if and only if the cuckoo graph is a pseudoforest. [26]

Pseudoforests also play a key role in parallel algorithms for graph coloring and related problems. [27]

Notes

  1. 1 2 The kind of undirected graph considered here is often called a multigraph or pseudograph, to distinguish it from a simple graph.
  2. 1 2 3 4 Gabow & Tarjan (1988).
  3. 1 2 Dantzig (1963).
  4. See the linked articles and the references therein for these definitions.
  5. This is the definition used, e.g., by Gabow & Westermann (1992).
  6. This is the definition in Gabow & Tarjan (1988).
  7. See, e.g., the proof of Lemma 4 in Àlvarez, Blesa & Serna (2002).
  8. Kruskal, Rudolph & Snir (1990) instead use the opposite definition, in which each vertex has indegree one; the resulting graphs, which they call unicycular, are the transposes of the graphs considered here.
  9. Woodall (1969); Lovász, Pach & Szegedy (1997).
  10. 1 2 Streinu & Theran (2009).
  11. Whiteley (1988).
  12. Bollobás (1985). See especially Corollary 24, p.120, for a bound on the number of vertices belonging to unicyclic components in a random graph, and Corollary 19, p.113, for a bound on the number of distinct labeled unicyclic graphs.
  13. Riddell (1951); see OEIS:  A057500 in the On-Line Encyclopedia of Integer Sequences.
  14. Aigner & Ziegler (1998).
  15. Flajolet & Odlyzko (1990).
  16. Konyagin et al. (2010).
  17. Martin, Odlyzko & Wolfram (1984).
  18. White (1913); Colbourn, Colbourn & Rosenbaum (1982); Stinson (1983).
  19. Simoes-Pereira (1972).
  20. Matthews (1977).
  21. Glossary of Signed and Gain Graphs and Allied Areas
  22. For this terminology, see the list of small graphs from the Information System on Graph Class Inclusions. However, butterfly graph may also refer to a different family of graphs related to hypercubes, and the five-vertex figure 8 is sometimes instead called a bowtie graph.
  23. El-Mallah & Colbourn (1988).
  24. 1 2 Ahuja, Magnanti & Orlin (1993).
  25. Gabow & Westermann (1992). See also the faster approximation schemes of Kowalik (2006).
  26. Kutzelnigg (2006).
  27. Goldberg, Plotkin & Shannon (1988); Kruskal, Rudolph & Snir (1990).

Related Research Articles

<span class="mw-page-title-main">Tree (graph theory)</span> Undirected, connected and acyclic graph

In graph theory, a tree is an undirected graph in which any two vertices are connected by exactly one path, or equivalently a connected acyclic undirected graph. A forest is an undirected graph in which any two vertices are connected by at most one path, or equivalently an acyclic undirected graph, or equivalently a disjoint union of trees.

<span class="mw-page-title-main">Component (graph theory)</span> Maximal subgraph whose vertices can reach each other

In graph theory, a component of an undirected graph is a connected subgraph that is not part of any larger connected subgraph. The components of any graph partition its vertices into disjoint sets, and are the induced subgraphs of those sets. A graph that is itself connected has exactly one component, consisting of the whole graph. Components are sometimes called connected components.

This is a glossary of graph theory. Graph theory is the study of graphs, systems of nodes or vertices connected in pairs by lines or edges.

<span class="mw-page-title-main">Graph (discrete mathematics)</span> Vertices connected in pairs by edges

In discrete mathematics, and more specifically in graph theory, a graph is a structure amounting to a set of objects in which some pairs of the objects are in some sense "related". The objects correspond to mathematical abstractions called vertices and each of the related pairs of vertices is called an edge. Typically, a graph is depicted in diagrammatic form as a set of dots or circles for the vertices, joined by lines or curves for the edges. Graphs are one of the objects of study in discrete mathematics.

<span class="mw-page-title-main">Spanning tree</span> Tree which includes all vertices of a graph

In the mathematical field of graph theory, a spanning treeT of an undirected graph G is a subgraph that is a tree which includes all of the vertices of G. In general, a graph may have several spanning trees, but a graph that is not connected will not contain a spanning tree. If all of the edges of G are also edges of a spanning tree T of G, then G is a tree and is identical to T.

<span class="mw-page-title-main">Strongly connected component</span> Partition of a graph whose components are reachable from all vertices

In the mathematical theory of directed graphs, a graph is said to be strongly connected if every vertex is reachable from every other vertex. The strongly connected components of an arbitrary directed graph form a partition into subgraphs that are themselves strongly connected. It is possible to test the strong connectivity of a graph, or to find its strongly connected components, in linear time (that is, Θ(V + E)).

In the mathematical theory of matroids, a graphic matroid is a matroid whose independent sets are the forests in a given finite undirected graph. The dual matroids of graphic matroids are called co-graphic matroids or bond matroids. A matroid that is both graphic and co-graphic is sometimes called a planar matroid ; these are exactly the graphic matroids formed from planar graphs.

<span class="mw-page-title-main">Circuit rank</span> Fewest graph edges whose removal breaks all cycles

In graph theory, a branch of mathematics, the circuit rank, cyclomatic number, cycle rank, or nullity of an undirected graph is the minimum number of edges that must be removed from the graph to break all its cycles, making it into a tree or forest. It is equal to the number of independent cycles in the graph. Unlike the corresponding feedback arc set problem for directed graphs, the circuit rank r is easily computed using the formula

<span class="mw-page-title-main">Connectivity (graph theory)</span> Basic concept of graph theory

In mathematics and computer science, connectivity is one of the basic concepts of graph theory: it asks for the minimum number of elements that need to be removed to separate the remaining nodes into two or more isolated subgraphs. It is closely related to the theory of network flow problems. The connectivity of a graph is an important measure of its resilience as a network.

The arboricity of an undirected graph is the minimum number of forests into which its edges can be partitioned. Equivalently it is the minimum number of spanning forests needed to cover all the edges of the graph. The Nash-Williams theorem provides necessary and sufficient conditions for when a graph is k-arboric.

<span class="mw-page-title-main">Dual graph</span> Graph representing faces of another graph

In the mathematical discipline of graph theory, the dual graph of a plane graph G is a graph that has a vertex for each face of G. The dual graph has an edge for each pair of faces in G that are separated from each other by an edge, and a self-loop when the same face appears on both sides of an edge. Thus, each edge e of G has a corresponding dual edge, whose endpoints are the dual vertices corresponding to the faces on either side of e. The definition of the dual depends on the choice of embedding of the graph G, so it is a property of plane graphs rather than planar graphs. For planar graphs generally, there may be multiple dual graphs, depending on the choice of planar embedding of the graph.

In graph theory, a connected graph is k-edge-connected if it remains connected whenever fewer than k edges are removed.

<span class="mw-page-title-main">Factor-critical graph</span> Graph of n vertices with a perfect matching for every subgraph of n-1 vertices

In graph theory, a mathematical discipline, a factor-critical graph is a graph with n vertices in which every subgraph of n − 1 vertices has a perfect matching.

<span class="mw-page-title-main">Degeneracy (graph theory)</span> Measurement of graph sparsity

In graph theory, a k-degenerate graph is an undirected graph in which every subgraph has a vertex of degree at most k: that is, some vertex in the subgraph touches k or fewer of the subgraph's edges. The degeneracy of a graph is the smallest value of k for which it is k-degenerate. The degeneracy of a graph is a measure of how sparse it is, and is within a constant factor of other sparsity measures such as the arboricity of a graph.

<span class="mw-page-title-main">Block graph</span> Graph whose biconnected components are all cliques

In graph theory, a branch of combinatorial mathematics, a block graph or clique tree is a type of undirected graph in which every biconnected component (block) is a clique.

<span class="mw-page-title-main">Ear decomposition</span> Partition of graph into sequence of paths

In graph theory, an ear of an undirected graph G is a path P where the two endpoints of the path may coincide, but where otherwise no repetition of edges or vertices is allowed, so every internal vertex of P has degree two in G. An ear decomposition of an undirected graph G is a partition of its set of edges into a sequence of ears, such that the one or two endpoints of each ear belong to earlier ears in the sequence and such that the internal vertices of each ear do not belong to any earlier ear. Additionally, in most cases the first ear in the sequence must be a cycle. An open ear decomposition or a proper ear decomposition is an ear decomposition in which the two endpoints of each ear after the first are distinct from each other.

The expected linear time MST algorithm is a randomized algorithm for computing the minimum spanning forest of a weighted graph with no isolated vertices. It was developed by David Karger, Philip Klein, and Robert Tarjan. The algorithm relies on techniques from Borůvka's algorithm along with an algorithm for verifying a minimum spanning tree in linear time. It combines the design paradigms of divide and conquer algorithms, greedy algorithms, and randomized algorithms to achieve expected linear performance.

In the mathematics of structural rigidity, a rigidity matroid is a matroid that describes the number of degrees of freedom of an undirected graph with rigid edges of fixed lengths, embedded into Euclidean space. In a rigidity matroid for a graph with n vertices in d-dimensional space, a set of edges that defines a subgraph with k degrees of freedom has matroid rank dn − k. A set of edges is independent if and only if, for every edge in the set, removing the edge would increase the number of degrees of freedom of the remaining subgraph.

In mathematics, a minimum bottleneck spanning tree (MBST) in an undirected graph is a spanning tree in which the most expensive edge is as cheap as possible. A bottleneck edge is the highest weighted edge in a spanning tree. A spanning tree is a minimum bottleneck spanning tree if the graph does not contain a spanning tree with a smaller bottleneck edge weight. For a directed graph, a similar problem is known as Minimum Bottleneck Spanning Arborescence (MBSA).

<span class="mw-page-title-main">Matroid parity problem</span> Largest independent set of paired elements

In combinatorial optimization, the matroid parity problem is a problem of finding the largest independent set of paired elements in a matroid. The problem was formulated by Lawler (1976) as a common generalization of graph matching and matroid intersection. It is also known as polymatroid matching, or the matchoid problem.

References