In graph theory, a maximal independent set (MIS) or maximal stable set is an independent set that is not a subset of any other independent set. In other words, there is no vertex outside the independent set that may join it because it is maximal with respect to the independent set property.
For example, in the graph P3, a path with three vertices a, b, and c, and two edges ab and bc, the sets {b} and {a, c} are both maximally independent. The set {a} is independent, but is not maximal independent, because it is a subset of the larger independent set {a, c}. In this same graph, the maximal cliques are the sets {a, b} and {b, c}.
A MIS is also a dominating set in the graph, and every dominating set that is independent must be maximal independent, so MISs are also called independent dominating sets.
A graph may have many MISs of widely varying sizes; [lower-alpha 1] the largest, or possibly several equally large, MISs of a graph is called a maximum independent set. The graphs in which all maximal independent sets have the same size are called well-covered graphs.
The phrase "maximal independent set" is also used to describe maximal subsets of independent elements in mathematical structures other than graphs, and in particular in vector spaces and matroids.
Two algorithmic problems are associated with MISs: finding a single MIS in a given graph and listing all MISs in a given graph.
For a graph , an independent set is a maximal independent set if for , one of the following is true: [1]
The above can be restated as a vertex either belongs to the independent set or has at least one neighbor vertex that belongs to the independent set. As a result, every edge of the graph has at least one endpoint not in . However, it is not true that every edge of the graph has at least one, or even one endpoint in
Notice that any neighbor to a vertex in the independent set cannot be in because these vertices are disjoint by the independent set definition.
If S is a maximal independent set in some graph, it is a maximal clique or maximal complete subgraph in the complementary graph. A maximal clique is a set of vertices that induces a complete subgraph, and that is not a subset of the vertices of any larger complete subgraph. That is, it is a set S such that every pair of vertices in S is connected by an edge and every vertex not in S is missing an edge to at least one vertex in S. A graph may have many maximal cliques, of varying sizes; finding the largest of these is the maximum clique problem.
Some authors include maximality as part of the definition of a clique, and refer to maximal cliques simply as cliques.
The complement of a maximal independent set, that is, the set of vertices not belonging to the independent set, forms a minimal vertex cover. That is, the complement is a vertex cover, a set of vertices that includes at least one endpoint of each edge, and is minimal in the sense that none of its vertices can be removed while preserving the property that it is a cover. Minimal vertex covers have been studied in statistical mechanics in connection with the hard-sphere lattice gas model, a mathematical abstraction of fluid-solid state transitions. [2]
Every maximal independent set is a dominating set, a set of vertices such that every vertex in the graph either belongs to the set or is adjacent to the set. A set of vertices is a maximal independent set if and only if it is an independent dominating set.
Certain graph families have also been characterized in terms of their maximal cliques or maximal independent sets. Examples include the maximal-clique irreducible and hereditary maximal-clique irreducible graphs. A graph is said to be maximal-clique irreducible if every maximal clique has an edge that belongs to no other maximal clique, and hereditary maximal-clique irreducible if the same property is true for every induced subgraph. [3] Hereditary maximal-clique irreducible graphs include triangle-free graphs, bipartite graphs, and interval graphs.
Cographs can be characterized as graphs in which every maximal clique intersects every maximal independent set, and in which the same property is true in all induced subgraphs.
Moon & Moser (1965) showed that any graph with n vertices has at most 3n/3 maximal cliques. Complementarily, any graph with n vertices also has at most 3n/3 maximal independent sets. A graph with exactly 3n/3 maximal independent sets is easy to construct: simply take the disjoint union of n/3 triangle graphs. Any maximal independent set in this graph is formed by choosing one vertex from each triangle. The complementary graph, with exactly 3n/3 maximal cliques, is a special type of Turán graph; because of their connection with Moon and Moser's bound, these graphs are also sometimes called Moon-Moser graphs. Tighter bounds are possible if one limits the size of the maximal independent sets: the number of maximal independent sets of size k in any n-vertex graph is at most
The graphs achieving this bound are again Turán graphs. [4]
Certain families of graphs may, however, have much more restrictive bounds on the numbers of maximal independent sets or maximal cliques. If all n-vertex graphs in a family of graphs have O(n) edges, and if every subgraph of a graph in the family also belongs to the family, then each graph in the family can have at most O(n) maximal cliques, all of which have size O(1). [5] For instance, these conditions are true for the planar graphs: every n-vertex planar graph has at most 3n − 6 edges, and a subgraph of a planar graph is always planar, from which it follows that each planar graph has O(n) maximal cliques (of size at most four). Interval graphs and chordal graphs also have at most n maximal cliques, even though they are not always sparse graphs.
The number of maximal independent sets in n-vertex cycle graphs is given by the Perrin numbers, and the number of maximal independent sets in n-vertex path graphs is given by the Padovan sequence. [6] Therefore, both numbers are proportional to powers of 1.324718, the plastic ratio.
Given a Graph G(V,E), it is easy to find a single MIS using the following algorithm:
The following algorithm finds a MIS in time O(log n). [1] [7] [8]
ANALYSIS: For each node v, divide its neighbours to lower neighbours (whose degree is lower than the degree of v) and higher neighbours (whose degree is higher than the degree of v), breaking ties as in the algorithm.
Call a node v bad if more than 2/3 of its neighbors are higher neighbours. Call an edge bad if both its endpoints are bad; otherwise the edge is good.
A worst-case graph, in which the average number of steps is , is a graph made of n/2 connected components, each with 2 nodes. The degree of all nodes is 1, so each node is selected with probability 1/2, and with probability 1/4 both nodes in a component are not chosen. Hence, the number of nodes drops by a factor of 4 each step, and the expected number of steps is .
The following algorithm is better than the previous one in that at least one new node is always added in each connected component: [9] [8]
Note that in every step, the node with the smallest number in each connected component always enters I, so there is always some progress. In particular, in the worst-case of the previous algorithm (n/2 connected components with 2 nodes each), a MIS will be found in a single step.
ANALYSIS:
Instead of randomizing in each step, it is possible to randomize once, at the beginning of the algorithm, by fixing a random ordering on the nodes. Given this fixed ordering, the following parallel algorithm achieves exactly the same MIS as the #Sequential algorithm (i.e. the result is deterministic): [10]
Between the totally sequential and the totally parallel algorithms, there is a continuum of algorithms that are partly sequential and partly parallel. Given a fixed ordering on the nodes and a factor δ∈(0,1], the following algorithm returns the same MIS:
Setting δ=1/n gives the totally sequential algorithm; setting δ=1 gives the totally parallel algorithm.
ANALYSIS: With a proper selection of the parameter δ in the partially parallel algorithm, it is possible to guarantee that it finishes after at most log(n) calls to the fully parallel algorithm, and the number of steps in each call is at most log(n). Hence the total run-time of the partially parallel algorithm is . Hence the run-time of the fully parallel algorithm is also at most . The main proof steps are:
An algorithm for listing all maximal independent sets or maximal cliques in a graph can be used as a subroutine for solving many NP-complete graph problems. Most obviously, the solutions to the maximum independent set problem, the maximum clique problem, and the minimum independent dominating problem must all be maximal independent sets or maximal cliques, and can be found by an algorithm that lists all maximal independent sets or maximal cliques and retains the ones with the largest or smallest size. Similarly, the minimum vertex cover can be found as the complement of one of the maximal independent sets. Lawler (1976) observed that listing maximal independent sets can also be used to find 3-colorings of graphs: a graph can be 3-colored if and only if the complement of one of its maximal independent sets is bipartite. He used this approach not only for 3-coloring but as part of a more general graph coloring algorithm, and similar approaches to graph coloring have been refined by other authors since. [11] Other more complex problems can also be modeled as finding a clique or independent set of a specific type. This motivates the algorithmic problem of listing all maximal independent sets (or equivalently, all maximal cliques) efficiently.
It is straightforward to turn a proof of Moon and Moser's 3n/3 bound on the number of maximal independent sets into an algorithm that lists all such sets in time O(3n/3). [12] For graphs that have the largest possible number of maximal independent sets, this algorithm takes constant time per output set. However, an algorithm with this time bound can be highly inefficient for graphs with more limited numbers of independent sets. For this reason, many researchers have studied algorithms that list all maximal independent sets in polynomial time per output set. [13] The time per maximal independent set is proportional to that for matrix multiplication in dense graphs, or faster in various classes of sparse graphs. [14]
The counting problem associated to maximal independent sets has been investigated in computational complexity theory. The problem asks, given an undirected graph, how many maximal independent sets it contains. This problem is #P-hard already when the input is restricted to be a bipartite graph. [15]
The problem is however tractable on some specific classes of graphs, for instance it is tractable on cographs. [16]
The maximal independent set problem was originally thought to be non-trivial to parallelize due to the fact that the lexicographical maximal independent set proved to be P-Complete; however, it has been shown that a deterministic parallel solution could be given by an reduction from either the maximum set packing or the maximal matching problem or by an reduction from the 2-satisfiability problem. [17] [18] Typically, the structure of the algorithm given follows other parallel graph algorithms - that is they subdivide the graph into smaller local problems that are solvable in parallel by running an identical algorithm.
Initial research into the maximal independent set problem started on the PRAM model and has since expanded to produce results for distributed algorithms on computer clusters. The many challenges of designing distributed parallel algorithms apply in equal to the maximum independent set problem. In particular, finding an algorithm that exhibits efficient runtime and is optimal in data communication for subdividing the graph and merging the independent set.
It was shown in 1984 by Karp et al. that a deterministic parallel solution on PRAM to the maximal independent set belonged in the Nick's Class complexity zoo of . [19] That is to say, their algorithm finds a maximal independent set in using , where is the vertex set size. In the same paper, a randomized parallel solution was also provided with a runtime of using processors. Shortly after, Luby and Alon et al. independently improved on this result, bringing the maximal independent set problem into the realm of with an runtime using processors, where is the number of edges in the graph. [18] [7] [20] In order to show that their algorithm is in , they initially presented a randomized algorithm that uses processors but could be derandomized with an additional processors. Today, it remains an open question as to if the maximal independent set problem is in .
Distributed maximal independent set algorithms are strongly influenced by algorithms on the PRAM model. The original work by Luby and Alon et al. has led to several distributed algorithms. [21] [22] [23] [20] In terms of exchange of bits, these algorithms had a message size lower bound per round of bits and would require additional characteristics of the graph. For example, the size of the graph would need to be known or the maximum degree of neighboring vertices for a given vertex could be queried. In 2010, Métivier et al. reduced the required message size per round to , which is optimal and removed the need for any additional graph knowledge. [24]
In computer science, the clique problem is the computational problem of finding cliques in a graph. It has several different formulations depending on which cliques, and what information about the cliques, should be found. Common formulations of the clique problem include finding a maximum clique, finding a maximum weight clique in a weighted graph, listing all maximal cliques, and solving the decision problem of testing whether a graph contains a clique larger than a given size.
This is a glossary of graph theory. Graph theory is the study of graphs, systems of nodes or vertices connected in pairs by lines or edges.
In graph theory, graph coloring is a special case of graph labeling; it is an assignment of labels traditionally called "colors" to elements of a graph subject to certain constraints. In its simplest form, it is a way of coloring the vertices of a graph such that no two adjacent vertices are of the same color; this is called a vertex coloring. Similarly, an edge coloring assigns a color to each edge so that no two adjacent edges are of the same color, and a face coloring of a planar graph assigns a color to each face or region so that no two faces that share a boundary have the same color.
In graph theory, an independent set, stable set, coclique or anticlique is a set of vertices in a graph, no two of which are adjacent. That is, it is a set of vertices such that for every two vertices in , there is no edge connecting the two. Equivalently, each edge in the graph has at most one endpoint in . A set is independent if and only if it is a clique in the graph's complement. The size of an independent set is the number of vertices it contains. Independent sets have also been called "internally stable sets", of which "stable set" is a shortening.
In graph theory, a vertex cover of a graph is a set of vertices that includes at least one endpoint of every edge of the graph.
In the mathematical discipline of graph theory, a matching or independent edge set in an undirected graph is a set of edges without common vertices. In other words, a subset of the edges is a matching if each vertex appears in at most one edge of that matching. Finding a matching in a bipartite graph can be treated as a network flow problem.
In computer science, a topological sort or topological ordering of a directed graph is a linear ordering of its vertices such that for every directed edge (u,v) from vertex u to vertex v, u comes before v in the ordering. For instance, the vertices of the graph may represent tasks to be performed, and the edges may represent constraints that one task must be performed before another; in this application, a topological ordering is just a valid sequence for the tasks. Precisely, a topological sort is a graph traversal in which each node v is visited only after all its dependencies are visited. A topological ordering is possible if and only if the graph has no directed cycles, that is, if it is a directed acyclic graph (DAG). Any DAG has at least one topological ordering, and algorithms are known for constructing a topological ordering of any DAG in linear time. Topological sorting has many applications, especially in ranking problems such as feedback arc set. Topological sorting is possible even when the DAG has disconnected components.
In graph theory, a cograph, or complement-reducible graph, or P4-free graph, is a graph that can be generated from the single-vertex graph K1 by complementation and disjoint union. That is, the family of cographs is the smallest class of graphs that includes K1 and is closed under complementation and disjoint union.
In mathematics and computer science, connectivity is one of the basic concepts of graph theory: it asks for the minimum number of elements that need to be removed to separate the remaining nodes into two or more isolated subgraphs. It is closely related to the theory of network flow problems. The connectivity of a graph is an important measure of its resilience as a network.
In graph theory, a dominating set for a graph G is a subset D of its vertices, such that any vertex of G is in D, or has a neighbor in D. The domination numberγ(G) is the number of vertices in a smallest dominating set for G.
In graph theory, a domatic partition of a graph is a partition of into disjoint sets , ,..., such that each Vi is a dominating set for G. The figure on the right shows a domatic partition of a graph; here the dominating set consists of the yellow vertices, consists of the green vertices, and consists of the blue vertices.
In graph theory, a path decomposition of a graph G is, informally, a representation of G as a "thickened" path graph, and the pathwidth of G is a number that measures how much the path was thickened to form G. More formally, a path-decomposition is a sequence of subsets of vertices of G such that the endpoints of each edge appear in one of the subsets and such that each vertex appears in a contiguous subsequence of the subsets, and the pathwidth is one less than the size of the largest set in such a decomposition. Pathwidth is also known as interval thickness, vertex separation number, or node searching number.
In the mathematical area of graph theory, a triangle-free graph is an undirected graph in which no three vertices form a triangle of edges. Triangle-free graphs may be equivalently defined as graphs with clique number ≤ 2, graphs with girth ≥ 4, graphs with no induced 3-cycle, or locally independent graphs.
In graph theory, boxicity is a graph invariant, introduced by Fred S. Roberts in 1969.
In computer science and graph theory, Karger's algorithm is a randomized algorithm to compute a minimum cut of a connected graph. It was invented by David Karger and first published in 1993.
In graph theory, the planar separator theorem is a form of isoperimetric inequality for planar graphs, that states that any planar graph can be split into smaller pieces by removing a small number of vertices. Specifically, the removal of vertices from an n-vertex graph can partition the graph into disjoint subgraphs each of which has at most vertices.
In the mathematical field of graph theory, the intersection number of a graph is the smallest number of elements in a representation of as an intersection graph of finite sets. In such a representation, each vertex is represented as a set, and two vertices are connected by an edge whenever their sets have a common element. Equivalently, the intersection number is the smallest number of cliques needed to cover all of the edges of .
In computing and graph theory, a dynamic connectivity structure is a data structure that dynamically maintains information about the connected components of a graph.
In computational complexity theory, a planted clique or hidden clique in an undirected graph is a clique formed from another graph by selecting a subset of vertices and adding edges between each pair of vertices in the subset. The planted clique problem is the algorithmic problem of distinguishing random graphs from graphs that have a planted clique. This is a variation of the clique problem; it may be solved in quasi-polynomial time but is conjectured not to be solvable in polynomial time for intermediate values of the clique size. The conjecture that no polynomial time solution exists is called the planted clique conjecture; it has been used as a computational hardness assumption.
A central problem in algorithmic graph theory is the shortest path problem. One of the generalizations of the shortest path problem is known as the single-source-shortest-paths (SSSP) problem, which consists of finding the shortest paths from a source vertex to all other vertices in the graph. There are classical sequential algorithms which solve this problem, such as Dijkstra's algorithm. In this article, however, we present two parallel algorithms solving this problem.