Graph isomorphism problem

Last updated

Contents

Unsolved problem in computer science:

Can the graph isomorphism problem be solved in polynomial time?

The graph isomorphism problem is the computational problem of determining whether two finite graphs are isomorphic.

The problem is not known to be solvable in polynomial time nor to be NP-complete, and therefore may be in the computational complexity class NP-intermediate. It is known that the graph isomorphism problem is in the low hierarchy of class NP, which implies that it is not NP-complete unless the polynomial time hierarchy collapses to its second level. [1] At the same time, isomorphism for many special classes of graphs can be solved in polynomial time, and in practice graph isomorphism can often be solved efficiently. [2] [3]

This problem is a special case of the subgraph isomorphism problem, [4] which asks whether a given graph G contains a subgraph that is isomorphic to another given graph H; this problem is known to be NP-complete. It is also known to be a special case of the non-abelian hidden subgroup problem over the symmetric group. [5]

In the area of image recognition it is known as the exact graph matching. [6]

State of the art

In November 2015, László Babai announced a quasi-polynomial time algorithm for all graphs, that is, one with running time for some fixed . [7] [8] [9] [10] On January 4, 2017, Babai retracted the quasi-polynomial claim and stated a sub-exponential time bound instead after Harald Helfgott discovered a flaw in the proof. On January 9, 2017, Babai announced a correction (published in full on January 19) and restored the quasi-polynomial claim, with Helfgott confirming the fix. [11] [12] Helfgott further claims that one can take c = 3, so the running time is 2O((log n)3). [13] [14]

Prior to this, the best accepted theoretical algorithm was due to Babai & Luks (1983), and was based on the earlier work by Luks (1982) combined with a subfactorial algorithm of V. N. Zemlyachenko ( Zemlyachenko, Korneenko & Tyshkevich 1985 ). The algorithm has run time 2O(n log n) for graphs with n vertices and relies on the classification of finite simple groups. Without this classification theorem, a slightly weaker bound 2O(n log2 n) was obtained first for strongly regular graphs by LászlóBabai  ( 1980 ), and then extended to general graphs by Babai & Luks (1983). Improvement of the exponent n for strongly regular graphs was done by Spielman (1996). For hypergraphs of bounded rank, a subexponential upper bound matching the case of graphs was obtained by Babai & Codenotti (2008).

There are several competing practical algorithms for graph isomorphism, such as those due to McKay (1981), Schmidt & Druffel (1976), Ullman (1976), and Stoichev (2019). While they seem to perform well on random graphs, a major drawback of these algorithms is their exponential time performance in the worst case. [15]

The graph isomorphism problem is computationally equivalent to the problem of computing the automorphism group of a graph, [16] [17] and is weaker than the permutation group isomorphism problem and the permutation group intersection problem. For the latter two problems, Babai, Kantor & Luks (1983) obtained complexity bounds similar to that for graph isomorphism.

Solved special cases

A number of important special cases of the graph isomorphism problem have efficient, polynomial-time solutions:

Complexity class GI

Since the graph isomorphism problem is neither known to be NP-complete nor known to be tractable, researchers have sought to gain insight into the problem by defining a new class GI, the set of problems with a polynomial-time Turing reduction to the graph isomorphism problem. [31] If in fact the graph isomorphism problem is solvable in polynomial time, GI would equal P . On the other hand, if the problem is NP-complete, GI would equal NP and all problems in NP would be solvable in quasi-polynomial time.

As is common for complexity classes within the polynomial time hierarchy, a problem is called GI-hard if there is a polynomial-time Turing reduction from any problem in GI to that problem, i.e., a polynomial-time solution to a GI-hard problem would yield a polynomial-time solution to the graph isomorphism problem (and so all problems in GI). A problem is called complete for GI, or GI-complete, if it is both GI-hard and a polynomial-time solution to the GI problem would yield a polynomial-time solution to .

The graph isomorphism problem is contained in both NP and co- AM . GI is contained in and low for Parity P, as well as contained in the potentially much smaller class SPP. [32] That it lies in Parity P means that the graph isomorphism problem is no harder than determining whether a polynomial-time nondeterministic Turing machine has an even or odd number of accepting paths. GI is also contained in and low for ZPP NP. [33] This essentially means that an efficient Las Vegas algorithm with access to an NP oracle can solve graph isomorphism so easily that it gains no power from being given the ability to do so in constant time.

GI-complete and GI-hard problems

Isomorphism of other objects

There are a number of classes of mathematical objects for which the problem of isomorphism is a GI-complete problem. A number of them are graphs endowed with additional properties or restrictions: [34]

GI-complete classes of graphs

A class of graphs is called GI-complete if recognition of isomorphism for graphs from this subclass is a GI-complete problem. The following classes are GI-complete: [34]

Many classes of digraphs are also GI-complete.

Other GI-complete problems

There are other nontrivial GI-complete problems in addition to isomorphism problems.

  • The recognition of self-complementarity of a graph or digraph. [40]
  • A clique problem for a class of so-called M-graphs. It is shown that finding an isomorphism for n-vertex graphs is equivalent to finding an n-clique in an M-graph of size n2. This fact is interesting because the problem of finding a clique of order (1  ε)n in a M-graph of size n2 is NP-complete for arbitrarily small positive ε. [41]
  • The problem of homeomorphism of 2-complexes. [42]
  • The definability problem for first-order logic. The input of this problem is a relational database instance I and a relation R, and the question to answer is whether there exists a first-order query Q (without constants) such that Q evaluated on I gives R as the answer. [43]

GI-hard problems

  • The problem of counting the number of isomorphisms between two graphs is polynomial-time equivalent to the problem of telling whether even one exists. [44]
  • The problem of deciding whether two convex polytopes given by either the V-description or H-description are projectively or affinely isomorphic. The latter means existence of a projective or affine map between the spaces that contain the two polytopes (not necessarily of the same dimension) which induces a bijection between the polytopes. [39]

Program checking

ManuelBlum andSampath Kannan ( 1995 ) have shown a probabilistic checker for programs for graph isomorphism. Suppose P is a claimed polynomial-time procedure that checks if two graphs are isomorphic, but it is not trusted. To check if graphs G and H are isomorphic:

This procedure is polynomial-time and gives the correct answer if P is a correct program for graph isomorphism. If P is not a correct program, but answers correctly on G and H, the checker will either give the correct answer, or detect invalid behaviour of P. If P is not a correct program, and answers incorrectly on G and H, the checker will detect invalid behaviour of P with high probability, or answer wrong with probability 2−100.

Notably, P is used only as a blackbox.

Applications

Graphs are commonly used to encode structural information in many fields, including computer vision and pattern recognition, and graph matching, i.e., identification of similarities between graphs, is an important tools in these areas. In these areas graph isomorphism problem is known as the exact graph matching. [45]

In cheminformatics and in mathematical chemistry, graph isomorphism testing is used to identify a chemical compound within a chemical database. [46] Also, in organic mathematical chemistry graph isomorphism testing is useful for generation of molecular graphs and for computer synthesis.

Chemical database search is an example of graphical data mining, where the graph canonization approach is often used. [47] In particular, a number of identifiers for chemical substances, such as SMILES and InChI, designed to provide a standard and human-readable way to encode molecular information and to facilitate the search for such information in databases and on the web, use canonization step in their computation, which is essentially the canonization of the graph which represents the molecule.

In electronic design automation graph isomorphism is the basis of the Layout Versus Schematic (LVS) circuit design step, which is a verification whether the electric circuits represented by a circuit schematic and an integrated circuit layout are the same. [48]

See also

Notes

  1. Schöning (1987).
  2. Babai, László; Erdős, Paul; Selkow, Stanley M. (1980-08-01). "Random Graph Isomorphism". SIAM Journal on Computing. 9 (3): 628–635. doi:10.1137/0209047. ISSN   0097-5397.
  3. McKay (1981).
  4. Ullman (1976).
  5. Moore, Russell & Schulman (2008).
  6. Endika Bengoetxea, "Inexact Graph Matching Using Estimation of Distribution Algorithms", Ph. D., 2002, Chapter 2:The graph matching problem (retrieved June 28, 2017)
  7. "Mathematician claims breakthrough in complexity theory". Science. November 10, 2015.
  8. Babai (2015)
  9. Video of first 2015 lecture linked from Babai's home page
  10. "The Graph Isomorphism Problem". Communications of the ACM. Retrieved 4 May 2021.
  11. Babai, László (January 9, 2017), Graph isomorphism update
  12. Erica Klarreich (January 14, 2017). "Graph Isomorphism Vanquished — Again". Quanta Magazine.
  13. Helfgott, Harald (January 16, 2017), Isomorphismes de graphes en temps quasi-polynomial (d'après Babai et Luks, Weisfeiler-Leman...), arXiv: 1701.04372 , Bibcode:2017arXiv170104372A
  14. Dona, Daniele; Bajpai, Jitendra; Helfgott, Harald Andrés (October 12, 2017). "Graph isomorphisms in quasi-polynomial time". arXiv: 1710.04574 [math.GR].
  15. Foggia, Sansone & Vento (2001).
  16. Luks, Eugene (1993-09-01). "Permutation groups and polynomial-time computation". DIMACS Series in Discrete Mathematics and Theoretical Computer Science. Vol. 11. Providence, Rhode Island: American Mathematical Society. pp. 139–175. doi:10.1090/dimacs/011/11. ISBN   978-0-8218-6599-6. ISSN   1052-1798.
  17. Algeboy (https://cs.stackexchange.com/users/90177/algeboy), Graph isomorphism and the automorphism group, URL (version: 2018-09-20): https://cs.stackexchange.com/q/97575
  18. Kelly (1957).
  19. Aho, Hopcroft & Ullman (1974), p. 84-86.
  20. Hopcroft & Wong (1974).
  21. Datta et al. (2009).
  22. 1 2 Booth & Lueker (1979).
  23. Colbourn (1981).
  24. Muzychuk (2004).
  25. Bodlaender (1990).
  26. Miller 1980; Filotti & Mayer 1980.
  27. Luks (1982).
  28. Babai, Grigoryev & Mount (1982).
  29. Miller (1983).
  30. Luks (1986).
  31. Booth & Colbourn 1977; Köbler, Schöning & Torán 1993.
  32. Köbler, Schöning & Torán 1992; Arvind & Kurur 2006
  33. Arvind & Köbler (2000).
  34. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Zemlyachenko, Korneenko & Tyshkevich (1985)
  35. Narayanamurthy & Ravindran (2008).
  36. Grigor'ev (1981).
  37. Gabarró, Joaquim; García, Alina; Serna, Maria (2011). "The complexity of game isomorphism". Theoretical Computer Science. 412 (48): 6675–6695. doi:10.1016/j.tcs.2011.07.022.
  38. Johnson (2005); Kaibel & Schwartz (2003).
  39. 1 2 Kaibel & Schwartz (2003).
  40. Colbourn & Colbourn (1978).
  41. Kozen (1978).
  42. Shawe-Taylor & Pisanski (1994).
  43. Arenas & Diaz (2016).
  44. Mathon (1979); Johnson 2005.
  45. Endika Bengoetxea, Ph.D., Abstract
  46. Irniger (2005).
  47. Cook & Holder (2007).
  48. Baird & Cho (1975).

Related Research Articles

The P versus NP problem is a major unsolved problem in theoretical computer science. Informally, it asks whether every problem whose solution can be quickly verified can also be quickly solved.

In theoretical computer science and mathematics, computational complexity theory focuses on classifying computational problems according to their resource usage, and relating these classes to each other. A computational problem is a task solved by a computer. A computation problem is solvable by mechanical application of mathematical steps, such as an algorithm.

<span class="mw-page-title-main">Interactive proof system</span>

In computational complexity theory, an interactive proof system is an abstract machine that models computation as the exchange of messages between two parties: a prover and a verifier. The parties interact by exchanging messages in order to ascertain whether a given string belongs to a language or not. The prover possesses unlimited computational resources but cannot be trusted, while the verifier has bounded computation power but is assumed to be always honest. Messages are sent between the verifier and prover until the verifier has an answer to the problem and has "convinced" itself that it is correct.

<span class="mw-page-title-main">Graph isomorphism</span> Bijection between the vertex set of two graphs

In graph theory, an isomorphism of graphsG and H is a bijection between the vertex sets of G and H

<span class="mw-page-title-main">Clique problem</span> Task of computing complete subgraphs

In computer science, the clique problem is the computational problem of finding cliques in a graph. It has several different formulations depending on which cliques, and what information about the cliques, should be found. Common formulations of the clique problem include finding a maximum clique, finding a maximum weight clique in a weighted graph, listing all maximal cliques, and solving the decision problem of testing whether a graph contains a clique larger than a given size.

In theoretical computer science, the subgraph isomorphism problem is a computational task in which two graphs G and H are given as input, and one must determine whether G contains a subgraph that is isomorphic to H. Subgraph isomorphism is a generalization of both the maximum clique problem and the problem of testing whether a graph contains a Hamiltonian cycle, and is therefore NP-complete. However certain other cases of subgraph isomorphism may be solved in polynomial time.

The Gödel Prize is an annual prize for outstanding papers in the area of theoretical computer science, given jointly by the European Association for Theoretical Computer Science (EATCS) and the Association for Computing Machinery Special Interest Group on Algorithms and Computational Theory. The award is named in honor of Kurt Gödel. Gödel's connection to theoretical computer science is that he was the first to mention the "P versus NP" question, in a 1956 letter to John von Neumann in which Gödel asked whether a certain NP-complete problem could be solved in quadratic or linear time.

In computational complexity theory, an Arthur–Merlin protocol, introduced by Babai (1985), is an interactive proof system in which the verifier's coin tosses are constrained to be public. Goldwasser & Sipser (1986) proved that all (formal) languages with interactive proofs of arbitrary length with private coins also have interactive proofs with public coins.

The Fulkerson Prize for outstanding papers in the area of discrete mathematics is sponsored jointly by the Mathematical Optimization Society (MOS) and the American Mathematical Society (AMS). Up to three awards of $1,500 each are presented at each (triennial) International Symposium of the MOS. Originally, the prizes were paid out of a memorial fund administered by the AMS that was established by friends of the late Delbert Ray Fulkerson to encourage mathematical excellence in the fields of research exemplified by his work. The prizes are now funded by an endowment administered by MPS.

In computational complexity theory, the PCP theorem states that every decision problem in the NP complexity class has probabilistically checkable proofs of constant query complexity and logarithmic randomness complexity.

<span class="mw-page-title-main">László Babai</span> Hungarian mathematician and computer scientist

László "Laci" Babai is a Hungarian professor of computer science and mathematics at the University of Chicago. His research focuses on computational complexity theory, algorithms, combinatorics, and finite groups, with an emphasis on the interactions between these fields.

In computational complexity, problems that are in the complexity class NP but are neither in the class P nor NP-complete are called NP-intermediate, and the class of such problems is called NPI. Ladner's theorem, shown in 1975 by Richard E. Ladner, is a result asserting that, if P ≠ NP, then NPI is not empty; that is, NP contains problems that are neither in P nor NP-complete. Since it is also true that if NPI problems exist, then P ≠ NP, it follows that P = NP if and only if NPI is empty.

In the mathematical field of graph theory, an automorphism of a graph is a form of symmetry in which the graph is mapped onto itself while preserving the edge–vertex connectivity.

In abstract algebra, the group isomorphism problem is the decision problem of determining whether two given finite group presentations refer to isomorphic groups.

In graph theory, a branch of mathematics, graph canonization is the problem of finding a canonical form of a given graph G. A canonical form is a labeled graph Canon(G) that is isomorphic to G, such that every graph that is isomorphic to G has the same canonical form as G. Thus, from a solution to the graph canonization problem, one could also solve the problem of graph isomorphism: to test whether two graphs G and H are isomorphic, compute their canonical forms Canon(G) and Canon(H), and test whether these two canonical forms are identical.

<span class="mw-page-title-main">NP-completeness</span> Complexity class

In computational complexity theory, a problem is NP-complete when:

  1. It is a decision problem, meaning that for any input to the problem, the output is either "yes" or "no".
  2. When the answer is "yes", this can be demonstrated through the existence of a short solution.
  3. The correctness of each solution can be verified quickly and a brute-force search algorithm can find a solution by trying all possible solutions.
  4. The problem can be used to simulate every other problem for which we can verify quickly that a solution is correct. In this sense, NP-complete problems are the hardest of the problems to which solutions can be verified quickly. If we could find solutions of some NP-complete problem quickly, we could quickly find the solutions of every other problem to which a given solution can be easily verified.

In computational complexity theory and the analysis of algorithms, an algorithm is said to take quasi-polynomial time if its time complexity is quasi-polynomially bounded. That is, there should exist a constant such that the worst-case running time of the algorithm, on inputs of size , has an upper bound of the form

In structural complexity theory, the Berman–Hartmanis conjecture is an unsolved conjecture named after Leonard C. Berman and Juris Hartmanis that states that all NP-complete languages look alike, in the sense that they can be related to each other by polynomial time isomorphisms.

Eugene Michael Luks is an American mathematician and computer scientist, a professor emeritus of computer and information science at the University of Oregon. He is known for his research on the graph isomorphism problem and on algorithms for computational group theory.

References

Surveys and monographs

Software