In mathematics and computer science, graph edit distance (GED) is a measure of similarity (or dissimilarity) between two graphs. The concept of graph edit distance was first formalized mathematically by Alberto Sanfeliu and King-Sun Fu in 1983. [1] A major application of graph edit distance is in inexact graph matching, such as error-tolerant pattern recognition in machine learning. [2]
The graph edit distance between two graphs is related to the string edit distance between strings. With the interpretation of strings as connected, directed acyclic graphs of maximum degree one, classical definitions of edit distance such as Levenshtein distance, [3] [4] Hamming distance [5] and Jaro–Winkler distance may be interpreted as graph edit distances between suitably constrained graphs. Likewise, graph edit distance is also a generalization of tree edit distance between rooted trees. [6] [7] [8] [9]
The mathematical definition of graph edit distance is dependent upon the definitions of the graphs over which it is defined, i.e. whether and how the vertices and edges of the graph are labeled and whether the edges are directed. Generally, given a set of graph edit operations (also known as elementary graph operations), the graph edit distance between two graphs and , written as can be defined as
where denotes the set of edit paths transforming into (a graph isomorphic to) and is the cost of each graph edit operation .
The set of elementary graph edit operators typically includes:
Additional, but less common operators, include operations such as edge splitting that introduces a new vertex into an edge (also creating a new edge), and edge contraction that eliminates vertices of degree two between edges (of the same color). Although such complex edit operators can be defined in terms of more elementary transformations, their use allows finer parameterization of the cost function when the operator is cheaper than the sum of its constituents.
A deep analysis of the elementary graph edit operators is presented in [10] [11] [12]
And some methods have been presented to automatically deduce these elementary graph edit operators. [13] [14] [15] [16] [17] And some algorithms learn these costs online: [18]
Graph edit distance finds applications in handwriting recognition, [19] fingerprint recognition [20] and cheminformatics. [21]
Exact algorithms for computing the graph edit distance between a pair of graphs typically transform the problem into one of finding the minimum cost edit path between the two graphs. The computation of the optimal edit path is cast as a pathfinding search or shortest path problem, often implemented as an A* search algorithm.
In addition to exact algorithms, a number of efficient approximation algorithms are also known. Most of them have cubic computational time [22] [23] [24] [25] [26]
Moreover, there is an algorithm that deduces an approximation of the GED in linear time [27]
Despite the above algorithms sometimes working well in practice, in general the problem of computing graph edit distance is NP-hard (for a proof that's available online, see Section 2 of Zeng et al.), and is even hard to approximate (formally, it is APX-hard [28] ).
A minimum spanning tree (MST) or minimum weight spanning tree is a subset of the edges of a connected, edge-weighted undirected graph that connects all the vertices together, without any cycles and with the minimum possible total edge weight. That is, it is a spanning tree whose sum of edge weights is as small as possible. More generally, any edge-weighted undirected graph has a minimum spanning forest, which is a union of the minimum spanning trees for its connected components.
In graph theory, graph coloring is a special case of graph labeling; it is an assignment of labels traditionally called "colors" to elements of a graph subject to certain constraints. In its simplest form, it is a way of coloring the vertices of a graph such that no two adjacent vertices are of the same color; this is called a vertex coloring. Similarly, an edge coloring assigns a color to each edge so that no two adjacent edges are of the same color, and a face coloring of a planar graph assigns a color to each face or region so that no two faces that share a boundary have the same color.
In computer science, a graph is an abstract data type that is meant to implement the undirected graph and directed graph concepts from the field of graph theory within mathematics.
In graph theory, a vertex cover of a graph is a set of vertices that includes at least one endpoint of every edge of the graph.
In the mathematical discipline of graph theory, a matching or independent edge set in an undirected graph is a set of edges without common vertices. In other words, a subset of the edges is a matching if each vertex appears in at most one edge of that matching. Finding a matching in a bipartite graph can be treated as a network flow problem.
In graph theory, a perfect graph is a graph in which the chromatic number equals the size of the maximum clique, both in the graph itself and in every induced subgraph. In all graphs, the chromatic number is greater than or equal to the size of the maximum clique, but they can be far apart. A graph is perfect when these numbers are equal, and remain equal after the deletion of arbitrary subsets of vertices.
In graph theory, a proper edge coloring of a graph is an assignment of "colors" to the edges of the graph so that no two incident edges have the same color. For example, the figure to the right shows an edge coloring of a graph by the colors red, blue, and green. Edge colorings are one of several different types of graph coloring. The edge-coloring problem asks whether it is possible to color the edges of a given graph using at most k different colors, for a given value of k, or with the fewest possible colors. The minimum required number of colors for the edges of a given graph is called the chromatic index of the graph. For example, the edges of the graph in the illustration can be colored by three colors but cannot be colored by two colors, so the graph shown has chromatic index three.
A Euclidean minimum spanning tree of a finite set of points in the Euclidean plane or higher-dimensional Euclidean space connects the points by a system of line segments with the points as endpoints, minimizing the total length of the segments. In it, any two points can reach each other along a path through the line segments. It can be found as the minimum spanning tree of a complete graph with the points as vertices and the Euclidean distances between points as edge weights.
In graph theory, a cograph, or complement-reducible graph, or P4-free graph, is a graph that can be generated from the single-vertex graph K1 by complementation and disjoint union. That is, the family of cographs is the smallest class of graphs that includes K1 and is closed under complementation and disjoint union.
The art gallery problem or museum problem is a well-studied visibility problem in computational geometry. It originates from the following real-world problem:
"In an art gallery, what is the minimum number of guards who together can observe the whole gallery?"
In graph theory and theoretical computer science, a maximum common induced subgraph of two graphs G and H is a graph that is an induced subgraph of both G and H, and that has as many vertices as possible.
In computer science, the Hopcroft–Karp algorithm is an algorithm that takes a bipartite graph as input and produces a maximum-cardinality matching as output — a set of as many edges as possible with the property that no two edges share an endpoint. It runs in time in the worst case, where is set of edges in the graph, is set of vertices of the graph, and it is assumed that . In the case of dense graphs the time bound becomes , and for sparse random graphs it runs in time with high probability.
Maximum cardinality matching is a fundamental problem in graph theory. We are given a graph G, and the goal is to find a matching containing as many edges as possible; that is, a maximum cardinality subset of the edges such that each vertex is adjacent to at most one edge of the subset. As each edge will cover exactly two vertices, this problem is equivalent to the task of finding a matching that covers as many vertices as possible.
In mathematics, particularly geometric graph theory, a unit distance graph is a graph formed from a collection of points in the Euclidean plane by connecting two points whenever the distance between them is exactly one. To distinguish these graphs from a broader definition that allows some non-adjacent pairs of vertices to be at distance one, they may also be called strict unit distance graphs or faithful unit distance graphs. As a hereditary family of graphs, they can be characterized by forbidden induced subgraphs. The unit distance graphs include the cactus graphs, the matchstick graphs and penny graphs, and the hypercube graphs. The generalized Petersen graphs are non-strict unit distance graphs.
In graph theory, a branch of mathematics, a skew-symmetric graph is a directed graph that is isomorphic to its own transpose graph, the graph formed by reversing all of its edges, under an isomorphism that is an involution without any fixed points. Skew-symmetric graphs are identical to the double covering graphs of bidirected graphs.
In the study of graph coloring problems in mathematics and computer science, a greedy coloring or sequential coloring is a coloring of the vertices of a graph formed by a greedy algorithm that considers the vertices of the graph in sequence and assigns each vertex its first available color. Greedy colorings can be found in linear time, but they do not, in general, use the minimum number of colors possible.
In graph theory, the blossom algorithm is an algorithm for constructing maximum matchings on graphs. The algorithm was developed by Jack Edmonds in 1961, and published in 1965. Given a general graph G = (V, E), the algorithm finds a matching M such that each vertex in V is incident with at most one edge in M and |M| is maximized. The matching is constructed by iteratively improving an initial empty matching along augmenting paths in the graph. Unlike bipartite matching, the key new idea is that an odd-length cycle in the graph (blossom) is contracted to a single vertex, with the search continuing iteratively in the contracted graph.
In graph theory, a branch of mathematics, an indifference graph is an undirected graph constructed by assigning a real number to each vertex and connecting two vertices by an edge when their numbers are within one unit of each other. Indifference graphs are also the intersection graphs of sets of unit intervals, or of properly nested intervals. Based on these two types of interval representations, these graphs are also called unit interval graphs or proper interval graphs; they form a subclass of the interval graphs.
In graph theory and computational complexity theory, a Frankl–Rödl graph is a graph defined by connecting pairs of vertices of a hypercube that are at a specified even distance from each other. The graphs of this type are parameterized by the dimension of the hypercube and by the distance between adjacent vertices.
Graph matching is the problem of finding a similarity between graphs.
{{cite journal}}
: CS1 maint: bot: original URL status unknown (link)