Wiener index

Last updated

In chemical graph theory, the Wiener index (also Wiener number) introduced by Harry Wiener, is a topological index of a molecule, defined as the sum of the lengths of the shortest paths between all pairs of vertices in the chemical graph representing the non-hydrogen atoms in the molecule. [1]

Contents

Wiener index can be used for the representation of computer networks and enhancing lattice hardware security.

History

The Wiener index is named after Harry Wiener, who introduced it in 1947; at the time, Wiener called it the "path number". [2] It is the oldest topological index related to molecular branching. [3] Based on its success, many other topological indexes of chemical graphs, based on information in the distance matrix of the graph, have been developed subsequently to Wiener's work. [4]

The same quantity has also been studied in pure mathematics, under various names including the gross status, [5] the distance of a graph, [6] and the transmission. [7] The Wiener index is also closely related to the closeness centrality of a vertex in a graph, a quantity inversely proportional to the sum of all distances between the given vertex and all other vertices that has been frequently used in sociometry and the theory of social networks. [6]

Example

Butane (C4H10) has two different structural isomers: n-butane, with a linear structure of four carbon atoms, and isobutane, with a branched structure. The chemical graph for n-butane is a four-vertex path graph, and the chemical graph for isobutane is a tree with one central vertex connected to three leaves.

The n-butane molecule has three pairs of vertices at distance one from each other, two pairs at distance two, and one pair at distance three, so its Wiener index is

The isobutane molecule has three pairs of vertices at distances one from each other (the three leaf-center pairs), and three pairs at distance two (the leaf-leaf pairs). Therefore, its Wiener index is

These numbers are instances of formulas for special cases of the Wiener index: it is for any -vertex path graph such as the graph of n-butane, [8] and for any -vertex star such as the graph of isobutane. [9]

Thus, even though these two molecules have the same chemical formula, and the same numbers of carbon-carbon and carbon-hydrogen bonds, their different structures give rise to different Wiener indices.

Relation to chemical properties

Wiener showed that the Wiener index number is closely correlated with the boiling points of alkane molecules. [2] Later work on quantitative structure–activity relationships showed that it is also correlated with other quantities including the parameters of its critical point, [10] the density, surface tension, and viscosity of its liquid phase, [11] and the van der Waals surface area of the molecule. [12]

Calculation in arbitrary graphs

The Wiener index may be calculated directly using an algorithm for computing all pairwise distances in the graph. When the graph is unweighted (so the length of a path is just its number of edges), these distances may be calculated by repeating a breadth-first search algorithm, once for each starting vertex. [13] The total time for this approach is O(nm), where n is the number of vertices in the graph and m is its number of edges.

For weighted graphs, one may instead use the Floyd–Warshall algorithm [13] [14] [15] or Johnson's algorithm, [16] with running time O(n3) or O(nm + n2 log n) respectively. Alternative but less efficient algorithms based on repeated matrix multiplication have also been developed within the chemical informatics literature. [17] [18]

Calculation in special types of graph

When the underlying graph is a tree (as is true for instance for the alkanes originally studied by Wiener), the Wiener index may be calculated more efficiently. If the graph is partitioned into two subtrees by removing a single edge e, then its Wiener index is the sum of the Wiener indices of the two subtrees, together with a third term representing the paths that pass through e. This third term may be calculated in linear time by computing the sum of distances of all vertices from e within each subtree and multiplying the two sums. [19] This divide and conquer algorithm can be generalized from trees to graphs of bounded treewidth, and leads to near-linear-time algorithms for such graphs. [20]

An alternative method for calculating the Wiener index of a tree, by Bojan Mohar and Tomaž Pisanski, works by generalizing the problem to graphs with weighted vertices, where the weight of a path is the product of its length with the weights of its two endpoints. If v is a leaf vertex of the tree then the Wiener index of the tree may be calculated by merging v with its parent (adding their weights together), computing the index of the resulting smaller tree, and adding a simple correction term for the paths that pass through the edge from v to its parent. By repeatedly removing leaves in this way, the Wiener index may be calculated in linear time. [13]

For graphs that are constructed as products of simpler graphs, the Wiener index of the product graph can often be computed by a simple formula that combines the indices of its factors. [21] Benzenoids (graphs formed by gluing regular hexagons edge-to-edge) can be embedded isometrically into the Cartesian product of three trees, allowing their Wiener indices to be computed in linear time by using the product formula together with the linear time tree algorithm. [22]

Inverse problem

Gutman & Yeh (1995) considered the problem of determining which numbers can be represented as the Wiener index of a graph. [23] They showed that all but two positive integers have such a representation; the two exceptions are the numbers 2 and 5, which are not the Wiener index of any graph. For graphs that must be bipartite, they found that again almost all integers can be represented, with a larger set of exceptions: none of the numbers in the set

{2, 3, 5, 6, 7, 11, 12, 13, 15, 17, 19, 33, 37, 39}

can be represented as the Wiener index of a bipartite graph.

Gutman and Yeh conjectured, but were unable to prove, a similar description of the numbers that can be represented as Wiener indices of trees, with a set of 49 exceptional values:

2, 3, 5, 6, 7, 8, 11, 12, 13, 14, 15, 17, 19, 21, 22, 23, 24, 26, 27, 30, 33, 34, 37, 38, 39, 41, 43, 45, 47, 51, 53, 55, 60, 61, 69, 73, 77, 78, 83, 85, 87, 89, 91, 99, 101, 106, 113, 147, 159 (sequence A122686 in the OEIS )

The conjecture was later proven by Wagner, Wang, and Yu. [24] [25]

Related Research Articles

The Hamiltonian path problem is a topic discussed in the fields of complexity theory and graph theory. It decides if a directed or undirected graph, G, contains a Hamiltonian path, a path that visits every vertex in the graph exactly once. The problem may specify the start and end of the path, in which case the starting vertex s and ending vertex t must be identified.

In the mathematical discipline of graph theory, a matching or independent edge set in an undirected graph is a set of edges without common vertices. In other words, a subset of the edges is a matching if each vertex appears in at most one edge of that matching. Finding a matching in a bipartite graph can be treated as a network flow problem.

In graph theory, the resistance distance between two vertices of a simple, connected graph, G, is equal to the resistance between two equivalent points on an electrical network, constructed so as to correspond to G, with each edge being replaced by a resistance of one ohm. It is a metric on graphs.

<span class="mw-page-title-main">Chordal graph</span> Graph where all long cycles have a chord

In the mathematical area of graph theory, a chordal graph is one in which all cycles of four or more vertices have a chord, which is an edge that is not part of the cycle but connects two vertices of the cycle. Equivalently, every induced cycle in the graph should have exactly three vertices. The chordal graphs may also be characterized as the graphs that have perfect elimination orderings, as the graphs in which each minimal separator is a clique, and as the intersection graphs of subtrees of a tree. They are sometimes also called rigid circuit graphs or triangulated graphs.

In computer science, a topological sort or topological ordering of a directed graph is a linear ordering of its vertices such that for every directed edge (u,v) from vertex u to vertex v, u comes before v in the ordering. For instance, the vertices of the graph may represent tasks to be performed, and the edges may represent constraints that one task must be performed before another; in this application, a topological ordering is just a valid sequence for the tasks. Precisely, a topological sort is a graph traversal in which each node v is visited only after all its dependencies are visited. A topological ordering is possible if and only if the graph has no directed cycles, that is, if it is a directed acyclic graph (DAG). Any DAG has at least one topological ordering, and algorithms are known for constructing a topological ordering of any DAG in linear time. Topological sorting has many applications, especially in ranking problems such as feedback arc set. Topological sorting is possible even when the DAG has disconnected components.

<span class="mw-page-title-main">Euclidean minimum spanning tree</span> Shortest network connecting points

A Euclidean minimum spanning tree of a finite set of points in the Euclidean plane or higher-dimensional Euclidean space connects the points by a system of line segments with the points as endpoints, minimizing the total length of the segments. In it, any two points can reach each other along a path through the line segments. It can be found as the minimum spanning tree of a complete graph with the points as vertices and the Euclidean distances between points as edge weights.

<span class="mw-page-title-main">Cograph</span> Graph formed by complementation and disjoint union

In graph theory, a cograph, or complement-reducible graph, or P4-free graph, is a graph that can be generated from the single-vertex graph K1 by complementation and disjoint union. That is, the family of cographs is the smallest class of graphs that includes K1 and is closed under complementation and disjoint union.

In graph theory, a path decomposition of a graph G is, informally, a representation of G as a "thickened" path graph, and the pathwidth of G is a number that measures how much the path was thickened to form G. More formally, a path-decomposition is a sequence of subsets of vertices of G such that the endpoints of each edge appear in one of the subsets and such that each vertex appears in a contiguous subsequence of the subsets, and the pathwidth is one less than the size of the largest set in such a decomposition. Pathwidth is also known as interval thickness, vertex separation number, or node searching number.

<span class="mw-page-title-main">Hosoya index</span> Number of matchings in a graph

The Hosoya index, also known as the Z index, of a graph is the total number of matchings in it. The Hosoya index is always at least one, because the empty set of edges is counted as a matching for this purpose. Equivalently, the Hosoya index is the number of non-empty matchings plus one. The index is named after Haruo Hosoya. It is used as a topological index in chemical graph theory.

In the fields of chemical graph theory, molecular topology, and mathematical chemistry, a topological index, also known as a connectivity index, is a type of a molecular descriptor that is calculated based on the molecular graph of a chemical compound. Topological indices are numerical parameters of a graph which characterize its topology and are usually graph invariant. Topological indices are used for example in the development of quantitative structure-activity relationships (QSARs) in which the biological activity or other properties of molecules are correlated with their chemical structure.

<span class="mw-page-title-main">Distance-hereditary graph</span> Graph whose induced subgraphs preserve distance

In graph theory, a branch of discrete mathematics, a distance-hereditary graph is a graph in which the distances in any connected induced subgraph are the same as they are in the original graph. Thus, any induced subgraph inherits the distances of the larger graph.

In graph theory, the planar separator theorem is a form of isoperimetric inequality for planar graphs, that states that any planar graph can be split into smaller pieces by removing a small number of vertices. Specifically, the removal of vertices from an n-vertex graph can partition the graph into disjoint subgraphs each of which has at most vertices.

<span class="mw-page-title-main">Caterpillar tree</span> Tree graph with all nodes within distance 1 from central path

In graph theory, a caterpillar or caterpillar tree is a tree in which all the vertices are within distance 1 of a central path.

In graph theory, a partial cube is a graph that is isometric to a subgraph of a hypercube. In other words, a partial cube can be identified with a subgraph of a hypercube in such a way that the distance between any two vertices in the partial cube is the same as the distance between those vertices in the hypercube. Equivalently, a partial cube is a graph whose vertices can be labeled with bit strings of equal length in such a way that the distance between two vertices in the graph is equal to the Hamming distance between their labels. Such a labeling is called a Hamming labeling; it represents an isometric embedding of the partial cube into a hypercube.

In computer science, the minimum routing cost spanning tree of a weighted graph is a spanning tree minimizing the sum of pairwise distances between vertices in the tree. It is also called the optimum distance spanning tree, shortest total path length spanning tree, minimum total distance spanning tree, or minimum average distance spanning tree. In an unweighted graph, this is the spanning tree of minimum Wiener index. Hu (1974) writes that the problem of constructing these trees was proposed by Francesco Maffioli.

<span class="mw-page-title-main">Widest path problem</span> Path-finding using high-weight graph edges

In graph algorithms, the widest path problem is the problem of finding a path between two designated vertices in a weighted graph, maximizing the weight of the minimum-weight edge in the path. The widest path problem is also known as the maximum capacity path problem. It is possible to adapt most shortest path algorithms to compute widest paths, by modifying them to use the bottleneck distance instead of path length. However, in many cases even faster algorithms are possible.

In network theory, the Wiener connector is a means of maximizing efficiency in connecting specified "query vertices" in a network. Given a connected, undirected graph and a set of query vertices in a graph, the minimum Wiener connector is an induced subgraph that connects the query vertices and minimizes the sum of shortest path distances among all pairs of vertices in the subgraph. In combinatorial optimization, the minimum Wiener connector problem is the problem of finding the minimum Wiener connector. It can be thought of as a version of the classic Steiner tree problem, where instead of minimizing the size of the tree, the objective is to minimize the distances in the subgraph.

In mathematics and computer science, graph edit distance (GED) is a measure of similarity between two graphs. The concept of graph edit distance was first formalized mathematically by Alberto Sanfeliu and King-Sun Fu in 1983. A major application of graph edit distance is in inexact graph matching, such as error-tolerant pattern recognition in machine learning.

In chemical graph theory, the Szeged index is a topological index of a molecule, used in biochemistry. The Szeged index, introduced by Iván Gutman, generalizes the concept of the Wiener index introduced by Harry Wiener. The Szeged index of a connected graph G is defined as

In chemical graph theory, the Padmakar–Ivan (PI) index is a topological index of a molecule, used in biochemistry. The Padmakar–Ivan index is a generalization introduced by Padmakar V. Khadikar and Iván Gutman of the concept of the Wiener index, introduced by Harry Wiener. The Padmakar–Ivan index of a graph G is the sum over all edges uv of G of number of edges which are not equidistant from u and v. Let G be a graph and e = uv an edge of G. Here denotes the number of edges lying closer to the vertex u than the vertex v, and is the number of edges lying closer to the vertex v than the vertex u. The Padmakar–Ivan index of a graph G is defined as

References

  1. Rouvray, Dennis H. (2002), "The rich legacy of half a century of the Wiener index", in Rouvray, Dennis H.; King, Robert Bruce (eds.), Topology in Chemistry: Discrete Mathematics of Molecules, Horwood Publishing, pp. 16–37, ISBN   9781898563761 .
  2. 1 2 Wiener, H. (1947), "Structural determination of paraffin boiling points", Journal of the American Chemical Society, 1 (69): 17–20, doi:10.1021/ja01193a005, PMID   20291038 .
  3. Todeschini, Roberto; Consonni, Viviana (2000), Handbook of Molecular Descriptors, Wiley, ISBN   3-527-29913-0 .
  4. Rouvray (2002). See in particular Table 2 on p. 32.
  5. Harary, Frank (1959), "Status and contrastatus", Sociometry, 22 (1): 23–43, doi:10.2307/2785610, JSTOR   2785610, MR   0134798
  6. 1 2 Entringer, R. C.; Jackson, D. E.; Snyder, D. A. (1976), "Distance in graphs", Czechoslovak Mathematical Journal, 26 (101): 283–296, doi: 10.21136/CMJ.1976.101401 , MR   0543771 .
  7. Šoltés, Ľubomír (1991), "Transmission in graphs: a bound and vertex removing", Mathematica Slovaca, 41 (1): 11–16, MR   1094978 .
  8. As Rouvray (2002) describes, this formula was already given by Wiener (1947).
  9. Bonchev, D.; Trinajstić, N. (1977), "Information theory, distance matrix, and molecular branching", Journal of Chemical Physics, 67 (10): 4517–4533, Bibcode:1977JChPh..67.4517B, doi:10.1063/1.434593, hdl: 10338.dmlcz/104047 .
  10. Stiel, Leonard I.; Thodos, George (1962), "The normal boiling points and critical constants of saturated aliphatic hydrocarbons", AIChE Journal, 8 (4): 527–529, Bibcode:1962AIChE...8..527S, doi:10.1002/aic.690080421 .
  11. Rouvray, D. H.; Crafford, B. C. (1976), "The dependence of physical-chemical properties on topological factors", South African Journal of Science, 72: 47.
  12. Gutman, Ivan; Körtvélyesi, T. (1995), "Wiener indices and molecular surfaces", Zeitschrift für Naturforschung, 50a (7): 669–671, Bibcode:1995ZNatA..50..669G, doi: 10.1515/zna-1995-0707 , S2CID   96881621 .
  13. 1 2 3 Mohar, Bojan; Pisanski, Tomaž (1988), "How to compute the Wiener index of a graph", Journal of Mathematical Chemistry, 2 (3): 267–277, doi:10.1007/BF01167206, MR   0966088, S2CID   15275183 .
  14. Floyd, Robert W. (June 1962), "Algorithm 97: Shortest Path", Communications of the ACM , 5 (6): 345, doi: 10.1145/367766.368168 , S2CID   2003382 .
  15. Warshall, Stephen (January 1962), "A theorem on Boolean matrices", Journal of the ACM , 9 (1): 11–12, doi: 10.1145/321105.321107 , S2CID   33763989 .
  16. Johnson, Donald B. (1977), "Efficient algorithms for shortest paths in sparse networks", Journal of the ACM , 24 (1): 1–13, doi: 10.1145/321992.321993 , S2CID   207678246 .
  17. Bersohn, Malcom (1983), "A fast algorithm for calculation of the distance matrix of a molecule", Journal of Computational Chemistry, 4 (1): 110–113, doi:10.1002/jcc.540040115, S2CID   122640545 .
  18. Müller, W. R.; Szymanski, K.; Knop, J. V.; Trinajstić, N. (1987), "An algorithm for construction of the molecular distance matrix", Journal of Computational Chemistry, 8 (2): 170–173, doi:10.1002/jcc.540080209, S2CID   122278769 .
  19. Canfield, E. R.; Robinson, R. W.; Rouvray, D. H. (1985), "Determination of the Wiener molecular branching index for the general tree", Journal of Computational Chemistry, 6 (6): 598–609, doi:10.1002/jcc.540060613, MR   0824918, S2CID   121861478 .
  20. Cabello, Sergio; Knauer, Christian (2009), "Algorithms for graphs of bounded treewidth via orthogonal range searching", Computational Geometry , 42 (9): 815–824, CiteSeerX   10.1.1.127.8127 , doi: 10.1016/j.comgeo.2009.02.001 , MR   2543804 .
  21. Yeh, Yeong Nan; Gutman, Ivan (1994), "On the sum of all distances in composite graphs", Discrete Mathematics , 135 (1–3): 359–365, doi:10.1016/0012-365X(93)E0092-I, MR   1310892 .
  22. Chepoi, Victor; Klavžar, Sandi (1997), "The Wiener index and the Szeged index of benzenoid systems in linear time", Journal of Chemical Information and Computer Sciences, 37 (4): 752–755, doi:10.1021/ci9700079 . For earlier algorithms for benzenoids based on their partial cube structure, see Klavžar, Sandi; Gutman, Ivan; Mohar, Bojan (1995), "Labeling of benzenoid systems which reflects the vertex-distance relations" (PDF), Journal of Chemical Information and Computer Sciences, 35 (3): 590–593, doi:10.1021/ci00025a030 .
  23. Gutman, Ivan; Yeh, Yeong-Nan (1995), "The sum of all distances in bipartite graphs", Mathematica Slovaca, 45 (4): 327–334, MR   1387048 .
  24. Wagner, Stephan G. (2006), "A class of trees and its Wiener index", Acta Applicandae Mathematicae, 91 (2): 119–132, doi:10.1007/s10440-006-9026-5, MR   2249544, S2CID   53644527 .
  25. Wang, Hua; Yu, Guang (2006), "All but 49 numbers are Wiener indices of trees", Acta Applicandae Mathematicae, 92 (1): 15–20, doi:10.1007/s10440-006-9037-2, MR   2263469, S2CID   14425684 .