Efficiency (network science)

Last updated

In network science, the efficiency of a network is a measure of how efficiently it exchanges information [1] and it is also called communication efficiency. The underlying idea (and main assumption) is that the more distant two nodes are in the network, the less efficient their communication will be. The concept of efficiency can be applied to both local and global scales in a network. On a global scale, efficiency quantifies the exchange of information across the whole network where information is concurrently exchanged. The local efficiency quantifies a network's resistance to failure on a small scale. That is the local efficiency of a node characterizes how well information is exchanged by its neighbors when it is removed.

Contents

Definition

The definition of communication efficiency assumes that the efficiency is inversely proportional to the distance, so in mathematical terms

where is the pairwise efficiency of nodes in network and is their distance.

The average communication efficiency of the network is then defined as the average over the pairwise efficiencies: [1]

where denotes the number of nodes in the network.

Distances can be measured in different ways, depending on the type of networks. The most natural distance for unweighted networks is the length of a shortest path between a nodes and , i.e. a shortest path between is a path with minimum number of edges and the number of edges is its length. Observe that if then —and that is why the sum above is over — while if there is no path connecting and , and their pairwise efficiency is zero. Being a count, for and so is bounded between 0 and 1, i.e. it is a normalised descriptor.

Weighted networks

The shortest path distance can also be generalised to weighted networks, see the weighted shortest path distance, but in this case and the average communication efficiency needs to be properly normalised in order to be comparable among different networks. [2] [3]

In [1] [2] the authors proposed to normalise dividing it by the efficiency of an idealised version of the network :

is the "ideal" graph on nodes wherein all possible edges are present. In the unweighted case every edge has unitary weight, is a clique, a full network, and . When the edges are weighted, a sufficient condition (for having a proper normalisation, i.e. ) on the distances in the ideal network, called this time , is [1] [2]

for . should be known (and different from zero) for all node pairs. A common choice is to take them as the geographical or physical distances in spatial networks [2] or as the maximum cost over all links, e.g. where indicates the maximum interaction strength in the network. [4] However, in [3] the authors highlight the issues of these choices when dealing with real world networks, which are characterised by heterogeneous structure and flows. For instance, choosing makes the global measure very sensitive to outliers in the distribution of weights and tends to under-estimate the actual efficiency of a network. The authors also propose a normalising procedure, i.e. a way for building using all and only the information contained in the edge weights (and no other meta-data such as geographical distances), which is statistically robust and physically grounded.

Efficiency and small-world behaviour

The global efficiency of a network is a measure comparable to , rather than just the average path length itself. The key distinction is that, while measures efficiency in a system where only one packet of information is being moved through the network, measures the efficiency of parallel communication, that is when all the nodes are exchanging packets of information with each other concurrently.

A local average of pairwise communication efficiencies can be used as an alternative to the clustering coefficient of a network. The local efficiency of a network is defined as:

where is the local subgraph consisting only of a node 's immediate neighbors, but not the node itself.

Applications

Broadly speaking, the efficiency of a network can be used to quantify small world behavior in networks. Efficiency can also be used to determine cost-effective structures in weighted and unweighted networks. [2] Comparing the two measures of efficiency in a network to a random network of the same size to see how economically a network is constructed. Furthermore, global efficiency is easier to use numerically than its counterpart, path length. [5]

For these reasons the concept of efficiency has been used across the many diverse applications of network science. [2] [6] Efficiency is useful in analysis of man-made networks such as transportation networks and communications networks. It is used to help determine how cost-efficient a particular network construction is, as well as how fault tolerant it is. Studies of such networks reveal that they tend to have high global efficiency, implying good use of resources, but low local efficiency. This is because, for example, a subway network is not closed, and passengers can be re-routed, by buses for example, even if a particular line in the network is down. [1]

Beyond human constructed networks, efficiency is a useful metric when talking about physical biological networks. In any facet of biology, the scarcity of resource plays a key role, and biological networks are no exception. Efficiency is used in neuroscience to discuss information transfer across neural networks, where the physical space and resource constraints are a major factor. [5] Efficiency has also been used in the study of ant colony tunnel systems, which are usually composed of large rooms as well as many sprawling tunnels. [7] This application to ant colonies is not too surprising because the large structure of a colony must serve as a transportation network for various resources, most namely food. [6]

Related Research Articles

<span class="mw-page-title-main">Backpropagation</span> Optimization algorithm for artificial neural networks

In machine learning, backpropagation is a widely used algorithm for training feedforward artificial neural networks or other parameterized networks with differentiable nodes. It is an efficient application of the Leibniz chain rule (1673) to such networks. It is also known as the reverse mode of automatic differentiation or reverse accumulation, due to Seppo Linnainmaa (1970). The term "back-propagating error correction" was introduced in 1962 by Frank Rosenblatt, but he did not know how to implement this, although Henry J. Kelley had a continuous precursor of backpropagation already in 1960 in the context of control theory.

In graph theory, a clustering coefficient is a measure of the degree to which nodes in a graph tend to cluster together. Evidence suggests that in most real-world networks, and in particular social networks, nodes tend to create tightly knit groups characterised by a relatively high density of ties; this likelihood tends to be greater than the average probability of a tie randomly established between two nodes.

<span class="mw-page-title-main">Centrality</span> Degree of connectedness within a graph

In graph theory and network analysis, indicators of centrality assign numbers or rankings to nodes within a graph corresponding to their network position. Applications include identifying the most influential person(s) in a social network, key infrastructure nodes in the Internet or urban networks, super-spreaders of disease, and brain networks. Centrality concepts were first developed in social network analysis, and many of the terms used to measure centrality reflect their sociological origin.

Mason's gain formula (MGF) is a method for finding the transfer function of a linear signal-flow graph (SFG). The formula was derived by Samuel Jefferson Mason, whom it is also named after. MGF is an alternate method to finding the transfer function algebraically by labeling each signal, writing down the equation for how that signal depends on other signals, and then solving the multiple equations for the output signal in terms of the input signal. MGF provides a step by step method to obtain the transfer function from a SFG. Often, MGF can be determined by inspection of the SFG. The method can easily handle SFGs with many variables and loops including loops with inner loops. MGF comes up often in the context of control systems, microwave circuits and digital filters because these are often represented by SFGs.

Average path length, or average shortest path length is a concept in network topology that is defined as the average number of steps along the shortest paths for all possible pairs of network nodes. It is a measure of the efficiency of information or mass transport on a network.

<span class="mw-page-title-main">Random geometric graph</span> In graph theory, the mathematically simplest spatial network

In graph theory, a random geometric graph (RGG) is the mathematically simplest spatial network, namely an undirected graph constructed by randomly placing N nodes in some metric space and connecting two nodes by a link if and only if their distance is in a given range, e.g. smaller than a certain neighborhood radius, r.

Isomap is a nonlinear dimensionality reduction method. It is one of several widely used low-dimensional embedding methods. Isomap is used for computing a quasi-isometric, low-dimensional embedding of a set of high-dimensional data points. The algorithm provides a simple method for estimating the intrinsic geometry of a data manifold based on a rough estimate of each data point’s neighbors on the manifold. Isomap is highly efficient and generally applicable to a broad range of data sources and dimensionalities.

Different definitions have been given for the dimension of a complex network or graph. For example, metric dimension is defined in terms of the resolving set for a graph. Dimension has also been defined based on the box covering method applied to graphs. Here we describe the definition based on the complex network zeta function. This generalises the definition based on the scaling property of the volume with distance. The best definition depends on the application.

<span class="mw-page-title-main">Network science</span> Academic field

Network science is an academic field which studies complex networks such as telecommunication networks, computer networks, biological networks, cognitive and semantic networks, and social networks, considering distinct elements or actors represented by nodes and the connections between the elements or actors as links. The field draws on theories and methods including graph theory from mathematics, statistical mechanics from physics, data mining and information visualization from computer science, inferential modeling from statistics, and social structure from sociology. The United States National Research Council defines network science as "the study of network representations of physical, biological, and social phenomena leading to predictive models of these phenomena."

<span class="mw-page-title-main">Closeness centrality</span>

In a connected graph, closeness centrality of a node is a measure of centrality in a network, calculated as the reciprocal of the sum of the length of the shortest paths between the node and all other nodes in the graph. Thus, the more central a node is, the closer it is to all other nodes.

<span class="mw-page-title-main">Google matrix</span> Stochastic matrix representing links between entities

A Google matrix is a particular stochastic matrix that is used by Google's PageRank algorithm. The matrix represents a graph with edges representing links between pages. The PageRank of each page can then be generated iteratively from the Google matrix using the power method. However, in order for the power method to converge, the matrix must be stochastic, irreducible and aperiodic.

<span class="mw-page-title-main">Katz centrality</span>

In graph theory, the Katz centrality of a node is a measure of centrality in a network. It was introduced by Leo Katz in 1953 and is used to measure the relative degree of influence of an actor within a social network. Unlike typical centrality measures which consider only the shortest path between a pair of actors, Katz centrality measures influence by taking into account the total number of walks between a pair of actors.

<span class="mw-page-title-main">Betweenness centrality</span> Measure of a graphs centrality, based on shortest paths

In graph theory, betweenness centrality is a measure of centrality in a graph based on shortest paths. For every pair of vertices in a connected graph, there exists at least one shortest path between the vertices such that either the number of edges that the path passes through or the sum of the weights of the edges is minimized. The betweenness centrality for each vertex is the number of these shortest paths that pass through the vertex.

Random walk closeness centrality is a measure of centrality in a network, which describes the average speed with which randomly walking processes reach a node from other nodes of the network. It is similar to the closeness centrality except that the farness is measured by the expected length of a random walk rather than by the shortest path.

The Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS) is a multi-criteria decision analysis method, which was originally developed by Ching-Lai Hwang and Yoon in 1981 with further developments by Yoon in 1987, and Hwang, Lai and Liu in 1993. TOPSIS is based on the concept that the chosen alternative should have the shortest geometric distance from the positive ideal solution (PIS) and the longest geometric distance from the negative ideal solution (NIS). A dedicated book in the fuzzy context was published in 2021

Strategic network formation defines how and why networks take particular forms. In many networks, the relation between nodes is determined by the choice of the participating players involved, not by an arbitrary rule. A "strategic2 modeling of network requires defining a network’s costs and benefits and predicts how individual preferences become outcomes.

Weighted correlation network analysis, also known as weighted gene co-expression network analysis (WGCNA), is a widely used data mining method especially for studying biological networks based on pairwise correlations between variables. While it can be applied to most high-dimensional data sets, it has been most widely used in genomic applications. It allows one to define modules (clusters), intramodular hubs, and network nodes with regard to module membership, to study the relationships between co-expression modules, and to compare the network topology of different networks. WGCNA can be used as a data reduction technique, as a clustering method, as a feature selection method, as a framework for integrating complementary (genomic) data, and as a data exploratory technique. Although WGCNA incorporates traditional data exploratory techniques, its intuitive network language and analysis framework transcend any standard analysis technique. Since it uses network methodology and is well suited for integrating complementary genomic data sets, it can be interpreted as systems biologic or systems genetic data analysis method. By selecting intramodular hubs in consensus modules, WGCNA also gives rise to network based meta analysis techniques.

Network Science based basketball analytics comprise a various recent attempts to apply the perspective of networks to the analysis of basketball.

<span class="mw-page-title-main">Multidimensional network</span> Networks with multiple kinds of relations

In network theory, multidimensional networks, a special type of multilayer network, are networks with multiple kinds of relations. Increasingly sophisticated attempts to model real-world systems as multidimensional networks have yielded valuable insight in the fields of social network analysis, economics, urban and international transport, ecology, psychology, medicine, biology, commerce, climatology, physics, computational neuroscience, operations management, and finance.

<span class="mw-page-title-main">Biased random walk on a graph</span> Structural analysis of a network

In network science, a biased random walk on a graph is a time path process in which an evolving variable jumps from its current state to one of various potential new states; unlike in a pure random walk, the probabilities of the potential new states are unequal.

References

  1. 1 2 3 4 5 Latora, Vito; Marchiori, Massimo (17 October 2001). "Efficient Behavior of Small-World Networks". Phys. Rev. Lett. 87 (19): 198701. arXiv: cond-mat/0101396 . Bibcode:2001PhRvL..87s8701L. doi:10.1103/PhysRevLett.87.198701. PMID   11690461. S2CID   15457305.
  2. 1 2 3 4 5 6 Latora, Vito; Marchiori, Massimo (March 2003). "Economic small-world behavior in weighted networks". The European Physical Journal B. 32 (2): 249–263. arXiv: cond-mat/0204089 . Bibcode:2003EPJB...32..249L. doi:10.1140/epjb/e2003-00095-5. S2CID   15430987.
  3. 1 2 Bertagnolli, Giulia; Gallotti, Riccardo; De Domenico, Manlio (June 2021). "Quantifying efficient information exchange in real network flows". Communications Physics. 4 (125): 125. arXiv: 2003.11374 . Bibcode:2021CmPhy...4..125B. doi:10.1038/s42005-021-00612-5. S2CID   214641335.
  4. Rubinov, Mikail; Sporns, Olaf (2010). "Complex network measures of brain connectivity: uses and interpretations". NeuroImage. 52 (3): 1059–1069. doi:10.1016/j.neuroimage.2009.10.003. PMID   19819337. S2CID   1245121.
  5. 1 2 Bullmore, Ed; Sporns, Olaf (March 2009). "Complex brain networks graph theoretical analysis of structural and functional systems". Nature Reviews Neuroscience. 10 (3): 186–198. doi:10.1038/nrn2575. PMID   19190637. S2CID   205504722.
  6. 1 2 Bocaletti, S.; Latora, V.; Moreno, Y.; Chavez, M.; Hwang, D.-U. (February 2006). "Complex networks: Structure and dynamics". Physics Reports. 424 (4–5): 175–308. Bibcode:2006PhR...424..175B. CiteSeerX   10.1.1.408.2061 . doi:10.1016/j.physrep.2005.10.009.
  7. Buhl, J.; Gautrais, J.; Solé, R.V.; Kuntz, P.; Valverde, S.; Deneubourg, J.L.; Theraulaz, G. (November 2002). "Efficiency and robustness in ant networks of galleries". The European Physical Journal B. 42 (1): 123–129. Bibcode:2004EPJB...42..123B. doi:10.1140/epjb/e2004-00364-9. S2CID   14975826.