Rich-club coefficient

Last updated

The rich-club coefficient is a metric on graphs and networks, designed to measure the extent to which well-connected nodes also connect to each other. Networks which have a relatively high rich-club coefficient are said to demonstrate the rich-club effect and will have many connections between nodes of high degree. The rich-club coefficient was first introduced in 2004 in a paper studying Internet topology. [1] [2]

Contents

The "Rich-club" effect has been measured and noted on scientific collaboration networks and air transportation networks. It has been shown to be significantly lacking on protein interaction networks.

Definition

Non-normalized form

The rich-club coefficient was first introduced as an unscaled metric parametrized by node degree ranks. [1] More recently, this has been updated to be parameterized in terms of node degrees k , indicating a degree cut-off. The rich-club coefficient for a given network N is then defined as:

 

 

 

 

(1)

[3] [4] [5]

where is the number of edges between the nodes of degree greater than or equal to k, and is the number of nodes with degree greater than or equal to k. This measures how many edges are present between nodes of degree at least k, normalized by how many edges there could be between these nodes in a complete graph. When this value is close to 1 for values of k close to , it is interpreted that high degree nodes of the network are well connected. The associated subgraph of nodes with degree at least k is also called the "Rich Club" graph.

Normalized for topology randomization

A criticism of the above metric is that it does not necessarily imply the existence of the rich-club effect, as it is monotonically increasing even for random networks. In certain degree distributions, it is not possible to avoid connecting high degree hubs. To account for this, it is necessary to compare the above metric to the same metric on a degree distribution preserving randomized version of the network. This updated metric is defined as:

 

 

 

 

(2)

where is the rich-club metric on a maximally randomized network with the same degree distribution of the network under study. This new ratio discounts unavoidable structural correlations that are a result of the degree distribution, giving a better indicator of the significance of the rich-club effect.

For this metric, if for certain values of k we have , this denotes the presence of the rich-club effect.

Generalizations

General richness properties

The natural definition of a node's "richness" is its number of neighbours. If instead we replace this with a generic richness metric on nodes r, then we can rewrite the unscaled Rich-Club coefficient as:

 

 

 

 

(3)

Where we are instead considering the sub graph on only nodes with a richness measure of at least r. For example, on scientific collaboration networks, replacing the degree richness (number of coauthors) with a strength richness (number of published papers), the topology of the rich club graph changes dramatically.

Assortativity

The Assortativity of a network is a measurement of how connected similar nodes are, where similarity is typically viewed in terms of node degree. Rich-club can be viewed as a more specific notation of assortativity, where we are only concerned with the connectivity of nodes beyond a certain richness metric. For example, if a network consisted of a collection of hub and spokes, where the hubs were well connected, such a network would be considered disassortative. However, due to the strong connectedness of the hubs in the network, the network would demonstrate the rich-club effect.

An example of a network which is both disassortative and demonstrates the Rich Club effect. The red nodes are hubs and form the "Rich Club." Disassortative network demonstrating the Rich Club effect.png
An example of a network which is both disassortative and demonstrates the Rich Club effect. The red nodes are hubs and form the "Rich Club."

Applications

The rich-club coefficient of a network is useful as a heuristic measurement of the robustness of a network. A high rich-club coefficient implies that the hubs are well connected, and global connectivity is resilient to any one hub being removed. It is also useful for verifying theories that generalize to other networks. For example, the consistent observation of high rich-club coefficients for scientific collaboration networks adds evidence to the theory that within social groups, the elite tend to associate with one another.

Implementations

The rich-club coefficient has been implemented in NetworkX, a Python library for network analysis. This implementation includes both the non-normalized and normalized forms as described above.

See also

Related Research Articles

<span class="mw-page-title-main">Scale-free network</span> Network whose degree distribution follows a power law

A scale-free network is a network whose degree distribution follows a power law, at least asymptotically. That is, the fraction P(k) of nodes in the network having k connections to other nodes goes for large values of k as

<span class="mw-page-title-main">Small-world network</span> Graph where most nodes are reachable in a small number of steps

A small-world network is a mathematical graph in which most nodes are not neighbors of one another, but the neighbors of any given node are likely to be neighbors of each other. Due to this, most neighboring nodes can be reached from every other node by a small number of hops or steps. Specifically, a small-world network is defined to be a network where the typical distance L between two randomly chosen nodes grows proportionally to the logarithm of the number of nodes N in the network, that is:

In graph theory, a clustering coefficient is a measure of the degree to which nodes in a graph tend to cluster together. Evidence suggests that in most real-world networks, and in particular social networks, nodes tend to create tightly knit groups characterised by a relatively high density of ties; this likelihood tends to be greater than the average probability of a tie randomly established between two nodes.

<span class="mw-page-title-main">Centrality</span> Degree of connectedness within a graph

In graph theory and network analysis, indicators of centrality assign numbers or rankings to nodes within a graph corresponding to their network position. Applications include identifying the most influential person(s) in a social network, key infrastructure nodes in the Internet or urban networks, super-spreaders of disease, and brain networks. Centrality concepts were first developed in social network analysis, and many of the terms used to measure centrality reflect their sociological origin.

<span class="mw-page-title-main">Complex network</span> Network with non-trivial topological features

In the context of network theory, a complex network is a graph (network) with non-trivial topological features—features that do not occur in simple networks such as lattices or random graphs but often occur in networks representing real systems. The study of complex networks is a young and active area of scientific research inspired largely by empirical findings of real-world networks such as computer networks, biological networks, technological networks, brain networks, climate networks and social networks.

<span class="mw-page-title-main">Degree distribution</span>

In the study of graphs and networks, the degree of a node in a network is the number of connections it has to other nodes and the degree distribution is the probability distribution of these degrees over the whole network.

<span class="mw-page-title-main">Barabási–Albert model</span>

The Barabási–Albert (BA) model is an algorithm for generating random scale-free networks using a preferential attachment mechanism. Several natural and human-made systems, including the Internet, the World Wide Web, citation networks, and some social networks are thought to be approximately scale-free and certainly contain few nodes with unusually high degree as compared to the other nodes of the network. The BA model tries to explain the existence of such nodes in real networks. The algorithm is named for its inventors Albert-László Barabási and Réka Albert.

<span class="mw-page-title-main">Community structure</span> Concept in graph theory

In the study of complex networks, a network is said to have community structure if the nodes of the network can be easily grouped into sets of nodes such that each set of nodes is densely connected internally. In the particular case of non-overlapping community finding, this implies that the network divides naturally into groups of nodes with dense connections internally and sparser connections between groups. But overlapping communities are also allowed. The more general definition is based on the principle that pairs of nodes are more likely to be connected if they are both members of the same community(ies), and less likely to be connected if they do not share communities. A related but different problem is community search, where the goal is to find a community that a certain vertex belongs to.

<span class="mw-page-title-main">Watts–Strogatz model</span> Method of generating random small-world graphs

The Watts–Strogatz model is a random graph generation model that produces graphs with small-world properties, including short average path lengths and high clustering. It was proposed by Duncan J. Watts and Steven Strogatz in their article published in 1998 in the Nature scientific journal. The model also became known as the (Watts) beta model after Watts used to formulate it in his popular science book Six Degrees.

Mixing patterns refer to systematic tendencies of one type of nodes in a network to connect to another type. For instance, nodes might tend to link to others that are very similar or very different. This feature is common in many social networks, although it also appears sometimes in non-social networks. Mixing patterns are closely related to assortativity; however, for the purposes of this article, the term is used to refer to assortative or disassortative mixing based on real-world factors, either topological or sociological.

<span class="mw-page-title-main">Assortativity</span> Tendency for similar nodes to be connected

Assortativity, or assortative mixing is a preference for a network's nodes to attach to others that are similar in some way. Though the specific measure of similarity may vary, network theorists often examine assortativity in terms of a node's degree. The addition of this characteristic to network models more closely approximates the behaviors of many real world networks.

<span class="mw-page-title-main">Random geometric graph</span> In graph theory, the mathematically simplest spatial network

In graph theory, a random geometric graph (RGG) is the mathematically simplest spatial network, namely an undirected graph constructed by randomly placing N nodes in some metric space and connecting two nodes by a link if and only if their distance is in a given range, e.g. smaller than a certain neighborhood radius, r.

<span class="mw-page-title-main">Network science</span> Academic field

Network science is an academic field which studies complex networks such as telecommunication networks, computer networks, biological networks, cognitive and semantic networks, and social networks, considering distinct elements or actors represented by nodes and the connections between the elements or actors as links. The field draws on theories and methods including graph theory from mathematics, statistical mechanics from physics, data mining and information visualization from computer science, inferential modeling from statistics, and social structure from sociology. The United States National Research Council defines network science as "the study of network representations of physical, biological, and social phenomena leading to predictive models of these phenomena."

<span class="mw-page-title-main">Modularity (networks)</span> Measure of network community structure

Modularity is a measure of the structure of networks or graphs which measures the strength of division of a network into modules. Networks with high modularity have dense connections between the nodes within modules but sparse connections between nodes in different modules. Modularity is often used in optimization methods for detecting community structure in networks. However, it has been shown that modularity suffers a resolution limit and, therefore, it is unable to detect small communities. Biological networks, including animal brains, exhibit a high degree of modularity.

<span class="mw-page-title-main">Evolving network</span>

Evolving networks are networks that change as a function of time. They are a natural extension of network science since almost all real world networks evolve over time, either by adding or removing nodes or links over time. Often all of these processes occur simultaneously, such as in social networks where people make and lose friends over time, thereby creating and destroying edges, and some people become part of new social networks or leave their networks, changing the nodes in the network. Evolving network concepts build on established network theory and are now being introduced into studying networks in many diverse fields.

<span class="mw-page-title-main">Hierarchical network model</span>

Hierarchical network models are iterative algorithms for creating networks which are able to reproduce the unique properties of the scale-free topology and the high clustering of the nodes at the same time. These characteristics are widely observed in nature, from biology to language to some social networks.

<span class="mw-page-title-main">Structural cut-off</span>

The structural cut-off is a concept in network science which imposes a degree cut-off in the degree distribution of a finite size network due to structural limitations. Networks with vertices with degree higher than the structural cut-off will display structural disassortativity.

<span class="mw-page-title-main">Disparity filter algorithm of weighted network</span>

Disparity filter is a network reduction algorithm to extract the backbone structure of undirected weighted network. Many real world networks such as citation networks, food web, airport networks display heavy tailed statistical distribution of nodes' weight and strength. Disparity filter can sufficiently reduce the network without destroying the multi-scale nature of the network. The algorithm is developed by M. Angeles Serrano, Marian Boguna and Alessandro Vespignani.

<span class="mw-page-title-main">Hub (network science)</span> Node with a number of links that greatly exceeds the average

In network science, a hub is a node with a number of links that greatly exceeds the average. Emergence of hubs is a consequence of a scale-free property of networks. While hubs cannot be observed in a random network, they are expected to emerge in scale-free networks. The uprise of hubs in scale-free networks is associated with power-law distribution. Hubs have a significant impact on the network topology. Hubs can be found in many real networks, such as the brain or the Internet.

In network science, the network entropy is a disorder measure derived from information theory to describe the level of randomness and the amount of information encoded in a graph. It is a relevant metric to quantitatively characterize real complex networks and can also be used to quantify network complexity

References

  1. 1 2 Zhou, Shi & Mondragón, Raúl J. (2004). "The Rich-Club Phenomenon In The Internet Topology". IEEE Communications Letters. 8 (3): 180–182. arXiv: cs/0308036 . doi:10.1109/lcomm.2004.823426. S2CID   7007263.
  2. Mattia Gasparini, Javier Luis Canovas Izquierdo, Robert Clariso, Marco Brambilla, Jordi Cabot: Analyzing Rich-Club Behavior in Open Source Projects. OpenSym 2019 proceedings
  3. Colizza, V. and Flammini, A. and Serrano, M. A. and Vespignani, A. (2006). "Detecting rich-club ordering in complex networks". Nature Physics. 2. 2 (2): 110–115. arXiv: physics/0602134 . Bibcode:2006NatPh...2..110C. doi:10.1038/nphys209. S2CID   2418153.{{cite journal}}: CS1 maint: multiple names: authors list (link)
  4. McAuley, Julian J. and da Fontoura Costa, Luciano and Caetano, Tibério S. (2007). "Rich-club phenomenon across complex network hierarchies". Applied Physics Letters. 91 (8): 084103. arXiv: physics/0701290 . Bibcode:2007ApPhL..91h4103M. doi:10.1063/1.2773951. S2CID   16544534.{{cite journal}}: CS1 maint: multiple names: authors list (link)
  5. Opsahl, Tore; Colizza, Vittoria; Panzarasa, Pietro; Ramasco, José J. (2008). "Prominence and Control: The Weighted Rich-Club Effect". Physical Review Letters. 101 (16): 168702. arXiv: 0804.0417 . Bibcode:2008PhRvL.101p8702O. doi:10.1103/physrevlett.101.168702. PMID   18999722. S2CID   29349737.