Part of a series on | ||||
Network science | ||||
---|---|---|---|---|
Network types | ||||
Graphs | ||||
| ||||
Models | ||||
| ||||
| ||||
A small-world network is a graph characterized by a high clustering coefficient and low distances. On an example of social network, high clustering implies the high probability that two friends of one person are friends themselves. The low distances, on the other hand, mean that there is a short chain of social connections between any two people (this effect is known as six degrees of separation). [1] Specifically, a small-world network is defined to be a network where the typical distance L between two randomly chosen nodes (the number of steps required) grows proportionally to the logarithm of the number of nodes N in the network, that is: [2]
while the global clustering coefficient is not small.
In the context of a social network, this results in the small world phenomenon of strangers being linked by a short chain of acquaintances. Many empirical graphs show the small-world effect, including social networks, wikis such as Wikipedia, gene networks, and even the underlying architecture of the Internet. It is the inspiration for many network-on-chip architectures in contemporary computer hardware. [3]
A certain category of small-world networks were identified as a class of random graphs by Duncan Watts and Steven Strogatz in 1998. [4] They noted that graphs could be classified according to two independent structural features, namely the clustering coefficient, and average node-to-node distance (also known as average shortest path length). Purely random graphs, built according to the Erdős–Rényi (ER) model, exhibit a small average shortest path length (varying typically as the logarithm of the number of nodes) along with a small clustering coefficient. Watts and Strogatz measured that in fact many real-world networks have a small average shortest path length, but also a clustering coefficient significantly higher than expected by random chance. Watts and Strogatz then proposed a novel graph model, currently named the Watts and Strogatz model, with (i) a small average shortest path length, and (ii) a large clustering coefficient. The crossover in the Watts–Strogatz model between a "large world" (such as a lattice) and a small world was first described by Barthelemy and Amaral in 1999. [5] This work was followed by many studies, including exact results (Barrat and Weigt, 1999; Dorogovtsev and Mendes; Barmpoutis and Murray, 2010).
Small-world networks tend to contain cliques, and near-cliques, meaning sub-networks which have connections between almost any two nodes within them. This follows from the defining property of a high clustering coefficient. Secondly, most pairs of nodes will be connected by at least one short path. This follows from the defining property that the mean-shortest path length be small. Several other properties are often associated with small-world networks. Typically there is an over-abundance of hubs – nodes in the network with a high number of connections (known as high degree nodes). These hubs serve as the common connections mediating the short path lengths between other edges. By analogy, the small-world network of airline flights has a small mean-path length (i.e. between any two cities you are likely to have to take three or fewer flights) because many flights are routed through hub cities. This property is often analyzed by considering the fraction of nodes in the network that have a particular number of connections going into them (the degree distribution of the network). Networks with a greater than expected number of hubs will have a greater fraction of nodes with high degree, and consequently the degree distribution will be enriched at high degree values. This is known colloquially as a fat-tailed distribution. Graphs of very different topology qualify as small-world networks as long as they satisfy the two definitional requirements above.
Network small-worldness has been quantified by a small-coefficient, , calculated by comparing clustering and path length of a given network to an Erdős–Rényi model with same degree on average. [6] [7]
Another method for quantifying network small-worldness utilizes the original definition of the small-world network comparing the clustering of a given network to an equivalent lattice network and its path length to an equivalent random network. The small-world measure () is defined as [8]
Where the characteristic path length L and clustering coefficient C are calculated from the network you are testing, Cℓ is the clustering coefficient for an equivalent lattice network and Lr is the characteristic path length for an equivalent random network.
Still another method for quantifying small-worldness normalizes both the network's clustering and path length relative to these characteristics in equivalent lattice and random networks. The Small World Index (SWI) is defined as [9]
Both ω′ and SWI range between 0 and 1, and have been shown to capture aspects of small-worldness. However, they adopt slightly different conceptions of ideal small-worldness. For a given set of constraints (e.g. size, density, degree distribution), there exists a network for which ω′ = 1, and thus ω aims to capture the extent to which a network with given constraints as small worldly as possible. In contrast, there may not exist a network for which SWI = 1, the thus SWI aims to capture the extent to which a network with given constraints approaches the theoretical small world ideal of a network where C ≈ Cℓ and L ≈ Lr. [9]
Small-world properties are found in many real-world phenomena, including websites with navigation menus, food webs, electric power grids, metabolite processing networks, networks of brain neurons, voter networks, telephone call graphs, and airport networks. [10] Cultural networks [11] and word co-occurrence networks [12] have also been shown to be small-world networks.
Networks of connected proteins have small world properties such as power-law obeying degree distributions. [13] Similarly transcriptional networks, in which the nodes are genes, and they are linked if one gene has an up or down-regulatory genetic influence on the other, have small world network properties. [14]
In another example, the famous theory of "six degrees of separation" between people tacitly presumes that the domain of discourse is the set of people alive at any one time. The number of degrees of separation between Albert Einstein and Alexander the Great is almost certainly greater than 30 [15] and this network does not have small-world properties. A similarly constrained network would be the "went to school with" network: if two people went to the same college ten years apart from one another, it is unlikely that they have acquaintances in common amongst the student body.
Similarly, the number of relay stations through which a message must pass was not always small. In the days when the post was carried by hand or on horseback, the number of times a letter changed hands between its source and destination would have been much greater than it is today. The number of times a message changed hands in the days of the visual telegraph (circa 1800–1850) was determined by the requirement that two stations be connected by line-of-sight.
Tacit assumptions, if not examined, can cause a bias in the literature on graphs in favor of finding small-world networks (an example of the file drawer effect resulting from the publication bias).
It is hypothesized by some researchers, such as Albert-László Barabási, that the prevalence of small world networks in biological systems may reflect an evolutionary advantage of such an architecture. One possibility is that small-world networks are more robust to perturbations than other network architectures. If this were the case, it would provide an advantage to biological systems that are subject to damage by mutation or viral infection.
In a small world network with a degree distribution following a power-law, deletion of a random node rarely causes a dramatic increase in mean-shortest path length (or a dramatic decrease in the clustering coefficient). This follows from the fact that most shortest paths between nodes flow through hubs, and if a peripheral node is deleted it is unlikely to interfere with passage between other peripheral nodes. As the fraction of peripheral nodes in a small world network is much higher than the fraction of hubs, the probability of deleting an important node is very low. For example, if the small airport in Sun Valley, Idaho was shut down, it would not increase the average number of flights that other passengers traveling in the United States would have to take to arrive at their respective destinations. However, if random deletion of a node hits a hub by chance, the average path length can increase dramatically. This can be observed annually when northern hub airports, such as Chicago's O'Hare airport, are shut down because of snow; many people have to take additional flights.
By contrast, in a random network, in which all nodes have roughly the same number of connections, deleting a random node is likely to increase the mean-shortest path length slightly but significantly for almost any node deleted. In this sense, random networks are vulnerable to random perturbations, whereas small-world networks are robust. However, small-world networks are vulnerable to targeted attack of hubs, whereas random networks cannot be targeted for catastrophic failure.
The main mechanism to construct small-world networks is the Watts–Strogatz mechanism.
Small-world networks can also be introduced with time-delay, [16] which will not only produce fractals but also chaos [17] under the right conditions, or transition to chaos in dynamics networks. [18]
Soon after the publication of Watts–Strogatz mechanism, approaches have been developed by Mashaghi and co-workers to generate network models that exhibit high degree correlations, while preserving the desired degree distribution and small-world properties. These approaches are based on edge-dual transformation and can be used to generate analytically solvable small-world network models for research into these systems. [19]
Degree–diameter graphs are constructed such that the number of neighbors each vertex in the network has is bounded, while the distance from any given vertex in the network to any other vertex (the diameter of the network) is minimized. Constructing such small-world networks is done as part of the effort to find graphs of order close to the Moore bound.
Another way to construct a small world network from scratch is given in Barmpoutis et al., [20] where a network with very small average distance and very large average clustering is constructed. A fast algorithm of constant complexity is given, along with measurements of the robustness of the resulting graphs. Depending on the application of each network, one can start with one such "ultra small-world" network, and then rewire some edges, or use several small such networks as subgraphs to a larger graph.
Small-world properties can arise naturally in social networks and other real-world systems via the process of dual-phase evolution. This is particularly common where time or spatial constraints limit the addition of connections between vertices The mechanism generally involves periodic shifts between phases, with connections being added during a "global" phase and being reinforced or removed during a "local" phase.
Small-world networks can change from scale-free class to broad-scale class whose connectivity distribution has a sharp cutoff following a power law regime due to constraints limiting the addition of new links. [21] For strong enough constraints, scale-free networks can even become single-scale networks whose connectivity distribution is characterized as fast decaying. [21] It was also shown analytically that scale-free networks are ultra-small, meaning that the distance scales according to . [22]
The advantages to small world networking for social movement groups are their resistance to change due to the filtering apparatus of using highly connected nodes, and its better effectiveness in relaying information while keeping the number of links required to connect a network to a minimum. [23]
The small world network model is directly applicable to affinity group theory represented in sociological arguments by William Finnegan. Affinity groups are social movement groups that are small and semi-independent pledged to a larger goal or function. Though largely unaffiliated at the node level, a few members of high connectivity function as connectivity nodes, linking the different groups through networking. This small world model has proven an extremely effective protest organization tactic against police action. [24] Clay Shirky argues that the larger the social network created through small world networking, the more valuable the nodes of high connectivity within the network. [23] The same can be said for the affinity group model, where the few people within each group connected to outside groups allowed for a large amount of mobilization and adaptation. A practical example of this is small world networking through affinity groups that William Finnegan outlines in reference to the 1999 Seattle WTO protests.
Many networks studied in geology and geophysics have been shown to have characteristics of small-world networks. Networks defined in fracture systems and porous substances have demonstrated these characteristics. [25] The seismic network in the Southern California region may be a small-world network. [26] The examples above occur on very different spatial scales, demonstrating the scale invariance of the phenomenon in the earth sciences.
Small-world networks have been used to estimate the usability of information stored in large databases. The measure is termed the Small World Data Transformation Measure. [27] [28] The greater the database links align to a small-world network the more likely a user is going to be able to extract information in the future. This usability typically comes at the cost of the amount of information that can be stored in the same repository.
The Freenet peer-to-peer network has been shown to form a small-world network in simulation, [29] allowing information to be stored and retrieved in a manner that scales efficiency as the network grows.
Nearest Neighbor Search solutions like HNSW use small-world networks to efficiently find the information in large item corpuses. [30] [31]
Both anatomical connections in the brain [32] and the synchronization networks of cortical neurons [33] exhibit small-world topology.
Structural and functional connectivity in the brain has also been found to reflect the small-world topology of short path length and high clustering. [34] The network structure has been found in the mammalian cortex across species as well as in large scale imaging studies in humans. [35] Advances in connectomics and network neuroscience, have found the small-worldness of neural networks to be associated with efficient communication. [36]
In neural networks, short pathlength between nodes and high clustering at network hubs supports efficient communication between brain regions at the lowest energetic cost. [36] The brain is constantly processing and adapting to new information and small-world network model supports the intense communication demands of neural networks. [37] High clustering of nodes forms local networks which are often functionally related. Short path length between these hubs supports efficient global communication. [38] This balance enables the efficiency of the global network while simultaneously equipping the brain to handle disruptions and maintain homeostasis, due to local subsystems being isolated from the global network. [39] Loss of small-world network structure has been found to indicate changes in cognition and increased risk of psychological disorders. [9]
In addition to characterizing whole-brain functional and structural connectivity, specific neural systems, such as the visual system, exhibit small-world network properties. [6]
A small-world network of neurons can exhibit short-term memory. A computer model developed by Sara Solla [40] [41] had two stable states, a property (called bistability) thought to be important in memory storage. An activating pulse generated self-sustaining loops of communication activity among the neurons. A second pulse ended this activity. The pulses switched the system between stable states: flow (recording a "memory"), and stasis (holding it). Small world neuronal networks have also been used as models to understand seizures. [42]
In statistical physics and mathematics, percolation theory describes the behavior of a network when nodes or links are added. This is a geometric type of phase transition, since at a critical fraction of addition the network of small, disconnected clusters merge into significantly larger connected, so-called spanning clusters. The applications of percolation theory to materials science and in many other disciplines are discussed here and in the articles Network theory and Percolation.
A scale-free network is a network whose degree distribution follows a power law, at least asymptotically. That is, the fraction P(k) of nodes in the network having k connections to other nodes goes for large values of k as
In graph theory, a clustering coefficient is a measure of the degree to which nodes in a graph tend to cluster together. Evidence suggests that in most real-world networks, and in particular social networks, nodes tend to create tightly knit groups characterised by a relatively high density of ties; this likelihood tends to be greater than the average probability of a tie randomly established between two nodes.
In the context of network theory, a complex network is a graph (network) with non-trivial topological features—features that do not occur in simple networks such as lattices or random graphs but often occur in networks representing real systems. The study of complex networks is a young and active area of scientific research inspired largely by empirical findings of real-world networks such as computer networks, biological networks, technological networks, brain networks, climate networks and social networks.
In the study of graphs and networks, the degree of a node in a network is the number of connections it has to other nodes and the degree distribution is the probability distribution of these degrees over the whole network.
The Barabási–Albert (BA) model is an algorithm for generating random scale-free networks using a preferential attachment mechanism. Several natural and human-made systems, including the Internet, the World Wide Web, citation networks, and some social networks are thought to be approximately scale-free and certainly contain few nodes with unusually high degree as compared to the other nodes of the network. The BA model tries to explain the existence of such nodes in real networks. The algorithm is named for its inventors Albert-László Barabási and Réka Albert.
In the study of complex networks, a network is said to have community structure if the nodes of the network can be easily grouped into sets of nodes such that each set of nodes is densely connected internally. In the particular case of non-overlapping community finding, this implies that the network divides naturally into groups of nodes with dense connections internally and sparser connections between groups. But overlapping communities are also allowed. The more general definition is based on the principle that pairs of nodes are more likely to be connected if they are both members of the same community(ies), and less likely to be connected if they do not share communities. A related but different problem is community search, where the goal is to find a community that a certain vertex belongs to.
The Watts–Strogatz model is a random graph generation model that produces graphs with small-world properties, including short average path lengths and high clustering. It was proposed by Duncan J. Watts and Steven Strogatz in their article published in 1998 in the Nature scientific journal. The model also became known as the (Watts) beta model after Watts used to formulate it in his popular science book Six Degrees.
Fractal analysis is useful in the study of complex networks, present in both natural and artificial systems such as computer systems, brain and social networks, allowing further development of the field in network science.
In the mathematical field of graph theory, the Erdős–Rényi model refers to one of two closely related models for generating random graphs or the evolution of a random network. These models are named after Hungarian mathematicians Paul Erdős and Alfréd Rényi, who introduced one of the models in 1959. Edgar Gilbert introduced the other model contemporaneously with and independently of Erdős and Rényi. In the model of Erdős and Rényi, all graphs on a fixed vertex set with a fixed number of edges are equally likely. In the model introduced by Gilbert, also called the Erdős–Rényi–Gilbert model, each edge has a fixed probability of being present or absent, independently of the other edges. These models can be used in the probabilistic method to prove the existence of graphs satisfying various properties, or to provide a rigorous definition of what it means for a property to hold for almost all graphs.
In graph theory, a random geometric graph (RGG) is the mathematically simplest spatial network, namely an undirected graph constructed by randomly placing N nodes in some metric space and connecting two nodes by a link if and only if their distance is in a given range, e.g. smaller than a certain neighborhood radius, r.
Network science is an academic field which studies complex networks such as telecommunication networks, computer networks, biological networks, cognitive and semantic networks, and social networks, considering distinct elements or actors represented by nodes and the connections between the elements or actors as links. The field draws on theories and methods including graph theory from mathematics, statistical mechanics from physics, data mining and information visualization from computer science, inferential modeling from statistics, and social structure from sociology. The United States National Research Council defines network science as "the study of network representations of physical, biological, and social phenomena leading to predictive models of these phenomena."
Modularity is a measure of the structure of networks or graphs which measures the strength of division of a network into modules. Networks with high modularity have dense connections between the nodes within modules but sparse connections between nodes in different modules. Modularity is often used in optimization methods for detecting community structure in networks. Biological networks, including animal brains, exhibit a high degree of modularity. However, modularity maximization is not statistically consistent, and finds communities in its own null model, i.e. fully random graphs, and therefore it cannot be used to find statistically significant community structures in empirical networks. Furthermore, it has been shown that modularity suffers a resolution limit and, therefore, it is unable to detect small communities.
Evolving networks are networks that change as a function of time. They are a natural extension of network science since almost all real world networks evolve over time, either by adding or removing nodes or links over time. Often all of these processes occur simultaneously, such as in social networks where people make and lose friends over time, thereby creating and destroying edges, and some people become part of new social networks or leave their networks, changing the nodes in the network. Evolving network concepts build on established network theory and are now being introduced into studying networks in many diverse fields.
In social network analysis, the co-stardom network represents the collaboration graph of film actors i.e. movie stars. The co-stardom network can be represented by an undirected graph of nodes and links. Nodes correspond to the movie star actors and two nodes are linked if they co-starred (performed) in the same movie. The links are un-directed, and can be weighted or not depending on the goals of study. If the number of times two actors appeared in a movie is needed, links are assigned weights. The co-stardom network can also be represented by a bipartite graph where nodes are of two types: actors and movies. And edges connect different types of nodes if they have a relationship. Initially the network was found to have a small-world property. Afterwards, it was discovered that it exhibits a scale-free (power-law) behavior.
Hierarchical network models are iterative algorithms for creating networks which are able to reproduce the unique properties of the scale-free topology and the high clustering of the nodes at the same time. These characteristics are widely observed in nature, from biology to language to some social networks.
Disparity filter is a network reduction algorithm to extract the backbone structure of undirected weighted network. Many real world networks such as citation networks, food web, airport networks display heavy tailed statistical distribution of nodes' weight and strength. Disparity filter can sufficiently reduce the network without destroying the multi-scale nature of the network. The algorithm is developed by M. Angeles Serrano, Marian Boguna and Alessandro Vespignani.
In network science, the efficiency of a network is a measure of how efficiently it exchanges information and it is also called communication efficiency. The underlying idea is that the more distant two nodes are in the network, the less efficient their communication will be. The concept of efficiency can be applied to both local and global scales in a network. On a global scale, efficiency quantifies the exchange of information across the whole network where information is concurrently exchanged. The local efficiency quantifies a network's resistance to failure on a small scale. That is the local efficiency of a node characterizes how well information is exchanged by its neighbors when it is removed.
Robustness, the ability to withstand failures and perturbations, is a critical attribute of many complex systems including complex networks.
In network science, the configuration model is a method for generating random networks from a given degree sequence. It is widely used as a reference model for real-life social networks, because it allows the modeler to incorporate arbitrary degree distributions.