Sparse network

Last updated January 05, 2024

In network science, a sparse network has much fewer links than the possible maximum number of links within that network (the opposite is a dense network). The study of sparse networks is a relatively new area primarily stimulated by the study of real networks, such as social and computer networks.^[1]

The notion of much fewer links is, of course, colloquial and informal. While a threshold for a particular network may be invented, there is no universal threshold that defines what much fewer actually means. As a result, there is no formal sense of sparsity for any finite network, despite widespread agreement that most empirical networks are indeed sparse. There is, however, a formal sense of sparsity in the case of infinite network models, determined by the behavior of the number of edges (M) and/or the average degree (⟨k⟩) as the number of nodes (N) goes to infinity.^[2]

Definitions

A simple unweighted network of size $N$ is called sparse if the number of links $M$ in it is much smaller than the maximum possible number of links $M_{max}$ :^[1]

$M\ll M_{max}={N \choose 2}$ .

In any given (real) network, the number of nodes N and links M are just two numbers, therefore the meaning of the much smaller sign ( $\ll$ above) is purely colloquial and informal, and so are statements like "many real networks are sparse."

However, if we deal with a synthetic graph sequence $G_{N}$ , or a network model that is well defined for networks $G_{N}$ of any size N = 1,2,..., $\infty$ , then the $\ll$ attains its usual formal meaning:

$M\ll M_{max}\iff M=o(M_{max})\iff \lim _{N\rightarrow \infty }{\frac {M}{M_{max}}}=0$ .

In other words, a network sequence or model $G_{N}$ is called dense or sparse depending on whether the (expected) average degree $\langle k\rangle =2M/N$ in $G_{N}$ scales linearly or sublinearly with N:^[2]^[3]

$G_{N}$ is dense if $\langle k\rangle =O(N)$ ;

$G_{N}$ is sparse if $\langle k\rangle =o(N)$ .

An important subclass of sparse networks are networks whose average degree is either constant or converges to a constant. Some authors call only such networks sparse, while others reserve special names for them: ^[4]

$G_{N}$ is truly sparse or extremely sparse or ultrasparse if $\langle k\rangle =O(1)$ .

There also exist alternative, stricter definitions of network sparsity requiring the convergence of the degree distribution in $G_{N}$ to a well defined limit at $N\rightarrow \infty$ .^[5] According to this definition, the N-star graph $S_{N}$ , for example, is not sparse.

Node degree distribution

The node degree distribution changes with the increasing connectivity. Different link densities in the complex networks have different node-degree distribution, as Flickr Network Analysis suggests.^[6] The sparsely connected networks have a scale free, power law distribution. With increasing connectivity, the networks show increasing divergence from power law. One of the main factors, influencing on the network connectivity is the node similarity. For instance, in social networks, people are likely to be linked to each other if they share common social background, interests, tastes, beliefs, etc. In context of biological networks, proteins or other molecules are linked if they have exact or complementary fit of their complex surfaces.^[6]

Common terminology

If the nodes in the networks are not weighted, the structural components of the network can be shown through adjacency matrix. If the most elements in the matrix are zero, such matrix is referred as sparse matrix. In contrast, if most of the elements are nonzero, then the matrix is dense. The sparsity or density of the matrix is identified by the fraction of the zero element to the total number of the elements in the matrix. Similarly, in the context of graph theory, if the number of links is close to its maximum, then the graph would be known as dense graph. If the number of links is lower than the maximum number of links, this type of graphs are referred as sparse graph.^[7]

Applications

Sparse Network can be found in social, computer and biological networks, as well as, its applications can be found in transportation, power-line, citation networks, etc. Since most real networks are large and sparse, there were several models developed to understand and analyze them.^[8] These networks have inspired sparse network-on-chip design in multiprocessor embedded computer engineering.

Sparse networks also induce cheaper computations by making it efficient to store the network as an Adjacency list, rather than an Adjacency matrix. For example, when using an adjacency list, iterating over a node's neighbors can be achieved in O(M/N), whereas it is achieved in O(N) with an adjacency matrix.^[2]

Related Research Articles

A scale-free network is a network whose degree distribution follows a power law, at least asymptotically. That is, the fraction P(k) of nodes in the network having k connections to other nodes goes for large values of k as

In mathematics, random graph is the general term to refer to probability distributions over graphs. Random graphs may be described simply by a probability distribution, or by a random process which generates them. The theory of random graphs lies at the intersection between graph theory and probability theory. From a mathematical perspective, random graphs are used to answer questions about the properties of typical graphs. Its practical applications are found in all areas in which complex networks need to be modeled – many random graph models are thus known, mirroring the diverse types of complex networks encountered in different areas. In a mathematical context, random graph refers almost exclusively to the Erdős–Rényi random graph model. In other contexts, any graph model may be referred to as a random graph.

In optimization theory, maximum flow problems involve finding a feasible flow through a flow network that obtains the maximum possible flow rate.

<span class="mw-page-title-main">Graph (abstract data type)</span> Abstract data type in computer science

In computer science, a graph is an abstract data type that is meant to implement the undirected graph and directed graph concepts from the field of graph theory within mathematics.

In graph theory and network analysis, indicators of centrality assign numbers or rankings to nodes within a graph corresponding to their network position. Applications include identifying the most influential person(s) in a social network, key infrastructure nodes in the Internet or urban networks, super-spreaders of disease, and brain networks. Centrality concepts were first developed in social network analysis, and many of the terms used to measure centrality reflect their sociological origin.

In the study of graphs and networks, the degree of a node in a network is the number of connections it has to other nodes and the degree distribution is the probability distribution of these degrees over the whole network.

In network theory, a giant component is a connected component of a given random graph that contains a significant fraction of the entire graph's vertices.

Semidefinite programming (SDP) is a subfield of mathematical programming concerned with the optimization of a linear objective function over the intersection of the cone of positive semidefinite matrices with an affine space, i.e., a spectrahedron.

A method for pruning dense networks to highlight key links

Fractal analysis is useful in the study of complex networks, present in both natural and artificial systems such as computer systems, brain and social networks, allowing further development of the field in network science.

In mathematics, the Grothendieck inequality states that there is a universal constant $with the following property. If M ij is an n \times n matrix with$

A Boolean network consists of a discrete set of boolean variables each of which has a Boolean function assigned to it which takes inputs from a subset of those variables and output that determines the state of the variable it is assigned to. This set of functions in effect determines a topology (connectivity) on the set of variables, which then become nodes in a network. Usually, the dynamics of the system is taken as a discrete time series where the state of the entire network at time t+1 is determined by evaluating each variable's function on the state of the network at time t. This may be done synchronously or asynchronously.

In the mathematical field of graph theory, the Erdős–Rényi model refers to one of two closely related models for generating random graphs or the evolution of a random network. These models are named after Hungarian mathematicians Paul Erdős and Alfréd Rényi, who introduced one of the models in 1959. Edgar Gilbert introduced the other model contemporaneously with and independently of Erdős and Rényi. In the model of Erdős and Rényi, all graphs on a fixed vertex set with a fixed number of edges are equally likely. In the model introduced by Gilbert, also called the Erdős–Rényi–Gilbert model, each edge has a fixed probability of being present or absent, independently of the other edges. These models can be used in the probabilistic method to prove the existence of graphs satisfying various properties, or to provide a rigorous definition of what it means for a property to hold for almost all graphs.

In graph theory, a random geometric graph (RGG) is the mathematically simplest spatial network, namely an undirected graph constructed by randomly placing N nodes in some metric space and connecting two nodes by a link if and only if their distance is in a given range, e.g. smaller than a certain neighborhood radius, r.

Network science is an academic field which studies complex networks such as telecommunication networks, computer networks, biological networks, cognitive and semantic networks, and social networks, considering distinct elements or actors represented by nodes and the connections between the elements or actors as links. The field draws on theories and methods including graph theory from mathematics, statistical mechanics from physics, data mining and information visualization from computer science, inferential modeling from statistics, and social structure from sociology. The United States National Research Council defines network science as "the study of network representations of physical, biological, and social phenomena leading to predictive models of these phenomena."

<span class="mw-page-title-main">Network controllability</span>

Network controllability concerns the structural controllability of a network. Controllability describes our ability to guide a dynamical system from any initial state to any desired final state in finite time, with a suitable choice of inputs. This definition agrees well with our intuitive notion of control. The controllability of general directed and weighted complex networks has recently been the subject of intense study by a number of groups in wide variety of networks, worldwide. Recent studies by Sharma et al. on multi-type biological networks identified control targets in phenotypically characterized Osteosarcoma showing important role of genes and proteins responsible for maintaining tumor microenvironment.

Robustness, the ability to withstand failures and perturbations, is a critical attribute of many complex systems including complex networks.

Evolution of a random network is a dynamical process, usually leading to emergence of giant component accompanied with striking consequences on the network topology. To quantify this process, there is a need of inspection on how the size of the largest connected cluster within the network, $, varies with the average degree . Networks change their topology as they evolve, undergoing phase transitions. Phase transitions are generally known from physics, where it occurs as matter changes state according to its thermal energy level, or when ferromagnetic properties emerge in some materials as they are cooling down. Such phase transitions take place in matter because it is a network of particles, and as such, rules of network phase transition directly apply to it. Phase transitions in networks happen as links are added to a network, meaning that having N nodes, in each time increment, a link is placed between a randomly chosen pair of them. The transformation from a set of disconnected nodes to a fully connected network is called the evolution of a network.$

In network science, the configuration model is a method for generating random networks from a given degree sequence. It is widely used as a reference model for real-life social networks, because it allows the modeler to incorporate arbitrary degree distributions.

In network science, the network entropy is a disorder measure derived from information theory to describe the level of randomness and the amount of information encoded in a graph. It is a relevant metric to quantitatively characterize real complex networks and can also be used to quantify network complexity

References

1 2 Barabási, Albert-László (2015). Network Science. Cambridge University Press. Retrieved 25 May 2015.
1 2 3 Newman, Mark. Networks 2nd Edition . Retrieved 14 Feb 2021.
↑ Bollobás, Béla (1985). Random Graphs. Academic Press.
↑ Janson, Svante (2018). "On Edge Exchangeable Random Graphs". J Stat Phys. 173 (3–4): 448–484. arXiv: 1702.06396 . Bibcode:2018JSP...173..448J. doi:10.1007/s10955-017-1832-9. PMC 6405020 . PMID 30930480.
↑ van der Hofstad, Remco (2017). Random Graphs and Complex Networks. Cambridge University Press. doi:10.1017/9781316779422. ISBN 9781316779422.
1 2 Scholz, Matthias (7 January 2015). "Node similarity as a basic principle behind connectivity in complex networks". Journal of Data Mining and Digital Humanities. 2015 (77). arXiv: 1010.0803 . doi: 10.46298/jdmdh.33 . S2CID 221799 . Retrieved 25 May 2015.
↑ Nykamp, Duane Q. "An introduction to networks". Math Insight. Retrieved 25 May 2015.
↑ Gribonval, Rémi. "Sparse Models, Algorithms and Learning for Large-scale data". SMALL. Retrieved 25 May 2015.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[:0-1] 1 2 Barabási, Albert-László (2015). Network Science. Cambridge University Press. Retrieved 25 May 2015.

[newman-2] 1 2 3 Newman, Mark. Networks 2nd Edition . Retrieved 14 Feb 2021.

[bela-3] Bollobás, Béla (1985). Random Graphs. Academic Press.

[janson-4] Janson, Svante (2018). "On Edge Exchangeable Random Graphs". J Stat Phys. 173 (3–4): 448–484. arXiv: 1702.06396 . Bibcode:2018JSP...173..448J. doi:10.1007/s10955-017-1832-9. PMC 6405020 . PMID 30930480.

[remco-5] van der Hofstad, Remco (2017). Random Graphs and Complex Networks. Cambridge University Press. doi:10.1017/9781316779422. ISBN 9781316779422.

[Scholz2015-6] 1 2 Scholz, Matthias (7 January 2015). "Node similarity as a basic principle behind connectivity in complex networks". Journal of Data Mining and Digital Humanities. 2015 (77). arXiv: 1010.0803 . doi: 10.46298/jdmdh.33 . S2CID 221799 . Retrieved 25 May 2015.

[7] Nykamp, Duane Q. "An introduction to networks". Math Insight. Retrieved 25 May 2015.

[8] Gribonval, Rémi. "Sparse Models, Algorithms and Learning for Large-scale data". SMALL. Retrieved 25 May 2015.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]