Part of a series on | ||||
Network science | ||||
---|---|---|---|---|
Network types | ||||
Graphs | ||||
| ||||
Models | ||||
| ||||
| ||||
In the study of graphs and networks, the degree of a node in a network is the number of connections it has to other nodes and the degree distribution is the probability distribution of these degrees over the whole network.
The degree of a node in a network (sometimes referred to incorrectly as the connectivity) is the number of connections or edges the node has to other nodes. If a network is directed, meaning that edges point in one direction from one node to another node, then nodes have two different degrees, the in-degree, which is the number of incoming edges, and the out-degree, which is the number of outgoing edges.
The degree distribution P(k) of a network is then defined to be the fraction of nodes in the network with degree k. Thus if there are n nodes in total in a network and nk of them have degree k, we have
The same information is also sometimes presented in the form of a cumulative degree distribution, the fraction of nodes with degree smaller than k, or even the complementary cumulative degree distribution, the fraction of nodes with degree greater than or equal to k (1 - C) if one considers C as the cumulative degree distribution; i.e. the complement of C.
The degree distribution is very important in studying both real networks, such as the Internet and social networks, and theoretical networks. The simplest network model, for example, the (Erdős–Rényi model) random graph, in which each of n nodes is independently connected (or not) with probability p (or 1 − p), has a binomial distribution of degrees k:
(or Poisson in the limit of large n, if the average degree is held fixed). Most networks in the real world, however, have degree distributions very different from this. Most are highly right-skewed, meaning that a large majority of nodes have low degree but a small number, known as "hubs", have high degree. Some networks, notably the Internet, the World Wide Web, and some social networks were argued to have degree distributions that approximately follow a power law: , where γ is a constant. Such networks are called scale-free networks and have attracted particular attention for their structural and dynamical properties. [1] [2] [3] [4] However, a survey of a wide range of real world networks suggests that scale-free networks are rare when assessed using statistically rigorous measures. [5] Some researchers have disputed these findings arguing that the definitions used in the study are inappropriately strict, [6] while others have argued that the precise functional form of the degree distribution is less important than knowing whether the degree distribution is fat-tailed or not. [7] The over-interpretation of specific forms of the degree distribution has also been criticised for failing to consider how networks may evolve over time. [8]
Excess degree distribution is the probability distribution, for a node reached by following an edge, of the number of other edges attached to that node. [9] In other words, it is the distribution of outgoing links from a node reached by following a link.
Suppose a network has a degree distribution , by selecting one node (randomly or not) and going to one of its neighbors (assuming to have one neighbor at least), then the probability of that node to have neighbors is not given by . The reason is that, whenever some node is selected in a heterogeneous network, it is more probable to reach the hubs by following one of the existing neighbors of that node. The true probability of such nodes to have degree is which is called the excess degree of that node. In the configuration model, which correlations between the nodes have been ignored and every node is assumed to be connected to any other nodes in the network with the same probability, the excess degree distribution can be found as: [9]
where is the mean-degree (average degree) of the model. It follows from that, that the average degree of the neighbor of any node is greater than the average degree of that node. In social networks, it mean that your friends, on average, have more friends than you. This is famous as the friendship paradox. It can be shown that a network can have a giant component, if its average excess degree is larger than one:
Bear in mind that the last two equations are just for the configuration model and to derive the excess degree distribution of a real-word network, we should also add degree correlations into account. [9]
Generating functions can be used to calculate different properties of random networks. Given the degree distribution and the excess degree distribution of some network, and respectively, it is possible to write two power series in the following forms:
and
can also be obtained from derivatives of :
If we know the generating function for a probability distribution then we can recover the values of by differentiating:
Some properties, e.g. the moments, can be easily calculated from and its derivatives:
And in general: [9]
For Poisson-distributed random networks, such as the ER graph, , that is the reason why the theory of random networks of this type is especially simple. The probability distributions for the 1st and 2nd-nearest neighbors are generated by the functions and . By extension, the distribution of -th neighbors is generated by:
, with iterations of the function acting on itself. [10]
The average number of 1st neighbors, , is and the average number of 2nd neighbors is:
In a directed network, each node has some in-degree and some out-degree which are the number of links which have run into and out of that node respectfully. If is the probability that a randomly chosen node has in-degree and out-degree then the generating function assigned to this joint probability distribution can be written with two valuables and as:
Since every link in a directed network must leave some node and enter another, the net average number of links entering a node is zero. Therefore,
,
which implies that, the generation function must satisfy:
where is the mean degree (both in and out) of the nodes in the network;
Using the function , we can again find the generation function for the in/out-degree distribution and in/out-excess degree distribution, as before. can be defined as generating functions for the number of arriving links at a randomly chosen node, and can be defined as the number of arriving links at a node reached by following a randomly chosen link. We can also define generating functions and for the number leaving such a node: [10]
Here, the average number of 1st neighbors, , or as previously introduced as , is and the average number of 2nd neighbors reachable from a randomly chosen node is given by: . These are also the numbers of 1st and 2nd neighbors from which a random node can be reached, since these equations are manifestly symmetric in and . [10]
In a signed network, each node has a positive-degree and a negative degree which are the positive number of links and negative number of links connected to that node respectfully. So and denote negative degree distribution and positive degree distribution of the signed network. [11] [12]
In probability theory and statistics, the negative binomial distribution is a discrete probability distribution that models the number of failures in a sequence of independent and identically distributed Bernoulli trials before a specified (non-random) number of successes occurs. For example, we can define rolling a 6 on some dice as a success, and rolling any other number as a failure, and ask how many failure rolls will occur before we see the third success. In such a case, the probability distribution of the number of failures that appear will be a negative binomial distribution.
Distributions, also known as Schwartz distributions or generalized functions, are objects that generalize the classical notion of functions in mathematical analysis. Distributions make it possible to differentiate functions whose derivatives do not exist in the classical sense. In particular, any locally integrable function has a distributional derivative.
In physics, the CHSH inequality can be used in the proof of Bell's theorem, which states that certain consequences of entanglement in quantum mechanics cannot be reproduced by local hidden-variable theories. Experimental verification of the inequality being violated is seen as confirmation that nature cannot be described by such theories. CHSH stands for John Clauser, Michael Horne, Abner Shimony, and Richard Holt, who described it in a much-cited paper published in 1969. They derived the CHSH inequality, which, as with John Stewart Bell's original inequality, is a constraint—on the statistical occurrence of "coincidences" in a Bell test—which is necessarily true if an underlying local hidden-variable theory exists. In practice, the inequality is routinely violated by modern experiments in quantum mechanics.
Spectral methods are a class of techniques used in applied mathematics and scientific computing to numerically solve certain differential equations. The idea is to write the solution of the differential equation as a sum of certain "basis functions" and then to choose the coefficients in the sum in order to satisfy the differential equation as well as possible.
In mathematics, random graph is the general term to refer to probability distributions over graphs. Random graphs may be described simply by a probability distribution, or by a random process which generates them. The theory of random graphs lies at the intersection between graph theory and probability theory. From a mathematical perspective, random graphs are used to answer questions about the properties of typical graphs. Its practical applications are found in all areas in which complex networks need to be modeled – many random graph models are thus known, mirroring the diverse types of complex networks encountered in different areas. In a mathematical context, random graph refers almost exclusively to the Erdős–Rényi random graph model. In other contexts, any graph model may be referred to as a random graph.
In network theory, a giant component is a connected component of a given random graph that contains a significant fraction of the entire graph's vertices.
In probability theory and statistical mechanics, the Gaussian free field (GFF) is a Gaussian random field, a central model of random surfaces.
A Boolean network consists of a discrete set of Boolean variables each of which has a Boolean function assigned to it which takes inputs from a subset of those variables and output that determines the state of the variable it is assigned to. This set of functions in effect determines a topology (connectivity) on the set of variables, which then become nodes in a network. Usually, the dynamics of the system is taken as a discrete time series where the state of the entire network at time t+1 is determined by evaluating each variable's function on the state of the network at time t. This may be done synchronously or asynchronously.
Quantum walks are quantum analogs of classical random walks. In contrast to the classical random walk, where the walker occupies definite states and the randomness arises due to stochastic transitions between states, in quantum walks randomness arises through (1) quantum superposition of states, (2) non-random, reversible unitary evolution and (3) collapse of the wave function due to state measurements.
Network science is an academic field which studies complex networks such as telecommunication networks, computer networks, biological networks, cognitive and semantic networks, and social networks, considering distinct elements or actors represented by nodes and the connections between the elements or actors as links. The field draws on theories and methods including graph theory from mathematics, statistical mechanics from physics, data mining and information visualization from computer science, inferential modeling from statistics, and social structure from sociology. The United States National Research Council defines network science as "the study of network representations of physical, biological, and social phenomena leading to predictive models of these phenomena."
A quantum t-design is a probability distribution over either pure quantum states or unitary operators which can duplicate properties of the probability distribution over the Haar measure for polynomials of degree t or less. Specifically, the average of any polynomial function of degree t over the design is exactly the same as the average over Haar measure. Here the Haar measure is a uniform probability distribution over all quantum states or over all unitary operators. Quantum t-designs are so called because they are analogous to t-designs in classical statistics, which arose historically in connection with the problem of design of experiments. Two particularly important types of t-designs in quantum mechanics are projective and unitary t-designs.
Coherent states have been introduced in a physical context, first as quasi-classical states in quantum mechanics, then as the backbone of quantum optics and they are described in that spirit in the article Coherent states. However, they have generated a huge variety of generalizations, which have led to a tremendous amount of literature in mathematical physics. In this article, we sketch the main directions of research on this line. For further details, we refer to several existing surveys.
Information field theory (IFT) is a Bayesian statistical field theory relating to signal reconstruction, cosmography, and other related areas. IFT summarizes the information available on a physical field using Bayesian probabilities. It uses computational techniques developed for quantum field theory and statistical field theory to handle the infinite number of degrees of freedom of a field and to derive algorithms for the calculation of field expectation values. For example, the posterior expectation value of a field generated by a known Gaussian process and measured by a linear device with known Gaussian noise statistics is given by a generalized Wiener filter applied to the measured data. IFT extends such known filter formula to situations with nonlinear physics, nonlinear devices, non-Gaussian field or noise statistics, dependence of the noise statistics on the field values, and partly unknown parameters of measurement. For this it uses Feynman diagrams, renormalisation flow equations, and other methods from mathematical physics.
Robustness, the ability to withstand failures and perturbations, is a critical attribute of many complex systems including complex networks.
In network science, the configuration model is a method for generating random networks from a given degree sequence. It is widely used as a reference model for real-life social networks, because it allows the modeler to incorporate arbitrary degree distributions.
Maximum-entropy random graph models are random graph models used to study complex networks subject to the principle of maximum entropy under a set of structural constraints, which may be global, distributional, or local.
In applied mathematics, the soft configuration model (SCM) is a random graph model subject to the principle of maximum entropy under constraints on the expectation of the degree sequence of sampled graphs. Whereas the configuration model (CM) uniformly samples random graphs of a specific degree sequence, the SCM only retains the specified degree sequence on average over all network realizations; in this sense the SCM has very relaxed constraints relative to those of the CM ("soft" rather than "sharp" constraints). The SCM for graphs of size has a nonzero probability of sampling any graph of size , whereas the CM is restricted to only graphs having precisely the prescribed connectivity structure.
In network science, the network entropy is a disorder measure derived from information theory to describe the level of randomness and the amount of information encoded in a graph. It is a relevant metric to quantitatively characterize real complex networks and can also be used to quantify network complexity
In computational and mathematical biology, a biological lattice-gas cellular automaton (BIO-LGCA) is a discrete model for moving and interacting biological agents, a type of cellular automaton. The BIO-LGCA is based on the lattice-gas cellular automaton (LGCA) model used in fluid dynamics. A BIO-LGCA model describes cells and other motile biological agents as point particles moving on a discrete lattice, thereby interacting with nearby particles. Contrary to classic cellular automaton models, particles in BIO-LGCA are defined by their position and velocity. This allows to model and analyze active fluids and collective migration mediated primarily through changes in momentum, rather than density. BIO-LGCA applications include cancer invasion and cancer progression.
In the context of quantum computing, the quantum walk search is a quantum algorithm for finding a marked node in a graph.