Entropy of network ensembles

Last updated

A set of networks that satisfies given structural characteristics can be treated as a network ensemble. [1] Brought up by Ginestra Bianconi in 2007, the entropy of a network ensemble measures the level of the order or uncertainty of a network ensemble. [2]

Contents

The entropy is the logarithm of the number of graphs. [3] Entropy can also be defined in one network. Basin entropy is the logarithm of the attractors in one Boolean network. [4]

Employing approaches from statistical mechanics, the complexity, uncertainty, and randomness of networks can be described by network ensembles with different types of constraints. [5]

Gibbs and Shannon entropy

By analogy to statistical mechanics, microcanonical ensembles and canonical ensembles of networks are introduced for the implementation. A partition function Z of an ensemble can be defined as:

where is the constraint, and () are the elements in the adjacency matrix, if and only if there is a link between node i and node j. is a step function with if , and if . The auxiliary fields and have been introduced as analogy to the bath in classical mechanics.

For simple undirected networks, the partition function can be simplified as [6]

where , is the index of the weight, and for a simple network .

Microcanonical ensembles and canonical ensembles are demonstrated with simple undirected networks.

For a microcanonical ensemble, the Gibbs entropy is defined by:

where indicates the cardinality of the ensemble, i.e., the total number of networks in the ensemble.

The probability of having a link between nodes i and j, with weight is given by:

For a canonical ensemble, the entropy is presented in the form of a Shannon entropy:

Relation between Gibbs and Shannon entropy

Network ensemble with given number of nodes and links , and its conjugate-canonical ensemble are characterized as microcanonical and canonical ensembles and they have Gibbs entropy and the Shannon entropy S, respectively. The Gibbs entropy in the ensemble is given by: [7]

For ensemble,

Inserting into the Shannon entropy: [6]

The relation indicates that the Gibbs entropy and the Shannon entropy per node S/N of random graphs are equal in the thermodynamic limit .

See also

Related Research Articles

<span class="mw-page-title-main">Pauli matrices</span> Matrices important in quantum mechanics and the study of spin

In mathematical physics and mathematics, the Pauli matrices are a set of three 2 × 2 complex matrices that are Hermitian, involutory and unitary. Usually indicated by the Greek letter sigma, they are occasionally denoted by tau when used in connection with isospin symmetries.

In continuum mechanics, the infinitesimal strain theory is a mathematical approach to the description of the deformation of a solid body in which the displacements of the material particles are assumed to be much smaller than any relevant dimension of the body; so that its geometry and the constitutive properties of the material at each point of space can be assumed to be unchanged by the deformation.

Linear elasticity is a mathematical model of how solid objects deform and become internally stressed due to prescribed loading conditions. It is a simplification of the more general nonlinear theory of elasticity and a branch of continuum mechanics.

In probability and statistics, an exponential family is a parametric set of probability distributions of a certain form, specified below. This special form is chosen for mathematical convenience, including the enabling of the user to calculate expectations, covariances using differentiation based on some useful algebraic properties, as well as for generality, as exponential families are in a sense very natural sets of distributions to consider. The term exponential class is sometimes used in place of "exponential family", or the older term Koopman–Darmois family. Sometimes loosely referred to as "the" exponential family, this class of distributions is distinct because they all possess a variety of desirable properties, most importantly the existence of a sufficient statistic.

<span class="mw-page-title-main">Expectation–maximization algorithm</span> Iterative method for finding maximum likelihood estimates in statistical models

In statistics, an expectation–maximization (EM) algorithm is an iterative method to find (local) maximum likelihood or maximum a posteriori (MAP) estimates of parameters in statistical models, where the model depends on unobserved latent variables. The EM iteration alternates between performing an expectation (E) step, which creates a function for the expectation of the log-likelihood evaluated using the current estimate for the parameters, and a maximization (M) step, which computes parameters maximizing the expected log-likelihood found on the E step. These parameter-estimates are then used to determine the distribution of the latent variables in the next E step. It can be used, for example, to estimate a mixture of gaussians, or to solve the multiple linear regression problem.

In information geometry, the Fisher information metric is a particular Riemannian metric which can be defined on a smooth statistical manifold, i.e., a smooth manifold whose points are probability measures defined on a common probability space. It can be used to calculate the informational difference between measurements.

<span class="mw-page-title-main">Bloch sphere</span> Geometrical representation of the pure state space of a two-level quantum mechanical system

In quantum mechanics and computing, the Bloch sphere is a geometrical representation of the pure state space of a two-level quantum mechanical system (qubit), named after the physicist Felix Bloch.

In physics, a first class constraint is a dynamical quantity in a constrained Hamiltonian system whose Poisson bracket with all the other constraints vanishes on the constraint surface in phase space. To calculate the first class constraint, one assumes that there are no second class constraints, or that they have been calculated previously, and their Dirac brackets generated.

In mathematical statistics, the Kullback–Leibler (KL) divergence, denoted , is a type of statistical distance: a measure of how one probability distribution P is different from a second, reference probability distribution Q. A simple interpretation of the KL divergence of P from Q is the expected excess surprise from using Q as a model instead of P when the actual distribution is P. While it is a measure of how different two distributions are, and in some sense is thus a "distance", it is not actually a metric, which is the most familiar and formal type of distance. In particular, it is not symmetric in the two distributions, and does not satisfy the triangle inequality. Instead, in terms of information geometry, it is a type of divergence, a generalization of squared distance, and for certain classes of distributions, it satisfies a generalized Pythagorean theorem.

In statistical mechanics, the microcanonical ensemble is a statistical ensemble that represents the possible states of a mechanical system whose total energy is exactly specified. The system is assumed to be isolated in the sense that it cannot exchange energy or particles with its environment, so that the energy of the system does not change with time.

In Bayesian probability, the Jeffreys prior, named after Sir Harold Jeffreys, is a non-informative prior distribution for a parameter space; its density function is proportional to the square root of the determinant of the Fisher information matrix:

An ideal chain is the simplest model in polymer chemistry to describe polymers, such as nucleic acids and proteins. It assumes that the monomers in a polymer are located at the steps of a hypothetical random walker that does not remember its previous steps. By neglecting interactions among monomers, this model assumes that two monomers can occupy the same location. Although it is simple, its generality gives insight about the physics of polymers.

<span class="mw-page-title-main">Assortativity</span> Tendency for similar nodes to be connected

Assortativity, or assortative mixing, is a preference for a network's nodes to attach to others that are similar in some way. Though the specific measure of similarity may vary, network theorists often examine assortativity in terms of a node's degree. The addition of this characteristic to network models more closely approximates the behaviors of many real world networks.

Monte Carlo in statistical physics refers to the application of the Monte Carlo method to problems in statistical physics, or statistical mechanics.

Stress majorization is an optimization strategy used in multidimensional scaling (MDS) where, for a set of -dimensional data items, a configuration of points in -dimensional space is sought that minimizes the so-called stress function . Usually is or , i.e. the matrix lists points in or dimensional Euclidean space so that the result may be visualised. The function is a cost or loss function that measures the squared differences between ideal distances and actual distances in r-dimensional space. It is defined as:

<span class="mw-page-title-main">Gravitational lensing formalism</span>

In general relativity, a point mass deflects a light ray with impact parameter by an angle approximately equal to

In statistical mechanics, the eight-vertex model is a generalisation of the ice-type (six-vertex) models; it was discussed by Sutherland, and Fan & Wu, and solved by Baxter in the zero-field case.

In continuum mechanics, an Arruda–Boyce model is a hyperelastic constitutive model used to describe the mechanical behavior of rubber and other polymeric substances. This model is based on the statistical mechanics of a material with a cubic representative volume element containing eight chains along the diagonal directions. The material is assumed to be incompressible. The model is named after Ellen Arruda and Mary Cunningham Boyce, who published it in 1993.

Curvilinear coordinates can be formulated in tensor calculus, with important applications in physics and engineering, particularly for describing transportation of physical quantities and deformation of matter in fluid mechanics and continuum mechanics.

In network science, the network entropy is a disorder measure derived from information theory to describe the level of randomness and the amount of information encoded in a graph. It is a relevant metric to quantitatively characterize real complex networks and can also be used to quantify network complexity

References

  1. Levin, E.; Tishby, N.; Solla, S.A. (October 1990). "A statistical approach to learning and generalization in layered neural networks". Proceedings of the IEEE. 78 (10): 1568–1574. doi:10.1109/5.58339. ISSN   1558-2256. S2CID   5254307.
  2. Bianconi, Ginestra (2008). "The entropy of randomized network ensembles". EPL (Europhysics Letters). 81 (2): 28005. arXiv: 0708.0153 . Bibcode:2008EL.....8128005B. doi:10.1209/0295-5075/81/28005. ISSN   0295-5075. S2CID   17269886.
  3. Menichetti, Giulia; Remondini, Daniel (2014). "Entropy of a network ensemble: definitions and applications to genomic data". Theoretical Biology Forum. 107 (1–2): 77–87. ISSN   0035-6050. PMID   25936214.
  4. Krawitz, Peter; Shmulevich, Ilya (27 September 2007). "Entropy of complex relevant components of Boolean networks". Physical Review E. 76 (3): 036115. arXiv: 0708.1538 . Bibcode:2007PhRvE..76c6115K. doi:10.1103/PhysRevE.76.036115. PMID   17930314. S2CID   6192682.
  5. Bianconi, Ginestra (27 March 2009). "Entropy of network ensembles". Physical Review E. 79 (3): 036114. arXiv: 0802.2888 . Bibcode:2009PhRvE..79c6114B. doi:10.1103/PhysRevE.79.036114. PMID   19392025. S2CID   26082469.
  6. 1 2 Anand, Kartik; Bianconi, Ginestra (13 October 2009). "Entropy measures for networks: Toward an information theory of complex topologies". Physical Review E. 80 (4): 045102. arXiv: 0907.1514 . Bibcode:2009PhRvE..80d5102A. doi:10.1103/PhysRevE.80.045102. PMID   19905379. S2CID   27419558.
  7. Bogacz, Leszek; Burda, Zdzisław; Wacław, Bartłomiej (1 July 2006). "Homogeneous complex networks". Physica A: Statistical Mechanics and Its Applications. 366: 587–607. arXiv: cond-mat/0502124 . Bibcode:2006PhyA..366..587B. doi:10.1016/j.physa.2005.10.024. ISSN   0378-4371. S2CID   119428248.