Simon model

Last updated

In applied probability theory, the Simon model is a class of stochastic models that results in a power-law distribution function. It was proposed by Herbert A. Simon [1] to account for the wide range of empirical distributions following a power-law. It models the dynamics of a system of elements with associated counters (e.g., words and their frequencies in texts, or nodes in a network and their connectivity ). In this model the dynamics of the system is based on constant growth via addition of new elements (new instances of words) as well as incrementing the counters (new occurrences of a word) at a rate proportional to their current values.

Contents

Description

To model this type of network growth as described above, Bornholdt and Ebel [2] considered a network with nodes, and each node with connectivities , . These nodes form classes of nodes with identical connectivity . Repeat the following steps:

(i) With probability add a new node and attach a link to it from an arbitrarily chosen node.

(ii) With probability add one link from an arbitrary node to a node of class chosen with a probability proportional to .

For this stochastic process, Simon found a stationary solution exhibiting power-law scaling, , with exponent

Properties

(i) Barabási-Albert (BA) model can be mapped to the subclass of Simon's model, when using the simpler probability for a node being connected to another node with connectivity (same as the preferential attachment at BA model). In other words, the Simon model describes a general class of stochastic processes that can result in a scale-free network, appropriate to capture Pareto and Zipf's laws.

(ii) The only free parameter of the model reflects the relative growth of number of nodes versus the number of links. In general has small values; therefore, the scaling exponents can be predicted to be . For instance, Bornholdt and Ebel [2] studied the linking dynamics of World Wide Web, and predicted the scaling exponent as , which was consistent with observation.

(iii) The interest in the scale-free model comes from its ability to describe the topology of complex networks. The Simon model does not have an underlying network structure, as it was designed to describe events whose frequency follows a power-law. Thus network measures going beyond the degree distribution such as the average path length, spectral properties, and clustering coefficient, cannot be obtained from this mapping.

The Simon model is related to generalized scale-free models with growth and preferential attachment properties. For more reference, see. [3] [4]

See also

Related Research Articles

<span class="mw-page-title-main">Power law</span> Functional relationship between two quantities

In statistics, a power law is a functional relationship between two quantities, where a relative change in one quantity results in a relative change in the other quantity proportional to a power of the change, independent of the initial size of those quantities: one quantity varies as a power of another. For instance, considering the area of a square in terms of the length of its side, if the length is doubled, the area is multiplied by a factor of four. The rate of change exhibited in these relationships is said to be multiplicative.

<span class="mw-page-title-main">Scale-free network</span> Network whose degree distribution follows a power law

A scale-free network is a network whose degree distribution follows a power law, at least asymptotically. That is, the fraction P(k) of nodes in the network having k connections to other nodes goes for large values of k as

In statistical mechanics, a universality class is a collection of mathematical models which share a single scale invariant limit under the process of renormalization group flow. While the models within a class may differ dramatically at finite scales, their behavior will become increasingly similar as the limit scale is approached. In particular, asymptotic phenomena such as critical exponents will be the same for all models in the class.

<span class="mw-page-title-main">Yule–Simon distribution</span> Discrete probability distribution

In probability and statistics, the Yule–Simon distribution is a discrete probability distribution named after Udny Yule and Herbert A. Simon. Simon originally called it the Yule distribution.

Critical exponents describe the behavior of physical quantities near continuous phase transitions. It is believed, though not proven, that they are universal, i.e. they do not depend on the details of the physical system, but only on some of its general features. For instance, for ferromagnetic systems, the critical exponents depend only on:

<span class="mw-page-title-main">Degree distribution</span>

In the study of graphs and networks, the degree of a node in a network is the number of connections it has to other nodes and the degree distribution is the probability distribution of these degrees over the whole network.

<span class="mw-page-title-main">Preferential attachment</span> Stochastic process formalizing cumulative advantage

A preferential attachment process is any of a class of processes in which some quantity, typically some form of wealth or credit, is distributed among a number of individuals or objects according to how much they already have, so that those who are already wealthy receive more than those who are not. "Preferential attachment" is only the most recent of many names that have been given to such processes. They are also referred to under the names Yule process, cumulative advantage, the rich get richer, and the Matthew effect. They are also related to Gibrat's law. The principal reason for scientific interest in preferential attachment is that it can, under suitable circumstances, generate power law distributions. If preferential attachment is non-linear, measured distributions may deviate from a power law. These mechanisms may generate distributions which are approximately power law over transient periods.

<span class="mw-page-title-main">Barabási–Albert model</span> Scale-free network generation algorithm

The Barabási–Albert (BA) model is an algorithm for generating random scale-free networks using a preferential attachment mechanism. Several natural and human-made systems, including the Internet, the World Wide Web, citation networks, and some social networks are thought to be approximately scale-free and certainly contain few nodes with unusually high degree as compared to the other nodes of the network. The BA model tries to explain the existence of such nodes in real networks. The algorithm is named for its inventors Albert-László Barabási and Réka Albert.

In probability theory and statistics, the Dirichlet-multinomial distribution is a family of discrete multivariate probability distributions on a finite support of non-negative integers. It is also called the Dirichlet compound multinomial distribution (DCM) or multivariate Pólya distribution. It is a compound probability distribution, where a probability vector p is drawn from a Dirichlet distribution with parameter vector , and an observation drawn from a multinomial distribution with probability vector p and number of trials n. The Dirichlet parameter vector captures the prior belief about the situation and can be seen as a pseudocount: observations of each outcome that occur before the actual data is collected. The compounding corresponds to a Pólya urn scheme. It is frequently encountered in Bayesian statistics, machine learning, empirical Bayes methods and classical statistics as an overdispersed multinomial distribution.

<span class="mw-page-title-main">Network science</span> Academic field

Network science is an academic field which studies complex networks such as telecommunication networks, computer networks, biological networks, cognitive and semantic networks, and social networks, considering distinct elements or actors represented by nodes and the connections between the elements or actors as links. The field draws on theories and methods including graph theory from mathematics, statistical mechanics from physics, data mining and information visualization from computer science, inferential modeling from statistics, and social structure from sociology. The United States National Research Council defines network science as "the study of network representations of physical, biological, and social phenomena leading to predictive models of these phenomena."

<span class="mw-page-title-main">Evolving network</span>

Evolving networks are networks that change as a function of time. They are a natural extension of network science since almost all real world networks evolve over time, either by adding or removing nodes or links over time. Often all of these processes occur simultaneously, such as in social networks where people make and lose friends over time, thereby creating and destroying edges, and some people become part of new social networks or leave their networks, changing the nodes in the network. Evolving network concepts build on established network theory and are now being introduced into studying networks in many diverse fields.

<span class="mw-page-title-main">Hierarchical network model</span>

Hierarchical network models are iterative algorithms for creating networks which are able to reproduce the unique properties of the scale-free topology and the high clustering of the nodes at the same time. These characteristics are widely observed in nature, from biology to language to some social networks.

Price's model is a mathematical model for the growth of citation networks. It was the first model which generalized the Simon model to be used for networks, especially for growing networks. Price's model belongs to the broader class of network growing models whose primary target is to explain the origination of networks with strongly skewed degree distributions. The model picked up the ideas of the Simon model reflecting the concept of rich get richer, also known as the Matthew effect. Price took the example of a network of citations between scientific papers and expressed its properties. His idea was that the way an old vertex gets new edges should be proportional to the number of existing edges the vertex already has. This was referred to as cumulative advantage, now also known as preferential attachment. Price's work is also significant in providing the first known example of a scale-free network. His ideas were used to describe many real-world networks such as the Web.

Robustness, the ability to withstand failures and perturbations, is a critical attribute of many complex systems including complex networks.

In network science, preferential attachment means that nodes of a network tend to connect to those nodes which have more links. If the network is growing and new nodes tend to connect to existing ones with linear probability in the degree of the existing nodes then preferential attachment leads to a scale-free network. If this probability is sub-linear then the network’s degree distribution is stretched exponential and hubs are much smaller than in a scale-free network. If this probability is super-linear then almost all nodes are connected to a few hubs. According to Kunegis, Blattner, and Moser several online networks follow a non-linear preferential attachment model. Communication networks and online contact networks are sub-linear while interaction networks are super-linear. The co-author network among scientists also shows the signs of sub-linear preferential attachment.

<span class="mw-page-title-main">Bianconi–Barabási model</span>

The Bianconi–Barabási model is a model in network science that explains the growth of complex evolving networks. This model can explain that nodes with different characteristics acquire links at different rates. It predicts that a node's growth depends on its fitness and can calculate the degree distribution. The Bianconi–Barabási model is named after its inventors Ginestra Bianconi and Albert-László Barabási. This model is a variant of the Barabási–Albert model. The model can be mapped to a Bose gas and this mapping can predict a topological phase transition between a "rich-get-richer" phase and a "winner-takes-all" phase.

In a scale-free network the degree distribution follows a power law function. In some empirical examples this power-law fits the degree distribution well only in the high degree region, however for small degree nodes the empirical degree-distribution deviates from it. See for example the network of scientific citations. This deviation of the observed degree-distribution from the theoretical prediction at the low-degree region is often referred as low-degree saturation.

The initial attractiveness is a possible extension of the Barabási–Albert model. The Barabási–Albert model generates scale-free networks where the degree distribution can be described by a pure power law. However, the degree distribution of most real life networks cannot be described by a power law solely. The most common discrepancies regarding the degree distribution found in real networks are the high degree cut-off and the low degree cut-off. The inclusion of initial attractiveness in the Barabási–Albert model addresses the low-degree cut-off phenomenon.

<span class="mw-page-title-main">Mediation-driven attachment model</span>

In the scale-free network theory, a mediation-driven attachment (MDA) model appears to embody a preferential attachment rule tacitly rather than explicitly. According to MDA rule, a new node first picks a node from the existing network at random and connect itself not with that but with one of the neighbors also picked at random.

<span class="mw-page-title-main">Copying network models</span>

Copying network models are network generation models that use a copying mechanism to form a network, by repeatedly duplicating and mutating existing nodes of the network. Such a network model has first been proposed in 1999 to explain the network of links between web pages, but since has been used to model biological and citation networks as well.

References

  1. Simon, Herbert A. (1955). "On a Class of Skew Distribution Functions". Biometrika. Oxford University Press (OUP). 42 (3–4): 425–440. doi:10.1093/biomet/42.3-4.425. ISSN   0006-3444.
  2. 1 2 Bornholdt, Stefan; Ebel, Holger (2001-08-27). "World Wide Web scaling exponent from Simon's 1955 model". Physical Review E. American Physical Society (APS). 64 (3): 035104(R). arXiv: cond-mat/0008465 . Bibcode:2001PhRvE..64c5104B. doi:10.1103/physreve.64.035104. ISSN   1063-651X. PMID   11580377. S2CID   2582211.
  3. Albert, Réka; Barabási, Albert-László (2002-01-30). "Statistical mechanics of complex networks". Reviews of Modern Physics. 74 (1): 47–97. arXiv: cond-mat/0106096 . Bibcode:2002RvMP...74...47A. doi:10.1103/revmodphys.74.47. ISSN   0034-6861. S2CID   60545.
  4. Amaral, L. A. N.; Scala, A.; Barthelemy, M.; Stanley, H. E. (2000-09-26). "Classes of small-world networks". Proceedings of the National Academy of Sciences USA. Proceedings of the National Academy of Sciences. 97 (21): 11149–11152. arXiv: cond-mat/0001458 . Bibcode:2000PNAS...9711149A. doi: 10.1073/pnas.200327197 . ISSN   0027-8424. PMC   17168 . PMID   11005838.