# Trophic coherence

Last updated

Trophic coherence is a property of directed graphs (or directed networks). [1] It is based on the concept of trophic levels used mainly in ecology, [2] but which can be defined for directed networks in general and provides a measure of hierarchical structure among nodes. Trophic coherence is the tendency of nodes to fall into well-defined trophic levels. It has been related to several structural and dynamical properties of directed networks, including the prevalence of cycles [3] and network motifs, [4] ecological stability, [1] intervality, [5] and spreading processes like epidemics and neuronal avalanches. [6]

## Definition

Consider a directed network defined by the ${\displaystyle N\times N}$ adjacency matrix ${\displaystyle A=(a_{ij})}$. Each node ${\displaystyle i}$ can be assigned a trophic level ${\displaystyle s_{i}}$ according to

${\displaystyle s_{i}=1+{\frac {1}{k_{i}^{\text{in}}}}\sum _{j}a_{ij}s_{j},}$

where ${\displaystyle k_{i}^{\text{in}}=\sum _{j}a_{ij}}$ is ${\displaystyle i}$'s in-degree, and nodes with ${\displaystyle k_{i}^{\text{in}}=0}$ (basal nodes) have ${\displaystyle s_{i}=1}$ by convention. Each edge has a trophic difference associated, defined as ${\displaystyle x_{ij}=s_{i}-s_{j}}$. The trophic coherence of the network is a measure of how tightly peaked the distribution of trophic distances, ${\displaystyle p(x)}$, is around its mean value, which is always ${\displaystyle \langle x\rangle =1}$. This can be captured by an incoherence parameter${\displaystyle q}$, equal to the standard deviation of ${\displaystyle p(x)}$:

${\displaystyle q={\sqrt {{\frac {1}{L}}\sum _{ij}a_{ij}x_{ij}^{2}-1}},}$

where ${\displaystyle L=\sum _{ij}a_{ij}}$ is the number of edges in the network. [1]

The figure shows two networks which differ in their trophic coherence. The position of the nodes on the vertical axis corresponds to their trophic level. In the network on the left, nodes fall into distinct (integer) trophic levels, so the network is maximally coherent ${\displaystyle (q=0)}$. In the one on the right, many of the nodes have fractional trophic levels, and the network is more incoherent ${\displaystyle (q=0.49)}$. [6]

## Trophic coherence in nature

The degree to which empirical networks are trophically coherent (or incoherent) can be investigated by comparison with a null model. This is provided by the basal ensemble, which comprises networks in which all non-basal nodes have the same proportion of basal nodes for in-neighbours. [3] Expected values in this ensemble converge to those of the widely used configuration ensemble [7] in the limit ${\displaystyle N\rightarrow \infty }$, ${\displaystyle L/N\rightarrow \infty }$ (with ${\displaystyle N}$ and ${\displaystyle L}$ the numbers of nodes and edges), and can be shown numerically to be a good approximation for finite random networks. The basal ensemble expectation for the incoherence parameter is

${\displaystyle {\tilde {q}}={\sqrt {{\frac {L}{L_{B}}}-1}},}$

where ${\displaystyle L_{B}}$ is the number of edges connected to basal nodes. [3] The ratio ${\displaystyle q/{\tilde {q}}}$ measured in empirical networks reveals whether they are more or less coherent than the random expectation. For instance, Johnson and Jones [3] find in a set of networks that food webs are significantly coherent ${\displaystyle (q/{\tilde {q}}=0.44\pm 0.17)}$, metabolic networks are significantly incoherent ${\displaystyle (q/{\tilde {q}}=1.81\pm 0.11)}$, and gene regulatory networks are close to the random expectation ${\displaystyle (q/{\tilde {q}}=0.99\pm 0.05)}$.

## Trophic levels and node function

There is as yet little understanding of the mechanisms which might lead to particular kinds of networks becoming significantly coherent or incoherent. [3] However, in systems which present correlations between trophic level and other features of nodes, processes which tended to favour the creation of edges between nodes with particular characteristics could induce coherence or incoherence. In the case of food webs, predators tend to specialise on consuming prey with certain biological properties (such as size, speed or behaviour) which correlate with their diet, and hence with trophic level. This has been suggested as the reason for food-web coherence. [1] However, food-web models based on a niche axis do not reproduce realistic trophic coherence, [1] which may mean either that this explanation is insufficient, or that several niche dimensions need to be considered. [8]

The relation between trophic level and node function can be seen in networks other than food webs. The figure shows a word adjacency network derived from the book Green Eggs and Ham, by Dr Seuss. [3] The height of nodes represents their trophic levels (according here to the edge direction which is the opposite of that suggested by the arrows, which indicate the order in which words are concatenated in sentences). The syntactic function of words is also shown with node colour. There is a clear relationship between syntactic function and trophic level: the mean trophic level of common nouns (blue) is ${\displaystyle s_{noun}=1.4\pm 1.2}$, whereas that of verbs (red) is ${\displaystyle s_{verb}=7.0\pm 2.7}$. This example illustrates how trophic coherence or incoherence might emerge from node function, and also that the trophic structure of networks provides a means of identifying node function in certain systems.

## Generating trophically coherent networks

There are various ways of generating directed networks with specified trophic coherence, all based on gradually introducing new edges to the system in such a way that the probability of each new candidate edge being accepted depends on the expected trophic difference it would have.

The preferential preying model is an evolving network model similar to the Barábasi-Albert model of preferential attachment, but inspired on an ecosystem that grows through immigration of new species. [1] One begins with ${\displaystyle B}$ basal nodes and proceeds to introduce new nodes up to a total of ${\displaystyle N}$. Each new node ${\displaystyle i}$ is assigned a first in-neighbour ${\displaystyle j}$ (a prey species in the food-web context) and a new edge is placed from ${\displaystyle j}$ to ${\displaystyle i}$. The new node is given a temporary trophic level ${\displaystyle s_{i}^{t}=s_{j}+1}$. Then a further ${\displaystyle \kappa _{i}}$ new in-neighbours ${\displaystyle l}$ are chosen for ${\displaystyle i}$ from among those in the network according to their trophic levels. Specifically, for a new candidate in-neighbour ${\displaystyle l}$, the probability of being chosen is a function of ${\displaystyle x_{il}^{t}=s_{i}^{t}-s_{l}}$. Johnson et al [1] use

${\displaystyle P_{il}\propto \exp \left(-{\frac {|x_{il}^{t}-1|}{T}}\right),}$

where ${\displaystyle T}$ is a parameter which tunes the trophic coherence: for ${\displaystyle T=0}$ maximally coherent networks are generated, and ${\displaystyle q}$ increases monotonically with ${\displaystyle T}$ for ${\displaystyle T>0}$. The choice of ${\displaystyle \kappa _{i}}$ is arbitrary. One possibility is to set to ${\displaystyle \kappa _{i}=z_{i}n_{i}}$, where ${\displaystyle n_{i}}$ is the number of nodes already in the network when ${\displaystyle i}$ arrives, and ${\displaystyle z_{i}}$ is a random variable drawn from a Beta distribution with parameters ${\displaystyle \alpha =1}$ and

${\displaystyle \beta ={\frac {N^{2}-B^{2}}{2L_{d}}}-1}$

(${\displaystyle L_{d}}$ being the desired number of edges). This way, the generalised cascade model [9] [10] is recovered in the limit ${\displaystyle T\rightarrow \infty }$, and the degree distributions are as in the niche model [11] and generalised niche model. [10] This algorithm, as described above, generates networks with no cycles (except for self-cycles, if the new node ${\displaystyle i}$ is itself considered among its candidate in-neighbours ${\displaystyle l}$). In order for cycles of all lengths to be a possible, one can consider new candidate edges in which the new node ${\displaystyle i}$ is the in-neighbour as well as those in which it would be the out-neighbour. The probability of acceptance of these edges, ${\displaystyle P_{li}}$, then depends on ${\displaystyle x_{li}^{t}=s_{l}-s_{i}^{t}}$.

The generalised preferential preying model [6] is similar to the one described above, but has certain advantages. In particular, it is more analytically tractable, and one can generate networks with a precise number of edges ${\displaystyle L}$. The network begins with ${\displaystyle B}$ basal nodes, and then a further ${\displaystyle N-B}$ new nodes are added in the following way. When each enters the system, it is assigned a single in-neighbour randomly from among those already there. Every node then has an integer temporary trophic level ${\displaystyle s_{i}^{t}}$. The remaining ${\displaystyle L-N+B}$ edges are introduced as follows. Each pair of nodes ${\displaystyle (i,j)}$ has two temporary trophic distances associated, ${\displaystyle x_{ij}^{t}=s_{i}^{t}-s_{j}^{t}}$ and ${\displaystyle x_{ji}^{t}=s_{j}^{t}-s_{i}^{t}}$. Each of these candidate edges is accepted with a probability that depends on this temporary distance. Klaise and Johnson [6] use

${\displaystyle P_{ij}\propto \exp \left(-{\frac {(x_{ij}^{t}-1)^{2}}{2T^{2}}}\right),}$

because they find the distribution of trophic distances in several kinds of networks to be approximately normal, and this choice leads to a range of the parameter ${\displaystyle T}$ in which ${\displaystyle q\simeq T}$. Once all the edges have been introduced, one must recalculate the trophic levels of all nodes, since these will differ from the temporary ones originally assigned unless ${\displaystyle T\simeq 0}$. As with the preferential preying model, the average incoherence parameter ${\displaystyle q}$ of the resulting networks is a monotonically increasing function of ${\displaystyle T}$ for ${\displaystyle T\geq 0}$. The figure above shows two networks with different trophic coherence generated with this algorithm.

## Related Research Articles

The travelling salesman problem asks the following question: "Given a list of cities and the distances between each pair of cities, what is the shortest possible route that visits each city exactly once and returns to the origin city?" It is an NP-hard problem in combinatorial optimization, important in theoretical computer science and operations research.

In computer science and optimization theory, the max-flow min-cut theorem states that in a flow network, the maximum amount of flow passing from the source to the sink is equal to the total weight of the edges in a minimum cut, i.e. the smallest total weight of the edges which if removed would disconnect the source from the sink.

In thermodynamics, the Helmholtz free energy is a thermodynamic potential that measures the useful work obtainable from a closed thermodynamic system at a constant temperature (isothermal). The change in the Helmholtz energy during a process is equal to the maximum amount of work that the system can perform in a thermodynamic process in which temperature is held constant. At constant temperature, the Helmholtz free energy is minimized at equilibrium.

In the mathematical field of Lie theory, a Dynkin diagram, named for Eugene Dynkin, is a type of graph with some edges doubled or tripled. Dynkin diagrams arise in the classification of semisimple Lie algebras over algebraically closed fields, in the classification of Weyl groups and other finite reflection groups, and in other contexts. Various properties of the Dynkin diagram correspond to important features of the associated Lie algebra.

In mathematics, a foliation is an equivalence relation on an n-manifold, the equivalence classes being connected, injectively immersed submanifolds, all of the same dimension p, modeled on the decomposition of the real coordinate space Rn into the cosets x + Rp of the standardly embedded subspace Rp. The equivalence classes are called the leaves of the foliation. If the manifold and/or the submanifolds are required to have a piecewise-linear, differentiable, or analytic structure then one defines piecewise-linear, differentiable, or analytic foliations, respectively. In the most important case of differentiable foliation of class Cr it is usually understood that r ≥ 1. The number p is called the dimension of the foliation and q = np is called its codimension.

In optimization theory, maximum flow problems involve finding a feasible flow through a flow network that obtains the maximum possible flow rate.

In graph theory, a clustering coefficient is a measure of the degree to which nodes in a graph tend to cluster together. Evidence suggests that in most real-world networks, and in particular social networks, nodes tend to create tightly knit groups characterised by a relatively high density of ties; this likelihood tends to be greater than the average probability of a tie randomly established between two nodes.

In graph theory and network analysis, indicators of centrality assign numbers or rankings to nodes within a graph corresponding to their network position. Applications include identifying the most influential person(s) in a social network, key infrastructure nodes in the Internet or urban networks, super-spreaders of disease, and brain networks. Centrality concepts were first developed in social network analysis, and many of the terms used to measure centrality reflect their sociological origin.

Mixing patterns refer to systematic tendencies of one type of nodes in a network to connect to another type. For instance, nodes might tend to link to others that are very similar or very different. This feature is common in many social networks, although it also appears sometimes in non-social networks. Mixing patterns are closely related to assortativity; however, for the purposes of this article, the term is used to refer to assortative or disassortative mixing based on real-world factors, either topological or sociological.

In graph theory, a random geometric graph (RGG) is the mathematically simplest spatial network, namely an undirected graph constructed by randomly placing N nodes in some metric space and connecting two nodes by a link if and only if their distance is in a given range, e.g. smaller than a certain neighborhood radius, r.

Network science is an academic field which studies complex networks such as telecommunication networks, computer networks, biological networks, cognitive and semantic networks, and social networks, considering distinct elements or actors represented by nodes and the connections between the elements or actors as links. The field draws on theories and methods including graph theory from mathematics, statistical mechanics from physics, data mining and information visualization from computer science, inferential modeling from statistics, and social structure from sociology. The United States National Research Council defines network science as "the study of network representations of physical, biological, and social phenomena leading to predictive models of these phenomena."

Modularity is a measure of the structure of networks or graphs which measures the strength of division of a network into modules. Networks with high modularity have dense connections between the nodes within modules but sparse connections between nodes in different modules. Modularity is often used in optimization methods for detecting community structure in networks. However, it has been shown that modularity suffers a resolution limit and, therefore, it is unable to detect small communities. Biological networks, including animal brains, exhibit a high degree of modularity.

The random walker algorithm is an algorithm for image segmentation. In the first description of the algorithm, a user interactively labels a small number of pixels with known labels, e.g., "object" and "background". The unlabeled pixels are each imagined to release a random walker, and the probability is computed that each pixel's random walker first arrives at a seed bearing each label, i.e., if a user places K seeds, each with a different label, then it is necessary to compute, for each pixel, the probability that a random walker leaving the pixel will first arrive at each seed. These probabilities may be determined analytically by solving a system of linear equations. After computing these probabilities for each pixel, the pixel is assigned to the label for which it is most likely to send a random walker. The image is modeled as a graph, in which each pixel corresponds to a node which is connected to neighboring pixels by edges, and the edges are weighted to reflect the similarity between the pixels. Therefore, the random walk occurs on the weighted graph.

In graph theory, the Katz centrality of a node is a measure of centrality in a network. It was introduced by Leo Katz in 1953 and is used to measure the relative degree of influence of an actor within a social network. Unlike typical centrality measures which consider only the shortest path between a pair of actors, Katz centrality measures influence by taking into account the total number of walks between a pair of actors.

In graph theory, betweenness centrality is a measure of centrality in a graph based on shortest paths. For every pair of vertices in a connected graph, there exists at least one shortest path between the vertices such that either the number of edges that the path passes through or the sum of the weights of the edges is minimized. The betweenness centrality for each vertex is the number of these shortest paths that pass through the vertex.

In the ADM formulation of general relativity one splits spacetime into spatial slices and time, the basic variables are taken to be the induced metric, , on the spatial slice, and its conjugate momentum variable related to the extrinsic curvature, ,. These are the metric canonical coordinates.

Ashtekar variables, which were a new canonical formalism of general relativity, raised new hopes for the canonical quantization of general relativity and eventually led to loop quantum gravity. Smolin and others independently discovered that there exists in fact a Lagrangian formulation of the theory by considering the self-dual formulation of the Tetradic Palatini action principle of general relativity. These proofs were given in terms of spinors. A purely tensorial proof of the new variables in terms of triads was given by Goldberg and in terms of tetrads by Henneaux et al.

The Louvain method for community detection is a method to extract communities from large networks created by Blondel et al. from the University of Louvain. The method is a greedy optimization method that appears to run in time if is the number of nodes in the network.

The stochastic block model is a generative model for random graphs. This model tends to produce graphs containing communities, subsets of nodes characterized by being connected with one another with particular edge densities. For example, edges may be more common within communities than between communities. Its mathematical formulation has been firstly introduced in 1983 in the field of social network by Holland et al. The stochastic block model is important in statistics, machine learning, and network science, where it serves as a useful benchmark for the task of recovering community structure in graph data.

Maximal entropy random walk (MERW) is a popular type of biased random walk on a graph, in which transition probabilities are chosen accordingly to the principle of maximum entropy, which says that the probability distribution which best represents the current state of knowledge is the one with largest entropy. While standard random walk chooses for every vertex uniform probability distribution among its outgoing edges, locally maximizing entropy rate, MERW maximizes it globally by assuming uniform probability distribution among all paths in a given graph.

## References

1. Johnson S, Domı́nguez-Garcı́a V, Donetti L, Muñoz MA (2014). "Trophic coherence determines food-web stability". Proc Natl Acad Sci USA . 111 (50): 17923–17928. arXiv:. Bibcode:2014PNAS..11117923J. doi:. PMC  . PMID   25468963.CS1 maint: multiple names: authors list (link)
2. Levine S (1980). "Several measures of trophic structure applicable to complex food webs". J Theor Biol . 83 (2): 195–207. doi:10.1016/0022-5193(80)90288-X.
3. Johnson S and Jones NS (2017). "Looplessness in networks is linked to trophic coherence". Proc Natl Acad Sci USA . 114 (22): 5618–5623. arXiv:. doi:. PMC  . PMID   28512222.
4. Klaise J and Johnson S (2017). "The origin of motif families in food webs". Scientific Reports . 7 (1): 16197. arXiv:. Bibcode:2017NatSR...716197K. doi:10.1038/s41598-017-15496-1. PMC  . PMID   29170384.
5. Domı́nguez-Garcı́a V, Johnson S, Muñoz MA (2016). "Intervality and coherence in complex networks". Chaos . 26 (6): 065308. arXiv:. Bibcode:2016Chaos..26f5308D. doi:10.1063/1.4953163. PMID   27368797. S2CID   16081869.CS1 maint: multiple names: authors list (link)
6. Klaise J and Johnson S (2016). "From neurons to epidemics: How trophic coherence affects spreading processes". Chaos . 26 (6): 065310. arXiv:. Bibcode:2016Chaos..26f5310K. doi:10.1063/1.4953160. PMID   27368799. S2CID   205214650.
7. Newman, MEJ (2003). "The structure and function of complex networks". SIAM Review . 45 (2): 167–256. arXiv:. Bibcode:2003SIAMR..45..167N. doi:10.1137/S003614450342480. S2CID   221278130.
8. Rossberg AG, Brännström A, Dieckmann U (2010). "Food-web structure in low- and high-dimensional trophic niche spaces". J R Soc Interface . 7 (53): 1735–1743. doi:10.1098/rsif.2010.0111. PMC  . PMID   20462875.CS1 maint: multiple names: authors list (link)
9. Cohen JE and Newman CM (1985). "A stochastic theory of community food webs I. Models and aggregated data". Proc. R. Soc. B . 224 (1237): 421–448. Bibcode:1985RSPSB.224..421C. doi:10.1098/rspb.1985.0042. S2CID   52993453.
10. Stouffer DB, Camacho J, Amaral LAN (2006). "A robust measure of food web intervality". Proc Natl Acad Sci USA . 103 (50): 19015–19020. Bibcode:2006PNAS..10319015S. doi:. PMC  . PMID   17146055.CS1 maint: multiple names: authors list (link)
11. Williams RJ and Martinez ND (2000). "Simple rules yield complex food webs". Nature . 404 (6774): 180–183. Bibcode:2000Natur.404..180W. doi:10.1038/35004572. PMID   10724169. S2CID   205004984.