Small-world routing

Last updated

In network theory, small-world routing refers to routing methods for small-world networks. Networks of this type are peculiar in that relatively short paths exist between any two nodes. Determining these paths, however, can be a difficult problem from the perspective of an individual routing node in the network if no further information is known about the network as a whole.

Contents

Greedy routing

Nearly every solution to the problem of routing in small world involves the application of greedy routing. This sort of routing depends on a relative reference point by which any node in the path can choose the next node it believes is closest to the destination. That is, there must be something to be greedy about. For example, this could be geographic location, IP address, etc. In the case of Milgram's original small-world experiment, participants knew the location and occupation of the final recipient and could therefore forward messages based on those parameters.[ citation needed ]

Constructing a reference base

Greedy routing will not readily work when there is no obvious reference base. This can occur, for example, in overlay networks where information about the destination's location in the underlying network is not available. Friend-to-friend networks are a particular example of this problem. In such networks, trust is ensured by the fact that you only know underlying information about nodes with whom you are already a neighbor.[ citation needed ]

One solution in this case, is to impose some sort of artificial addressing on the nodes in such a way that this addressing can be effectively used by greedy routing methods. A 2005 paper by a developer of the Freenet Project discusses how this can be accomplished in friend to friend networks. Given the assumption that these networks exhibit small world properties, often as the result of real-world or acquaintance relationships, it should be possible to recover an embedded Kleinberg small-world graph. This is accomplished by selecting random pairs of nodes and potentially swapping them based on an objective function that minimizes the product of all the distances between any given node and its neighbors.[ citation needed ]

An important problem involved with this solution is the possibility of local minima. This can occur if nodes are in a situation that is optimal only considering a local neighborhood, while ignoring the possibility of a higher optimality resulting from swaps with distant nodes. In the above paper, the authors proposed a simulated annealing method where less-than-optimal swaps were made with a small probability. This probability was proportional to the value of making the switches. Another possible metaheuristic optimization method is a tabu search, which adds a memory to the swap decision. In its most simplistic form, a limited history of past swaps is remembered so that they will be excluded from the list of possible swapping nodes.[ citation needed ]

This method for constructing a reference base can also be adapted to distributed settings, where decisions can only be made at the level of individual nodes who have no knowledge of the overall network. It turns out that the only modification necessary is in the method for selecting pairs of random nodes. In a distributed setting, this is done by having each node periodically send out a random walker terminating at a node to be considered for swapping.[ citation needed ]

The Kleinberg model

The Kleinberg model of a network is effective at demonstrating the effectiveness of greedy small world routing. The model uses an n x n grid of nodes to represent a network, where each node is connected with an undirected edge to its neighbors. To give it the "small world" effect, a number of long range edges are added to the network that tend to favor nodes closer in distance rather than farther. When adding edges, the probability of connecting some random vertex to another random vertex w is proportional to Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "http://localhost:6011/en.wikipedia.org/v1/":): 1/d(v,w)^q, where is the clustering exponent. [1]

Greedy routing in the Kleinberg model

It is easy to see that a greedy algorithm, without using the long range edges, can navigate from random vertices on the grid in time. By following the guaranteed connections to our neighbors, we can move one unit at a time in the direction of our destination. This is also the case when the clustering component is large and the "long range" edges end up staying very close; we simply do not take advantage of the weaker ties in this model. When , the long range edges are uniformly connected at random which means the long range edges are "too random" to be used efficiently for decentralized search. Kleinberg has shown that the optimal clustering coefficient for this model is , or an inverse square distribution. [2]

To reason why this is the case, if a circle of radius r is drawn around the initial node it will have nodal density where n is the number of nodes in the circular area. As this circle gets expanded further out, the number of nodes in the given area increases proportional to as the probability of having a random link with any node remains proportional Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "http://localhost:6011/en.wikipedia.org/v1/":): 1/r^{2}, meaning the probability of the original node having a weak tie with any node a given distance away is effectively independent of distance. Therefore, it is concluded that with , long-range edges are evenly distributed over all distances, which is effective for letting us funnel to our final destination.[ citation needed ]

Some structured Peer-to-peer systems based on DHTs often are implementing variants of Kleinberg's Small-World topology to enable efficient routing within Peer-to-peer network with limited node degrees. [3]

See also

Related Research Articles

<span class="mw-page-title-main">Travelling salesman problem</span> NP-hard problem in combinatorial optimization

The travelling salesman problem (TSP) asks the following question: "Given a list of cities and the distances between each pair of cities, what is the shortest possible route that visits each city exactly once and returns to the origin city?" It is an NP-hard problem in combinatorial optimization, important in theoretical computer science and operations research.

<span class="mw-page-title-main">Dijkstra's algorithm</span> Graph search algorithm

Dijkstra's algorithm is an algorithm for finding the shortest paths between nodes in a weighted graph, which may represent, for example, road networks. It was conceived by computer scientist Edsger W. Dijkstra in 1956 and published three years later.

<span class="mw-page-title-main">Simulated annealing</span> Probabilistic optimization technique and metaheuristic

Simulated annealing (SA) is a probabilistic technique for approximating the global optimum of a given function. Specifically, it is a metaheuristic to approximate global optimization in a large search space for an optimization problem. For large numbers of local optima, SA can find the global optima. It is often used when the search space is discrete. For problems where finding an approximate global optimum is more important than finding a precise local optimum in a fixed amount of time, simulated annealing may be preferable to exact algorithms such as gradient descent or branch and bound.

<span class="mw-page-title-main">Distributed hash table</span> Decentralized distributed system with lookup service

A distributed hash table (DHT) is a distributed system that provides a lookup service similar to a hash table. Key–value pairs are stored in a DHT, and any participating node can efficiently retrieve the value associated with a given key. The main advantage of a DHT is that nodes can be added or removed with minimum work around re-distributing keys. Keys are unique identifiers which map to particular values, which in turn can be anything from addresses, to documents, to arbitrary data. Responsibility for maintaining the mapping from keys to values is distributed among the nodes, in such a way that a change in the set of participants causes a minimal amount of disruption. This allows a DHT to scale to extremely large numbers of nodes and to handle continual node arrivals, departures, and failures.

<span class="mw-page-title-main">Bellman–Ford algorithm</span> Algorithm for finding the shortest paths in graphs

The Bellman–Ford algorithm is an algorithm that computes shortest paths from a single source vertex to all of the other vertices in a weighted digraph. It is slower than Dijkstra's algorithm for the same problem, but more versatile, as it is capable of handling graphs in which some of the edge weights are negative numbers. The algorithm was first proposed by Alfonso Shimbel (1955), but is instead named after Richard Bellman and Lester Ford Jr., who published it in 1958 and 1956, respectively. Edward F. Moore also published a variation of the algorithm in 1959, and for this reason it is also sometimes called the Bellman–Ford–Moore algorithm.

<span class="mw-page-title-main">Random graph</span> Graph generated by a random process

In mathematics, random graph is the general term to refer to probability distributions over graphs. Random graphs may be described simply by a probability distribution, or by a random process which generates them. The theory of random graphs lies at the intersection between graph theory and probability theory. From a mathematical perspective, random graphs are used to answer questions about the properties of typical graphs. Its practical applications are found in all areas in which complex networks need to be modeled – many random graph models are thus known, mirroring the diverse types of complex networks encountered in different areas. In a mathematical context, random graph refers almost exclusively to the Erdős–Rényi random graph model. In other contexts, any graph model may be referred to as a random graph.

<span class="mw-page-title-main">Degree distribution</span>

In the study of graphs and networks, the degree of a node in a network is the number of connections it has to other nodes and the degree distribution is the probability distribution of these degrees over the whole network.

<span class="mw-page-title-main">Barabási–Albert model</span>

The Barabási–Albert (BA) model is an algorithm for generating random scale-free networks using a preferential attachment mechanism. Several natural and human-made systems, including the Internet, the World Wide Web, citation networks, and some social networks are thought to be approximately scale-free and certainly contain few nodes with unusually high degree as compared to the other nodes of the network. The BA model tries to explain the existence of such nodes in real networks. The algorithm is named for its inventors Albert-László Barabási and Réka Albert.

In the study of scale-free networks, a copying mechanism is a process by which such a network can form and grow, by means of repeated steps in which nodes are duplicated with mutations from existing nodes. Several variations have been studied. In the general copying model, a growing network starts as a small initial graph and, at each time step, a new vertex is added with a given number k of new outgoing edges. As a result of a stochastic selection, the neighbors of the new vertex are either chosen randomly among the existing vertices, or one existing vertex is randomly selected and k of its neighbors are "copied" as heads of the new edges.

Geographic routing is a routing principle that relies on geographic position information. It is mainly proposed for wireless networks and based on the idea that the source sends a message to the geographic location of the destination instead of using the network address. In the area of packet radio networks, the idea of using position information for routing was first proposed in the 1980s for interconnection networks. Geographic routing requires that each node can determine its own location and that the source is aware of the location of the destination. With this information, a message can be routed to the destination without knowledge of the network topology or a prior route discovery.

<span class="mw-page-title-main">Assortativity</span> Tendency for similar nodes to be connected

Assortativity, or assortative mixing, is a preference for a network's nodes to attach to others that are similar in some way. Though the specific measure of similarity may vary, network theorists often examine assortativity in terms of a node's degree. The addition of this characteristic to network models more closely approximates the behaviors of many real world networks.

<span class="mw-page-title-main">Cartesian tree</span> Binary tree derived from a sequence of numbers

In computer science, a Cartesian tree is a binary tree derived from a sequence of distinct numbers. To construct the Cartesian tree, set its root to be the minimum number in the sequence, and recursively construct its left and right subtrees from the subsequences before and after this number. It is uniquely defined as a min-heap whose symmetric (in-order) traversal returns the original sequence.

An important question in statistical mechanics is the dependence of model behaviour on the dimension of the system. The shortcut model was introduced in the course of studying this dependence. The model interpolates between discrete regular lattices of integer dimension.

<span class="mw-page-title-main">Network science</span> Academic field

Network science is an academic field which studies complex networks such as telecommunication networks, computer networks, biological networks, cognitive and semantic networks, and social networks, considering distinct elements or actors represented by nodes and the connections between the elements or actors as links. The field draws on theories and methods including graph theory from mathematics, statistical mechanics from physics, data mining and information visualization from computer science, inferential modeling from statistics, and social structure from sociology. The United States National Research Council defines network science as "the study of network representations of physical, biological, and social phenomena leading to predictive models of these phenomena."

<span class="mw-page-title-main">Modularity (networks)</span> Measure of network community structure

Modularity is a measure of the structure of networks or graphs which measures the strength of division of a network into modules. Networks with high modularity have dense connections between the nodes within modules but sparse connections between nodes in different modules. Modularity is often used in optimization methods for detecting community structure in networks. Biological networks, including animal brains, exhibit a high degree of modularity. However, modularity maximization is not statistically consistent, and finds communities in its own null model, i.e. fully random graphs, and therefore it cannot be used to find statistically significant community structures in empirical networks. Furthermore, it has been shown that modularity suffers a resolution limit and, therefore, it is unable to detect small communities.

<span class="mw-page-title-main">Greedy coloring</span> One-by-one assignment of colors to graph vertices

In the study of graph coloring problems in mathematics and computer science, a greedy coloring or sequential coloring is a coloring of the vertices of a graph formed by a greedy algorithm that considers the vertices of the graph in sequence and assigns each vertex its first available color. Greedy colorings can be found in linear time, but they do not, in general, use the minimum number of colors possible.

In queueing theory, a discipline within the mathematical theory of probability, the backpressure routing algorithm is a method for directing traffic around a queueing network that achieves maximum network throughput, which is established using concepts of Lyapunov drift. Backpressure routing considers the situation where each job can visit multiple service nodes in the network. It is an extension of max-weight scheduling where each job visits only a single service node.

<span class="mw-page-title-main">Configuration model</span>

In network science, the configuration model is a method for generating random networks from a given degree sequence. It is widely used as a reference model for real-life social networks, because it allows the modeler to incorporate arbitrary degree distributions.

<span class="mw-page-title-main">Copying network models</span>

Copying network models are network generation models that use a copying mechanism to form a network, by repeatedly duplicating and mutating existing nodes of the network. Such a network model has first been proposed in 1999 to explain the network of links between web pages, but since has been used to model biological and citation networks as well.

In network theory, link prediction is the problem of predicting the existence of a link between two entities in a network. Examples of link prediction include predicting friendship links among users in a social network, predicting co-authorship links in a citation network, and predicting interactions between genes and proteins in a biological network. Link prediction can also have a temporal aspect, where, given a snapshot of the set of links at time , the goal is to predict the links at time . Link prediction is widely applicable. In e-commerce, link prediction is often a subtask for recommending items to users. In the curation of citation databases, it can be used for record deduplication. In bioinformatics, it has been used to predict protein-protein interactions (PPI). It is also used to identify hidden groups of terrorists and criminals in security related applications.

References

  1. Kleinberg, Jon. "Networks, Crowds, and Markets: Reasoning about a Highly Connected World" (PDF). Retrieved 10 May 2011.
  2. Kleinberg, Jon M. (August 2000). "Navigation in a small world". Nature. 406 (6798): 845. Bibcode:2000Natur.406..845K. doi: 10.1038/35022643 . ISSN   1476-4687. PMID   10972276.
  3. Manku, Gurmeet Singh Manku. "Symphony: Distributed Hashing in a Small World" (PDF). usenix.org.