Topology of the World Wide Web

Last updated

World Wide Web topology is distinct from Internet topology. While the former focuses on how web pages are interconnected through hyperlinks, the latter refers to the layout of network infrastructure like routers, ISPs, and backbone connections.

Contents

The Jellyfish model of the World Wide Web topology represents the web as a core of highly connected nodes (web pages) surrounded by layers of less connected nodes. The Bow Tie model, on the other hand, divides the web into distinct zones: a strongly connected core, an 'IN' group leading into the core, an 'OUT' group leading out, and disconnected components. This model emphasizes the flow of hyperlinks between different parts of the web.. [1] [2]

Models of web page topology

Jellyfish Model

The simplistic Jellyfish model of the World Wide Web centers around a large strongly connected core of high-degree web pages that form a clique; pages such that there is a path from any page within the core to any other page. In other words, starting from any node within the core, it is possible to visit any other node in the core just by clicking hyperlinks. From there, a distinction is made between pages of single degree and those of higher order degree. Pages with many links form rings around the center, with all such pages that are a single link away from the core making up the first ring, all such pages that are two links away from the core making up the second ring, and so on. Then from each ring, pages of single degree are depicted as hanging downward, with a page linked by the core hanging from the center, for example. In this manner, the rings form a sort of dome away from the center that is reminiscent of a jellyfish, with the hanging nodes making up the creature's tentacles. [3]

Bow Tie Model

The Bow Tie model comprises four main groups of web pages, plus some smaller ones. Like the Jellyfish model there is a strongly connected core. There are two other large groups, roughly of equal sizes. One consists of all pages that link to the strongly connected core, but which have no links from the core back out to them. This is the "Origination" or "In" group, as it contains links that lead into the core and originate outside it. The counterpart to this is the group of all pages that the strongly connected core links to, but which have no links back into the core. This is the "Termination" or "Out" group, as it contains links that lead out of the core and terminate outside it. A fourth group is all the disconnected pages, which neither link to the core nor are linked from it. [4] [5]

The Bow Tie model has additional, smaller groups of web pages. Both the "In" and "Out" groups have smaller "Tendrils" [6] leading to and from them. These consist of pages that link to and from the "In" and "Out" group but are not part of either to begin with, in essence the "Origination" and "Termination" groups of the larger "In" and "Out". This can be carried on ad nauseam, adding tendrils to the tendrils, and so on. Additionally, there is another important group known as "Tubes". This group consists of pages accessible from "In" and which link to "Out", but which are not part of the large core. Visually, they form alternative routes from "In" to "Out", like tubes bending around the central strongly connected component. [4] [5]

See also

Related Research Articles

<span class="mw-page-title-main">Network topology</span> Arrangement of a communication network

Network topology is the arrangement of the elements of a communication network. Network topology can be used to define or describe the arrangement of various types of telecommunication networks, including command and control radio networks, industrial fieldbusses and computer networks.

<span class="mw-page-title-main">Hyperlink</span> Method of referencing visual computer data

In computing, a hyperlink, or simply a link, is a digital reference to data that the user can follow or be guided to by clicking or tapping. A hyperlink points to a whole document or to a specific element within a document. Hypertext is text with hyperlinks. The text that is linked from is known as anchor text. A software system that is used for viewing and creating hypertext is a hypertext system, and to create a hyperlink is to hyperlink. A user following hyperlinks is said to navigate or browse the hypertext.

<span class="mw-page-title-main">Scalable Coherent Interface</span> High-speed interconnect standard for shared memory multiprocessing and message passing

The Scalable Coherent Interface or Scalable Coherent Interconnect (SCI), is a high-speed interconnect standard for shared memory multiprocessing and message passing. The goal was to scale well, provide system-wide memory coherence and a simple interface; i.e. a standard to replace existing buses in multiprocessor systems with one with no inherent scalability and performance limitations.

<span class="mw-page-title-main">Scale-free network</span> Network whose degree distribution follows a power law

A scale-free network is a network whose degree distribution follows a power law, at least asymptotically. That is, the fraction P(k) of nodes in the network having k connections to other nodes goes for large values of k as

Hypermedia, an extension of hypertext, is a nonlinear medium of information that includes graphics, audio, video, plain text and hyperlinks. This designation contrasts with the broader term multimedia, which may include non-interactive linear presentations as well as hypermedia. The term was first used in a 1965 article written by Ted Nelson. Hypermedia is a type of multimedia that features interactive elements, such as hypertext, buttons, or interactive images and videos, allowing users to navigate and engage with content in a non-linear manner.

<span class="mw-page-title-main">Ring network</span> Network topology in which nodes form a ring

A ring network is a network topology in which each node connects to exactly two other nodes, forming a single continuous pathway for signals through each node – a ring. Data travels from node to node, with each node along the way handling every packet.

<span class="mw-page-title-main">Backbone network</span> Computer network that connects other networks together

A backbone or core network is a part of a computer network which interconnects networks, providing a path for the exchange of information between different LANs or subnetworks. A backbone can tie together diverse networks in the same building, in different buildings in a campus environment, or over wide areas. Normally, the backbone's capacity is greater than the networks connected to it.

<span class="mw-page-title-main">Computer network</span> Network that allows computers to share resources and communicate with each other

A computer network is a set of computers sharing resources located on or provided by network nodes. Computers use common communication protocols over digital interconnections to communicate with each other. These interconnections are made up of telecommunication network technologies based on physically wired, optical, and wireless radio-frequency methods that may be arranged in a variety of network topologies.

In the study of scale-free networks, a copying mechanism is a process by which such a network can form and grow, by means of repeated steps in which nodes are duplicated with mutations from existing nodes. Several variations have been studied. In the general copying model, a growing network starts as a small initial graph and, at each time step, a new vertex is added with a given number k of new outgoing edges. As a result of a stochastic selection, the neighbors of the new vertex are either chosen randomly among the existing vertices, or one existing vertex is randomly selected and k of its neighbors are "copied" as heads of the new edges.

<span class="mw-page-title-main">Core router</span> Router used on the internet backbone and on internet exchanges

A core router is a router designed to operate in the Internet backbone, or core. To fulfill this role, a router must be able to support multiple telecommunications interfaces of the highest speed in use in the core Internet and must be able to forward IP packets at full speed on all of them. It must also support the routing protocols being used in the core. A core router is distinct from an edge router: edge routers sit at the edge of a backbone network and connect to core routers.

<span class="mw-page-title-main">EMMAN</span>

EMMAN was a company limited by guarantee and jointly owned by its members, eight Higher Education Institutions in the East Midlands region of the United Kingdom.

<span class="mw-page-title-main">Network science</span> Academic field

Network science is an academic field which studies complex networks such as telecommunication networks, computer networks, biological networks, cognitive and semantic networks, and social networks, considering distinct elements or actors represented by nodes and the connections between the elements or actors as links. The field draws on theories and methods including graph theory from mathematics, statistical mechanics from physics, data mining and information visualization from computer science, inferential modeling from statistics, and social structure from sociology. The United States National Research Council defines network science as "the study of network representations of physical, biological, and social phenomena leading to predictive models of these phenomena."

In computer networking, the link layer is the lowest layer in the Internet protocol suite, the networking architecture of the Internet. The link layer is the group of methods and communications protocols confined to the link that a host is physically connected to. The link is the physical and logical network component used to interconnect hosts or nodes in the network and a link protocol is a suite of methods and standards that operate only between adjacent network nodes of a network segment.

<span class="mw-page-title-main">Evolving network</span>

Evolving networks are networks that change as a function of time. They are a natural extension of network science since almost all real world networks evolve over time, either by adding or removing nodes or links over time. Often all of these processes occur simultaneously, such as in social networks where people make and lose friends over time, thereby creating and destroying edges, and some people become part of new social networks or leave their networks, changing the nodes in the network. Evolving network concepts build on established network theory and are now being introduced into studying networks in many diverse fields.

The webgraph describes the directed links between pages of the World Wide Web. A graph, in general, consists of several vertices, some pairs connected by edges. In a directed graph, edges are directed lines or arcs. The webgraph is a directed graph, whose vertices correspond to the pages of the WWW, and a directed edge connects page X to page Y if there exists a hyperlink on page X, referring to page Y.

<span class="mw-page-title-main">Torus interconnect</span> Type of geometry for connecting computer nodes

A torus interconnect is a switch-less network topology for connecting processing nodes in a parallel computer system.

<span class="mw-page-title-main">Hierarchical network model</span>

Hierarchical network models are iterative algorithms for creating networks which are able to reproduce the unique properties of the scale-free topology and the high clustering of the nodes at the same time. These characteristics are widely observed in nature, from biology to language to some social networks.

<span class="mw-page-title-main">Hub (network science)</span> Node with a number of links that greatly exceeds the average

In network science, a hub is a node with a number of links that greatly exceeds the average. Emergence of hubs is a consequence of a scale-free property of networks. While hubs cannot be observed in a random network, they are expected to emerge in scale-free networks. The uprise of hubs in scale-free networks is associated with power-law distribution. Hubs have a significant impact on the network topology. Hubs can be found in many real networks, such as the brain or the Internet.

The random surfing model is a graph model which describes the probability of a random user visiting a web page. The model attempts to predict the chance that a random internet surfer will arrive at a page by either clicking a link or by accessing the site directly, for example by directly entering the website's URL in the address bar. For this reason, an assumption is made that all users surfing the internet will eventually stop following links in favor of switching to another site completely. The model is similar to a Markov chain, where the chain's states are web pages the user lands on and transitions are equally probable links between these pages.

In network theory, link prediction is the problem of predicting the existence of a link between two entities in a network. Examples of link prediction include predicting friendship links among users in a social network, predicting co-authorship links in a citation network, and predicting interactions between genes and proteins in a biological network. Link prediction can also have a temporal aspect, where, given a snapshot of the set of links at time , the goal is to predict the links at time . Link prediction is widely applicable. In e-commerce, link prediction is often a subtask for recommending items to users. In the curation of citation databases, it can be used for record deduplication. In bioinformatics, it has been used to predict protein-protein interactions (PPI). It is also used to identify hidden groups of terrorists and criminals in security related applications.

References

  1. Siganos, Georgos; Sudhir L Tauro; Michalis Faloutsos (Dec 7, 2004). "Jellyfish: A Conceptual Model for the AS Internet Topology" (PDF). Retrieved 2007-12-29.
  2. "IBM Almaden - News - Researchers Map the Web" . Retrieved 2008-11-11.
  3. Siganos, Georgos; Tauro, Sudhir Leslie; Faloutsos, Michalis (September 2006). "Jellyfish: A conceptual model for the as Internet topology". Journal of Communications and Networks. 8 (3): 339–350. doi:10.1109/JCN.2006.6182774. ISSN   1229-2370.
  4. 1 2 Broder, Andrei; Kumar, Ravi; Maghoul, Farzin; Raghavan, Prabhakar; Rajagopalan, Sridhar; Stata, Raymie; Tomkins, Andrew; Wiener, Janet (2000). "Graph structure in the Web". Computer Networks. 33 (1–6): 309–320. doi:10.1016/S1389-1286(00)00083-9.[ dead link ]
  5. 1 2 Metaxas, Panagiotis (2012). Why Is the Shape of the Web a Bowtie?. World Wide Web (WWW) Conference, WebScience Track. Lyon, France. Retrieved 2018-04-02.
  6. Kaufmann, Michael; Mchedlidze, Tamara; Symvonis, Antonios (August 2013). "On upward point set embeddability". Computational Geometry. 46 (6): 774–804. arXiv: 1010.5937 . doi:10.1016/j.comgeo.2012.11.008. ISSN   0925-7721.