Shared risk resource group (commonly referred to as shared risk group or SRG) is a concept in optical mesh network routing that different networks may suffer from a common failure if they share a common risk or a common SRG. SRG is not limited to optical mesh networks: SRGs are also used in MPLS, IP networks, and synchronous optical networks.
An SRG failure makes multiple circuits go down because of the failure of a common resource those networks share. There are three main shared risk groups:
Failure recovery is a crucial in all types of networks. The MPLS as well as the IP network uses the high speed capabilities of modern optical networks. SRLGs typically deal with links between fiber optic nodes, but that is not always the case. [1] [2] SRLG can also be modeled if the links contain transmission lines instead of fiber optic cable. SRG modeling is also used when a provider generates a service-level agreement with a client with various protection schemes. [3]
Fiber spans are fiber optic cables that connect two nodes. In practice, these cables are bundled on one concrete conduit or power/telephone pole (aerial), which creates a shared risk link group. If, for example, if there is a cut on a fiber span, it takes down all circuits (upper layer logical links) that use that particular SRLG. The term SRLG may have first appeared in 2000. [4] [5] Early work (from 1990s) that considered SRLG (before the term was coined) in understanding implications due to SRLG, and designing for survivability and restoration by considering SRLG can be found in [6] [7] . [8]
In optical mesh networks, nodes are junctions of fiber spans. Some nodes might contain highly sophisticated routing equipment— while others may be just a patch panel. Whatever the case, a node is a shared risk node group—because if the node fails, the failure affects all signals through that particular node.
Shared risk group also extends within a node itself—in particular nodes that contain multi-port network cards. Dense wavelength division multiplexing equipment are also considered SREG because failure of a DWDM multiplexer affects all of the channels through that DWDM. The same is true for multi-port network cards. When routing over SNRG is not possible, circuit-pack diversity with-in the same node can lessen the risk of failure.
Failure recovery is an essential part of any optical based network. When provisioning a circuit, engineers typically use a shortest path algorithm, such as Dijkstra. Calculations for a protection path must take into account that the protection path must provide 100% SRG protection. In other words, the protection path cannot go through the same SRLG or SRNG. If SRG diversity is not achieved then the failure of that SRG fails both primary path and back-up paths simultaneously. Therefore, the two calculated paths must be SRG diverse. [6] [8] [9]
There has been recent studies that have proved that the SRG diverse routing is in fact NP-complete. [10] There is currently no known discrete method to solve this real world problem for large-scale network. People have been able to solve this problem by finding a heuristic solution. [1]
The SRG diverse routing problem has proven to be NP-complete. To prove something is NP-complete, it is sufficient to prove that the problem closely resembles another well-known NP-complete problem. To prove the case, engineers introduce a graph, as shown in the picture. The graph depicts that, between two nodes, there exist multiple paths, which may include other nodes. The parallel paths in sub-graphs (circled in blue) belong to the same SRLG.
Finding an SRG diverse path is the same as finding two disjoint subsets, such that each subset contains at least one common element. This is equivalent to the set-splitting problem, which has been proven NP-complete. Therefore, the SRG diverse routing problem is also NP-complete. [11] (SRLG is solvable using Suurballe's algorithm)
There has been many attempts to overcome the fact that there is no solution for the SRG diverse routing problem. One of these attempts is by means of a graph transformation approach. [9] This method takes the original network graph and applies some transformations to the graph to obtain a transformed graph that overcomes the SRG diverse problem to some degree. However, this method has its own shortcomings.
After obtaining the transformed graph one would simply compute the primary path using a known shortest path algorithm such as Dijkstra's. On computing the primary path, and removing all nodes and links in that path, run the algorithm again on the remaining network. There may be instances when, due to topological restrictions, unavoidable traps could be introduced that prevent the algorithm from finding a solution. There are also avoidable traps, which come from parameter restrictions such as cost. These can be overcome by reconsidering the parameter values or altering the algorithm to make it more robust.
This method is limited, the following conditions must be met to calculate two SRG diverse paths:
This approach works only in very narrow circumstances. When looking at actual large scale implemented networks this approach is useless because the links in the network greatly exceed these restrictions. A typical link can contain as many as 50,000 SRLG. [12] One of the reasons this approach falls short is in the case of two independent edges where links fall in the same SRLG, even though the algorithm might find a path that would be incorrect because there would be no physical route. [9]
Modern network providers have various ways to deal with shared risk group diverse routing. [13] SRGs are now closely linked to service level agreements. 100% SRG diverse is not possible in some cases. An example of this is the link that goes from the clients office to the providers local offices. Often, the primary path and the back-up path exit the building at the same point, which in itself is an SRG.
The most common way to deal with SRG is to keep a database of all the networks SRGs. The means of updating these databases are of great concern, because manual updating creates room for human error. It can also delay updating, because the network topology changes rapidly. Auto-discovery of SRGs has been proposed. SRG auto-discovery uses all components in the actual physical layer. Active components are those that can be monitored, and they include: amplifiers, transponders, regenerators, and DWDM Mux/DeMuxs. Passive components cannot be monitored electronically, and include conduits, simple patch-panels, and splice points.
Fitting these components with GPS would help identify component position to a SRLG management system. The system could then generate all of the SRLGs based on the information. This would also help localize the failure, which would further reduce down time of that failed SRG. A supervisory channel could connect to all active components to provide management and supervision. [14] (registration required)
Because longer SRLGs have more components it is easier to detect them. Shorter SRLGs are harder to detect because they don't have as many components as the longer SRLGs. The parameter that determines just how well SRLG can be detected is the amplifier spacing to the SRLG length. SRLG that span anything over 50 miles and over are nearly 100% detected. [15]
Telecommunications and networking
Telecommunications equipment
Packet networking
Availability
Routing is the process of selecting a path for traffic in a network or between or across multiple networks. Broadly, routing is performed in many types of networks, including circuit-switched networks, such as the public switched telephone network (PSTN), and computer networks, such as the Internet.
Synchronous Optical Networking (SONET) and Synchronous Digital Hierarchy (SDH) are standardized protocols that transfer multiple digital bit streams synchronously over optical fiber using lasers or highly coherent light from light-emitting diodes (LEDs). At low transmission rates data can also be transferred via an electrical interface. The method was developed to replace the plesiochronous digital hierarchy (PDH) system for transporting large amounts of telephone calls and data traffic over the same fiber without the problems of synchronization.
Network topology is the arrangement of the elements of a communication network. Network topology can be used to define or describe the arrangement of various types of telecommunication networks, including command and control radio networks, industrial fieldbusses and computer networks.
In computing, load balancing is the process of distributing a set of tasks over a set of resources, with the aim of making their overall processing more efficient. Load balancing can optimize response time and avoid unevenly overloading some compute nodes while other compute nodes are left idle.
In fiber-optic communications, wavelength-division multiplexing (WDM) is a technology which multiplexes a number of optical carrier signals onto a single optical fiber by using different wavelengths of laser light. This technique enables bidirectional communications over a single strand of fiber as well as multiplication of capacity.
The Hamiltonian path problem is a topic discussed in the fields of complexity theory and graph theory. It decides if a directed or undirected graph, G, contains a Hamiltonian path, a path that visits every vertex in the graph exactly once. The problem may specify the start and end of the path, in which case the starting vertex s and ending vertex t must be identified.
A mesh network is a local area network topology in which the infrastructure nodes connect directly, dynamically and non-hierarchically to as many other nodes as possible and cooperate with one another to efficiently route data to and from clients.
The routing and wavelength assignment (RWA) problem is an optical networking problem with the goal of maximizing the number of optical connections.
A computer network is a set of computers sharing resources located on or provided by network nodes. Computers use common communication protocols over digital interconnections to communicate with each other. These interconnections are made up of telecommunication network technologies based on physically wired, optical, and wireless radio-frequency methods that may be arranged in a variety of network topologies.
A wireless ad hoc network (WANET) or mobile ad hoc network (MANET) is a decentralized type of wireless network. The network is ad hoc because it does not rely on a pre-existing infrastructure, such as routers or wireless access points. Instead, each node participates in routing by forwarding data for other nodes. The determination of which nodes forward data is made dynamically on the basis of network connectivity and the routing algorithm in use.
In mathematics, a graph partition is the reduction of a graph to a smaller graph by partitioning its set of nodes into mutually exclusive groups. Edges of the original graph that cross between the groups will produce edges in the partitioned graph. If the number of resulting edges is small compared to the original graph, then the partitioned graph may be better suited for analysis and problem-solving than the original. Finding a partition that simplifies graph analysis is a hard problem, but one that has applications to scientific computing, VLSI circuit design, and task scheduling in multiprocessor computers, among others. Recently, the graph partition problem has gained importance due to its application for clustering and detection of cliques in social, pathological and biological networks. For a survey on recent trends in computational methods and applications see Buluc et al. (2013). Two common examples of graph partitioning are minimum cut and maximum cut problems.
In telecommunications, subnetwork connection protection (SNCP), is a type of protection mechanism associated with synchronous optical networks such as synchronous digital hierarchy (SDH).
An optical mesh network is a type of optical telecommunications network employing wired fiber-optic communication or wireless free-space optical communication in a mesh network architecture.
A multicast session requires a "point-to-multipoint" connection from a source node to multiple destination nodes. The source node is known as the root. The destination nodes are known as leaves. In the modern era, it is important to protect multicast connections in an optical mesh network. Recently, multicast applications have gained popularity as they are important to protecting critical sessions against failures such as fiber cuts, hardware faults, and natural disasters.
Link protection is designed to safeguard networks from failure. Failures in high-speed networks have always been a concern of utmost importance. A single fiber cut can lead to heavy losses of traffic and protection-switching techniques have been used as the key source to ensure survivability in networks. Survivability can be addressed in many layers in a network and protection can be performed at the physical layer, Layer 2 and Layer 3 (IP).
Path protection in telecommunications is an end-to-end protection scheme used in connection oriented circuits in different network architectures to protect against inevitable failures on service providers’ network that might affect the services offered to end customers. Any failure occurred at any point along the path of a circuit will cause the end nodes to move/pick the traffic to/from a new route. Finding paths with protection, especially in elastic optical networks, was considered a difficult problem, but an efficient and optimal algorithm was proposed.
Segment protection is a type of backup technique that can be used in most networks. It can be implemented as a dedicated backup or as a shared backup protection. Overlapping segments and non-overlapping segments are allowed; each providing different advantages.
The p-Cycle protection scheme is a technique to protect a mesh network from a failure of a link, with the benefits of ring like recovery speed and mesh-like capacity efficiency, similar to that of a shared backup path protection (SBPP). p-Cycle protection was invented in late 1990s, with research and development done mostly by Wayne D. Grover, and D. Stamatelakis.
Fast automatic restoration (FASTAR) is an automated fast response system developed and deployed by American Telephone & Telegraph (AT&T) in 1992 for the centralized restoration of its digital transport network. FASTAR automatically reroutes circuits over a spare protection capacity when a fiber-optic cable failure is detected, hence increasing service availability and reducing the impact of the outages in the network. Similar in operation is real-time restoration (RTR), developed and deployed by MCI and used in the MCI network to minimize the effects of a fiber cut.
Deterministic Networking (DetNet) is an effort by the IETF DetNet Working Group to study implementation of deterministic data paths for real-time applications with extremely low data loss rates, packet delay variation (jitter), and bounded latency, such as audio and video streaming, industrial automation, and vehicle control.