STC104

Last updated

The STC104 switch, also known as the C104 switch in its early phases, is an asynchronous packet-routing chip that was designed for building high-performance point-to-point computer communication networks. It was developed by INMOS in the 1990s and was the first example of a general-purpose production packet routing chip. It was also the first routing chip to implement wormhole routing, to decouple packet size from the flow-control protocol, and to implement interval and two-phase randomized routing. [1] [2]

Contents

The STC104 has 32 bidirectional communication links, called DS-Links, that each operate at 100 Mbit/s. These links are connected by a non-blocking crossbar that allows simultaneous transmission of packets between all input and output links.

Switching

The STC104 uses wormhole switching to reduce latency and the per-link buffering requirement. Wormhole switching works by splitting packets into fixed-size chunks (called flits) for transmission, allowing the packet to be pipelined in the network. The first header flit opens a route (or circuit) through each switch in the network, allowing subsequent flits to experience no switching delay. The final flit closes the route. [3]

Since the header flit can proceed independently of the subsequent flits, the latency of the packet is independent of its size. Consequently, the amount of buffering provided by links can also be chosen independently of the packet size. Furthermore, the total buffering requirement is small since, typically, only a small number of flits need to be stored for each link. This is in contrast to store-and-forward switching, where a whole packet must be buffered at each link end point.

Routing

Messages are routed in networks of C104s using interval routing. [4] In a network where each destination is uniquely numbered, interval routing associates non-overlapping, contiguous ranges of destinations with each output link. An output link for a packet is chosen by comparing the destination (contained in the packet's header) to each interval and choosing the one that contains the destination. [5] The benefits of interval routing are that it is sufficient to provide deterministic routing on a range of network topologies and that can be implemented simply with a table-based lookup, so it delivers routing decisions with low latency. Interval routing can be used to implement efficient routing strategies for many classes of regular network topology. [6]

In some networks, multiple links will connect to the same STC104 or processor endpoint, or to a set of equivalent devices. In this circumstance, the STC104 provides a mechanism for grouped adaptive routing, where bundles of links can share the same interval and a link is chosen adaptively from a bundle based on its availability. [7] This mechanism makes efficient use of the available link bandwidth by ensuring a packet does not wait for a link while another equivalent one is available.

An additional ability of interval routing is to partition the network into independent sub networks. This can be used to prevent deadlock or to separate high-priority traffic to travel without contention.

Header deletion

To support routing in hierarchical networks, such as multi-stage butterfly or Clos networks, the STC104 provides a mechanism for header deletion. Each output link that is connected to the next level of the hierarchy can be programmed to discard the header, so that the packet is subsequently routed by the new packet header, which immediately precedes the deleted one. [8]

Header deletion can also be used to implement two-phase randomized routing. Two-phase randomized routing is a method for preventing network contention and it works by routing packets to a randomly chosen intermediate node, before routing it to the destination. [9] The effect is to reduce all traffic to an average worst case with predictable latency and bandwidth. Two-phase randomized routing is implemented by the STC104 by setting up links where traffic enters the network to prepend a header with a random destination. The destination is another STC104 device, which recognises the header and discards it before routing it to its actual destination.

Since randomly routing messages via an intermediate destination can create cyclic dependencies between different packets, deadlock can occur. However, deadlock can be avoided by partitioning the network into two components: one for the randomizing phase and one for the destination phase. [10]

Network topologies

The STC104 can be used to construct a variety of network topologies, including multi-dimensional grids and tori, hypercubes and Clos networks (and the closely related Fat tree). [11]

The STC104 links are called DS-Links. A single DS-Link is a unidirectional, asynchronous, flow-controlled connection that operates serially, with a bandwidth of up to 100 MBits/s. [12]

Physically, a DS-Link is implemented with two wires: a data wire that carries the signal and a strobe that changes only when the data does not. The strobe signal allows the transmitter's clock to be recovered by the receiver, and for the receiver to synchronise to it. This allows the transmitter and receiver to maintain their own clocks with potentially varying frequency and phase.

A DS-Link implements transfer of data on the wires using a token protocol. A token can either carry one byte of data or a control message, such as flow control, end of packet, end of message. A single bit distinguishes the token type and an additional parity is used for error detection. A byte is therefore encoded in 10 bits and a control token is encoded in 4 bits.

Each DS-link has a buffer large enough to store eight tokens. To prevent a tokens from being received when the buffer is full, a token-level flow control mechanism is used. This mechanism automatically sends control tokens to the sender when there is space in the buffer.

Microarchitecture

The STC104 can be classified as a special-purpose MIMD processor with distributed control. [1] The main components are 32 link slices that are connected to the crossbar, and logic for global services such as initialisation and reset. Each link slice provides a single input and output with a pair of DS-Links and additional logic to implement the routing functionality and provide buffering. The link slices operate concurrently and independently, with their state determined only by their configuration parameters and the data flowing through them.

Physical implementation

The STC104 was designed and manufactured on a 1.0 micron CMOS process (SGS-Thomson HCMOS4) with three metal layers for routing. The chip had an area of approximately 204.6mm2, had 1.875 million transistors and dissipated up to 5 W of power, operating at 50 MHz. [1]

Notes

  1. 1 2 3 Thompson 1994.
  2. May 1993.
  3. May 1993, Chapter 3.
  4. Van Leeuwen 1987.
  5. May 1993, Section 3.6.3.
  6. Jones 1997, Section 3.4.
  7. May 1993, Section 3.6.5.
  8. May 1993, Section 3.6.2.
  9. Valiant 1982.
  10. May 1993, 1.6.1.
  11. Jones 1997.
  12. May 1993, Chapter 3, Chapter 4.

Related Research Articles

<span class="mw-page-title-main">Asynchronous Transfer Mode</span> Digital telecommunications protocol for voice, video, and data

Asynchronous Transfer Mode (ATM) is a telecommunications standard defined by the American National Standards Institute and ITU-T for digital transmission of multiple types of traffic. ATM was developed to meet the needs of the Broadband Integrated Services Digital Network as defined in the late 1980s, and designed to integrate telecommunication networks. It can handle both traditional high-throughput data traffic and real-time, low-latency content such as telephony (voice) and video. ATM provides functionality that uses features of circuit switching and packet switching networks by using asynchronous time-division multiplexing. ATM was seen in the 1990s as a competitor to Ethernet and networks carrying IP traffic as, unlike Ethernet, it was faster and designed with quality-of-service in mind, but it fell out of favor once Ethernet reached speeds of 1 gigabits per second.

Quality of service (QoS) is the description or measurement of the overall performance of a service, such as a telephony or computer network, or a cloud computing service, particularly the performance seen by the users of the network. To quantitatively measure quality of service, several related aspects of the network service are often considered, such as packet loss, bit rate, throughput, transmission delay, availability, jitter, etc.

Network throughput refers to the rate of message delivery over a communication channel, such as Ethernet or packet radio, in a communication network. The data that these messages contain may be delivered over physical or logical links, or through network nodes. Throughput is usually measured in bits per second, and sometimes in data packets per second or data packets per time slot.

A virtual circuit (VC) is a means of transporting data over a data network, based on packet switching and in which a connection is first established across the network between two endpoints. The network, rather than having a fixed data rate reservation per connection as in circuit switching, takes advantage of the statistical multiplexing on its transmission links, an intrinsic feature of packet switching.

<span class="mw-page-title-main">Transputer</span> Series of pioneering microprocessors from the 1980s

The transputer is a series of pioneering microprocessors from the 1980s, intended for parallel computing. To support this, each transputer had its own integrated memory and serial communication links to exchange data with other transputers. They were designed and produced by Inmos, a semiconductor company based in Bristol, United Kingdom.

<span class="mw-page-title-main">Meiko Scientific</span> Operating system

Meiko Scientific Ltd. was a British supercomputer company based in Bristol, founded by members of the design team working on the Inmos transputer microprocessor.

Wormhole flow control, also called wormhole switching or wormhole routing, is a system of simple flow control in computer networking based on known fixed links. It is a subset of flow control methods called Flit-Buffer Flow Control.

NetFlow is a feature that was introduced on Cisco routers around 1996 that provides the ability to collect IP network traffic as it enters or exits an interface. By analyzing the data provided by NetFlow, a network administrator can determine things such as the source and destination traffic, class of service, and the causes of congestion. A typical flow monitoring setup consists of three main components:

<span class="mw-page-title-main">RapidIO</span> High-speed interconnect technology

The RapidIO architecture is a high-performance packet-switched electrical connection technology. It supports messaging, read/write and cache coherency semantics. Based on industry-standard electrical specifications such as those for Ethernet, RapidIO can be used as a chip-to-chip, board-to-board, and chassis-to-chassis interconnect.

A load-balanced switch is a switch architecture which guarantees 100% throughput with no central arbitration at all, at the cost of sending each packet across the crossbar twice. Load-balanced switches are a subject of research for large routers scaled past the point of practical central arbitration.

The Intel QuickPath Interconnect (QPI) is a point-to-point processor interconnect developed by Intel which replaced the front-side bus (FSB) in Xeon, Itanium, and certain desktop platforms starting in 2008. It increased the scalability and available bandwidth. Prior to the name's announcement, Intel referred to it as Common System Interface (CSI). Earlier incarnations were known as Yet Another Protocol (YAP) and YAP+.

In telecommunications and computer networking, connection-oriented communication is a communication protocol where a communication session or a semi-permanent connection is established before any useful data can be transferred. The established connection ensures that data is delivered in the correct order to the upper communication layer. The alternative is called connectionless communication, such as the datagram mode communication used by Internet Protocol (IP) and User Datagram Protocol (UDP), where data may be delivered out of order, since different network packets are routed independently and may be delivered over different paths.

Avionics Full-Duplex Switched Ethernet (AFDX), also ARINC 664, is a data network, patented by international aircraft manufacturer Airbus, for safety-critical applications that utilizes dedicated bandwidth while providing deterministic quality of service (QoS). AFDX is a worldwide registered trademark by Airbus. The AFDX data network is based on Ethernet technology using commercial off-the-shelf (COTS) components. The AFDX data network is a specific implementation of ARINC Specification 664 Part 7, a profiled version of an IEEE 802.3 network per parts 1 & 2, which defines how commercial off-the-shelf networking components will be used for future generation Aircraft Data Networks (ADN). The six primary aspects of an AFDX data network include full duplex, redundancy, determinism, high speed performance, switched and profiled network.

<span class="mw-page-title-main">IEEE 1355</span>

IEEE Standard 1355-1995, IEC 14575, or ISO 14575 is a data communications standard for Heterogeneous Interconnect (HIC).

ARINC 818: Avionics Digital Video Bus (ADVB) is a video interface and protocol standard developed for high bandwidth, low-latency, uncompressed digital video transmission in avionics systems. The standard, which was released in January 2007, has been advanced by ARINC and the aerospace community to meet the stringent needs of high performance digital video. The specification was updated and ARINC 818-2 was released in December 2013, adding a number of new features, including link rates up to 32X fibre channel rates, channel-bonding, switching, field sequential color, bi-directional control and data-only links.

Isra Vision Parsytec AG is a company of Isra Vision, founded in 1985 as Parsytec in Aachen, Germany.

CoDel is an active queue management (AQM) algorithm in network routing, developed by Van Jacobson and Kathleen Nichols and published as RFC8289. It is designed to overcome bufferbloat in networking hardware, such as routers, by setting limits on the delay network packets experience as they pass through buffers in this equipment. CoDel aims to improve on the overall performance of the random early detection (RED) algorithm by addressing some of its fundamental misconceptions, as perceived by Jacobson, and by being easier to manage.

In computer networking, a flit is a link-level atomic piece that forms a network packet or stream. The first flit, called the header flit holds information about this packet's route and sets up the routing behavior for all subsequent flits associated with the packet. The header flit is followed by zero or more body flits, containing the actual payload of data. The final flit, called the tail flit, performs some book keeping to close the connection between the two nodes.

Time-Sensitive Networking (TSN) is a set of standards under development by the Time-Sensitive Networking task group of the IEEE 802.1 working group. The TSN task group was formed in November 2012 by renaming the existing Audio Video Bridging Task Group and continuing its work. The name changed as a result of the extension of the working area of the standardization group. The standards define mechanisms for the time-sensitive transmission of data over deterministic Ethernet networks.

<span class="mw-page-title-main">Butterfly network</span> Technique to link multiple computers into a high-speed network

A butterfly network is a technique to link multiple computers into a high-speed network. This form of multistage interconnection network topology can be used to connect different nodes in a multiprocessor system. The interconnect network for a shared memory multiprocessor system must have low latency and high bandwidth unlike other network systems, like local area networks (LANs) or internet for three reasons:

References

See also