Flit (computer networking)

Last updated

In computer networking, a flit (flow control unit or flow control digit) is a link-level atomic piece that forms a network packet or stream. [1] The first flit, called the header flit holds information about this packet's route (namely the destination address) and sets up the routing behavior for all subsequent flits associated with the packet. The header flit is followed by zero or more body flits, containing the actual payload of data. The final flit, called the tail flit, performs some book keeping to close the connection between the two nodes.

Contents

A virtual connection holds the state needed to coordinate the handling of the flits of a packet. At a minimum, this state identifies the output port of the current node for the next hop of the route and the state of the virtual connection (idle, waiting for resources, or active). The virtual connection may also include pointers to the flits of the packet that are buffered on the current node and the number of flit buffers available on the next node. [2] :237

Interconnect Network : Basics

The growing need for performance from computing systems drove the industry into the multi-core and many-core arena. In this setup, the execution of a kernel (a program) is split across multiple processors and the computation happens in parallel, thus ensuring performance with respect to execution time. This however implies that the processors must now be able to communicate with each other and exchange data and control signals seamlessly. One straightforward approach is the bus based interconnect, a group of wires connecting all the processors. This approach is however not scalable as the number of processors in the system increase. [3] Hence, a scalable high performance interconnect network lies at the core of parallel computer architecture.

Basic network Terminologies and background

Definitions of an Interconnection network

The formal definition of an interconnection network

"An interconnection network I is represented by a strongly connected directed multigraph, I = G(N,C). The set of vertices of the multigraph N includes the set of processing element nodes P and the set of router nodes RT. The set of arcs C represents the set of unidirectional channels (possibly virtual) that connect either the processing elements to the routers or the routers to each other". [4]

The primary expectation of an interconnection network is to have as low a latency as possible, that is the time taken to transfer a message from one node to another should be minimal, while allowing a large number of such transactions to take place concurrently. [5] As with any other engineering design trade offs, the interconnection network must accomplish these traits while keeping the cost of implementation as low as possible. Having discussed what is expected of a network, let us look at a few design points that can be tweaked to obtain the necessary performance.

The basic building blocks of an interconnection network are its topology, routing algorithm, switching strategy and the flow control mechanism.

Topology : This refers to the general infrastructure of the interconnection network; the pattern in which multiple processors are connected. This pattern could either be regular or irregular, though many multi-core architectures today use highly regular interconnection networks.

RoutingAlgorithm : This determines which path the message must take in order to ensure delivery to the destination node. The choice of the path is based on multiple metrics such as latency, security and number of nodes involved etc. There are many different routing algorithms, providing different guarantees and offering different performance trade-offs.

SwitchingStrategy : The routing algorithm only determines the path that a message must take to reach its destination node. The actual traversal of the message within the network is the responsibility of the switching strategy. There are basically two types of switching strategies, a circuit switched network is a network where a path is reserved and blocked off from other messages, till the message is delivered to its destination node. A famous example of circuit switched network is the telephone services, which establish a circuit through many switches for a call. The alternative approach is the packet switched network where messages are broken down into smaller compact entities called packets. Each packet contains a part of data in addition to a sequence number. This implies that each packet can now be transferred individually and assembled at the destination based on the sequence number.

Flow control : Note that we have previously established the fact that multiple messages can flow through the interconnect network at any given time. It is the responsibility of the flow control mechanism implemented at the router level to decide which message gets to flow and which message is held back.

Characteristics and metrics of a network

Every network has a width w, and a transmission rate f, which decide the bandwidth of a network as b = w*f. The amount of data transferred in a single cycle is called a physical unit or phit. As is observable, the width of a network is also equal to the phit size. Hence the bandwidth of the network can also be defined in terms of phit/sec. Each message to be transferred can be broken down into smaller chunks of fixed length entities called packets . Packets may in turn be broken down into message flow control units or flits.

The need for flits

It is important to note that flits represent logical units of information, while phits represent the physical domain, that is, phits represent the number of bits that can be transferred in parallel in a single cycle. Consider the Cray T3D. [6] It has an interconnection network which uses flit level message flow control wherein each flit is composed of eight 16-bit phits. That means its flit size is 128bits and phit size is 16bits. Also consider the IBM SP2 switch. [7] It also uses the flit level message flow control, but its flit size is equal to its phit size, which is set to 8 bits.

Flit width determination

Note that the message size is the dominant deciding factor (among many others) in deciding the flit widths. Based on the message size, there are two conflicting design choices:

Based on the size of the packets, the width of the physical link between two routers has to be decided. Meaning, if the packet size is large, the link width also has to be kept large, however, a larger link width implies more area and higher power dissipation. In general, link widths are kept to a minimum. The link width (which also decides the phit width) now factors into deciding the flit width. [8]

At this point, it is important to note that though inter-router transfers are necessarily constructed in terms of phits, the switching techniques deal in terms of flits. [8] For more details regarding the various switching techniques refer Wormhole switching and Cut-through switching. Since the majority of switching techniques work on flits, they also have a major impact in deciding the flit width. Other determining factors include reliability, performance and implementation complexity.

Example

An example of how flits works in a network Flit example.gif
An example of how flits works in a network

Consider an example of how packets are transmitted in terms of flits. In this case we have a packet transmitting between A and B in the figure. The packet transmitting process is happening in the following steps.

Summary

A flit (flow control units/digits) is a unit amount of data when the message is transmitting in link-level. The flit can be accepted or rejected at the receiver side based on the flow control protocol and the size of the receive buffer. The mechanism of link-level flow control is allowing the receiver to send a continuous signals stream to control if it should keep sending flits or stop sending flits. When a packet is transmitted over a link, the packet will need to be split into multiple flits before the transmitting begin. [3]

See also

Related Research Articles

Routing is the process of selecting a path for traffic in a network or between or across multiple networks. Broadly, routing is performed in many types of networks, including circuit-switched networks, such as the public switched telephone network (PSTN), and computer networks, such as the Internet.

The Transmission Control Protocol (TCP) is one of the main protocols of the Internet protocol suite. It originated in the initial network implementation in which it complemented the Internet Protocol (IP). Therefore, the entire suite is commonly referred to as TCP/IP. TCP provides reliable, ordered, and error-checked delivery of a stream of octets (bytes) between applications running on hosts communicating via an IP network. Major internet applications such as the World Wide Web, email, remote administration, and file transfer rely on TCP, which is part of the Transport Layer of the TCP/IP suite. SSL/TLS often runs on top of TCP.

Network throughput refers to the rate of successful message delivery over a communication channel, such as Ethernet or packet radio, in a communication network.The data that these messages contain may be delivered over physical or logical links, or through network nodes. Throughput is usually measured in bits per second, and sometimes in data packets per second or data packets per time slot.

<span class="mw-page-title-main">Frame Relay</span> Wide area network technology

Frame Relay is a standardized wide area network (WAN) technology that specifies the physical and data link layers of digital telecommunications channels using a packet switching methodology. Originally designed for transport across Integrated Services Digital Network (ISDN) infrastructure, it may be used today in the context of many other network interfaces.

Circuit switching is a method of implementing a telecommunications network in which two network nodes establish a dedicated communications channel (circuit) through the network before the nodes may communicate. The circuit guarantees the full bandwidth of the channel and remains connected for the duration of the communication session. The circuit functions as if the nodes were physically connected as with an electrical circuit. Circuit switching originated in analog telephone networks where the network created a dedicated circuit between two telephones for the duration of a telephone call. It contrasts with message switching and packet switching used in modern digital networks in which the trunklines between switching centers carry data between many different nodes in the form of data packets without dedicated circuits.

<span class="mw-page-title-main">Network topology</span> Arrangement of the elements of a communication network

Network topology is the arrangement of the elements of a communication network. Network topology can be used to define or describe the arrangement of various types of telecommunication networks, including command and control radio networks, industrial fieldbusses and computer networks.

Wormhole flow control, also called wormhole switching or wormhole routing, is a system of simple flow control in computer networking based on known fixed links. It is a subset of flow control methods called Flit-Buffer Flow Control.

The data link layer, or layer 2, is the second layer of the seven-layer OSI model of computer networking. This layer is the protocol layer that transfers data between nodes on a network segment across the physical layer. The data link layer provides the functional and procedural means to transfer data between network entities and may also provide the means to detect and possibly correct errors that can occur in the physical layer.

In telecommunication, common-channel signaling (CCS), or common-channel interoffice signaling (CCIS), is the transmission of control information (signaling) via a separate channel than that used for the messages, The signaling channel usually controls multiple message channels.

<span class="mw-page-title-main">RapidIO</span> Electrical connection technology

The RapidIO architecture is a high-performance packet-switched electrical connection technology. RapidIO supports messaging, read/write and cache coherency semantics. Based on industry-standard electrical specifications such as those for Ethernet, RapidIO can be used as a chip-to-chip, board-to-board, and chassis-to-chassis interconnect.

The Intel QuickPath Interconnect (QPI) is a point-to-point processor interconnect developed by Intel which replaced the front-side bus (FSB) in Xeon, Itanium, and certain desktop platforms starting in 2008. It increased the scalability and available bandwidth. Prior to the name's announcement, Intel referred to it as Common System Interface (CSI). Earlier incarnations were known as Yet Another Protocol (YAP) and YAP+.

In telecommunication networks, the transmission time is the amount of time from the beginning until the end of a message transmission. In the case of a digital message, it is the time from the first bit until the last bit of a message has left the transmitting node. The packet transmission time in seconds can be obtained from the packet size in bit and the bit rate in bit/s as:

Routing in delay-tolerant networking concerns itself with the ability to transport, or route, data from a source to a destination, which is a fundamental ability all communication networks must have. Delay- and disruption-tolerant networks (DTNs) are characterized by their lack of connectivity, resulting in a lack of instantaneous end-to-end paths. In these challenging environments, popular ad hoc routing protocols such as AODV and DSR fail to establish routes. This is due to these protocols trying to first establish a complete route and then, after the route has been established, forward the actual data. However, when instantaneous end-to-end paths are difficult or impossible to establish, routing protocols must take to a "store and forward" approach, where data is incrementally moved and stored throughout the network in hopes that it will eventually reach its destination. A common technique used to maximize the probability of a message being successfully transferred is to replicate many copies of the message in hopes that one will succeed in reaching its destination.

Multistage interconnection networks (MINs) are a class of high-speed computer networks usually composed of processing elements (PEs) on one end of the network and memory elements (MEs) on the other end, connected by switching elements (SEs). The switching elements themselves are usually connected to each other in stages, hence the name.

Bufferbloat is a cause of high latency and jitter in packet-switched networks caused by excess buffering of packets. Bufferbloat can also cause packet delay variation, as well as reduce the overall network throughput. When a router or switch is configured to use excessively large buffers, even very high-speed networks can become practically unusable for many interactive applications like voice over IP (VoIP), audio streaming, online gaming, and even ordinary web browsing.

In digital communications networks, packet processing refers to the wide variety of algorithms that are applied to a packet of data or information as it moves through the various network elements of a communications network. With the increased performance of network interfaces, there is a corresponding need for faster packet processing.

<span class="mw-page-title-main">Network scheduler</span> Arbiter on a node in packet switching communication network

A network scheduler, also called packet scheduler, queueing discipline (qdisc) or queueing algorithm, is an arbiter on a node in a packet switching communication network. It manages the sequence of network packets in the transmit and receive queues of the protocol stack and network interface controller. There are several network schedulers available for the different operating systems, that implement many of the existing network scheduling algorithms.

The STC104 switch, also known as the C104 switch in its early phases, is an asynchronous packet-routing chip that was designed for building high-performance point-to-point computer communication networks. It was developed by INMOS in the 1990s and was the first example of a general-purpose production packet routing chip. It was also the first routing chip to implement wormhole routing, to decouple packet size from the flow-control protocol, and to implement interval and two-phase randomized routing.

<span class="mw-page-title-main">Butterfly network</span> Technique to link multiple computers into a high-speed network

A butterfly network is a technique to link multiple computers into a high-speed network. This form of multistage interconnection network topology can be used to connect different nodes in a multiprocessor system. The interconnect network for a shared memory multiprocessor system must have low latency and high bandwidth unlike other network systems, like local area networks (LANs) or internet for three reasons:

Deterministic Networking (DetNet) is an effort by the IETF DetNet Working Group to study implementation of deterministic data paths for real-time applications with extremely low data loss rates, packet delay variation (jitter), and bounded latency, such as audio and video streaming, industrial automation, and vehicle control.

References

  1. "Archived copy" (PDF). Archived from the original (PDF) on 2015-03-20. Retrieved 2018-10-25.{{cite web}}: CS1 maint: archived copy as title (link)
  2. William James Dally; Brian Towles (2004). "13.2.1". Principles and Practices of Interconnection Networks. Morgan Kaufmann Publishers, Inc. ISBN   978-0-12-200751-4.
  3. 1 2 Solihin, Yan (2009). Fundamentals of Parallel Computer Architecture, Multichip and Multicore Systems. Solihin Publishing & Consulting LLC. p. 363.
  4. Duato, J.; Lysne, O.; Pang, R.; Pinkston, T. M. (2005-05-01). "A theory for deadlock-free dynamic network reconfiguration. Part I". IEEE Transactions on Parallel and Distributed Systems. 16 (5): 412–427. doi:10.1109/TPDS.2005.58. ISSN   1045-9219. S2CID   15354425.
  5. Elsevier. "Parallel Computer Architecture - 1st Edition". www.elsevier.com. Retrieved 2016-12-03.
  6. Scott, Steven L.; Thorson, Greg (1994-01-01). "Optimized Routing in the Cray T3D". Proceedings of the First International Workshop on Parallel Computer Routing and Communication. PCRCW '94. London, UK, UK: Springer-Verlag. 853: 281–294. doi:10.1007/3-540-58429-3_44. ISBN   978-3540584292.
  7. "The communication software and parallel environment of the IBM SP2". domino.research.ibm.com. 2001-02-23. Retrieved 2016-11-29.
  8. 1 2 Duato, Jose (2011-08-06). Interconnection Networks. Morgan Kaufmann. ISBN   9780123991805.