Head-of-line blocking (HOL blocking) in computer networking is a performance-limiting phenomenon that occurs when a queue of packets is held up by the first packet in the queue. This occurs, for example, in input-buffered network switches, out-of-order delivery and multiple requests in HTTP pipelining.
A switch may be composed of buffered input ports, a switch fabric, and buffered output ports. If first-in first-out (FIFO) input buffers are used, only the oldest packet is available for forwarding. If the oldest packet cannot be transmitted due to its target output being busy, then more recent arrivals cannot be forwarded. The output may be busy if there is output contention.
Without HOL blocking, the new arrivals could potentially be forwarded around the stuck oldest packet to their respective destinations. HOL blocking can produce performance-degrading effects in input-buffered systems.
This phenomenon limits the throughput of switches. For FIFO input buffers, a simple model of fixed-sized cells to uniformly distributed destinations, causes the throughput to be limited to 58.6% of the total as the number of links becomes large. [1]
One way to overcome this limitation is by using virtual output queues. [2]
Only switches with input buffering can suffer HOL blocking. With sufficient internal bandwidth, input buffering is unnecessary; all buffering is handled at outputs and HOL blocking is avoided. This no-input-buffering architecture is common in small to medium-sized ethernet switches.
Out-of-order delivery occurs when sequenced packets arrive out of order. This may happen due to different paths taken by the packets or from packets being dropped and resent. HOL blocking can significantly increase packet reordering. [3] [4]
Reliably broadcasting messages across a lossy network among a large number of peers is a difficult problem. While atomic broadcast algorithms solve the single point of failure problem of centralized servers, those algorithms introduce a head-of-line blocking problem. [5] The Bimodal Multicast algorithm, a randomized algorithm that uses a gossip protocol, avoids head-of-line blocking by allowing some messages to be received out-of-order. [6]
One form of HOL blocking in HTTP/1.1 is when the number of allowed parallel requests in the browser is used up, and subsequent requests need to wait for the former ones to complete. HTTP/2 addresses this issue through request multiplexing, which eliminates HOL blocking at the application layer, but HOL still exists at the transport (TCP) layer. [7] [8]
Head-of-line blocking can occur in reliable byte streams: if packets are reordered or lost and need to be retransmitted (and thus arrive out-of-order), data from sequentially later parts of the stream may be received before sequentially earlier parts of the stream; however, the later data cannot typically be used until the earlier data has been received, incurring network latency. If multiple independent higher-level messages are encapsulated and multiplexed onto a single reliable byte stream, then head-of-line blocking can cause processing of a fully-received message that was sent later to wait for delivery of a message that was sent earlier. [9] This affects, for example, HTTP/2, which frames multiple request–response pairs onto a single stream; HTTP/3, which has an application-layer framing design and uses datagram rather than stream transport, avoids this problem. [10] [11] The latency degradation from head-of-line blocking depends on the underlying packet loss rate and round-trip time, with higher losses producing worse latency. [12] [13] Without changing the stream abstraction, reducing packet loss can reduce the harm from head-of-line blocking; an alternative is to implement the reliable byte stream using forward error correction to send redundant data so that a certain amount of loss can be tolerated without incurring retransmissions. [9]
The Transmission Control Protocol (TCP) is one of the main protocols of the Internet protocol suite. It originated in the initial network implementation in which it complemented the Internet Protocol (IP). Therefore, the entire suite is commonly referred to as TCP/IP. TCP provides reliable, ordered, and error-checked delivery of a stream of octets (bytes) between applications running on hosts communicating via an IP network. Major internet applications such as the World Wide Web, email, remote administration, and file transfer rely on TCP, which is part of the Transport layer of the TCP/IP suite. SSL/TLS often runs on top of TCP.
Wormhole flow control, also called wormhole switching or wormhole routing, is a system of simple flow control in computer networking based on known fixed links. It is a subset of flow control methods called flit-buffer flow control.
In computing, scheduling is the action of assigning resources to perform tasks. The resources may be processors, network links or expansion cards. The tasks may be threads, processes or data flows.
Network congestion in data networking and queueing theory is the reduced quality of service that occurs when a network node or link is carrying more data than it can handle. Typical effects include queueing delay, packet loss or the blocking of new connections. A consequence of congestion is that an incremental increase in offered load leads either only to a small increase or even a decrease in network throughput.
In computing, cache replacement policies are optimizing instructions or algorithms which a computer program or hardware-maintained structure can utilize to manage a cache of information. Caching improves performance by keeping recent or often-used data items in memory locations which are faster, or computationally cheaper to access, than normal memory stores. When the cache is full, the algorithm must choose which items to discard to make room for new data.
In computer engineering, out-of-order execution is a paradigm used in high-performance central processing units to make use of instruction cycles that would otherwise be wasted. In this paradigm, a processor executes instructions in an order governed by the availability of input data and execution units, rather than by their original order in a program. In doing so, the processor can avoid being idle while waiting for the preceding instruction to complete and can, in the meantime, process the next instructions that are able to run immediately and independently.
Random early detection (RED), also known as random early discard or random early drop, is a queuing discipline for a network scheduler suited for congestion avoidance.
A network processor is an integrated circuit which has a feature set specifically targeted at the networking application domain.
A load-balanced switch is a switch architecture which guarantees 100% throughput with no central arbitration at all, at the cost of sending each packet across the crossbar twice. Load-balanced switches are a subject of research for large routers scaled past the point of practical central arbitration.
Transmission Control Protocol (TCP) uses a congestion control algorithm that includes various aspects of an additive increase/multiplicative decrease (AIMD) scheme, along with other schemes including slow start and a congestion window (CWND), to achieve congestion avoidance. The TCP congestion-avoidance algorithm is the primary basis for congestion control in the Internet. Per the end-to-end principle, congestion control is largely a function of internet hosts, not the network itself. There are several variations and versions of the algorithm implemented in protocol stacks of operating systems of computers that connect to the Internet.
Packet loss occurs when one or more packets of data travelling across a computer network fail to reach their destination. Packet loss is either caused by errors in data transmission, typically across wireless networks, or network congestion. Packet loss is measured as a percentage of packets lost with respect to packets sent.
Fair queuing is a family of scheduling algorithms used in some process and network schedulers. The algorithm is designed to achieve fairness when a limited resource is shared, for example to prevent flows with large packets or processes that generate small jobs from consuming more throughput or CPU time than other flows or processes.
Bandwidth management is the process of measuring and controlling the communications on a network link, to avoid filling the link to capacity or overfilling the link, which would result in network congestion and poor performance of the network. Bandwidth is described by bit rate and measured in units of bits per second (bit/s) or bytes per second (B/s).
In routers and switches, active queue management (AQM) is the policy of dropping packets inside a buffer associated with a network interface controller (NIC) before that buffer becomes full, often with the goal of reducing network congestion or improving end-to-end latency. This task is performed by the network scheduler, which for this purpose uses various algorithms such as random early detection (RED), Explicit Congestion Notification (ECN), or controlled delay (CoDel). RFC 7567 recommends active queue management as a best practice.
A reliable byte stream is a common service paradigm in computer networking; it refers to a byte stream in which the bytes which emerge from the communication channel at the recipient are exactly the same, and in exactly the same order, as they were when the sender inserted them into the channel.
Virtual output queueing (VOQ) is a technique used in certain network switch architectures where, rather than keeping all traffic in a single queue, separate queues are maintained for each possible output location. It addresses a common problem known as head-of-line blocking.
Bufferbloat is a cause of high latency and jitter in packet-switched networks caused by excess buffering of packets. Bufferbloat can also cause packet delay variation, as well as reduce the overall network throughput. When a router or switch is configured to use excessively large buffers, even very high-speed networks can become practically unusable for many interactive applications like voice over IP (VoIP), audio streaming, online gaming, and even ordinary web browsing.
Design of robust and reliable networks and network services relies on an understanding of the traffic characteristics of the network. Throughout history, different models of network traffic have been developed and used for evaluating existing and proposed networks and services.
Deterministic Networking (DetNet) is an effort by the IETF DetNet Working Group to study implementation of deterministic data paths for real-time applications with extremely low data loss rates, packet delay variation (jitter), and bounded latency, such as audio and video streaming, industrial automation, and vehicle control.
Saverio Mascolo is an Italian information engineer, academic and researcher. He is the former Head of the Department of Electrical Engineering and Information Science and the professor of Automatic Control at Department of Ingegneria Elettrica e dell'Informazione (DEI) at Politecnico di Bari, Italy.
{{cite journal}}
: CS1 maint: multiple names: authors list (link)