CoDel

Last updated

CoDel (Controlled Delay; pronounced "coddle") is an active queue management (AQM) algorithm in network routing, developed by Van Jacobson and Kathleen Nichols and published as RFC8289. [1] It is designed to overcome bufferbloat in networking hardware, such as routers, by setting limits on the delay network packets experience as they pass through buffers in this equipment. CoDel aims to improve on the overall performance of the random early detection (RED) algorithm by addressing some of its fundamental misconceptions, as perceived by Jacobson, and by being easier to manage.

Contents

In 2012, an implementation of CoDel was written by Dave Täht and Eric Dumazet for the Linux kernel and dual licensed under the GNU General Public License and the 3-clause BSD license. Dumazet's improvement on CoDel is called FQ-CoDel, standing for "Fair/Flow Queue CoDel"; it was first adopted as the standard AQM and packet scheduling solution in 2014 in the OpenWrt 14.07 release called "Barrier Breaker". From there, CoDel and FQ-CoDel have migrated into various downstream projects such as Tomato, dd-wrt, OPNsense and Ubiquiti's "Smart Queues" feature.

Theory

CoDel is based on observations of packet behavior in packet-switched networks under the influence of data buffers. Some of these observations are about the fundamental nature of queueing and the causes of bufferbloat, others relate to weaknesses of alternative queue management algorithms. CoDel was developed as an attempt to address the problem of bufferbloat. [2]

Bufferbloat

The flow of packets slows down while traveling through a network link between a fast and a slow network, especially at the start of a TCP session, when there is a sudden burst of packets and the slower network may not be able to accept the burst quickly enough. Buffers exist to ease this problem by giving the fast network a place to store packets to be read by the slower network at its own pace. [3] In other words, buffers act like shock absorbers to convert bursty arrivals into smooth, steady departures. However, a buffer has limited capacity. The ideal buffer is sized so it can handle a sudden burst of communication and match the speed of that burst to the speed of the slower network. Ideally, the shock-absorbing situation is characterized by a temporary delay for packets in the buffer during the transmission burst, after which the delay rapidly disappears and the network reaches a balance in offering and handling packets. [3]

The TCP congestion control algorithm relies on packet drops to determine the available bandwidth between two communicating devices. It speeds up the data transfer until packets start to drop, and then slows down the transmission rate. Ideally, it keeps speeding up and slowing down as it finds equilibrium at the speed of the link. For this to work, the packet drops must occur in a timely manner so that the algorithm can responsively select a suitable transfer speed. With packets held in an overly-large buffer, the packets will arrive at their destination but with a higher latency but no packets are dropped so TCP does not slow down. Under these conditions, TCP may even decide that the path of the connection has changed and repeat the search for a new equilibrium. [4] [5]

Having a big and constantly full buffer that causes increased transmission delays and reduced interactivity, especially when looking at two or more simultaneous transmissions over the same channel, is called bufferbloat. Available channel bandwidth can also end up being unused, as some fast destinations may not be reached due to buffers being clogged with data awaiting delivery to slow destinations.

Good and bad queues

CoDel distinguishes between two types of queue: [3] [6] A good queue is one that exhibits no bufferbloat. Communication bursts cause no more than a temporary increase in queue delay. The network link utilization is maximized. A bad queue exhibits bufferbloat. Communication bursts cause the buffer to fill up and stay filled, resulting in low utilization and a constantly high buffer delay. In order to be effective against bufferbloat, a solution in the form of an active queue management (AQM) algorithm must be able to recognize an occurrence of bufferbloat and react by deploying effective countermeasures.

Van Jacobson asserted in 2006 that existing algorithms have been using incorrect means of recognizing bufferbloat. [5] Algorithms like RED measure the average queue length and consider it a case of bufferbloat if the average grows too large. Jacobson demonstrated in 2006 that this measurement is not a good metric, as the average queue length rises sharply in the case of a communications burst. The queue can then dissipate quickly (good queue) or become a standing queue (bad queue). Other factors in network traffic can also cause false positives or negatives, causing countermeasures to be deployed unnecessarily. Jacobson suggested that average queue length actually contains no information at all about packet demand or network load. [3] [5] He suggested that a better metric might be the minimum queue length during a sliding time window. [3]

Algorithm

Based on Jacobson's notion from 2006, CoDel was developed to manage queues under control of the minimum delay experienced by packets in the running buffer window. The goal is to keep this minimum delay below 5 milliseconds. If the minimum delay rises to too high a value, packets are dropped from the queue until the delay drops below the maximum level. [3] Nichols and Jacobson cite several advantages to using nothing more than this metric: [3]

CoDel does nothing to manage the buffer if the minimum delay for the buffer window is below the maximum allowed value. It also does nothing if the buffer is relatively empty (if there are fewer than one MTU's worth of bytes in the buffer). [3] If these conditions do not hold, then CoDel drops packets probabilistically. [3]

The algorithm is independently computed at each network hop. The algorithm operates over an interval, initially 100 milliseconds. Per-packet queuing delay is monitored through the hop. As each packet is dequeued for forwarding, the queuing delay (amount of time the packet spent waiting in the queue) is calculated. The lowest queuing delay for the interval is stored. When the last packet of the interval is dequeued, if the lowest queuing delay for the interval is greater than 5 milliseconds, this single packet is dropped and the interval used for the next group of packets is shortened. If the lowest queuing delay for the interval is less than 5 milliseconds, the packet is forwarded and the interval is reset to 100 milliseconds.

When the interval is shortened, it is done so in accordance with the inverse square root of the number of successive intervals in which packets were dropped due to excessive queuing delay. The sequence of intervals is , , , , ...

Simulation results

CoDel has been tested in simulation tests by Nichols and Jacobson, at different MTUs and link rates and other variations of conditions. In general, results indicate: [3] [7]

Simulation was also performed by Greg White and Joey Padden at CableLabs. [8]

Implementation

A full implementation of CoDel was realized in May 2012 and made available as open-source software. [3] It was implemented within the Linux kernel (starting with the 3.5 mainline). [9] Dave Täht back-ported CoDel to Linux kernel 3.3 for project CeroWrt, which concerns itself among other things with bufferbloat, [10] where it was exhaustively tested. CoDel began to appear as an option in some proprietary/turnkey bandwidth management platforms in 2013. [11] FreeBSD had CoDel integrated into the 11.x [12] and 10.x [13] code branches in 2016. [14] An implementation is distributed with OpenBSD since version 6.2. [15]

Derived algorithms

Fair/Flow Queue CoDel (FQ-CoDel; fq_codel in Linux code) adds flow queuing to CoDel so that it differentiates between multiple simultaneous connections and works fairly. It gives the first packet in each stream priority, so that small streams can start and finish quickly for better use of network resources. CoDel co-author Van Jacobson recommends the use of fq_codel over codel where it's available. [16] FQ-CoDel is published as RFC8290. It is written by T. Hoeiland-Joergensen, P. McKenney, D. Täht, J. Gettys, and E. Dumazet, all members of the "bufferbloat project". [17]

Common Applications Kept Enhanced (CAKE; sch_cake in Linux code) is a combined traffic shaper and AQM algorithm presented by the bufferbloat project in 2018. It builds on the experience of using fq_codel with the HTB (Hierarchy Token Bucket) traffic shaper. It improves over the Linux htb+fq_codel implementation by reducing hash collisions between flows, reducing CPU utilization in traffic shaping, and in a few other ways. [18]

In 2022, Dave Täht reviewed the state of fq_codel and sch_cake implementations in the wild. He found that while many systems have switched to either as the default AQM, several implementations have dubious deviations from the standard. For example, Apple's implementation of fq_codel (default in iOS) has a very large number of users but no "codel" component. Täht also notes the general lack of hardware offloading, made more important by the increase in network traffic brought by the COVID-19 pandemic. [19]

See also

Related Research Articles

The Internet Control Message Protocol (ICMP) is a supporting protocol in the Internet protocol suite. It is used by network devices, including routers, to send error messages and operational information indicating success or failure when communicating with another IP address, for example, an error is indicated when a requested service is not available or that a host or router could not be reached. ICMP differs from transport protocols such as TCP and UDP in that it is not typically used to exchange data between systems, nor is it regularly employed by end-user network applications.

The Transmission Control Protocol (TCP) is one of the main protocols of the Internet protocol suite. It originated in the initial network implementation in which it complemented the Internet Protocol (IP). Therefore, the entire suite is commonly referred to as TCP/IP. TCP provides reliable, ordered, and error-checked delivery of a stream of octets (bytes) between applications running on hosts communicating via an IP network. Major internet applications such as the World Wide Web, email, remote administration, and file transfer rely on TCP, which is part of the Transport Layer of the TCP/IP suite. SSL/TLS often runs on top of TCP.

Explicit Congestion Notification (ECN) is an extension to the Internet Protocol and to the Transmission Control Protocol and is defined in RFC 3168 (2001). ECN allows end-to-end notification of network congestion without dropping packets. ECN is an optional feature that may be used between two ECN-enabled endpoints when the underlying network infrastructure also supports it.

Network congestion in data networking and queueing theory is the reduced quality of service that occurs when a network node or link is carrying more data than it can handle. Typical effects include queueing delay, packet loss or the blocking of new connections. A consequence of congestion is that an incremental increase in offered load leads either only to a small increase or even a decrease in network throughput.

<span class="mw-page-title-main">Jim Gettys</span> American computer programmer

Jim Gettys is an American computer programmer. He was involved in multiple computer related projects.

<span class="mw-page-title-main">Leaky bucket</span> Network traffic shaping and policing algorithm

The leaky bucket is an algorithm based on an analogy of how a bucket with a constant leak will overflow if either the average rate at which water is poured in exceeds the rate at which the bucket leaks or if more water than the capacity of the bucket is poured in all at once. It can be used to determine whether some sequence of discrete events conforms to defined limits on their average and peak rates or frequencies, e.g. to limit the actions associated with these events to these rates or delay them until they do conform to the rates. It may also be used to check conformance or limit to an average rate alone, i.e. remove any variation from the average.

The token bucket is an algorithm used in packet-switched and telecommunications networks. It can be used to check that data transmissions, in the form of packets, conform to defined limits on bandwidth and burstiness. It can also be used as a scheduling algorithm to determine the timing of transmissions that will comply with the limits set for the bandwidth and burstiness: see network scheduler.

<span class="mw-page-title-main">Random early detection</span> Algorithm

Random early detection (RED), also known as random early discard or random early drop is a queuing discipline for a network scheduler suited for congestion avoidance.

Transmission Control Protocol (TCP) uses a network congestion-avoidance algorithm that includes various aspects of an additive increase/multiplicative decrease (AIMD) scheme, along with other schemes including slow start and congestion window (CWND), to achieve congestion avoidance. The TCP congestion-avoidance algorithm is the primary basis for congestion control in the Internet. Per the end-to-end principle, congestion control is largely a function of internet hosts, not the network itself. There are several variations and versions of the algorithm implemented in protocol stacks of operating systems of computers that connect to the Internet.

Nagle's algorithm is a means of improving the efficiency of TCP/IP networks by reducing the number of packets that need to be sent over the network. It was defined by John Nagle while working for Ford Aerospace. It was published in 1984 as a Request for Comments (RFC) with title Congestion Control in IP/TCP Internetworks in RFC 896.

TCP tuning techniques adjust the network congestion avoidance parameters of Transmission Control Protocol (TCP) connections over high-bandwidth, high-latency networks. Well-tuned networks can perform up to 10 times faster in some cases. However, blindly following instructions without understanding their real consequences can hurt performance as well.

The Media Delivery Index (MDI) is a set of measures that can be used to monitor both the quality of a delivered video stream as well as to show system margin for IPTV systems by providing an accurate measurement of jitter and delay at network level (Internet Protocol, IP), which are the main causes for quality loss. Identifying and quantizing such problems in this kind of networks is key to maintaining high quality video delivery and providing indications that warn system operators with enough advance notice to allow corrective action.

Bandwidth management is the process of measuring and controlling the communications on a network link, to avoid filling the link to capacity or overfilling the link, which would result in network congestion and poor performance of the network. Bandwidth is described by bit rate and measured in units of bits per second (bit/s) or bytes per second (B/s).

In routers and switches, active queue management (AQM) is the policy of dropping packets inside a buffer associated with a network interface controller (NIC) before that buffer becomes full, often with the goal of reducing network congestion or improving end-to-end latency. This task is performed by the network scheduler, which for this purpose uses various algorithms such as random early detection (RED), Explicit Congestion Notification (ECN), or controlled delay (CoDel). RFC 7567 recommends active queue management as a best practice.

The TCP window scale option is an option to increase the receive window size allowed in Transmission Control Protocol above its former maximum value of 65,535 bytes. This TCP option, along with several others, is defined in RFC 7323 which deals with long fat networks (LFNs).

Blue is a scheduling discipline for the network scheduler developed by graduate student Wu-chang Feng for Professor Kang G. Shin at the University of Michigan and others at the Thomas J. Watson Research Center of IBM in 1999.

Bufferbloat is a cause of high latency and jitter in packet-switched networks caused by excess buffering of packets. Bufferbloat can also cause packet delay variation, as well as reduce the overall network throughput. When a router or switch is configured to use excessively large buffers, even very high-speed networks can become practically unusable for many interactive applications like voice over IP (VoIP), audio streaming, online gaming, and even ordinary web browsing.

<span class="mw-page-title-main">Network scheduler</span> Arbiter on a node in packet switching communication network

A network scheduler, also called packet scheduler, queueing discipline (qdisc) or queueing algorithm, is an arbiter on a node in a packet switching communication network. It manages the sequence of network packets in the transmit and receive queues of the protocol stack and network interface controller. There are several network schedulers available for the different operating systems, that implement many of the existing network scheduling algorithms.

Kathleen Nichols is an American computer scientist and computer networking expert. Nichols is the founder and CEO of Pollere, Inc, a network architecture and performance company based in California, US. Before founding Pollere, Nichols was VP of Network Science at Packet Design, where she was part of the founding team. Prior to Packet Design she was director of advanced Internet architectures in the Office of CTO at Cisco Systems.

<span class="mw-page-title-main">Dave Taht</span> American network engineer

Dave Täht is an American network engineer, musician, lecturer, asteroid exploration advocate, and Internet activist. He is the chief executive officer of TekLibre.

References

  1. Nichols, K.; Jacobson, V.; McGregor, A.; Iyengar, J. (Jan 2018). Controlled Delay Active Queue Management. IETF. doi: 10.17487/RFC8289 . RFC 8289.
  2. Joe Brockmeier (2012-05-08). "Good News for Solving Bufferbloat: CoDel Provides "No Knobs" Solution". ReadWriteWeb. Archived from the original on 2012-07-12. Retrieved 2012-08-16.
  3. 1 2 3 4 5 6 7 8 9 10 11 Nichols, Kathleen; Jacobson, Van (6 May 2012). "Controlling Queue Delay". ACM Queue. ACM Publishing. 55 (7): 42–50. doi:10.1145/2209249.2209264. S2CID   381738 . Retrieved 12 August 2012.
  4. Jacobson, Van; Karels, MJ (1988). "Congestion avoidance and control" (PDF). ACM SIGCOMM Computer Communication Review. 18 (4): 314–329. doi:10.1145/52325.52356. Archived from the original (PDF) on 2004-06-22.
  5. 1 2 3 Jacobson, Van (2006). "A rant on queues. A talk presented at MIT Lincoln Labs, Lexington, MA" (PDF). Retrieved 12 August 2012.
  6. Iljitsch van Beijnum (2012-05-10). "CoDel buffer management could solve the Internet's bufferbloat jams". Ars Technica. Retrieved 2012-08-16.
  7. Nichols, Kathleen (July 2012). "Controlled Delay (CoDel) Active Queue Management". Pollere Inc. Archived from the original on 22 August 2012. Retrieved 12 August 2012.
  8. Greg White; Joey Padden (November 2012). "Preliminary Study of Codel AGM in a Docsis Network" (PDF). cablelabs.com. Retrieved 2015-06-14.
  9. Gettys, Jim (22 May 2012). "A Milestone Reached: CoDel is in Linux!". jg's Ramblings. Retrieved 12 August 2012.
  10. "Cerowrt - Overview". Bufferbloat. Retrieved 2014-01-24.
  11. "Procera Packetlogic Changelog". proceranetworks.com. Archived from the original on 2013-07-24. Retrieved 2013-07-24.
  12. truckman (2016-05-26). "Import Dummynet AQM version 0.2.1 (CoDel, FQ-CoDel, PIE and FQ-PIE)".
  13. truckman (2016-06-10). "MFC Import Dummynet AQM version 0.2.1 (CoDel, FQ-CoDel, PIE and FQ-PIE)".
  14. Al Saadi, Rasool; Armitage, Grenville. "Implementing AQM in FreeBSD".
  15. "OpenBSD 6.2" . Retrieved 13 October 2017.
  16. "Benchmarking Codel and FQ Codel - Bufferbloat.net". www.bufferbloat.net.
  17. Toke Høiland-Jørgensen; Paul McKenney; Jim Gettys; Eric Dumazet (January 2018). The Flow Queue CoDel Packet Scheduler and Active Queue Management Algorithm. IETF. doi: 10.17487/RFC8290 . ISSN   2070-1721. RFC 8290.Experimental.
  18. "Cake - Bufferbloat.net". www.bufferbloat.net.
  19. Dave Täht (April 23, 2022). "The state of fq_codel and sch_cake worldwide". CeroWRT.