Token bucket

Last updated August 28, 2024

The token bucket is an algorithm used in packet-switched and telecommunications networks. It can be used to check that data transmissions, in the form of packets, conform to defined limits on bandwidth and burstiness (a measure of the unevenness or variations in the traffic flow). It can also be used as a scheduling algorithm to determine the timing of transmissions that will comply with the limits set for the bandwidth and burstiness: see network scheduler.

Overview

The token bucket algorithm is based on an analogy of a fixed capacity bucket into which tokens, normally representing a unit of bytes or a single packet of predetermined size, are added at a fixed rate. When a packet is to be checked for conformance to the defined limits, the bucket is inspected to see if it contains sufficient tokens at that time. If so, the appropriate number of tokens, e.g. equivalent to the length of the packet in bytes, are removed ("cashed in"), and the packet is passed, e.g., for transmission. The packet does not conform if there are insufficient tokens in the bucket, and the contents of the bucket are not changed. Non-conformant packets can be treated in various ways:

They may be dropped.
They may be enqueued for subsequent transmission when sufficient tokens have accumulated in the bucket.
They may be transmitted, but marked as being non-conformant, possibly to be dropped subsequently if the network is overloaded.

A conforming flow can thus contain traffic with an average rate up to the rate at which tokens are added to the bucket, and have a burstiness determined by the depth of the bucket. This burstiness may be expressed in terms of either a jitter tolerance, i.e. how much sooner a packet might conform (e.g. arrive or be transmitted) than would be expected from the limit on the average rate, or a burst tolerance or maximum burst size, i.e. how much more than the average level of traffic might conform in some finite period.

Algorithm

The token bucket algorithm can be conceptually understood as follows:

A token is added to the bucket every $1/r$ seconds.
The bucket can hold at the most $b$ tokens. If a token arrives when the bucket is full, it is discarded.
When a packet (network layer PDU) of n bytes arrives,
- if at least n tokens are in the bucket, n tokens are removed from the bucket, and the packet is sent to the network.
- if fewer than n tokens are available, no tokens are removed from the bucket, and the packet is considered to be non-conformant.

Variations

Implementers of this algorithm on platforms lacking the clock resolution necessary to add a single token to the bucket every $1/r$ seconds may want to consider an alternative formulation. Given the ability to update the token bucket every S milliseconds, the number of tokens to add every S milliseconds = $(r*S)/1000$ .

Properties

Average rate

Over the long run the output of conformant packets is limited by the token rate, $r$ .

Burst size

Let $M$ be the maximum possible transmission rate in bytes/second.

Then $T_{\text{max}}={\begin{cases}b/(M-r)&{\text{ if }}r<M\\\infty &{\text{ otherwise }}\end{cases}}$ is the maximum burst time, that is the time for which the rate $M$ is fully utilized.

The maximum burst size is thus $B_{\text{max}}=T_{\text{max}}*M$

Uses

The token bucket can be used in either traffic shaping or traffic policing. In traffic policing, nonconforming packets may be discarded (dropped) or may be reduced in priority (for downstream traffic management functions to drop if there is congestion). In traffic shaping, packets are delayed until they conform. Traffic policing and traffic shaping are commonly used to protect the network against excess or excessively bursty traffic, see bandwidth management and congestion avoidance. Traffic shaping is commonly used in the network interfaces in hosts to prevent transmissions being discarded by traffic management functions in the network.

The token bucket algorithm is also used in controlling database IO flow.^[1] In it, limitation applies to neither IOPS nor the bandwidth but rather to a linear combination of both. By defining tokens to be the normalized sum of IO request weight and its length, the algorithm makes sure that the time derivative of the aforementioned function stays below the needed threshold.

Comparison to leaky bucket

The token bucket algorithm is directly comparable to one of the two versions of the leaky bucket algorithm described in the literature.^[2]^[3]^[4]^[5] This comparable version of the leaky bucket is described on the relevant Wikipedia page as the leaky bucket algorithm as a meter. This is a mirror image of the token bucket, in that conforming packets add fluid, equivalent to the tokens removed by a conforming packet in the token bucket algorithm, to a finite capacity bucket, from which this fluid then drains away at a constant rate, equivalent to the process in which tokens are added at a fixed rate.

There is, however, another version of the leaky bucket algorithm,^[3] described on the relevant Wikipedia page as the leaky bucket algorithm as a queue. This is a special case of the leaky bucket as a meter, which can be described by the conforming packets passing through the bucket. The leaky bucket as a queue is therefore applicable only to traffic shaping, and does not, in general, allow the output packet stream to be bursty, i.e. it is jitter free. It is therefore significantly different from the token bucket algorithm.

These two versions of the leaky bucket algorithm have both been described in the literature under the same name. This has led to considerable confusion over the properties of that algorithm and its comparison with the token bucket algorithm. However, fundamentally, the two algorithms are the same, and will, if implemented correctly and given the same parameters, see exactly the same packets as conforming and nonconforming.

Hierarchical token bucket

The hierarchical token bucket (HTB) is a faster replacement for the class-based queueing (CBQ) queuing discipline in Linux.^[6] It is useful for limiting each client's download/upload rate so that the limited client cannot saturate the total bandwidth.

Three clients sharing the same outbound bandwidth. Link-bandwidth.jpg — Three clients sharing the same outbound bandwidth.

Conceptually, HTB is an arbitrary number of token buckets arranged in a hierarchy. The primary egress queuing discipline (qdisc) on any device is known as the root qdisc. The root qdisc will contain one class. This single HTB class will be set with two parameters, a rate and a ceil. These values should be the same for the top-level class, and will represent the total available bandwidth on the link.

In HTB, rate means the guaranteed bandwidth available for a given class and ceil (short for ceiling) indicates the maximum bandwidth that class is allowed to consume. When a class requests a bandwidth more than guaranteed, it may borrow bandwidth from its parent as long as both ceils are not reached. Hierarchical Token Bucket implements a classful queuing mechanism for the Linux traffic control system, and provides rate and ceil to allow the user to control the absolute bandwidth to particular classes of traffic as well as indicate the ratio of distribution of bandwidth when extra bandwidth become available (up to ceil).

When choosing the bandwidth for a top-level class, traffic shaping only helps at the bottleneck between the LAN and the Internet. Typically, this is the case in home and office network environments, where an entire LAN is serviced by a DSL or T1 connection.

Related Research Articles

Asynchronous Transfer Mode (ATM) is a telecommunications standard defined by the American National Standards Institute and ITU-T for digital transmission of multiple types of traffic. ATM was developed to meet the needs of the Broadband Integrated Services Digital Network as defined in the late 1980s, and designed to integrate telecommunication networks. It can handle both traditional high-throughput data traffic and real-time, low-latency content such as telephony (voice) and video. ATM provides functionality that uses features of circuit switching and packet switching networks by using asynchronous time-division multiplexing. ATM was seen in the 1990s as a competitor to Ethernet and networks carrying IP traffic as, unlike Ethernet, it was faster and designed with quality-of-service in mind, but it fell out of favor once Ethernet reached speeds of 1 gigabits per second.

Network throughput refers to the rate of message delivery over a communication channel, such as Ethernet or packet radio, in a communication network. The data that these messages contain may be delivered over physical or logical links, or through network nodes. Throughput is usually measured in bits per second, and sometimes in data packets per second or data packets per time slot.

In computer networking, integrated services or IntServ is an architecture that specifies the elements to guarantee quality of service (QoS) on networks. IntServ can for example be used to allow video and sound to reach the receiver without interruption.

Traffic shaping is a bandwidth management technique used on computer networks which delays some or all datagrams to bring them into compliance with a desired traffic profile. Traffic shaping is used to optimize or guarantee performance, improve latency, or increase usable bandwidth for some kinds of packets by delaying other kinds. It is often confused with traffic policing, the distinct but related practice of packet dropping and packet marking.

Network congestion in data networking and queueing theory is the reduced quality of service that occurs when a network node or link is carrying more data than it can handle. Typical effects include queueing delay, packet loss or the blocking of new connections. A consequence of congestion is that an incremental increase in offered load leads either only to a small increase or even a decrease in network throughput.

Deficit Round Robin (DRR), also Deficit Weighted Round Robin (DWRR), is a scheduling algorithm for the network scheduler. DRR is, like weighted fair queuing (WFQ), a packet-based implementation of the ideal Generalized Processor Sharing (GPS) policy. It was proposed by M. Shreedhar and G. Varghese in 1995 as an efficient and fair algorithm.

The leaky bucket is an algorithm based on an analogy of how a bucket with a constant leak will overflow if either the average rate at which water is poured in exceeds the rate at which the bucket leaks or if more water than the capacity of the bucket is poured in all at once. It can be used to determine whether some sequence of discrete events conforms to defined limits on their average and peak rates or frequencies, e.g. to limit the actions associated with these events to these rates or delay them until they do conform to the rates. It may also be used to check conformance or limit to an average rate alone, i.e. remove any variation from the average.

In communications, traffic policing is the process of monitoring network traffic for compliance with a traffic contract and taking steps to enforce that contract. Traffic sources which are aware of a traffic contract may apply traffic shaping to ensure their output stays within the contract and is thus not discarded. Traffic exceeding a traffic contract may be discarded immediately, marked as non-compliant, or left as-is, depending on administrative policy and the characteristics of the excess traffic.

The generic cell rate algorithm (GCRA) is a leaky bucket-type scheduling algorithm for the network scheduler that is used in Asynchronous Transfer Mode (ATM) networks. It is used to measure the timing of cells on virtual channels (VCs) and or Virtual Paths (VPs) against bandwidth and jitter limits contained in a traffic contract for the VC or VP to which the cells belong. Cells that do not conform to the limits given by the traffic contract may then be re-timed (delayed) in traffic shaping, or may be dropped (discarded) or reduced in priority (demoted) in traffic policing. Nonconforming cells that are reduced in priority may then be dropped, in preference to higher priority cells, by downstream components in the network that are experiencing congestion. Alternatively they may reach their destination if there is enough capacity for them, despite them being excess cells as far as the contract is concerned: see priority control.

Generalized processor sharing (GPS) is an ideal scheduling algorithm for process schedulers and network schedulers. It is related to the fair-queuing principle which groups packets into classes and shares the service capacity between them. GPS shares this capacity according to some fixed weights.

TCP tuning techniques adjust the network congestion avoidance parameters of Transmission Control Protocol (TCP) connections over high-bandwidth, high-latency networks. Well-tuned networks can perform up to 10 times faster in some cases. However, blindly following instructions without understanding their real consequences can hurt performance as well.

Fair queuing is a family of scheduling algorithms used in some process and network schedulers. The algorithm is designed to achieve fairness when a limited resource is shared, for example to prevent flows with large packets or processes that generate small jobs from consuming more throughput or CPU time than other flows or processes.

Bandwidth management is the process of measuring and controlling the communications on a network link, to avoid filling the link to capacity or overfilling the link, which would result in network congestion and poor performance of the network. Bandwidth is described by bit rate and measured in units of bits per second (bit/s) or bytes per second (B/s).

In routers and switches, active queue management (AQM) is the policy of dropping packets inside a buffer associated with a network interface controller (NIC) before that buffer becomes full, often with the goal of reducing network congestion or improving end-to-end latency. This task is performed by the network scheduler, which for this purpose uses various algorithms such as random early detection (RED), Explicit Congestion Notification (ECN), or controlled delay (CoDel). RFC 7567 recommends active queue management as a best practice.

Weighted fair queueing (WFQ) is a network scheduling algorithm. WFQ is both a packet-based implementation of the generalized processor sharing (GPS) policy, and a natural extension of fair queuing (FQ). Whereas FQ shares the link's capacity in equal subparts, WFQ allows schedulers to specify, for each flow, which fraction of the capacity will be given.

Usage Parameter Control (UPC) and Network Parameter Control (NPC) are functions that may be performed in a computer network. UPC may be performed at the input to a network "to protect network resources from malicious as well as unintentional misbehaviour". NPC is the same and done for the same reasons as UPC, but at the interface between two networks.

CoDel is an active queue management (AQM) algorithm in network routing, developed by Van Jacobson and Kathleen Nichols and published as RFC8289. It is designed to overcome bufferbloat in networking hardware, such as routers, by setting limits on the delay network packets experience as they pass through buffers in this equipment. CoDel aims to improve on the overall performance of the random early detection (RED) algorithm by addressing some of its fundamental misconceptions, as perceived by Jacobson, and by being easier to manage.

In the field of computer networking, TCP pacing is the denomination of a set of techniques to make the pattern of packet transmission generated by the Transmission Control Protocol less bursty. It can be conducted by the network scheduler.

A network scheduler, also called packet scheduler, queueing discipline (qdisc) or queueing algorithm, is an arbiter on a node in a packet switching communication network. It manages the sequence of network packets in the transmit and receive queues of the protocol stack and network interface controller. There are several network schedulers available for the different operating systems, that implement many of the existing network scheduling algorithms.

Time-Sensitive Networking (TSN) is a set of standards under development by the Time-Sensitive Networking task group of the IEEE 802.1 working group. The TSN task group was formed in November 2012 by renaming the existing Audio Video Bridging Task Group and continuing its work. The name changed as a result of the extension of the working area of the standardization group. The standards define mechanisms for the time-sensitive transmission of data over deterministic Ethernet networks.

References

↑ "Implementing a New IO Scheduler Algorithm for Mixed Read/Write Workloads". 3 August 2022. Retrieved 2022-08-04.
↑ Turner, J., New directions in communications (or which way to the information age?). IEEE Communications Magazine 24 (10): 8–15. ISSN 0163-6804, 1986.
1 2 Andrew S. Tanenbaum, Computer Networks, Fourth Edition, ISBN 0-13-166836-6, Prentice Hall PTR, 2003., page 401.
↑ ATM Forum, The User Network Interface (UNI), v. 3.1, ISBN 0-13-393828-X, Prentice Hall PTR, 1995.
↑ ITU-T, Traffic control and congestion control in B ISDN, Recommendation I.371, International Telecommunication Union, 2004, Annex A, page 87.
↑ "Linux HTB Home Page" . Retrieved 2013-11-30.