TCP global synchronization

Last updated August 21, 2024

TCP global synchronization in computer networks is a pattern of each sender decreasing and increasing transmission rates at the same time as other senders. It can happen to TCP/IP flows during periods of congestion because each sender will reduce their transmission rate at the same time when packet loss occurs.

Causes

Routers on the Internet normally have packet queues, to allow them to hold packets when the network is busy, rather than discarding them.

Because routers have limited resources, the size of these queues is also limited. The simplest technique to limit queue size is known as tail drop. The queue is allowed to fill to its maximum size, and then any new packets are simply discarded until there is space in the queue again.

This causes problems when used on TCP/IP routers handling multiple TCP streams, especially during peaks in traffic. While the network is stable, the queue is constantly full, and there are no problems except that the full queue results in high latency. However, the introduction of a sudden burst of traffic may cause large numbers of established, steady streams to lose packets simultaneously.

TCP has automatic recovery from dropped packets, which it interprets as congestion on the network (which is usually correct). The sender reduces its sending rate for a certain amount of time and then tries to find out if the network is no longer congested by increasing the rate again subject to a ramp-up. This is known as the slow start algorithm.

Almost all the senders will use the same time delay before increasing their rates. When these delays expire at the same time, all the senders will send additional packets and the router queue will again overflow in a repeating feedback loop. This pattern is referred to as "global synchronization" and leads to inefficient use of bandwidth, due to the large numbers of dropped packets, which must be retransmitted, and because the senders have a reduced sending rate, compared to the stable state, while they are backed-off, following each loss. This pattern is similar with the thundering herd problem in process management.

Solutions

This problem has been the subject of much research. The consensus appears to be that the tail drop algorithm is the leading cause of the problem, and other queue size management algorithms such as Random Early Detection (RED) and Weighted RED will reduce the likelihood of global synchronization, as well as keeping queue sizes down in the face of heavy load and unexpected peak traffic.

Related Research Articles

Asynchronous Transfer Mode (ATM) is a telecommunications standard defined by the American National Standards Institute and ITU-T for digital transmission of multiple types of traffic. ATM was developed to meet the needs of the Broadband Integrated Services Digital Network as defined in the late 1980s, and designed to integrate telecommunication networks. It can handle both traditional high-throughput data traffic and real-time, low-latency content such as telephony (voice) and video. ATM provides functionality that uses features of circuit switching and packet switching networks by using asynchronous time-division multiplexing. ATM was seen in the 1990s as a competitor to Ethernet and networks carrying IP traffic as, unlike Ethernet, it was faster and designed with quality-of-service in mind, but it fell out of favor once Ethernet reached speeds of 1 gigabits per second.

The Internet Control Message Protocol (ICMP) is a supporting protocol in the Internet protocol suite. It is used by network devices, including routers, to send error messages and operational information indicating success or failure when communicating with another IP address. For example, an error is indicated when a requested service is not available or that a host or router could not be reached. ICMP differs from transport protocols such as TCP and UDP in that it is not typically used to exchange data between systems, nor is it regularly employed by end-user network applications.

The Transmission Control Protocol (TCP) is one of the main protocols of the Internet protocol suite. It originated in the initial network implementation in which it complemented the Internet Protocol (IP). Therefore, the entire suite is commonly referred to as TCP/IP. TCP provides reliable, ordered, and error-checked delivery of a stream of octets (bytes) between applications running on hosts communicating via an IP network. Major internet applications such as the World Wide Web, email, remote administration, and file transfer rely on TCP, which is part of the Transport layer of the TCP/IP suite. SSL/TLS often runs on top of TCP.

Frame Relay is a standardized wide area network (WAN) technology that specifies the physical and data link layers of digital telecommunications channels using a packet switching methodology. Originally designed for transport across Integrated Services Digital Network (ISDN) infrastructure, it may be used today in the context of many other network interfaces.

Explicit Congestion Notification (ECN) is an extension to the Internet Protocol and to the Transmission Control Protocol and is defined in RFC 3168 (2001). ECN allows end-to-end notification of network congestion without dropping packets. ECN is an optional feature that may be used between two ECN-enabled endpoints when the underlying network infrastructure also supports it.

Network congestion in data networking and queueing theory is the reduced quality of service that occurs when a network node or link is carrying more data than it can handle. Typical effects include queueing delay, packet loss or the blocking of new connections. A consequence of congestion is that an incremental increase in offered load leads either only to a small increase or even a decrease in network throughput.

FAST TCP is a TCP congestion avoidance algorithm especially targeted at long-distance, high latency links, developed at the Netlab, California Institute of Technology and now being commercialized by FastSoft. FastSoft was acquired by Akamai Technologies in 2012.

Random early detection (RED), also known as random early discard or random early drop, is a queuing discipline for a network scheduler suited for congestion avoidance.

In communications, traffic policing is the process of monitoring network traffic for compliance with a traffic contract and taking steps to enforce that contract. Traffic sources which are aware of a traffic contract may apply traffic shaping to ensure their output stays within the contract and is thus not discarded. Traffic exceeding a traffic contract may be discarded immediately, marked as non-compliant, or left as-is, depending on administrative policy and the characteristics of the excess traffic.

Transmission Control Protocol (TCP) uses a congestion control algorithm that includes various aspects of an additive increase/multiplicative decrease (AIMD) scheme, along with other schemes including slow start and a congestion window (CWND), to achieve congestion avoidance. The TCP congestion-avoidance algorithm is the primary basis for congestion control in the Internet. Per the end-to-end principle, congestion control is largely a function of internet hosts, not the network itself. There are several variations and versions of the algorithm implemented in protocol stacks of operating systems of computers that connect to the Internet.

TCP tuning techniques adjust the network congestion avoidance parameters of Transmission Control Protocol (TCP) connections over high-bandwidth, high-latency networks. Well-tuned networks can perform up to 10 times faster in some cases. However, blindly following instructions without understanding their real consequences can hurt performance as well.

Packet loss occurs when one or more packets of data travelling across a computer network fail to reach their destination. Packet loss is either caused by errors in data transmission, typically across wireless networks, or network congestion. Packet loss is measured as a percentage of packets lost with respect to packets sent.

Weighted random early detection (WRED) is a queueing discipline for a network scheduler suited for congestion avoidance. It is an extension to random early detection (RED) where a single queue may have several different sets of queue thresholds. Each threshold set is associated to a particular traffic class.

Bandwidth management is the process of measuring and controlling the communications on a network link, to avoid filling the link to capacity or overfilling the link, which would result in network congestion and poor performance of the network. Bandwidth is described by bit rate and measured in units of bits per second (bit/s) or bytes per second (B/s).

Compound TCP (CTCP) is a Microsoft algorithm that was introduced as part of the Windows Vista and Window Server 2008 TCP stack. It is designed to aggressively adjust the sender's congestion window to optimise TCP for connections with large bandwidth-delay products while trying not to harm fairness. It is also available for Linux, as well as for Windows XP and Windows Server 2003 via a hotfix.

In routers and switches, active queue management (AQM) is the policy of dropping packets inside a buffer associated with a network interface controller (NIC) before that buffer becomes full, often with the goal of reducing network congestion or improving end-to-end latency. This task is performed by the network scheduler, which for this purpose uses various algorithms such as random early detection (RED), Explicit Congestion Notification (ECN), or controlled delay (CoDel). RFC 7567 recommends active queue management as a best practice.

Tail drop is a simple queue management algorithm used by network schedulers in network equipment to decide when to drop packets. With tail drop, when the queue is filled to its maximum capacity, the newly arriving packets are dropped until the queue has enough room to accept incoming traffic.

In computing, Microsoft's Windows Vista and Windows Server 2008 introduced in 2007/2008 a new networking stack named Next Generation TCP/IP stack, to improve on the previous stack in several ways. The stack includes native implementation of IPv6, as well as a complete overhaul of IPv4. The new TCP/IP stack uses a new method to store configuration settings that enables more dynamic control and does not require a computer restart after a change in settings. The new stack, implemented as a dual-stack model, depends on a strong host-model and features an infrastructure to enable more modular components that one can dynamically insert and remove.

Bufferbloat is a cause of high latency and jitter in packet-switched networks caused by excess buffering of packets. Bufferbloat can also cause packet delay variation, as well as reduce the overall network throughput. When a router or switch is configured to use excessively large buffers, even very high-speed networks can become practically unusable for many interactive applications like voice over IP (VoIP), audio streaming, online gaming, and even ordinary web browsing.

The Fast Adaptive and Secure Protocol (FASP) is a proprietary data transfer protocol. FASP is a network-optimized network protocol created by Michelle C. Munson and Serban Simu, productized by Aspera, and now owned by IBM subsequent to its acquisition of Aspera. The associated client/server software packages are also commonly called Aspera. The technology is patented under US Patent #8085781, Bulk Data Transfer, #20090063698, Method and system for aggregate bandwidth control. and others.

References

Vegesna, Srinivas (2001). IP Quality of Service, chap. 6. Cisco press. ISBN 1-57870-116-3.
Szigeti, Tim (2005). End-to-End QoS Network Design, chap. 6. Cisco press. ISBN 1-58705-176-1.

External links

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.