In computer networking, an elephant flow is an extremely large (in total bytes) continuous flow set up by a TCP (or other protocol) flow measured over a network link. Elephant flows, though not numerous, can occupy a disproportionate share of the total bandwidth over a period of time. It is not clear who coined elephant flow [lower-alpha 1] but the term began occurring in published Internet network research in 2001 when the observations were made that a small number of flows carry the majority of Internet traffic and the remainder consists of a large number of flows that carry very little Internet traffic (mice flows). [2] [3] For example, researchers Mori et al. studied the traffic flows on several Japanese universities and research networks. [4] At the WIDE network they found elephant flows were only 4.7% of all flows but occupied 41.3% of all data transmitted during the time period.
The actual impact of elephant flows on Internet traffic is still an area of research and debate. Some research shows that elephant flows may be highly correlated with traffic spikes and other elephant flows (Lan & Heidemann and Mori et al.). [5] Elephant flows have varying definitions proposed by researchers including flows that occupy greater than 1% of total traffic in a time period, [6] measuring the duration of the flow, [7] and looking at flows whose size is greater than the mean plus three standard deviations of traffic during the time period. [5] One of the main goals of research into elephant flows is to develop more efficient bandwidth management tools and predictive models for the Internet. For example, researchers have focused on providing better quality of service to flows of small sizes (mice flows) by de-prioritizing elephant flows. [8]
Elephant flows can also be viewed from the perspective of a network appliance such as an Intrusion Prevention System (IPS). In this context the number of bytes on the flow is less significant than the instantaneous processing load required to service the flow, where the processing load depends on the IPS configuration (how much work it is supposed to do) and the byte rate (flow throughput). An elephant flow could thus be defined as a flow that exceeds a given total service time within a particular time interval
For example, if just a single CPU core is used to process a flow, an elephant flow could be considered any flow for which the processing load exceeds the capacity of the CPU core. This in turn could be defined by dropped packets or an excess latency for any packet to transit the device. Obviously, lower thresholds can be applied and more cores could be used but the basic concept of required processing load relative to processing capacity holds.
To see how this differs from simply looking at the total bytes on a flow, consider two flows F1 and F2 with N1 and N2 total bytes respectively and where N2 = 1000*N1. It is possible that N1 is an elephant flow while N2 is not, if for example the required inspection of F1 is more complex than that of F2 and/or if the rate of F1 is much greater than the rate of F2.
The Internet protocol suite, commonly known as TCP/IP, is a framework for organizing the set of communication protocols used in the Internet and similar computer networks according to functional criteria. The foundational protocols in the suite are the Transmission Control Protocol (TCP), the User Datagram Protocol (UDP), and the Internet Protocol (IP). Early versions of this networking model were known as the Department of Defense (DoD) model because the research and development were funded by the United States Department of Defense through DARPA.
The Transmission Control Protocol (TCP) is one of the main protocols of the Internet protocol suite. It originated in the initial network implementation in which it complemented the Internet Protocol (IP). Therefore, the entire suite is commonly referred to as TCP/IP. TCP provides reliable, ordered, and error-checked delivery of a stream of octets (bytes) between applications running on hosts communicating via an IP network. Major internet applications such as the World Wide Web, email, remote administration, and file transfer rely on TCP, which is part of the Transport layer of the TCP/IP suite. SSL/TLS often runs on top of TCP.
In computer networking, the transport layer is a conceptual division of methods in the layered architecture of protocols in the network stack in the Internet protocol suite and the OSI model. The protocols of this layer provide end-to-end communication services for applications. It provides services such as connection-oriented communication, reliability, flow control, and multiplexing.
Network congestion in data networking and queueing theory is the reduced quality of service that occurs when a network node or link is carrying more data than it can handle. Typical effects include queueing delay, packet loss or the blocking of new connections. A consequence of congestion is that an incremental increase in offered load leads either only to a small increase or even a decrease in network throughput.
Internet traffic is the flow of data within the entire Internet, or in certain network links of its constituent networks. Common traffic measurements are total volume, in units of multiples of the byte, or as transmission rates in bytes per certain time units.
Van Jacobson is an American computer scientist, renowned for his work on TCP/IP network performance and scaling. He is one of the primary contributors to the TCP/IP protocol stack—the technological foundation of today’s Internet. Since 2013, Jacobson is an adjunct professor at the University of California, Los Angeles (UCLA) working on Named Data Networking.
A network processor is an integrated circuit which has a feature set specifically targeted at the networking application domain.
Transmission Control Protocol (TCP) uses a congestion control algorithm that includes various aspects of an additive increase/multiplicative decrease (AIMD) scheme, along with other schemes including slow start and a congestion window (CWND), to achieve congestion avoidance. The TCP congestion-avoidance algorithm is the primary basis for congestion control in the Internet. Per the end-to-end principle, congestion control is largely a function of internet hosts, not the network itself. There are several variations and versions of the algorithm implemented in protocol stacks of operating systems of computers that connect to the Internet.
The PARC Universal Packet was one of the two earliest internetworking protocol suites; it was created by researchers at Xerox PARC in the mid-1970s. The entire suite provided routing and packet delivery, as well as higher-level functions such as a reliable byte stream, along with numerous applications.
In data communications, the bandwidth-delay product is the product of a data link's capacity and its round-trip delay time. The result, an amount of data measured in bits, is equivalent to the maximum amount of data on the network circuit at any given time, i.e., data that has been transmitted but not yet acknowledged. The bandwidth-delay product was originally proposed as a rule of thumb for sizing router buffers in conjunction with congestion avoidance algorithm random early detection (RED).
In computer networking, jumbo frames are Ethernet frames with more than 1500 bytes of payload, the limit set by the IEEE 802.3 standard. The payload limit for jumbo frames is variable: while 9000 bytes is the most commonly used limit, smaller and larger limits exist. Many Gigabit Ethernet switches and Gigabit Ethernet network interface controllers and some Fast Ethernet switches and Fast Ethernet network interface cards can support jumbo frames.
In computer networks, network traffic measurement is the process of measuring the amount and type of traffic on a particular network. This is especially important with regard to effective bandwidth management.
In routers and switches, active queue management (AQM) is the policy of dropping packets inside a buffer associated with a network interface controller (NIC) before that buffer becomes full, often with the goal of reducing network congestion or improving end-to-end latency. This task is performed by the network scheduler, which for this purpose uses various algorithms such as random early detection (RED), Explicit Congestion Notification (ECN), or controlled delay (CoDel). RFC 7567 recommends active queue management as a best practice.
WAN optimization is a collection of techniques for improving data transfer across wide area networks (WANs). In 2008, the WAN optimization market was estimated to be $1 billion, and was to grow to $4.4 billion by 2014 according to Gartner, a technology research firm. In 2015 Gartner estimated the WAN optimization market to be a $1.1 billion market.
Multipath routing is a routing technique simultaneously using multiple alternative paths through a network. This can yield a variety of benefits such as fault tolerance, increased bandwidth, and improved security.
Internet Mix or IMIX refers to typical Internet traffic passing some network equipment such as routers, switches or firewalls. When measuring equipment performance using an IMIX of packets the performance is assumed to resemble what can be seen in "real-world" conditions.
Multipath TCP (MPTCP) is an ongoing effort of the Internet Engineering Task Force's (IETF) Multipath TCP working group, that aims at allowing a Transmission Control Protocol (TCP) connection to use multiple paths to maximize throughput and increase redundancy.
The Recursive InterNetwork Architecture (RINA) is a new computer network architecture proposed as an alternative to the architecture of the currently mainstream Internet protocol suite. The principles behind RINA were first presented by John Day in his 2008 book Patterns in Network Architecture: A return to Fundamentals. This work is a start afresh, taking into account lessons learned in the 35 years of TCP/IP’s existence, as well as the lessons of OSI’s failure and the lessons of other network technologies of the past few decades, such as CYCLADES, DECnet, and Xerox Network Systems. RINA's fundamental principles are that computer networking is just Inter-Process Communication or IPC, and that layering should be done based on scope/scale, with a single recurring set of protocols, rather than based on function, with specialized protocols. The protocol instances in one layer interface with the protocol instances on higher and lower layers via new concepts and entities that effectively reify networking functions currently specific to protocols like BGP, OSPF and ARP. In this way, RINA claims to support features like mobility, multihoming and quality of service without the need for additional specialized protocols like RTP and UDP, as well as to allow simplified network administration without the need for concepts like autonomous systems and NAT.
John Heidemann is an engineer at the USC Information Sciences Institute in Marina del Rey, California. He was named a Fellow of the Institute of Electrical and Electronics Engineers (IEEE) in 2014 for his contributions to sensor networks, internet measurement, and simulations. He has authored more than two hundred and fifty scholarly papers in these fields. His research has been supported by the American National Science Foundation, among others.