Measuring network throughput

Last updated

Throughput of a network can be measured using various tools available on different platforms. This page explains the theory behind what these tools set out to measure and the issues regarding these measurements.

Contents

Reasons for measuring throughput in networks. People are often concerned about measuring the maximum data throughput in bits per second of a communications link or network access. A typical method of performing a measurement is to transfer a 'large' file from one system to another system and measure the time required to complete the transfer or copy of the file. The throughput is then calculated by dividing the file size by the time to get the throughput in megabits, kilobits, or bits per second.

Unfortunately, the results of such an exercise will often result in the goodput which is less than the maximum theoretical data throughput, leading to people believing that their communications link is not operating correctly. In fact, there are many overheads accounted for in throughput in addition to transmission overheads, including latency, TCP Receive Window size and system limitations, which means the calculated goodput does not reflect the maximum achievable throughput. [1]

Theory: Short Summary

The Maximum bandwidth can be calculated as follows: [2]

where RWIN is the TCP Receive Window and RTT is the round-trip time for the path. The Max TCP Window size in the absence of TCP window scale option is 65,535 bytes. Example: Max Bandwidth = 65,535 bytes / 0.220 s = 297886.36 B/s * 8 = 2.383 Mbit/s. Over a single TCP connection between those endpoints, the tested bandwidth will be restricted to 2.376 Mbit/s even if the contracted bandwidth is greater.

Bandwidth test software

Bandwidth test software is used to determine the maximum bandwidth of a network or internet connection. It is typically undertaken by attempting to download or upload the maximum amount of data in a certain period of time, or a certain amount of data in the minimum amount of time. For this reason, Bandwidth tests can delay internet transmissions through the internet connection as they are undertaken, and can cause inflated data charges.

Nomenclature

Bit rates (data-rate units)
NameSymbolMultiple
bit per second bit/s11
Metric prefixes (SI)
kilobit per second kbit/s 103 10001
megabit per second Mbit/s 106 10002
gigabit per second Gbit/s 109 10003
terabit per second Tbit/s 1012 10004
Binary prefixes (IEC 80000-13)
kibibit per second Kibit/s 210 10241
mebibit per second Mibit/s 220 10242
gibibit per second Gibit/s 230 10243
tebibit per second Tibit/s 240 10244

The throughput of communications links is measured in bits per second (bit/s), kilobits per second (kbit/s), megabits per second (Mbit/s) and gigabits per second (Gbit/s). In this application, kilo, mega and giga are the standard S.I. prefixes indicating multiplication by 1,000 (kilo), 1,000,000 (mega), and 1,000,000,000 (giga).

File sizes are typically measured in bytes kilobytes, megabytes, and gigabytes being usual, where a byte is eight bits. In modern textbooks one kilobyte is defined as 1,000 byte, one megabyte as 1,000,000 byte, etc., in accordance with the 1998 International Electrotechnical Commission (IEC) standard. However, the convention adopted by Windows systems is to define 1 kilobyte is as 1,024 (or 210) bytes, which is equal to 1  kibibyte. Similarly, a file size of "1 megabyte" is 1,024 × 1,024 byte, equal to 1 mebibyte), and "1 gigabyte" 1,024 × 1,024 × 1,024 byte = 1 gibibyte).

Confusing and inconsistent use of Suffixes

It is usual for people to abbreviate commonly used expressions. For file sizes, it is usual for someone to say that they have a '64 k' file (meaning 64 kilobytes), or a '100 meg' file (meaning 100 megabytes). When talking about circuit bit rates, people will interchangeably use the terms throughput, bandwidth and speed, and refer to a circuit as being a '64 k' circuit, or a '2 meg' circuit meaning 64 kbit/s or 2 Mbit/s (see also the List of connection bandwidths). However, a '64 k' circuit will not transmit a '64 k' file in one second. This may not be obvious to those unfamiliar with telecommunications and computing, so misunderstandings sometimes arise. In actuality, a 64 kilobyte file is 64 × 1,024 × 8 bits in size and the 64 k circuit will transmit bits at a rate of 64 × 1,000 bit/s, so the amount of time taken to transmit a 64 kilobyte file over the 64 k circuit will be at least (64 × 1,024 × 8)/(64 × 1,000) seconds, which works out to be 8.192 seconds.

Compression

Some equipment can improve matters by compressing the data as it is sent. This is a feature of most analog modems and of several popular operating systems. If the 64 k file can be shrunk by compression, the time taken to transmit can be reduced. This can be done invisibly to the user, so a highly compressible file may be transmitted considerably faster than expected. As this 'invisible' compression cannot easily be disabled, it therefore follows that when measuring throughput by using files and timing the time to transmit, one should use files that cannot be compressed. Typically, this is done using a file of random data, which becomes harder to compress the closer to truly random it is.

Assuming your data cannot be compressed, the 8.192 seconds to transmit a 64 kilobyte file over a 64 kilobit/s communications link is a theoretical minimum time which will not be achieved in practice. This is due to the effect of overheads which are used to format the data in an agreed manner so that both ends of a connection have a consistent view of the data.

There are at least two issues that aren't immediately obvious for transmitting compressed files:

  1. The throughput of the network itself isn't improved by compression. From the end-to-end (server to client) perspective compression does improve throughput. That's because information content for the same amount of transmission is increased through compression of files.
  2. Compressing files at the server and client takes more processor resources at both the ends. The server has to use its processor to compress the files, if they aren't already done. The client has to decompress the files upon receipt. This can be considered an expense (for the server and client) for the benefit of increased end to end throughput(although the throughput hasn't changed for the network itself.) [3]

Overheads and data formats

[4]

A common communications link used by many people is the asynchronous start-stop, or just "asynchronous", serial link. If you have an external modem attached to your home or office computer, the chances are that the connection is over an asynchronous serial connection. Its advantage is that it is simple it can be implemented using only three wires: Send, Receive and Signal Ground (or Signal Common). In an RS-232 interface, an idle connection has a continuous negative voltage applied. A 'zero' bit is represented as a positive voltage difference with respect to the Signal Ground and a 'one' bit is a negative voltage with respect to signal ground, thus indistinguishable from the idle state. This means you need to know when a 'one' bit starts to distinguish it from idle. This is done by agreeing in advance how fast data will be transmitted over a link, then using a start bit to signal the start of a byte this start bit will be a 'zero' bit. Stop bits are 'one' bits i.e. negative voltage.

Actually, more things will have been agreed in advance the speed of bit transmission, the number of bits per character, the parity and the number of stop bits (signifying the end of a character). So a designation of 9600-8-E-2 would be 9,600 bits per second, with eight bits per character, even parity and two stop bits.

A common set-up of an asynchronous serial connection would be 9600-8-N-1 (9,600 bit/s, 8 bits per character, no parity and 1 stop bit) - a total of 10 bits transmitted to send one 8 bit character (one start bit, the 8 bits making up the byte transmitted and one stop bit). This is an overhead of 20%, so a 9,600 bit/s asynchronous serial link will not transmit data at 9600/8 bytes per second (1200 byte/s) but actually, in this case 9600/10 bytes per second (960 byte/s), which is considerably slower than expected.

It can get worse. If parity is specified and we use 2 stop bits, the overhead for carrying one 8 bit character is 4 bits (one start bit, one parity bit and two stop bits) - or 50%! In this case a 9600 bit/s connection will carry 9600/12 byte/s (800 byte/s). Asynchronous serial interfaces commonly will support bit transmission speeds of up to 230.4 kbit/s. If it is set up to have no parity and one stop bit, this means the byte transmission rate is 23.04 kbyte/s.

The advantage of the asynchronous serial connection is its simplicity. One disadvantage is its low efficiency in carrying data. This can be overcome by using a synchronous interface. In this type of interface, a clock signal is added on a separate wire, and the bits are transmitted in synchrony with the clock the interface no longer has to look for the start and stop bits of each individual character however, it is necessary to have a mechanism to ensure the sending and receiving clocks are kept in synchrony, so data is divided up into frames of multiple characters separated by known delimiters. There are three common coding schemes for framed communications HDLC, PPP, and Ethernet

HDLC

When using HDLC, rather than each byte having a start, optional parity, and one or two stop bits, the bytes are gathered together into a frame. The start and end of the frame are signalled by the 'flag', and error detection is carried out by the frame check sequence. If the frame has a maximum sized address of 32 bits, a maximum sized control part of 16 bits and a maximum sized frame check sequence of 16 bits, the overhead per frame could be as high as 64 bits. If each frame carried but a single byte, the data throughput efficiency would be extremely low. However, the bytes are normally gathered together, so that even with a maximal overhead of 64 bits, frames carrying more than 24 bytes are more efficient than asynchronous serial connections. As frames can vary in size because they can have different numbers of bytes being carried as data, this means the overhead of an HDLC connection is not fixed. [5]

PPP

The "point-to-point protocol " (PPP) is defined by the Internet Request For Comment documents RFC 1570, RFC 1661 and RFC 1662. With respect to the framing of packets, PPP is quite similar to HDLC, but supports both bit-oriented as well as byte-oriented ("octet-stuffed") methods of delimiting frames while maintaining data transparency. [6]

Ethernet

Ethernet is a "local area network" (LAN) technology, which is also framed. The way the frame is electrically defined on a connection between two systems is different from the typically wide-area networking technology that uses HDLC or PPP implemented, but these details are not important for throughput calculations. Ethernet is a shared medium, so that it is not guaranteed that only the two systems that are transferring a file between themselves will have exclusive access to the connection. If several systems are attempting to communicate simultaneously, the throughput between any pair can be substantially lower than the nominal bandwidth available. [7]

Other low-level protocols

Dedicated point-to-point links are not the only option for many connections between systems. Frame Relay, ATM, and MPLS based services can also be used. When calculating or estimating data throughputs, the details of the frame/cell/packet format and the technology's detailed implementation need to be understood. [8]

Frame Relay

Frame Relay uses a modified HDLC format to define the frame format that carries data. [9]

ATM

Asynchronous Transfer Mode (ATM) uses a radically different method of carrying data. Rather than using variable length frames or packets, data is carried in fixed size cells. Each cell is 53 bytes long, with the first 5 bytes defined as the header, and the following 48 bytes as payload. Data networking commonly requires packets of data that are larger than 48 bytes, so there is a defined adaptation process that specifies how larger packets of data should be divided up in a standard manner to be carried by the smaller cells. This process varies according to the data carried, so in ATM nomenclature, there are different ATM Adaptation Layers. The process defined for most data is named ATM Adaptation Layer No. 5 or AAL5.

Understanding throughput on ATM links requires a knowledge of which ATM adaptation layer has been used for the data being carried. [10]

MPLS

Multiprotocol Label Switching (MPLS) adds a standard tag or header known as a 'label' to existing packets of data. In certain situations it is possible to use MPLS in a 'stacked' manner, so that labels are added to packets that have already been labelled. Connections between MPLS systems can also be 'native', with no underlying transport protocol, or MPLS labelled packets can be carried inside frame relay or HDLC packets as payloads. Correct throughput calculations need to take such configurations into account. For example, a data packet could have two MPLS labels attached via 'label-stacking', then be placed as payload inside an HDLC frame. This generates more overhead that has to be taken into account that a single MPLS label attached to a packet which is then sent 'natively', with no underlying protocol to a receiving system. [11]

Higher-level protocols

Few systems transfer files and data by simply copying the contents of the file into the 'Data' field of HDLC or PPP frames another protocol layer is used to format the data inside the 'Data' field of the HDLC or PPP frame. The most commonly used such protocol is Internet Protocol (IP), defined by RFC 791. This imposes its own overheads.

Again, few systems simply copy the contents of files into IP packets, but use yet another protocol that manages the connection between two systems TCP (Transmission Control Protocol ), defined by RFC 1812. This adds its own overhead.

Finally, a final protocol layer manages the actual data transfer process. A commonly used protocol for this is the "file transfer protocol [12]

See also

Related Research Articles

<span class="mw-page-title-main">Asynchronous Transfer Mode</span> Digital telecommunications protocol for voice, video, and data

Asynchronous Transfer Mode (ATM) is a telecommunications standard defined by the American National Standards Institute and ITU-T for digital transmission of multiple types of traffic. ATM was developed to meet the needs of the Broadband Integrated Services Digital Network as defined in the late 1980s, and designed to integrate telecommunication networks. It can handle both traditional high-throughput data traffic and real-time, low-latency content such as telephony (voice) and video. ATM provides functionality that uses features of circuit switching and packet switching networks by using asynchronous time-division multiplexing.

In telecommunications, asynchronous communication is transmission of data, generally without the use of an external clock signal, where data can be transmitted intermittently rather than in a steady stream. Any timing required to recover data from the communication symbols is encoded within the symbols.

Multiprotocol Label Switching (MPLS) is a routing technique in telecommunications networks that directs data from one node to the next based on labels rather than network addresses. Whereas network addresses identify endpoints the labels identify established paths between endpoints. MPLS can encapsulate packets of various network protocols, hence the multiprotocol component of the name. MPLS supports a range of access technologies, including T1/E1, ATM, Frame Relay, and DSL.

In computer networking, Point-to-Point Protocol (PPP) is a data link layer communication protocol between two routers directly without any host or any other networking in between. It can provide loop detection authentication, transmission encryption, and data compression.

Network throughput refers to the rate of message delivery over a communication channel, such as Ethernet or packet radio, in a communication network. The data that these messages contain may be delivered over physical or logical links, or through network nodes. Throughput is usually measured in bits per second, and sometimes in data packets per second or data packets per time slot.

<span class="mw-page-title-main">Frame Relay</span> Wide area network technology

Frame Relay is a standardized wide area network (WAN) technology that specifies the physical and data link layers of digital telecommunications channels using a packet switching methodology. Originally designed for transport across Integrated Services Digital Network (ISDN) infrastructure, it may be used today in the context of many other network interfaces.

A frame is a digital data transmission unit in computer networking and telecommunication. In packet switched systems, a frame is a simple container for a single network packet. In other telecommunications systems, a frame is a repeating structure supporting time-division multiplexing.

A virtual circuit (VC) is a means of transporting data over a data network, based on packet switching and in which a connection is first established across the network between two endpoints. The network, rather than having a fixed data rate reservation per connection as in circuit switching, takes advantage of the statistical multiplexing on its transmission links, an intrinsic feature of packet switching.

High-Level Data Link Control (HDLC) is a bit-oriented code-transparent synchronous data link layer protocol developed by the International Organization for Standardization (ISO). The standard for HDLC is ISO/IEC 13239:2002.

The data link layer, or layer 2, is the second layer of the seven-layer OSI model of computer networking. This layer is the protocol layer that transfers data between nodes on a network segment across the physical layer. The data link layer provides the functional and procedural means to transfer data between network entities and may also provide the means to detect and possibly correct errors that can occur in the physical layer.

AX.25 is a data link layer protocol originally derived from layer 2 of the X.25 protocol suite and designed for use by amateur radio operators. It is used extensively on amateur packet radio networks.

CRC-based framing is a kind of frame synchronization used in Asynchronous Transfer Mode (ATM) and other similar protocols.

In telecommunications and computer networking, connection-oriented communication is a communication protocol where a communication session or a semi-permanent connection is established before any useful data can be transferred. The established connection ensures that data is delivered in the correct order to the upper communication layer. The alternative is called connectionless communication, such as the datagram mode communication used by Internet Protocol (IP) and User Datagram Protocol, where data may be delivered out of order, since different network packets are routed independently and may be delivered over different paths.

TCP tuning techniques adjust the network congestion avoidance parameters of Transmission Control Protocol (TCP) connections over high-bandwidth, high-latency networks. Well-tuned networks can perform up to 10 times faster in some cases. However, blindly following instructions without understanding their real consequences can hurt performance as well.

Protocol spoofing is used in data communications to improve performance in situations where an existing protocol is inadequate, for example due to long delays or high error rates.

Binary Synchronous Communication is an IBM character-oriented, half-duplex link protocol, announced in 1967 after the introduction of System/360. It replaced the synchronous transmit-receive (STR) protocol used with second generation computers. The intent was that common link management rules could be used with three different character encodings for messages.

In computer networking, jumbo frames are Ethernet frames with more than 1500 bytes of payload, the limit set by the IEEE 802.3 standard. The payload limit for jumbo frames is variable: while 9000 bytes is the most commonly used limit, smaller and larger limits exist. Many Gigabit Ethernet switches and Gigabit Ethernet network interface controllers and some Fast Ethernet switches and Fast Ethernet network interface cards can support jumbo frames.

The Microcom Networking Protocols, almost always shortened to MNP, is a family of error-correcting protocols commonly used on early high-speed modems. Originally developed for use on Microcom's own family of modems, the protocol was later openly licensed and used by most of the modem industry, notably the "big three", Telebit, USRobotics and Hayes. MNP was later supplanted by V.42bis, which was used almost universally starting with the first V.32bis modems in the early 1990s.

In computer networks, goodput is the application-level throughput of a communication; i.e. the number of useful information bits delivered by the network to a certain destination per unit of time. The amount of data considered excludes protocol overhead bits as well as retransmitted data packets. This is related to the amount of time from the first bit of the first packet sent until the last bit of the last packet is delivered.

References

  1. Comer, D. E. (2008). Computer Networks and Internets 5th Edition
  2. "Mathematical Modelling of TCP Throughput Performance" (PDF).
  3. Comer, D. E. (2008). Computer Networks and Internets 5th Edition
  4. Comer, D. E. (2008). Computer Networks and Internets 5th Edition
  5. Cisco System, Inc. (2001-2006). Cisco IOS IP Configuration Guide
  6. Lydia Parziale, D. T. (2006). TCP/IP TUTORIAL AND TECHNICAL OVERVIEW
  7. Lammle, T. (2002). Cisco Certified Network Associate. London
  8. Lydia Parziale, D. T. (2006). TCP/IP TUTORIAL AND TECHNICAL OVERVIEW
  9. Comer, D. E. (2008). Computer Networks and Internets 5th Edition
  10. Comer, D. E. (2008). Computer Networks and Internets 5th Edition
  11. Smith, S. (2003). Introductions To MPLS. CISCO
  12. Lydia Parziale, D. T. (2006). TCP/IP TUTORIAL AND TECHNICAL OVERVIEW