Traffic classification

Last updated

Traffic classification is an automated process which categorises computer network traffic according to various parameters (for example, based on port number or protocol) into a number of traffic classes. [1] Each resulting traffic class can be treated differently in order to differentiate the service implied for the data generator or consumer.

Contents

Typical uses

Packets are classified to be differently processed by the network scheduler. Upon classifying a traffic flow using a particular protocol, a predetermined policy can be applied to it and other flows to either guarantee a certain quality (as with VoIP or media streaming service [2] ) or to provide best-effort delivery. This may be applied at the ingress point (the point at which traffic enters the network, typically an edge device) with a granularity that allows traffic management mechanisms to separate traffic into individual flows and queue, police and shape them differently. [3]

Classification methods

Classification is achieved by various means.

Port numbers

Deep Packet Inspection

Matching bit patterns of data to those of known protocols is a simple widely used technique. An example to match the BitTorrent protocol handshaking phase would be a check to see if a packet began with character 19 which was then followed by the 19-byte string 'BitTorrent protocol'. [4]

A comprehensive comparison of various network traffic classifiers, which depend on Deep Packet Inspection (PACE, OpenDPI, 4 different configurations of L7-filter, NDPI, Libprotoident, and Cisco NBAR), is shown in the Independent Comparison of Popular DPI Tools for Traffic Classification. [5]

Statistical classification

Encrypted traffic classification

Nowadays the traffic is more complex, and more secure, for this, we need a method to classify the encrypted traffic in a different way than the classic mode (based on IP traffic analysis by probes in the core network). A form to achieve this is by using traffic descriptors from connection traces in the radio interface to perform the classification. [7]

This same problem with traffic classification is also present in multimedia traffic. It has been generally proven that using methods based on neural networks, vector support machines, statistics, and the nearest neighbors are a great way to do this traffic classification, but in some specific cases some methods are better than others, for example: neural networks work better when the whole observation set is taken into account. [8]

Implementation

Both, the Linux network scheduler and Netfilter contain logic to identify and mark or classify network packets.

Typical traffic classes

Operators often distinguish three broad types of network traffic: Sensitive, Best-Effort, and Undesired.[ citation needed ]

Sensitive traffic

Sensitive traffic is traffic the operator has an expectation to deliver on time. This includes VoIP, online gaming, video conferencing, and web browsing. Traffic management schemes are typically tailored in such a way that the quality of service of these selected uses is guaranteed, or at least prioritized over other classes of traffic. This can be accomplished by the absence of shaping for this traffic class, or by prioritizing sensitive traffic above other classes.

Best-effort traffic

Best effort traffic is all other kinds of non-detrimental traffic. This is traffic that the ISP deems isn't sensitive to Quality of Service metrics (jitter, packet loss, latency). A typical example would be peer-to-peer and email applications. [9] Traffic management schemes are generally tailored so best-effort traffic gets what is left after sensitive traffic.

Undesired traffic

This category is generally limited to the delivery of spam and traffic created by worms, botnets, and other malicious attacks. In some networks, this definition can include such traffic as non-local VoIP (for example, Skype) or video streaming services to protect the market for the 'in-house' services of the same type. In these cases, traffic classification mechanisms identify this traffic, allowing the network operator to either block this traffic entirely, or severely hamper its operation.

File sharing

Peer-to-peer file sharing applications are often designed to use any and all available bandwidth which impacts QoS-sensitive applications (like online gaming) that use comparatively small amounts of bandwidth. P2P programs can also suffer from download strategy inefficiencies, namely downloading files from any available peer, regardless of link cost. The applications use ICMP and regular HTTP traffic to discover servers and download directories of available files.

In 2002, Sandvine Incorporated determined, through traffic analysis, that P2P traffic accounted for up to 60% of traffic on most networks. [10] This shows, in contrast to previous studies and forecasts, that P2P has become mainstream.

P2P protocols can and are often designed so that the resulting packets are harder to identify (to avoid detection by traffic classifiers), and with enough robustness that they do not depend on specific QoS properties in the network (in-order packet delivery, jitter, etc. - typically this is achieved through increased buffering and reliable transport, with the user experiencing increased download time as a result). The encrypted BitTorrent protocol does for example rely on obfuscation and randomized packet sizes in order to avoid identification. [11] File sharing traffic can be appropriately classified as Best-Effort traffic. At peak times when sensitive traffic is at its height, download speeds will decrease. However, since P2P downloads are often background activities, it affects the subscriber experience little, so long as the download speeds increase to their full potential when all other subscribers hang up their VoIP phones. Exceptions are real-time P2P VoIP and P2P video streaming services who need permanent QoS and use excessive[ citation needed ] overhead and parity traffic to enforce this as far as possible.

Some P2P applications [12] can be configured to act as self-limiting sources, serving as a traffic shaper configured to the user's (as opposed to the network operator's) traffic specification.

Some vendors advocate managing clients rather than specific protocols, particularly for ISPs. By managing per-client (that is, per customer), if the client chooses to use their fair share of the bandwidth running P2P applications, they can do so, but if their application is abusive, they only clog their own bandwidth and cannot affect the bandwidth used by other customers.

Related Research Articles

Quality of service (QoS) is the description or measurement of the overall performance of a service, such as a telephony or computer network, or a cloud computing service, particularly the performance seen by the users of the network. To quantitatively measure quality of service, several related aspects of the network service are often considered, such as packet loss, bit rate, throughput, transmission delay, availability, jitter, etc.

Voice over Internet Protocol (VoIP), also called IP telephony, is a method and group of technologies for voice calls for the delivery of voice communication sessions over Internet Protocol (IP) networks, such as the Internet.

Differentiated services or DiffServ is a computer networking architecture that specifies a mechanism for classifying and managing network traffic and providing quality of service (QoS) on modern IP networks. DiffServ can, for example, be used to provide low-latency to critical network traffic such as voice or streaming media while providing best-effort service to non-critical services such as web traffic or file transfers.

BitTorrent, also referred to as simply torrent, is a communication protocol for peer-to-peer file sharing (P2P), which enables users to distribute data and electronic files over the Internet in a decentralized manner. The protocol is developed and maintained by Rainberry, Inc., and was first released in 2001.

Traffic shaping is a bandwidth management technique used on computer networks which delays some or all datagrams to bring them into compliance with a desired traffic profile. Traffic shaping is used to optimize or guarantee performance, improve latency, or increase usable bandwidth for some kinds of packets by delaying other kinds. It is often confused with traffic policing, the distinct but related practice of packet dropping and packet marking.

An anonymous P2P communication system is a peer-to-peer distributed application in which the nodes, which are used to share resources, or participants are anonymous or pseudonymous. Anonymity of participants is usually achieved by special routing overlay networks that hide the physical location of each node from other participants.

Deep packet inspection (DPI) is a type of data processing that inspects in detail the data being sent over a computer network, and may take actions such as alerting, blocking, re-routing, or logging it accordingly. Deep packet inspection is often used for baselining application behavior, analyzing network usage, troubleshooting network performance, ensuring that data is in the correct format, checking for malicious code, eavesdropping, and internet censorship, among other purposes. There are multiple headers for IP packets; network equipment only needs to use the first of these for normal operation, but use of the second header is normally considered to be shallow packet inspection despite this definition.

Internet traffic is the flow of data within the entire Internet, or in certain network links of its constituent networks. Common traffic measurements are total volume, in units of multiples of the byte, or as transmission rates in bytes per certain time units.

An overlay network is a computer network that is layered on top of another network. The concept of overlay networking is distinct from the traditional model of OSI layered networks, and almost always assumes that the underlay network is an IP network of some kind.

<span class="mw-page-title-main">Sandvine</span>

Sandvine Incorporated is an application and network intelligence company based in Waterloo, Ontario.

Bandwidth throttling consists in the limitation of the communication speed, of the ingoing (received) or outgoing (sent) data in a network node or in a network device.

BitTorrent is a proprietary adware BitTorrent client developed by Bram Cohen and Rainberry, Inc. used for uploading and downloading files via the BitTorrent protocol. BitTorrent was the first client written for the protocol. It is often nicknamed Mainline by developers denoting its official origins. Since version 6.0 the BitTorrent client has been a rebranded version of μTorrent. As a result, it is no longer open source. It is currently available for Microsoft Windows, Mac, Linux, iOS and Android. There are currently two versions of the software, "BitTorrent Classic" which inherits the historical version numbering, and "BitTorrent Web", which uses its own version numbering.

The Skype protocol is a proprietary Internet telephony network used by Skype. The protocol's specifications have not been made publicly available by Skype and official applications using the protocol are closed-source.

In computing, Microsoft's Windows Vista and Windows Server 2008 introduced in 2007/2008 a new networking stack named Next Generation TCP/IP stack, to improve on the previous stack in several ways. The stack includes native implementation of IPv6, as well as a complete overhaul of IPv4. The new TCP/IP stack uses a new method to store configuration settings that enables more dynamic control and does not require a computer restart after a change in settings. The new stack, implemented as a dual-stack model, depends on a strong host-model and features an infrastructure to enable more modular components that one can dynamically insert and remove.

Hart v. Comcast was a suit filed by Jon Hart, a citizen of California against Comcast in Alameda County. Comcast is a provider of internet access and services. The suit alleged that Comcast was illegally interfering with certain types of internet traffic, such as BitTorrent. The suit alleged that Comcast is guilty of false advertising for advertising high speed services yet deliberately using technology to interfere with access speeds. The suit also claimed Comcast's actions violated established Federal Communications Commission policies on Net Neutrality. The case has since been settled out of court.

Peer-to-peer caching is a computer network traffic management technology used by Internet Service Providers (ISPs) to accelerate content delivered over peer-to-peer (P2P) networks while reducing related bandwidth costs.

The NetEqualizer is a bandwidth shaping appliance designed for voice and data networks, created by APconnections in 2003. NetEqualizer traffic shaping appliances use built-in behavior-based algorithms to automatically shape traffic during peak periods on the network. When the network is congested, the fairness algorithms favor business class applications at the expense of large file downloads. The favored applications include those such as VoIP, web browsing, web-based applications, chat and email. Traffic is prioritized based on the nature of the traffic, so the NetEqualizer remains Net Neutral.

libtorrent

libtorrent is an open-source implementation of the BitTorrent protocol. It is written in and has its main library interface in C++. Its most notable features are support for Mainline DHT, IPv6, HTTP seeds and μTorrent's peer exchange. libtorrent uses Boost, specifically Boost.Asio to gain its platform independence. It is known to build on Windows and most Unix-like operating systems.

Network intelligence (NI) is a technology that builds on the concepts and capabilities of deep packet inspection (DPI), packet capture and business intelligence (BI). It examines, in real time, IP data packets that cross communications networks by identifying the protocols used and extracting packet content and metadata for rapid analysis of data relationships and communications patterns. Also, sometimes referred to as Network Acceleration or piracy.

Net bias is the counter-principle to net neutrality, which indicates differentiation or discrimination of price and the quality of content or applications on the Internet by ISPs. Similar terms include data discrimination, digital redlining, and network management.

References

  1. IETF RFC 2475 "An Architecture for Differentiated Services" section 2.3.1 - IETF definition of classifier.
  2. SIN 450 Issue 1.2 May 2007 Suppliers' Information Note For The BT Network BT Wholesale - BT IPstream Advanced Services - End User Speed Control and Downstream Quality of Service - Service Description
  3. Ferguson P., Huston G., Quality of Service: Delivering QoS on the Internet and in Corporate Networks, John Wiley & Sons, Inc., 1998. ISBN   0-471-24358-2.
  4. BitTorrent Protocol
  5. Tomasz Bujlow; Valentín Carela-Español; Pere Barlet-Ros. "Independent Comparison of Popular DPI Tools for Traffic Classification". In press (Computer Networks). Retrieved 2014-11-10.
  6. E. Hjelmvik and W. John, “Statistical Protocol IDentification with SPID: Preliminary Results”, in Proceedings of SNCNW, 2009
  7. Gijón, Carolina (2020). "Encrypted Traffic Classification Based on Unsupervised Learning in Cellular Radio Access Networks". IEEE. 8: 167252–167263. doi: 10.1109/ACCESS.2020.3022980 . S2CID   221913926.
  8. Canovas, Alejandro (2018). "Multimedia Data Flow Traffic Classification Using Intelligent Models Based on Traffic Patterns". IEEE. 32 (6): 100–107. doi:10.1109/MNET.2018.1800121. hdl: 10251/116174 . S2CID   54437310.
  9. The spam problem has actually led some network operators to implement Traffic shaping on SMTP traffic. See Tarpit (networking)
  10. Leydon, John. "P2P swamps broadband networks". The Register . The Register article which refers to Sandvine report - access to the actual report requires registration with Sandvine
  11. Identifying the Message Stream Encryption (MSE) protocol
  12. "Optimize uTorrent Speeds Jatex Weblog". Example for client side P2P traffic limiting