Internet Content Adaptation Protocol

Last updated

The Internet Content Adaptation Protocol (ICAP) is a lightweight HTTP-like protocol specified in RFC   3507 which is used to extend transparent proxy servers, thereby freeing up resources and standardizing the way in which new features are implemented. ICAP is generally used to implement virus scanning and content filters in transparent HTTP proxy caches. Content adaptation refers to performing the particular value added service (content manipulation) for the associated client request/response.

Contents

ICAP concentrates on leveraging edge-based devices (caching proxies) to help deliver value-added services. At the core of this process is a cache that will proxy all client transactions and will process them through web servers. These ICAP servers are focused on a specific function, for example, ad insertion, virus scanning, multi-AV scanning, content translation, language translation, or content filtering. Off-loading value-added services from web servers to ICAP servers allows those same web servers to be scaled according to raw HTTP throughput versus having to handle these extra tasks.

History

ICAP was proposed in late 1999 by Peter Danzig and John Schuster [1] from Network Appliance. [2] Don Gillies took over the project in the spring of 2000 and enhanced the protocol in three main ways:

Gillies prototyped the first ICAP client and server for the NetCache series of internet caches in mid-2000 (known as ICAP 0.9 protocol) and produced training materials for vendors. The client was written in C++ in the core of the NetCache server, and the demonstration ICAP Server was written in Perl and employed the Debian word-replacement filters to rewrite web pages, skipping over the HTML tags, and translating web pages into Swedish Chef or Jive in real time. [3] With knowledge learned from the prototyping experience, Gillies revised the IETF draft standard to make RPCs using only chunked encoding, greatly simplifying the ICAP protocol. [1]

Related Research Articles

The Domain Name System (DNS) is the hierarchical and decentralized naming system used to identify computers reachable through the Internet or other Internet Protocol (IP) networks. The resource records contained in the DNS associate domain names with other forms of information. These are most commonly used to map human-friendly domain names to the numerical IP addresses computers need to locate services and devices using the underlying network protocols, but have been extended over time to perform many other functions as well. The Domain Name System has been an essential component of the functionality of the Internet since 1985.

Hypertext Transfer Protocol Application protocol for distributed, collaborative, hypermedia information systems

The Hypertext Transfer Protocol (HTTP) is an application layer protocol in the Internet protocol suite model for distributed, collaborative, hypermedia information systems. HTTP is the foundation of data communication for the World Wide Web, where hypertext documents include hyperlinks to other resources that the user can easily access, for example by a mouse click or by tapping the screen in a web browser.

Web server Computer software that distributes web pages

A web server is computer software and underlying hardware that accepts requests via HTTP or its secure variant HTTPS. A user agent, commonly a web browser or web crawler, initiates communication by making a request for a web page or other resource using HTTP, and the server responds with the content of that resource or an error message. A web server can also accept and store resources sent from the user agent if configured to do so.

Network address translation Protocol facilitating connection of one IP address space to another

Network address translation (NAT) is a method of mapping an IP address space into another by modifying network address information in the IP header of packets while they are in transit across a traffic routing device. The technique was originally used to avoid the need to assign a new address to every host when a network was moved, or when the upstream Internet service provider was replaced, but could not route the network's address space. It has become a popular and essential tool in conserving global address space in the face of IPv4 address exhaustion. One Internet-routable IP address of a NAT gateway can be used for an entire private network.

Proxy server Computer server that makes and receives requests on behalf of a user

In computer networking, a proxy server is a server application that acts as an intermediary between a client requesting a resource and the server providing that resource.

SOCKS is an Internet protocol that exchanges network packets between a client and server through a proxy server. SOCKS5 optionally provides authentication so only authorized users may access a server. Practically, a SOCKS server proxies TCP connections to an arbitrary IP address, and provides a means for UDP packets to be forwarded.

The Internet Printing Protocol (IPP) is a specialized Internet protocol for communication between client devices and printers. It allows clients to submit one or more print jobs to the printer or print server, and perform tasks such as querying the status of a printer, obtaining the status of print jobs, or cancelling individual print jobs.

Content delivery network Layer in the internet ecosystem addressing bottlenecks

A content delivery network, or content distribution network (CDN), is a geographically distributed network of proxy servers and their data centers. The goal is to provide high availability and performance by distributing the service spatially relative to end users. CDNs came into existence in the late 1990s as a means for alleviating the performance bottlenecks of the Internet as the Internet was starting to become a mission-critical medium for people and enterprises. Since then, CDNs have grown to serve a large portion of the Internet content today, including web objects, downloadable objects, applications, live streaming media, on-demand streaming media, and social media sites.

In the context of an HTTP transaction, basic access authentication is a method for an HTTP user agent to provide a user name and password when making a request. In basic HTTP authentication, a request contains a header field in the form of Authorization: Basic <credentials>, where credentials is the Base64 encoding of ID and password joined by a single colon :.

HTTP pipelining

HTTP pipelining is a feature of HTTP/1.1 which allows multiple HTTP requests to be sent over a single TCP connection without waiting for the corresponding responses. HTTP/1.1 specification requires servers to respond to pipelined requests correctly, sending back non-pipelined but valid responses even if server does not support HTTP pipelining. Despite this requirement, many legacy HTTP/1.1 servers do not support pipelining correctly, forcing most HTTP clients to not use HTTP pipelining in practice.

HTTP compression Capability that can be built into web servers and web clients

HTTP compression is a capability that can be built into web servers and web clients to improve transfer speed and bandwidth utilization.

X-Forwarded-For HTTP header field

The X-Forwarded-For (XFF) HTTP header field is a common method for identifying the originating IP address of a client connecting to a web server through an HTTP proxy or load balancer.

Oracle iPlanet Web Proxy Server

The Oracle iPlanet Web Proxy Server (OiWPS), formerly known as Sun Java System Web Proxy Server (SJSWPS), is a proxy server software developed by Sun Microsystems.

WebSocket Computer network protocol

WebSocket is a computer communications protocol, providing full-duplex communication channels over a single TCP connection. The WebSocket protocol was standardized by the IETF as RFC 6455 in 2011. The current specification is known as the HTML Living Standard. It is maintained by the Web Hypertext Application Technology Working Group (WHATWG), a consortium of the major browser vendors.

Deep content inspection (DCI) is a form of network filtering that examines an entire file or MIME object as it passes an inspection point, searching for viruses, spam, data loss, key words or other content level criteria. Deep Content Inspection is considered the evolution of Deep Packet Inspection with the ability to look at what the actual content contains instead of focusing on individual or multiple packets. Deep Content Inspection allows services to keep track of content across multiple packets so that the signatures they may be searching for can cross packet boundaries and yet they will still be found. An exhaustive form of network traffic inspection in which Internet traffic is examined across all the seven OSI ISO layers, and most importantly, the application layer.

Constrained Application Protocol (CoAP) is a specialized Internet application protocol for constrained devices, as defined in RFC 7252. It enables those constrained devices called "nodes" to communicate with the wider Internet using similar protocols. CoAP is designed for use between devices on the same constrained network, between devices and general nodes on the Internet, and between devices on different constrained networks both joined by an internet. CoAP is also being used via other mechanisms, such as SMS on mobile communication networks.

HTTP/2 is a major revision of the HTTP network protocol used by the World Wide Web. It was derived from the earlier experimental SPDY protocol, originally developed by Google. HTTP/2 was developed by the HTTP Working Group of the Internet Engineering Task Force (IETF). HTTP/2 is the first new version of HTTP since HTTP/1.1, which was standardized in RFC 2068 in 1997. The Working Group presented HTTP/2 to the Internet Engineering Steering Group (IESG) for consideration as a Proposed Standard in December 2014, and IESG approved it to publish as Proposed Standard on February 17, 2015. The HTTP/2 specification was published as RFC 7540 on May 14, 2015.

DNS over HTTPS (DoH) is a protocol for performing remote Domain Name System (DNS) resolution via the HTTPS protocol. A goal of the method is to increase user privacy and security by preventing eavesdropping and manipulation of DNS data by man-in-the-middle attacks by using the HTTPS protocol to encrypt the data between the DoH client and the DoH-based DNS resolver. By March 2018, Google and the Mozilla Foundation had started testing versions of DNS over HTTPS. In February 2020, Firefox switched to DNS over HTTPS by default for users in the United States.

References

  1. 1 2 J. Elson; A. Cerpa (2003). Internet Content Adaptation Protocol (ICAP). IETF. doi: 10.17487/RFC3507 . RFC 3507.
  2. "Internet Content Adaptation Protocol (ICAP)" (PDF). NetApp. 2001-07-30.
  3. Gillies, Donald. "ICAP Installation Instructions". UBC ECE Dept. Retrieved 2016-01-04.