Squid (software)

Last updated

Squid
Developer(s) Duane Wessels, Henrik Nordström, Amos Jeffries, Alex Rousskov, Francesco Chemolli, Robert Collins, Guido Serassio and volunteers [2]
Initial releaseJuly 1996;27 years ago (1996-07)
Stable release
6.7 [3]   OOjs UI icon edit-ltr-progressive.svg / 4 February 2024
Repository github.com/squid-cache/squid
Written in C++ [4]
Operating system BSD, Linux, Unix, Windows [5]
Type Proxy server
License GPL 2.0 or later [6]
Website www.squid-cache.org
The LAMP stack with Squid as web cache. LAMP software bundle.svg
The LAMP stack with Squid as web cache.

Squid is a caching and forwarding HTTP web proxy. It has a wide variety of uses, including speeding up a web server by caching repeated requests, caching World Wide Web (WWW), Domain Name System (DNS), and other network lookups for a group of people sharing network resources, and aiding security by filtering traffic. Although used for mainly HTTP and File Transfer Protocol (FTP), Squid includes limited support for several other protocols including Internet Gopher, Secure Sockets Layer (SSL), [7] Transport Layer Security (TLS), and Hypertext Transfer Protocol Secure (HTTPS). Squid does not support the SOCKS protocol, unlike Privoxy, with which Squid can be used in order to provide SOCKS support.

Contents

Squid was originally designed to run as a daemon on Unix-like systems. A Windows port was maintained up to version 2.7. New versions available on Windows use the Cygwin environment. [8] [9] Squid is free software released under the GNU General Public License.

History

Squid was originally developed as the Harvest object cache, [7] part of the Harvest project at the University of Colorado Boulder. [10] [11] Further work on the program was completed at the University of California, San Diego and funded via two grants from the National Science Foundation. [12] Duane Wessels forked the "last pre-commercial version of Harvest" and renamed it to Squid to avoid confusion with the commercial fork called Cached 2.0, which became NetCache. [13] [14] Squid version 1.0.0 was released in July 1996. [13] SquidNT, a port of the Squid proxy server was merged into the main Squid project in September 2006. [15]

Squid is now developed almost exclusively through volunteer efforts.

In October 2023, it was revealed that Squid continued to suffer from 35 security vulnerabilities which had not been fixed for two and a half years after their initial reporting. [16]

Basic functionality

After a Squid proxy server is installed, web browsers can be configured to use it as a proxy HTTP server, allowing Squid to retain copies of the documents returned, which, on repeated requests for the same documents, can reduce access time as well as bandwidth consumption. This is often useful for Internet service providers to increase speed to their customers, and LANs that share an Internet connection. Because the caching servers are controlled by the web service operator, caching proxies do not anonymize the user and should not be confused with anonymizing proxies.

A client program (e.g. browser) either has to specify explicitly the proxy server it wants to use (typical for ISP customers), or it could be using a proxy without any extra configuration: "transparent caching", in which case all outgoing HTTP requests are intercepted by Squid and all responses are cached. The latter is typically a corporate set-up (all clients are on the same LAN) and often introduces the privacy concerns mentioned above.

Squid has some features that can help anonymize connections, such as disabling or changing specific header fields in a client's HTTP requests. Whether these are set, and what they are set to do, is up to the person who controls the computer running Squid. People requesting pages through a network which transparently uses Squid may not know whether this information is being logged. [17] Within UK organisations at least, users should be informed if computers or internet connections are being monitored. [18]

Reverse proxy

The above setup, caching the contents of an unlimited number of webservers for a limited number of clients, is the classical one. Another setup is "reverse proxy" or "webserver acceleration" (using http_port 80 accel vhost). In this mode, the cache serves an unlimited number of clients for a limited number of—or just one—web servers.

As an example, if slow.example.com is a "real" web server, and www.example.com is the Squid cache server that "accelerates" it, the first time any page is requested from www.example.com, the cache server would get the actual page from slow.example.com, but later requests would get the stored copy directly from the accelerator (for a configurable period, after which the stored copy would be discarded). The end result, without any action by the clients, is less traffic to the source server, meaning less CPU and memory usage, and less need for bandwidth. This does, however, mean that the source server cannot accurately report on its traffic numbers without additional configuration, as all requests would seem to have come from the reverse proxy. A way to adapt the reporting on the source server is to use the X-Forwarded-For HTTP header reported by the reverse proxy, to get the real client's IP address.

It is possible for one Squid server to serve simultaneously as a normal and a reverse proxy. For example, a business might host its own website on a web server, with a Squid server acting as a reverse proxy between clients (customers accessing the website from outside the business) and the web server. The same Squid server could act as a classical web cache, caching HTTP requests from clients within the business (i.e., employees accessing the internet from their workstations), so accelerating web access and reducing bandwidth demands.

Media-range limits

For example, a feature of the HTTP protocol is to limit a request to the range of data in the resource being referenced. This feature is used extensively by video streaming websites such as YouTube, so that if a user clicks to the middle of the video progress bar, the server can begin to send data from the middle of the file, rather than sending the entire file from the beginning and the user waiting for the preceding data to finish loading.

Partial downloads are also extensively used by Microsoft Windows Update so that extremely large update packages can download in the background and pause halfway through the download, if the user turns off their computer or disconnects from the Internet.

The Metalink download format enables clients to do segmented downloads by issuing partial requests and spreading these over a number of mirrors.

Squid can relay partial requests to the origin web server. In order for a partial request to be satisfied at a fast speed from cache, Squid requires a full copy of the same object to already exist in its storage.

If a proxy video user is watching a video stream and browses to a different page before the video completely downloads, Squid cannot keep the partial download for reuse and simply discards the data. Special configuration is required to force such downloads to continue and be cached. [19]

Supported operating systems

Squid supports many operating systems, including:

See also

Related Research Articles

The Domain Name System (DNS) is a hierarchical and distributed naming system for computers, services, and other resources in the Internet or other Internet Protocol (IP) networks. It associates various information with domain names assigned to each of the associated entities. Most prominently, it translates readily memorized domain names to the numerical IP addresses needed for locating and identifying computer services and devices with the underlying network protocols. The Domain Name System has been an essential component of the functionality of the Internet since 1985.

<span class="mw-page-title-main">HTTP</span> Application protocol for distributed, collaborative, hypermedia information systems

The Hypertext Transfer Protocol (HTTP) is an application layer protocol in the Internet protocol suite model for distributed, collaborative, hypermedia information systems. HTTP is the foundation of data communication for the World Wide Web, where hypertext documents include hyperlinks to other resources that the user can easily access, for example by a mouse click or by tapping the screen in a web browser.

<span class="mw-page-title-main">Web server</span> Computer software that distributes web pages

A web server is computer software and underlying hardware that accepts requests via HTTP or its secure variant HTTPS. A user agent, commonly a web browser or web crawler, initiates communication by making a request for a web page or other resource using HTTP, and the server responds with the content of that resource or an error message. A web server can also accept and store resources sent from the user agent if configured to do so.

<span class="mw-page-title-main">Proxy server</span> Computer server that makes and receives requests on behalf of a user

In computer networking, a proxy server is a server application that acts as an intermediary between a client requesting a resource and the server providing that resource. It improves privacy, security, and performance in the process.

<span class="mw-page-title-main">Privoxy</span> Non-caching proxy server

Privoxy is a free non-caching web proxy with filtering capabilities for enhancing privacy, manipulating cookies and modifying web page data and HTTP headers before the page is rendered by the browser. Privoxy is a "privacy enhancing proxy", filtering web pages and removing advertisements. Privoxy can be customized by users, for both stand-alone systems and multi-user networks. Privoxy can be chained to other proxies and is frequently used in combination with Squid among others and can be used to bypass Internet censorship.

SOCKS is an Internet protocol that exchanges network packets between a client and server through a proxy server. SOCKS5 optionally provides authentication so only authorized users may access a server. Practically, a SOCKS server proxies TCP connections to an arbitrary IP address, and provides a means for UDP packets to be forwarded.

A Web cache is a system for optimizing the World Wide Web. It is implemented both client-side and server-side. The caching of multimedia and other files can result in less overall delay when browsing the Web.

URL redirection, also called URL forwarding, is a World Wide Web technique for making a web page available under more than one URL address. When a web browser attempts to open a URL that has been redirected, a page with a different URL is opened. Similarly, domain redirection or domain forwarding is when all pages in a URL domain are redirected to a different domain, as when wikipedia.com and wikipedia.net are automatically redirected to wikipedia.org.

The Web Proxy Auto-Discovery (WPAD) Protocol is a method used by clients to locate the URL of a configuration file using DHCP and/or DNS discovery methods. Once detection and download of the configuration file is complete, it can be executed to determine the proxy for a specified URL.

<span class="mw-page-title-main">Reverse proxy</span> Type of proxy server

In computer networks, a reverse proxy is an application that sits in front of back-end servers and forwards client requests to those servers instead of having the client directly taking to the servers. Reverse proxies help increase scalability, performance, resilience and security. The resources returned to the client appear as if they originated from the web server itself.

A web accelerator is a proxy server that reduces website access time. They can be a self-contained hardware appliance or installable software.

This article presents a comparison of the features, platform support, and packaging of many independent implementations of Domain Name System (DNS) name server software.

A proxy auto-config (PAC) file defines how web browsers and other user agents can automatically choose the appropriate proxy server for fetching a given URL.

<span class="mw-page-title-main">HTTP compression</span> Capability that can be built into web servers and web clients

HTTP compression is a capability that can be built into web servers and web clients to improve transfer speed and bandwidth utilization.

<span class="mw-page-title-main">X-Forwarded-For</span> HTTP header field

The X-Forwarded-For (XFF) HTTP header field is a common method for identifying the originating IP address of a client connecting to a web server through an HTTP proxy or load balancer.

A home server is a computing server located in a private computing residence providing services to other devices inside or outside the household through a home network or the Internet. Such services may include file and printer serving, media center serving, home automation control, web serving, web caching, file sharing and synchronization, video surveillance and digital video recorder, calendar and contact sharing and synchronization, account authentication, and backup services.

<span class="mw-page-title-main">Polipo</span>

Polipo is a discontinued lightweight caching and forwarding web proxy server. It has a wide variety of uses, from aiding security by filtering traffic; to caching web, DNS and other computer network lookups for a group of people sharing network resources; to speeding up a web server by caching repeated requests. It can be configured to use on-disk cache and serve cached content when offline and perform various forms of content filtering.

<span class="mw-page-title-main">WebSocket</span> Computer network protocol

WebSocket is a computer communications protocol, providing simultaneous two-way communication channels over a single Transmission Control Protocol (TCP) connection. The WebSocket protocol was standardized by the IETF as RFC 6455 in 2011. The current specification allowing web applications to use this protocol is known as WebSockets. It is a living standard maintained by the WHATWG and a successor to The WebSocket API from the W3C.

References

  1. "Squid Project Logo" . Retrieved 6 July 2014.
  2. "Who looks after the Squid project?".
  3. "squid : Optimising Web Delivery". 4 February 2024. Retrieved 4 February 2024.
  4. squid-cache/squid, Squid, 27 July 2022, retrieved 27 July 2022
  5. "What is the Best OS for Squid?".
  6. "Squid License".
  7. 1 2 C.Mic Bowman, Peter B. Danzig, Darren R. Hardy, Udi Manper, Michael F. Schwartz, The Harvest information discovery and access system, Computer Networks and ISDN Systems, Volume 28, Issues 1–2, December 1995, Pages 119–125. doi:10.1016/0169-7552(95)00098-5
  8. "Squid for Windows". GitHub . February 2024. Current build is based on the latest Squid 4 build for Cygwin Windows 64 bit
  9. "Squid-cache.org Knowledge Base". Squid on Windows
  10. Squid intro, on the Squid website
  11. Harvest cache now available as an "httpd accelerator", by Mike Schwartz on the http-wg mailing list, Tue, 4 April 1995, as forwarded by Brian Behlendorf to the Apache HTTP Server developers' mailing list
  12. "Squid Sponsors". Archived from the original on 11 May 2007. Retrieved 13 February 2007. The NSF was the primary funding source for Squid development from 1996–2000. Two grants (#NCR-9616602, #NCR-9521745) received through the Advanced Networking Infrastructure and Research (ANIR) Division were administered by the University of California San Diego
  13. 1 2 Duane Wessels Squid and ICP: Past, Present, and Future, Proceedings of the Australian Unix Users Group. September 1997, Brisbane, Australia
  14. "netcache.com". Archived from the original on 12 November 1996. Retrieved 7 August 2012.
  15. "Squid FAQ: Does Squid run on Windows?".
  16. "55 Vulnerabilities in Squid Caching Proxy and 35 0days". 11 October 2023.
  17. See the documentation for header_access and header_replace for further details.
  18. See, for example, Computer Monitoring In The Workplace and Your Privacy
  19. "Squid Configuration Reference" . Retrieved 26 November 2012.
  20. OS/2 Ports by Paul Smedley, OS/2 Ports
  21. "KnowledgeBase/Windows - Squid Web Proxy Wiki".

Further reading