In computing, the User-Agent header is an HTTP header intended to identify the user agent responsible for making a given HTTP request. Whereas the character sequence User-Agent
comprises the name of the header itself, the header value that a given user agent uses to identify itself is colloquially known as its user agent string. The user agent for the operator of a computer used to access the Web has encoded within the rules that govern its behavior the knowledge of how to negotiate its half of a request-response transaction; the user agent thus plays the role of the client in a client–server system. Often considered useful in networks is the ability to identify and distinguish the software facilitating a network session. For this reason, the User-Agent HTTP header exists to identify the client software to the responding server.
When a software agent operates in a network protocol, it often identifies itself, its application type, operating system, device model, software vendor, or software revision, by submitting a characteristic identification string to its operating peer. In HTTP, [1] SIP, [2] and NNTP [3] protocols, this identification is transmitted in a header field User-Agent. Bots, such as Web crawlers, often also include a URL and/or e-mail address so that the Webmaster can contact the operator of the bot.
In HTTP, the "user agent string" is often used for content negotiation, where the origin server selects suitable content or operating parameters for the response. For example, the user agent string might be used by a web server to choose variants based on the known capabilities of a particular version of client software. The concept of content tailoring is built into the HTTP standard in RFC 1945 "for the sake of tailoring responses to avoid particular user agent limitations".
The user agent string is one of the criteria by which Web crawlers may be excluded from accessing certain parts of a website using the Robots Exclusion Standard (robots.txt file).
As with many other HTTP request headers, the information in the user agent string contributes to the information that the client sends to the server, since the string can vary considerably from user to user. [4]
The user agent string format is currently specified by section 10.1.5 of HTTP Semantics. The format of the user agent string in HTTP is a list of product tokens (keywords) with optional comments. For example, if a user's product were called WikiBrowser, their user agent string might be WikiBrowser/1.0 Gecko/1.0. The "most important" product component is listed first.
The parts of this string are as follows:
During the first browser war, many web servers were configured to send web pages that required advanced features, including frames, to clients that were identified as some version of Mozilla only. [5] Other browsers were considered to be older products such as Mosaic, Cello, or Samba, and would be sent a bare bones HTML document.
For this reason, most Web browsers use a user agent string value as follows:
For example, Safari on the iPad has used the following:
Mozilla/5.0 (iPad; U; CPU OS 3_2_1 like Mac OS X; en-us) AppleWebKit/531.21.10 (KHTML, like Gecko) Mobile/7B405
The components of this string are as follows:
Before migrating to the Chromium code base, Opera was the most widely used web browser that did not have the user agent string with "Mozilla" (instead beginning it with "Opera"). Since July 15, 2013, [6] Opera's user agent string begins with "Mozilla/5.0" and, to avoid encountering legacy server rules, no longer includes the word "Opera" (instead using the string "OPR" to denote the Opera version).
Automated web crawling tools can use a simplified form, where an important field is contact information in case of problems. By convention the word "bot" is included in the name of the agent. For example:
Googlebot/2.1 (+http://www.google.com/bot.html)
Automated agents are expected to follow rules in a special file called "robots.txt".
Web browsers created in the United States, such as Netscape Navigator and Internet Explorer, previously used the letters U, I, and N to specify the encryption strength in the user agent string. Until 1996, when the United States government allowed encryption with keys longer than 40 bits to be exported, vendors shipped various browser versions with different encryption strengths. "U" stands for "USA" (for the version with 128-bit encryption), "I" stands for "International" – the browser has 40-bit encryption and can be used anywhere in the world – and "N" stands (de facto) for "None" (no encryption). [7] Following the lifting of export restrictions, most vendors supported 256-bit encryption.
The popularity of various Web browser products has varied throughout the Web's history, and this has influenced the design of websites in such a way that websites are sometimes designed to work well only with particular browsers, rather than according to uniform standards by the World Wide Web Consortium (W3C) or the Internet Engineering Task Force (IETF). Websites often include code to detect browser version to adjust the page design sent according to the user agent string received. This may mean that less-popular browsers are not sent complex content (even though they might be able to deal with it correctly) or, in extreme cases, refused all content. [8] Thus, various browsers have a feature to cloak or spoof their identification to force certain server-side content. For example, the Android browser identifies itself as Safari (among other things) in order to aid compatibility. [9] [10]
Other HTTP client programs, like download managers and offline browsers, often have the ability to change the user agent string.
A result of user agent spoofing may be that collected statistics of Web browser usage are inaccurate.
User agent sniffing is the practice of websites showing different or adjusted content when viewed with certain user agents. An example of this is Microsoft Exchange Server 2003's Outlook Web Access feature. When viewed with Internet Explorer 6 or newer, more functionality is displayed compared to the same page in any other browsers. User agent sniffing is considered poor practice, since it encourages browser-specific design and penalizes new browsers with unrecognized user agent identifications. Instead, the W3C recommends creating standard HTML markup, [11] allowing correct rendering in as many browsers as possible, and to test for specific browser features rather than particular browser versions or brands. [12]
Websites intended for display by mobile phones often rely on user agent sniffing, since mobile browsers often differ greatly from each other.
In 2020, Google announced that they would be freezing parts of the User-Agent header in their Chrome browser since it was no longer being used to determine browser capabilities and instead mainly being used for passive browser fingerprinting. [13] Google stated that a new feature called Client Hints would replace the functionality of the user agent string. [14]
Starting with Chrome 113, released in April 2023, the User-Agent header was partially frozen. The user-agent in newer version of Chrome would remain static except for the digits that represented the major version of the browser being used. [15]
This section needs to be updated. The reason given is: Firefox 120 has been released.(October 2024) |
Starting with Firefox 110 released in February 2023, [16] Mozilla announced it would temporarily freeze portions of the browser's user agent string at version 109. This was done due to several websites incorrectly recognizing a development version of the browser (which identified itself by the string Mozilla/5.0 (Windows NT 10.0; Win64; rv:110.0) Gecko/20100101 Firefox/110.0
) [17] as the deprecated Internet Explorer 11 (which reports Mozilla/5.0 (Windows NT 10.0; Trident/7.0; rv:11.0) like Gecko
). [18] The problem will self-correct after the release of Firefox 120, as only browsers identifying themselves as 110 through 119 were observed to be affected by it. [19]
Hypertext Transfer Protocol Secure (HTTPS) is an extension of the Hypertext Transfer Protocol (HTTP). It uses encryption for secure communication over a computer network, and is widely used on the Internet. In HTTPS, the communication protocol is encrypted using Transport Layer Security (TLS) or, formerly, Secure Sockets Layer (SSL). The protocol is therefore also referred to as HTTP over TLS, or HTTP over SSL.
Gecko is a browser engine developed by Mozilla. It is used in the Firefox browser, the Thunderbird email client, and many other projects.
This is a comparison of both historical and current web browsers based on developer, engine, platform(s), releases, license, and cost.
Digest access authentication is one of the agreed-upon methods a web server can use to negotiate credentials, such as username or password, with a user's web browser. This can be used to confirm the identity of a user before sending sensitive information, such as online banking transaction history. It applies a hash function to the username and password before sending them over the network. In contrast, basic access authentication uses the easily reversible Base64 encoding instead of hashing, making it non-secure unless used in conjunction with TLS.
In computer hypertext, a URI fragment is a string of characters that refers to a resource that is subordinate to another, primary resource. The primary resource is identified by a Uniform Resource Identifier (URI), and the fragment identifier points to the subordinate resource.
The usage share of web browsers is the portion, often expressed as a percentage, of visitors to a group of web sites that use a particular web browser.
In HTTP, "Referer" is an optional HTTP header field that identifies the address of the web page from which the resource has been requested. By checking the referrer, the server providing the new web page can see where the request originated.
Server Name Indication (SNI) is an extension to the Transport Layer Security (TLS) computer networking protocol by which a client indicates which hostname it is attempting to connect to at the start of the handshaking process. The extension allows a server to present one of multiple possible certificates on the same IP address and TCP port number and hence allows multiple secure (HTTPS) websites to be served by the same IP address without requiring all those sites to use the same certificate. It is the conceptual equivalent to HTTP/1.1 name-based virtual hosting, but for HTTPS. This also allows a proxy to forward client traffic to the right server during TLS/SSL handshake. The desired hostname is not encrypted in the original SNI extension, so an eavesdropper can see which site is being requested. The SNI extension was specified in 2003 in RFC 3546
Web storage, sometimes known as DOM storage, is a standard JavaScript API provided by web browsers. It enables websites to store persistent data on users' devices similar to cookies, but with much larger capacity and no information sent in HTTP headers. There are two main web storage types: local storage and session storage, behaving similarly to persistent cookies and session cookies respectively. Web Storage is standardized by the World Wide Web Consortium (W3C) and WHATWG, and is supported by all major browsers.
The W3C Geolocation API is an effort by the World Wide Web Consortium (W3C) to standardize an interface to retrieve the geographical location information for a client-side device. It defines a set of objects, ECMAScript standard compliant, that executing in the client application give the client's device location through the consulting of Location Information Servers, which are transparent for the application programming interface (API). The most common sources of location information are IP address, Wi-Fi and Bluetooth MAC address, radio-frequency identification (RFID), Wi-Fi connection location, or device Global Positioning System (GPS) and GSM/CDMA cell IDs. The location is returned with a given accuracy depending on the best location information source available.
HTTP Strict Transport Security (HSTS) is a policy mechanism that helps to protect websites against man-in-the-middle attacks such as protocol downgrade attacks and cookie hijacking. It allows web servers to declare that web browsers should automatically interact with it using only HTTPS connections, which provide Transport Layer Security (TLS/SSL), unlike the insecure HTTP used alone. HSTS is an IETF standards track protocol and is specified in RFC 6797.
WebSocket is a computer communications protocol, providing a simultaneous two-way communication channel over a single Transmission Control Protocol (TCP) connection. The WebSocket protocol was standardized by the IETF as RFC 6455 in 2011. The current specification allowing web applications to use this protocol is known as WebSockets. It is a living standard maintained by the WHATWG and a successor to The WebSocket API from the W3C.
Cross-origin resource sharing (CORS) is a mechanism to safely bypass the same-origin policy, that is, it allows a web page to access restricted resources from a server on a domain different than the domain that served the web page.
WebRTC is a free and open-source project providing web browsers and mobile applications with real-time communication (RTC) via application programming interfaces (APIs). It allows audio and video communication and streaming to work inside web pages by allowing direct peer-to-peer communication, eliminating the need to install plugins or download native apps.
Content Security Policy (CSP) is a computer security standard introduced to prevent cross-site scripting (XSS), clickjacking and other code injection attacks resulting from execution of malicious content in the trusted web page context. It is a Candidate Recommendation of the W3C working group on Web Application Security, widely supported by modern web browsers. CSP provides a standard method for website owners to declare approved origins of content that browsers should be allowed to load on that website—covered types are JavaScript, CSS, HTML frames, web workers, fonts, images, embeddable objects such as Java applets, ActiveX, audio and video files, and other HTML5 features.
HTML audio is a subject of the HTML specification, incorporating audio input, playback, and synthesis, as well as speech to text, all in the browser.
HTTP/2 is a major revision of the HTTP network protocol used by the World Wide Web. It was derived from the earlier experimental SPDY protocol, originally developed by Google. HTTP/2 was developed by the HTTP Working Group of the Internet Engineering Task Force (IETF). HTTP/2 is the first new version of HTTP since HTTP/1.1, which was standardized in RFC 2068 in 1997. The Working Group presented HTTP/2 to the Internet Engineering Steering Group (IESG) for consideration as a Proposed Standard in December 2014, and IESG approved it to publish as Proposed Standard on February 17, 2015. The initial HTTP/2 specification was published as RFC 7540 on May 14, 2015.
Client Hints are an extension to the existing Hypertext Transfer Protocol (HTTP) that allows web servers to ask the client for information about it's configuration. The client can choose to respond to this request by advertising the requested information about itself through sending the data using a specific part of the HTTP protocol called HTTP Header fields or by exposing the same information to the JavaScript code being executed on a web page. This can then help the server tailor it's responses to the client, for example, a server can choose to send a smaller image if a client advertises that they have a very small screen.
Mozilla/5.0 (Linux; U; Android 2.2; en-sa; HTC_DesireHD_A9191 Build/FRF91) AppleWebKit/533.1 (KHTML, like Gecko) Version/4.0 Mobile Safari/533.1