URI fragment

Last updated

In computer hypertext, a URI fragment is a string of characters that refers to a resource that is subordinate to another, primary resource. The primary resource is identified by a Uniform Resource Identifier (URI), and the fragment identifier points to the subordinate resource.

Contents

The fragment identifier introduced by a hash mark # is the optional last part of a URL for a document. It is typically used to identify a portion of that document. The generic syntax is specified in RFC 3986. [1] The hash mark separator in URIs is not part of the fragment identifier.

Basics

In URIs, a hash mark # introduces the optional fragment near the end of the URL. The generic RFC 3986 syntax for URIs also allows an optional query part introduced by a question mark ?. In URIs with a query and a fragment, the fragment follows the query. Query parts depend on the URI scheme and are evaluated by the server—e.g., http: supports queries unlike ftp:. Fragments depend on the document MIME type and are evaluated by the client (web browser). Clients are not supposed to send URI fragments to servers when they retrieve a document. [1] [2]

A URI ending with # is permitted by the generic syntax and is a kind of empty fragment. In MIME document types such as text/html or any XML type, empty identifiers to match this syntactically legal construct are not permitted. Web browsers typically display the top of the document for an empty fragment.

The fragment identifier functions differently to the rest of the URI: its processing is exclusively client-sided with no participation from the web server, though the server typically helps to determine the MIME type, and the MIME type determines the processing of fragments. When an agent (such as a web browser) requests a web resource from a web server, the agent sends the URI to the server, but does not send the fragment. Instead, the agent waits for the server to send the resource, and then the agent processes the resource according to the document type and fragment value. [3]

In an HTML web page, the agent will look for an anchor identified with an HTML tag that includes an id= or name= attribute equal to the fragment identifier.

Examples

Proposals

Several proposals have been made for fragment identifiers for use with plain text documents (which cannot store anchor metadata), or to refer to locations within HTML documents in which the author has not used anchor tags:

See also

Related Research Articles

A document type definition (DTD) is a specification file that contains set of markup declarations that define a document type for an SGML-family markup language. The DTD specification file can be used to validate documents.

<span class="mw-page-title-main">HTML</span> HyperText Markup Language

The HyperText Markup Language or HTML is the standard markup language for documents designed to be displayed in a web browser. It defines the content and structure of web content. It is often assisted by technologies such as Cascading Style Sheets (CSS) and scripting languages such as JavaScript.

A Uniform Resource Identifier (URI) is a unique sequence of characters that identifies a logical or physical resource used by web technologies. URIs may be used to identify anything, including real-world objects, such as people and places, concepts, or information resources such as web pages and books. Some URIs provide a means of locating and retrieving information resources on a network ; these are Uniform Resource Locators (URLs). A URL provides the location of the resource. A URI identifies the resource by name at the specified location or URL. Other URIs provide only a unique name, without a means of locating or retrieving the resource or information about it; these are Uniform Resource Names (URNs). The web technologies that use URIs are not limited to web browsers. URIs are used to identify anything described using the Resource Description Framework (RDF), for example, concepts that are part of an ontology defined using the Web Ontology Language (OWL), and people who are described using the Friend of a Friend vocabulary would each have an individual URI.

<span class="mw-page-title-main">Hyperlink</span> Method of referencing visual computer data

In computing, a hyperlink, or simply a link, is a digital reference to data that the user can follow or be guided to by clicking or tapping. A hyperlink points to a whole document or to a specific element within a document. Hypertext is text with hyperlinks. The text that is linked from is known as anchor text. A software system that is used for viewing and creating hypertext is a hypertext system, and to create a hyperlink is to hyperlink. A user following hyperlinks is said to navigate or browse the hypertext.

An HTML element is a type of HTML document component, one of several types of HTML nodes. The first used version of HTML was written by Tim Berners-Lee in 1993 and there have since been many versions of HTML. The current de facto standard is governed by the industry group WHATWG and is known as the HTML Living Standard.

A Uniform Resource Name (URN) is a Uniform Resource Identifier (URI) that uses the urn scheme. URNs are globally unique persistent identifiers assigned within defined namespaces so they will be available for a long period of time, even after the resource which they identify ceases to exist or becomes unavailable. URNs cannot be used to directly locate an item and need not be resolvable, as they are simply templates that another parser may use to find an item.

Web standards are the formal, non-proprietary standards and other technical specifications that define and describe aspects of the World Wide Web. In recent years, the term has been more frequently associated with the trend of endorsing a set of standardized best practices for building web sites, and a philosophy of web design and development that includes those methods.

A query string is a part of a uniform resource locator (URL) that assigns values to specified parameters. A query string commonly includes fields added to a base URL by a Web browser or other client application, for example as part of an HTML document, choosing the appearance of a page, or jumping to positions in multimedia content.

<span class="mw-page-title-main">Digest access authentication</span> Method of negotiating credentials between web server and browser

Digest access authentication is one of the agreed-upon methods a web server can use to negotiate credentials, such as username or password, with a user's web browser. This can be used to confirm the identity of a user before sending sensitive information, such as online banking transaction history. It applies a hash function to the username and password before sending them over the network. In contrast, basic access authentication uses the easily reversible Base64 encoding instead of hashing, making it non-secure unless used in conjunction with TLS.

The data URI scheme is a uniform resource identifier (URI) scheme that provides a way to include data in-line in Web pages as if they were external resources. It is a form of file literal or here document. This technique allows normally separate elements such as images and style sheets to be fetched in a single Hypertext Transfer Protocol (HTTP) request, which may be more efficient than multiple HTTP requests, and used by several browser extensions to package images as well as other multimedia content in a single HTML file for page saving. As of 2022, data URIs are fully supported by most major browsers, and partially supported in Internet Explorer.

URL encoding, officially known as percent-encoding, is a method to encode arbitrary data in a uniform resource identifier (URI) using only the US-ASCII characters legal within a URI. Although it is known as URL encoding, it is also used more generally within the main Uniform Resource Identifier (URI) set, which includes both Uniform Resource Locator (URL) and Uniform Resource Name (URN). As such, it is also used in the preparation of data of the application/x-www-form-urlencoded media type, as is often used in the submission of HTML form data in HTTP requests.

A web resource is any identifiable resource present on or connected to the World Wide Web. Resources are identified using Uniform Resource Identifiers (URIs). In the Semantic Web, web resources and their semantic properties are described using the Resource Description Framework (RDF).

RDFa or Resource Description Framework in Attributes is a W3C Recommendation that adds a set of attribute-level extensions to HTML, XHTML and various XML-based document types for embedding rich metadata within Web documents. The Resource Description Framework (RDF) data-model mapping enables its use for embedding RDF subject-predicate-object expressions within XHTML documents. It also enables the extraction of RDF model triples by compliant user agents.

<span class="mw-page-title-main">URI normalization</span> Process by which URIs are standardized

URI normalization is the process by which URIs are modified and standardized in a consistent manner. The goal of the normalization process is to transform a URI into a normalized URI so it is possible to determine if two syntactically different URIs may be equivalent.

A media type is a two-part identifier for file formats and format contents transmitted on the Internet. Their purpose is somewhat similar to file extensions in that they identify the intended data format. The Internet Assigned Numbers Authority (IANA) is the official authority for the standardization and publication of these classifications. Media types were originally defined in Request for Comments RFC 2045 (MIME) Part One: Format of Internet Message Bodies in November 1996 as a part of the MIME specification, for denoting type of email message content and attachments; hence the original name, MIME type. Media types are also used by other internet protocols such as HTTP and document file formats such as HTML, for similar purposes.

A Formal Public Identifier (FPI) is a short piece of text with a particular structure that may be used to uniquely identify a product, specification or document. FPIs were introduced as part of Standard Generalized Markup Language (SGML), and serve particular purposes in formats historically derived from SGML. Some of their most common uses are as part of document type declarations (DOCTYPEs) and document type definitions (DTDs) in SGML, XML and historically HTML, but they are also used in the vCard and iCalendar file formats to identify the software product which generated the file.

A single-page application (SPA) is a web application or website that interacts with the user by dynamically rewriting the current web page with new data from the web server, instead of the default method of a web browser loading entire new pages. The goal is faster transitions that make the website feel more like a native app.

Extensible HyperText Markup Language (XHTML) is part of the family of XML markup languages which mirrors or extends versions of the widely used HyperText Markup Language (HTML), the language in which Web pages are formulated.

A document type declaration, or DOCTYPE, is an instruction that associates a particular XML or SGML document with a document type definition (DTD). In the serialized form of the document, it manifests as a short string of markup that conforms to a particular syntax.

A Uniform Resource Locator (URL), colloquially known as an address on the Web, is a reference to a resource that specifies its location on a computer network and a mechanism for retrieving it. A URL is a specific type of Uniform Resource Identifier (URI), although many people use the two terms interchangeably. URLs occur most commonly to reference web pages (HTTP/HTTPS) but are also used for file transfer (FTP), email (mailto), database access (JDBC), and many other applications.

References

  1. 1 2 "RFC 3986 Uniform Resource Identifier (URI): Generic Syntax". Internet Engineering Task Force. January 2005. Retrieved 2012-03-06.
  2. R. Fielding, Ed., Adobe; J. Reschke, Ed., greenbytes (June 2014). "Hypertext Transfer Protocol (HTTP/1.1): Message Syntax and Routing". Internet Engineering Task Force (IETF). Retrieved 2023-12-27. The target URI excludes the reference's fragment component, if any, since fragment identifiers are reserved for client-side processing{{cite web}}: CS1 maint: multiple names: authors list (link)
  3. "Representation types and fragment identifier semantics". Architecture of the World Wide Web, Volume One. W3C. 2004. Retrieved 2011-07-13.
  4. "Validity constraint: ID". XML 1.0 (Fifth Edition). W3C. 2008. Retrieved 2011-07-13.
  5. "xml:id Version 1.0". W3C. 2005. Retrieved 2011-07-13.
  6. "Issue 77024". Chromium. 2011. Retrieved 2011-07-13.
  7. "Media Type Review". W3C Media Fragments Working Group. 2009. Retrieved 2009-04-29.
  8. "New Feature: Link within a Video". 2006-07-19. Retrieved 2011-07-13.
  9. Link to Specific Content in Gmail, Google Blogoscoped, 2007-11-17
  10. Bryan, P (2013-04-02). "RFC 6901 – JavaScript Object Notation (JSON) Pointer". The Internet Society. Retrieved 2022-07-14.
  11. "Parameters for Opening PDF Files – Specifying parameters in a URL" (PDF). Adobe. April 2007. Retrieved 2017-09-20.
  12. Taft, E.; Pravetz, J.; Zilles, S.; Masinter, L. (May 2004). "RFC 3778 – The application/pdf Media Type". tools.ietf.org. The Internet Society. doi:10.17487/RFC3778 . Retrieved 2017-09-20.
  13. "Linking – SVG 1.1 (Second Edition)".
  14. "Media Fragments URI 1.0 (basic) W3C Recommendation" . Retrieved 2012-09-25.
  15. "Scroll to Text Fragment". Chrome Platform Status. Google Chrome . Retrieved 2020-05-18.
  16. Kelly, Gordon. "Google Chrome 80 Released With Controversial Deep Linking Upgrade". Forbes. Retrieved 2020-06-04.
  17. "WICG/scroll-to-text-fragment: Proposal to allow specifying a text snippet in a URL fragment". GitHub. WebPlatform.org Incubator Community Group at W3C . Retrieved 2020-05-18.
  18. "Pypi md5 check support" . Retrieved 2011-07-13. Pypi has the habit to append an md5 fragment to its egg urls, we'll use it to check the already present distribution files in the cache
  19. 1 2 "Hash URIs". W3C Blog. 2011-05-12. Retrieved 2011-07-13.
  20. "HTML 5.1 2nd Edition". W3C. 2017. Retrieved 2018-08-03.
  21. 1 2 "Proposal for making AJAX crawlable". 2009-10-07. Retrieved 2011-07-13.
  22. "(Specifications) Making AJAX Applications Crawlable". Google Inc. Retrieved 2013-05-04.
  23. "Manipulating the browser history". Mozilla Developer Network. Retrieved 2017-02-23.
  24. "Deprecating our AJAX crawling scheme". Official Google Webmaster Central Blog. Retrieved 2017-02-23.
  25. Fragment Search, gerv.net
  26. Fragment identifiers for plain text files, Erik Wilde and Marcel Baschnagel, Swiss Federal Institute of Technology (ETH Zürich), Proceedings of the sixteenth ACM conference on Hypertext and hypermedia doi : 10.1145/1083356.1083398
  27. Text-Search Fragment Identifiers, K. Yee, Network Working Group, Foresight Institute, March 1998
  28. bmcquade; bokan; nburris (2022-03-24). "Feature: Scroll to Text Fragment". Chrome Platform Status. chromium.org. Retrieved 2022-05-03.
  29. LiveURLs project
  30. The technology behind LiveURLs, accessed 2011-03-13
  31. "Web Marker" Firefox add-on, accessed 2011-03-13
  32. "EPUB Canonical Fragment Identifiers 1.1". idpf.org. Retrieved 2020-06-03.