|Organisation||International DOI Foundation|
In computing, a digital object identifier (DOI) is a persistent identifier or handle used to identify objects uniquely, standardized by the International Organization for Standardization (ISO).An implementation of the Handle System, DOIs are in wide use mainly to identify academic, professional, and government information, such as journal articles, research reports and data sets, and official publications though they also have been used to identify other types of information resources, such as commercial videos.
A persistent identifier is a long-lasting reference to a document, file, web page, or other object.
In computer programming, a handle is an abstract reference to a resource. Handles are used when application software references blocks of memory or objects managed by another system, such as a database or an operating system. A resource handle can be an opaque identifier, in which case it is often an integer number, or it can be a pointer that allows access to further information.
The International Organization for Standardization is an international standard-setting body composed of representatives from various national standards organizations.
A DOI aims to be "resolvable", usually to some form of access to the information object to which the DOI refers. This is achieved by binding the DOI to metadata about the object, such as a URL, indicating where the object can be found. Thus, by being actionable and interoperable, a DOI differs from identifiers such as ISBNs and ISRCs which aim only to identify their referents uniquely. The DOI system uses the indecs Content Model for representing metadata.
Metadata is "data [information] that provides information about other data". Many distinct types of metadata exist, among these descriptive metadata, structural metadata, administrative metadata, reference metadata and statistical metadata.
A Uniform Resource Locator (URL), colloquially termed a web address, is a reference to a web resource that specifies its location on a computer network and a mechanism for retrieving it. A URL is a specific type of Uniform Resource Identifier (URI), although many people use the two terms interchangeably. Thus
http://www.example.com is a URL, while
www.example.com is not.</ref> URLs occur most commonly to reference web pages (http), but are also used for file transfer (ftp), email (mailto), database access (JDBC), and many other applications.
indecs was a project partly funded by the European Community Info 2000 initiative and by several organisations representing the music, rights, text publishing, authors, library and other sectors in 1998-2000, which has since been used in a number of metadata activities. A final report and related documents were published; the indecs Metadata Framework document is a concise summary.
The DOI for a document remains fixed over the lifetime of the document, whereas its location and other metadata may change. Referring to an online document by its DOI is supposed to provide a more stable link than simply using its URL. But every time a URL changes, the publisher has to update the metadata for the DOI to link to the new URL.It is the publisher's responsibility to update the DOI database. If they fail to do so, the DOI resolves to a dead link leaving the DOI useless.
Link rot is the process by which hyperlinks on individual websites or the Internet in general tend to point to web pages, servers or other resources that have become permanently unavailable. There is no reliable data on how long web pages and other resources survive: the estimates vary dramatically between different studies, as well as between different sets of links on which these studies are based.
The developer and administrator of the DOI system is the International DOI Foundation (IDF), which introduced it in 2000.Organizations that meet the contractual obligations of the DOI system and are willing to pay to become a member of the system can assign DOIs. The DOI system is implemented through a federation of registration agencies coordinated by the IDF. By late April 2011 more than 50 million DOI names had been assigned by some 4,000 organizations, and by April 2013 this number had grown to 85 million DOI names assigned through 9,500 organizations.
A DOI is a type of Handle System handle, which takes the form of a character string divided into two parts, a prefix and a suffix, separated by a slash.
The prefix identifies the registrant of the identifier, and the suffix is chosen by the registrant and identifies the specific object associated with that DOI. Most legal Unicode characters are allowed in these strings, which are interpreted in a case-insensitive manner. The prefix usually takes the form
NNNN is at least a four digit number greater than or equal to
1000, whose limit depends only on the total number of registrants. The prefix may be further subdivided with periods, like
Unicode is a computing industry standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard is maintained by the Unicode Consortium, and as of May 2019 the most recent version, Unicode 12.1, contains a repertoire of 137,994 characters covering 150 modern and historic scripts, as well as multiple symbol sets and emoji. The character repertoire of the Unicode Standard is synchronized with ISO/IEC 10646, and both are code-for-code identical.
For example, in the DOI name
10.1000/182, the prefix is
10.1000 and the suffix is
182. The "10." part of the prefix distinguishes the handle as part of the DOI namespace, as opposed to some other Handle System namespace, and the characters
1000 in the prefix identify the registrant; in this case the registrant is the International DOI Foundation itself.
182 is the suffix, or item ID, identifying a single object (in this case, the latest version of the DOI Handbook).
DOI names can identify creative works (such as texts, images, audio or video items, and software) in both electronic and physical forms, performances, and abstract workssuch as licenses, parties to a transaction, etc.
The names can refer to objects at varying levels of detail: thus DOI names can identify a journal, an individual issue of a journal, an individual article in the journal, or a single table in that article. The choice of level of detail is left to the assigner, but in the DOI system it must be declared as part of the metadata that is associated with a DOI name, using a data dictionary based on the indecs Content Model.
The official DOI Handbook explicitly states that DOIs should display on screens and in print in the format
Contrary to the DOI Handbook, CrossRef, a major DOI registration agency, recommends displaying a URL (for example,
https://doi.org/10.1000/182) instead of the officially specified format (for example,
doi:10.1000/182 ) This URL is persistent (there is a contract that ensures persistence in the DOI.ORG domain), so it is a PURL — providing the location of an HTTP proxy server which will redirect web accesses to the correct online location of the linked item.
The CrossRef recommendation is primarily based on the assumption that the DOI is being displayed without being hyperlinked to its appropriate URL – the argument being that without the hyperlink it is not as easy to copy-and-paste the full URL to actually bring up the page for the DOI, thus the entire URL should be displayed, allowing people viewing the page containing the DOI to copy-and-paste the URL, by hand, into a new window/tab in their browser in order to go to the appropriate page for the document the DOI represents.
Major applications of the DOI system currently include:
In the Organisation for Economic Co-operation and Development's publication service OECD iLibrary, each table or graph in an OECD publication is shown with a DOI name that leads to an Excel file of data underlying the tables and graphs. Further development of such services is planned.
Other registries include Crossref and the multilingual European DOI Registration Agency.
DOI and other special identifiers can help to unify information about references with different spelling in various language versions of Wikipedia.
The IDF designed the DOI system to provide a form of persistent identification, in which each DOI name permanently and unambiguously identifies the object to which it is associated. It also associates metadata with objects, allowing it to provide users with relevant pieces of information about the objects and their relationships. Included as part of this metadata are network actions that allow DOI names to be resolved to web locations where the objects they describe can be found. To achieve its goals, the DOI system combines the Handle System and the indecs Content Model with a social infrastructure.
The Handle System ensures that the DOI name for an object is not based on any changeable attributes of the object such as its physical location or ownership, that the attributes of the object are encoded in its metadata rather than in its DOI name, and that no two objects are assigned the same DOI name. Because DOI names are short character strings, they are human-readable, may be copied and pasted as text, and fit into the URI specification. The DOI name-resolution mechanism acts behind the scenes, so that users communicate with it in the same way as with any other web service; it is built on open architectures, incorporates trust mechanisms, and is engineered to operate reliably and flexibly so that it can be adapted to changing demands and new applications of the DOI system.DOI name-resolution may be used with OpenURL to select the most appropriate among multiple locations for a given object, according to the location of the user making the request. However, despite this ability, the DOI system has drawn criticism from librarians for directing users to non-free copies of documents that would have been available for no additional fee from alternative locations.
The indecs Content Model as used within the DOI system associates metadata with objects. A small kernel of common metadata is shared by all DOI names and can be optionally extended with other relevant data, which may be public or restricted. Registrants may update the metadata for their DOI names at any time, such as when publication information changes or when an object moves to a different URL.
The International DOI Foundation (IDF) oversees the integration of these technologies and operation of the system through a technical and social infrastructure. The social infrastructure of a federation of independent registration agencies offering DOI services was modelled on existing successful federated deployments of identifiers such as GS1 and ISBN.
A DOI name differs from commonly used Internet pointers to material, such as the Uniform Resource Locator (URL), in that it identifies an object itself as a first-class entity, rather than the specific place where the object is located at a certain time. It implements the Uniform Resource Identifier (Uniform Resource Name) concept and adds to it a data model and social infrastructure.
A DOI name also differs from standard identifier registries such as the ISBN, ISRC, etc. The purpose of an identifier registry is to manage a given collection of identifiers, whereas the primary purpose of the DOI system is to make a collection of identifiers actionable and interoperable, where that collection can include identifiers from many other controlled collections.
The DOI system offers persistent, semantically-interoperable resolution to related current data and is best suited to material that will be used in services outside the direct control of the issuing assigner (e.g., public citation or managing content of value). It uses a managed registry (providing social and technical infrastructure). It does not assume any specific business model for the provision of identifiers or services and enables other existing services to link to it in defined ways. Several approaches for making identifiers persistent have been proposed. The comparison of persistent identifier approaches is difficult because they are not all doing the same thing. Imprecisely referring to a set of schemes as "identifiers" doesn't mean that they can be compared easily. Other "identifier systems" may be enabling technologies with low barriers to entry, providing an easy to use labeling mechanism that allows anyone to set up a new instance (examples include Persistent Uniform Resource Locator (PURL), URLs, Globally Unique Identifiers (GUIDs), etc.), but may lack some of the functionality of a registry-controlled scheme and will usually lack accompanying metadata in a controlled scheme. The DOI system does not have this approach and should not be compared directly to such identifier schemes. Various applications using such enabling technologies with added features have been devised that meet some of the features offered by the DOI system for specific sectors (e.g., ARK).
A DOI name does not depend on the object's location and, in this way, is similar to a Uniform Resource Name (URN) or PURL but differs from an ordinary URL. URLs are often used as substitute identifiers for documents on the Internet although the same document at two different locations has two URLs. By contrast, persistent identifiers such as DOI names identify objects as first class entities: two instances of the same object would have the same DOI name.
DOI name resolution is provided through the Handle System, developed by Corporation for National Research Initiatives, and is freely available to any user encountering a DOI name. Resolution redirects the user from a DOI name to one or more pieces of typed data: URLs representing instances of the object, services such as e-mail, or one or more items of metadata. To the Handle System, a DOI name is a handle, and so has a set of values assigned to it and may be thought of as a record that consists of a group of fields. Each handle value must have a data type specified in its
<type> field, which defines the syntax and semantics of its data. While a DOI persistently and uniquely identifies the object to which it is assigned, DOI resolution may not be persistent, due to technical and administrative issues.
To resolve a DOI name, it may be input to a DOI resolver, such as doi.org.
Another approach, which avoids typing or cutting-and-pasting into a resolver is to include the DOI in a document as a URL which uses the resolver as an HTTP proxy, such as
https://doi.org/ (preferred) or
http://dx.doi.org/, both of which support HTTPS. For example, the DOI
10.1000/182 can be included in a reference or hyperlink as
https://doi.org/10.1000/182 . This approach allows users to click on the DOI as a normal hyperlink. Indeed, as previously mentioned, this is how CrossRef recommends that DOIs always be represented (preferring HTTPS over HTTP), so that if they are cut-and-pasted into other documents, emails, etc., they will be actionable.
Other DOI resolvers and HTTP Proxies include http://hdl.handle.net, and https://doi.pangaea.de/. At the beginning of the year 2016, a new class of alternative DOI resolvers was started by http://doai.io. This service is unusual in that it tries to find a non-paywalled version of a title and redirects you to that instead of the publisher's version.Since then, other open-access favoring DOI resolvers have been created, notably https://oadoi.org/ in October 2016. While traditional DOI resolvers solely rely on the Handle System, alternative DOI resolvers first consult open access resources such as BASE (Bielefeld Academic Search Engine).
An alternative to HTTP proxies is to use one of a number of add-ons and plug-ins for browsers, thereby avoiding the conversion of the DOIs to URLs, [ permanent dead link ], enables the browser to access Handle System handles or DOIs like
doi:10.1000/1 directly in the Firefox browser, using the native Handle System protocol. This plug-in can also replace references to web-to-handle proxy servers with native resolution. A disadvantage of this approach for publishers is that, at least at present, most users will be encountering the DOIs in a browser, mail reader, or other software which does not have one of these plug-ins installed.
The International DOI Foundation (IDF), a non-profit organisation created in 1998, is the governance body of the DOI system.It safeguards all intellectual property rights relating to the DOI system, manages common operational features, and supports the development and promotion of the DOI system. The IDF ensures that any improvements made to the DOI system (including creation, maintenance, registration, resolution and policymaking of DOI names) are available to any DOI registrant. It also prevents third parties from imposing additional licensing requirements beyond those of the IDF on users of the DOI system.
The IDF is controlled by a Board elected by the members of the Foundation, with an appointed Managing Agent who is responsible for co-ordinating and planning its activities. Membership is open to all organizations with an interest in electronic publishing and related enabling technologies. The IDF holds annual open meetings on the topics of DOI and related issues.
Registration agencies, appointed by the IDF, provide services to DOI registrants: they allocate DOI prefixes, register DOI names, and provide the necessary infrastructure to allow registrants to declare and maintain metadata and state data. Registration agencies are also expected to actively promote the widespread adoption of the DOI system, to cooperate with the IDF in the development of the DOI system as a whole, and to provide services on behalf of their specific user community. A list of current RAs is maintained by the International DOI Foundation. The IDF is recognized as one of the federated registrars for the Handle System by the DONA Foundation (of which the IDF is a board member), and is responsible for assigning Handle System prefixes under the top-level
Registration agencies generally charge a fee to assign a new DOI name; parts of these fees are used to support the IDF. The DOI system overall, through the IDF, operates on a not-for-profit cost recovery basis.
The DOI system is an international standard developed by the International Organization for Standardization in its technical committee on identification and description, TC46/SC9.The Draft International Standard ISO/DIS 26324, Information and documentation – Digital Object Identifier System met the ISO requirements for approval. The relevant ISO Working Group later submitted an edited version to ISO for distribution as an FDIS (Final Draft International Standard) ballot, which was approved by 100% of those voting in a ballot closing on 15 November 2010. The final standard was published on 23 April 2012.
DOI is a registered URI under the info URI scheme specified by IETF RFC 4452. info:doi/ is the infoURI Namespace of Digital Object Identifiers.
The DOI syntax is a NISO standard, first standardised in 2000, ANSI/NISO Z39.84-2005 Syntax for the Digital Object Identifier.
The maintainers of the DOI system have deliberately not registered a DOI namespace for URNs, stating that:
URN architecture assumes a DNS-based Resolution Discovery Service (RDS) to find the service appropriate to the given URN scheme. However no such widely deployed RDS schemes currently exist.... DOI is not registered as a URN namespace, despite fulfilling all the functional requirements, since URN registration appears to offer no advantage to the DOI System. It requires an additional layer of administration for defining DOI as a URN namespace (the string
urn:doi:10.1000/1rather than the simpler
doi:10.1000/1) and an additional step of unnecessary redirection to access the resolution service, already achieved through either http proxy or native resolution. If RDS mechanisms supporting URN specifications become widely available, DOI will be registered as a URN.
The International Standard Book Number (ISBN) is a numeric commercial book identifier which is intended to be unique. Publishers purchase ISBNs from an affiliate of the International ISBN Agency.
In computing, a namespace is a set of symbols that are used to organize objects of various kinds, so that these objects may be referred to by name. In Java, a namespace ensures that all the identifiers within it must have unique names so that they can be easily identified. In order to manage the namespace Java provides the mechanism of creating Java packages. Prominent examples include:
A Uniform Resource Identifier (URI) is a string of characters that unambiguously identifies a particular resource. To guarantee uniformity, all URIs follow a predefined set of syntax rules, but also maintain extensibility through a separately defined hierarchical naming scheme.
An International Standard Serial Number (ISSN) is an eight-digit serial number used to uniquely identify a serial publication, such as a magazine. The ISSN is especially helpful in distinguishing between serials with the same title. ISSN are used in ordering, cataloging, interlibrary loans, and other practices in connection with serial literature.
A Uniform Resource Name (URN) is a Uniform Resource Identifier (URI) that uses the
The Uniform Domain-Name Dispute-Resolution Policy (UDRP) is a process established by the Internet Corporation for Assigned Names and Numbers (ICANN) for the resolution of disputes regarding the registration of internet domain names. The UDRP currently applies to all generic top level domains, some country code top-level domains, and some older top level domains in specific circumstances.
A persistent uniform resource locator (PURL) is a uniform resource locator (URL) that is used to redirect to the location of the requested web resource. PURLs redirect HTTP clients using HTTP status codes.
Life Science Identifiers are a way to name and locate pieces of information on the web. Essentially, an LSID is a unique identifier for some data, and the LSID protocol specifies a standard way to locate the data. They are a little like DOIs used by many publishers.
Crossref is an official Digital Object Identifier (DOI) Registration Agency of the International DOI Foundation. It is run by the Publishers International Linking Association Inc. (PILA) and was launched in early 2000 as a cooperative effort among publishers to enable persistent cross-publisher citation linking in online academic journals.
The International Geo Sample Number or IGSN is a sample identification code of typically nine characters. As an active persistent identifier it can be resolved through the Handle System. The system is used in production by the System for Earth Sample Registration (SESAR), Geoscience Australia, Commonwealth Scientific and Industrial Research Organisation Mineral Resources, Australian Research Data Commons (ARDC), University of Bremen MARUM, and German Research Centre for Geosciences (GFZ). Other organisations are preparing the introduction of the IGSN.
A Formal Public Identifier (FPI) is a short piece of specially formatted text that may be used to uniquely identify a product, specification or document. One of their most common uses is as part of document type definitions, but they are also used in the vCard and iCalendar formats to identify the software product that has generated data.
The Handle System is the Corporation for National Research Initiatives's proprietary registry assigning persistent identifiers, or handles, to information resources, and for resolving "those handles into the information necessary to locate, access, and otherwise make use of the resources".
An Archival Resource Key (ARK) is a Uniform Resource Locator (URL) that is a multi-purpose persistent identifier for information objects of any type. An ARK contains the label ark: after the URL's hostname, which sets the expectation that, when submitted to a web browser, the URL terminated by '?' returns a brief metadata record, and the URL terminated by '??' returns metadata that includes a commitment statement from the current service provider. The ARK and its inflections provide access to three facets of a provider's ability to provide persistence.
The ORCID is a nonproprietary alphanumeric code to uniquely identify scientific and other academic authors and contributors. This addresses the problem that a particular author's contributions to the scientific literature or publications in the humanities can be hard to recognize as most personal names are not unique, they can change, have cultural differences in name order, contain inconsistent use of first-name abbreviations and employ different writing systems. It provides a persistent identity for humans, similar to that created for content-related entities on digital networks by digital object identifiers (DOIs).
An Extensible Resource Identifier is a scheme and resolution protocol for abstract identifiers compatible with Uniform Resource Identifiers and Internationalized Resource Identifiers, developed by the XRI Technical Committee at OASIS. The goal of XRI was a standard syntax and discovery format for abstract, structured identifiers that are domain-, location-, application-, and transport-independent, so they can be shared across any number of domains, directories, and interaction protocols.
EIDR, or the Entertainment Identifier Registry, is a global unique identifier system for a broad array of audio visual objects, including motion pictures, television, and radio programs. The identification system resolves an identifier to a metadata record that is associated with top-level titles, edits, DVDs, encodings, clips, and mash-ups. EIDR also provides identifiers for Video Service providers, such as broadcast and cable networks.
lex is a URN namespace, a type of Uniform Resource Name (URN), that allows accurate identification of laws and other legal norms.
The MIRIAM Registry, a by-product of the MIRIAM Guidelines, is a database of namespaces and associated information that is used in the creation of uniform resource identifiers. It contains the set of community-approved namespaces for databases and resources serving, primarily, the biological sciences domain. These shared namespaces, when combined with 'data collection' identifiers, can be used to create globally unique identifiers for knowledge held in data repositories. For more information on the use of URIs to annotate models, see the specification of SBML Level 2 Version 2.
Identifiers.org is a project providing stable and perennial identifiers for data records used in the Life Sciences. The identifiers are provided in the form of Uniform Resource Identifiers (URIs). Identifiers.org is also a resolving system, that relies on collections listed in the MIRIAM Registry to provide direct access to different instances of the identified records.
Assuming the publishers do their job of maintaining the databases, these centralized references, unlike current web links, should never become outdated or broken.
All DOI prefixes begin with "10" to distinguish the DOI from other implementations of the Handle System followed by a four-digit number or string (the prefix can be longer if necessary).
Over 18,000 DOI name prefixes within the DOI System
The registrant code may be further divided into sub-elements for administrative convenience if desired. Each sub-element of the registrant code shall be preceded by a full stop.
| Wikidata has the property: |