Identifiers.org

Last updated

Identifiers.org is a project providing stable and perennial identifiers for data records used in the Life Sciences. The identifiers are provided in the form of Uniform Resource Identifiers (URIs). Identifiers.org is also a resolving system, that relies on collections listed in the MIRIAM Registry to provide direct access to different instances of the identified records.

Contents

Identifiers.org URIs and resolving system

The Identifiers.org URIs [1] [2] are perennial identifiers, that specify at once the data collection, using the namespaces of the Registry, and the record identifier within the collection in the form of a unique resolvable URI. The Identifiers.org resolving system is built upon the information stored in the MIRIAM Registry, [3] which is a database that stores namespaces assigned to commonly used data collections (databases and ontologies) for the Life Sciences. It transforms an Identifiers.org URI into the various URLs leading to the various instances of the record identified by the URI. Identifiers.org is part of the ELIXIR Interoperability Platform.

Identifier structure

An Identifiers.org URI is formed of several parts:

Structure and examples of Identifiers.org URIs. Identifiers.org-URIs.png
Structure and examples of Identifiers.org URIs.

Usage

The system allows a consistent and uniform annotation of datasets. This in turn facilitates data alignment and integration. Identifiers.org URIs are used to encode the metadata in the standard formats of the COMBINE initiative, [4] such as SBML. In particular, databases such as BioModels Database and Reactome export their data in SBML with cross-references encoded using Identifiers.org URIs. These URIs are also used in various semantic web projects such as Bio2RDF, Open PHACTS and the EBI RDF platform [5] Identifiers.org is part of the Interoperability platform of the European life-sciences Infrastructure for biological Information.

Comparison with other URI systems

Identifiers.org URIs have been developed since 2011 as a resolvable version of the MIRIAM identifiers, developed since 2005, which were of a URN form, and not directly resolvable. Identifiers.org URIs are similar to PURLs, albeit providing alternative resolutions for collections with several instances. They are also similar to DOIs, but provide human readable collection names, and re-use the record identifier assigned by the data provider.

See also

Related Research Articles

<span class="mw-page-title-main">Dublin Core</span> Standardized set of metadata elements

The Dublin Core, also known as the Dublin Core Metadata Element Set (DCMES), is a set of fifteen main metadata items for describing digital or physical resources. The Dublin Core Metadata Initiative (DCMI) is responsible for formulating the Dublin Core; DCMI is a project of the Association for Information Science and Technology (ASIS&T), a non-profit organization.

A Uniform Resource Identifier (URI) is a unique sequence of characters that identifies a logical or physical resource used by web technologies. URIs may be used to identify anything, including real-world objects, such as people and places, concepts, or information resources such as web pages and books. Some URIs provide a means of locating and retrieving information resources on a network ; these are Uniform Resource Locators (URLs). A URL provides the location of the resource. A URI identifies the resource by name at the specified location or URL. Other URIs provide only a unique name, without a means of locating or retrieving the resource or information about it; these are Uniform Resource Names (URNs). The web technologies that use URIs are not limited to web browsers. URIs are used to identify anything described using the Resource Description Framework (RDF), for example, concepts that are part of an ontology defined using the Web Ontology Language (OWL), and people who are described using the Friend of a Friend vocabulary would each have an individual URI.

The Resource Description Framework (RDF) is a World Wide Web Consortium (W3C) standard originally designed as a data model for metadata. It has come to be used as a general method for description and exchange of graph data. RDF provides a variety of syntax notations and data serialization formats, with Turtle currently being the most widely used notation.

A Uniform Resource Name (URN) is a Uniform Resource Identifier (URI) that uses the urn scheme. URNs are globally unique persistent identifiers assigned within defined namespaces so they will be available for a long period of time, even after the resource which they identify ceases to exist or becomes unavailable. URNs cannot be used to directly locate an item and need not be resolvable, as they are simply templates that another parser may use to find an item.

<span class="mw-page-title-main">Digital object identifier</span> ISO standard unique string identifier for a digital object

A digital object identifier (DOI) is a persistent identifier or handle used to uniquely identify various objects, standardized by the International Organization for Standardization (ISO). DOIs are an implementation of the Handle System; they also fit within the URI system. They are widely used to identify academic, professional, and government information, such as journal articles, research reports, data sets, and official publications. DOIs have also been used to identify other types of information resources, like commercial videos.

XML namespaces are used for providing uniquely named elements and attributes in an XML document. They are defined in a W3C recommendation. An XML instance may contain element or attribute names from more than one XML vocabulary. If each vocabulary is given a namespace, the ambiguity between identically named elements or attributes can be resolved.

A web resource is any identifiable resource present on or connected to the World Wide Web. Resources are identified using Uniform Resource Identifiers (URIs). In the Semantic Web, web resources and their semantic properties are described using the Resource Description Framework (RDF).

The Open Biological and Biomedical Ontologies (OBO) Foundry is a group of people dedicated to build and maintain ontologies related to the life sciences. The OBO Foundry establishes a set of principles for ontology development for creating a suite of interoperable reference ontologies in the biomedical domain. Currently, there are more than a hundred ontologies that follow the OBO Foundry principles.

Life Science Identifiers are a way to name and locate pieces of information on the web. Essentially, an LSID is a unique identifier for some data, and the LSID protocol specifies a standard way to locate the data. They are a little like DOIs used by many publishers.

PRONOM is a web-based technical registry to support digital preservation services, developed by The National Archives of the United Kingdom. PRONOM was the first and remains, to date, the only operational public file format registry in the world, although the "Magic File" repository of the File Command has served this role in a less formal capacity for two decades. Other projects to develop technical registries, including the UK Digital Curation Centre's Representation Information Registry, and the Global Digital Format Registry project at Harvard University, are now in progress.

BioMOBY is a registry of web services used in bioinformatics. It allows interoperability between biological data hosts and analytical services by annotating services with terms taken from standard ontologies. BioMOBY is released under the Artistic License.

<span class="mw-page-title-main">Systems Biology Ontology</span>

The Systems Biology Ontology (SBO) is a set of controlled, relational vocabularies of terms commonly used in systems biology, and in particular in computational modeling.

<span class="mw-page-title-main">Minimum information required in the annotation of models</span>

MIRIAM is a community-level effort to standardize the annotation and curation processes of quantitative models of biological systems. It consists of a set of guidelines suitable for use with any structured format, allowing different groups to collaborate and share resulting models. Adherence to these guidelines also facilitates the sharing of software and service infrastructures built upon modeling activities.

The FAO geopolitical ontology is an ontology developed by the Food and Agriculture Organization of the United Nations (FAO) to describe, manage and exchange data related to geopolitical entities such as countries, territories, regions and other similar areas.

The Handle System is the Corporation for National Research Initiatives's proprietary registry assigning persistent identifiers, or handles, to information resources, and for resolving "those handles into the information necessary to locate, access, and otherwise make use of the resources".

An Extensible Resource Identifier (XRI) is a scheme and resolution protocol for abstract identifiers compatible with Uniform Resource Identifiers (URI) and Internationalized Resource Identifiers (IRI), developed by the XRI Technical Committee at OASIS. The goal of XRI was a standard syntax and discovery format for abstract, structured identifiers that are domain-, location-, application-, and transport-independent, so they can be shared across any number of domains, directories, and interaction protocols.

<span class="mw-page-title-main">SABIO-Reaction Kinetics Database</span>

SABIO-RK is a web-accessible database storing information about biochemical reactions and their kinetic properties.

<span class="mw-page-title-main">MIRIAM Registry</span>

The MIRIAM Registry, a by-product of the MIRIAM Guidelines, is a database of namespaces and associated information that is used in the creation of uniform resource identifiers. It contains the set of community-approved namespaces for databases and resources serving, primarily, the biological sciences domain. These shared namespaces, when combined with 'data collection' identifiers, can be used to create globally unique identifiers for knowledge held in data repositories. For more information on the use of URIs to annotate models, see the specification of SBML Level 2 Version 2.

Nicolas Le Novère is a British and French biologist. His research focuses on modeling signaling pathways and developing tools to share mathematical models.

Global Research Identifier Database (GRID) is a database of educational and research organizations worldwide, created and maintained by Digital Science & Research Solutions Ltd., part of the technology company Digital Science. In 2021 public releases of the database were discontinued in favor of Research Organization Registry (ROR) as the leading open organization identifier.

References

  1. Juty, N; Le Novère, N; Laibe, C (2012). "Identifiers.org and MIRIAM Registry: Community resources to provide persistent identification". Nucleic Acids Research. 40 (Database issue): D580–6. doi:10.1093/nar/gkr1097. PMC   3245029 . PMID   22140103.
  2. http://identifiers.org/ Identifiers.org Website
  3. Laibe, C; Le Novère, N (2007). "MIRIAM Resources: tools to generate and resolve robust cross-references in Systems Biology". BMC Systems Biology. 1: 58. doi: 10.1186/1752-0509-1-58 . PMC   2259379 . PMID   18078503.
  4. http://co.mbine.org/ COmputational Modeling in BIology NEtwork Web site
  5. S Jupp, J Malone, J Bolleman, M Brandizi, M Davies, L Garcia, A Gaulton, S Gehant, C Laibe, N Redaschi, SM Wimalaratne, M Martin, N Le Novère, H Parkinson, E Birney, AM Jenkinson (2014) The EBI RDF Platform: Linked Open Data for the Life Sciences. Bioinformatics doi : 10.1093/bioinformatics/btt765