Darwin Core

Last updated

Darwin Core (often abbreviated to DwC) is an extension of Dublin Core for biodiversity informatics. It is meant to provide a stable standard reference for sharing information on biological diversity (biodiversity). [1] The terms described in this standard are a part of a larger set of vocabularies and technical specifications under development and maintained by Biodiversity Information Standards (TDWG) (formerly the Taxonomic Databases Working Group).

Contents

Description

The Darwin Core is a body of standards intended to facilitate the sharing of information about biological diversity. The DwC includes a glossary of terms, and documentation providing reference definitions, examples, and commentary. An overview of the currently adopted terms and concepts can be found in the Darwin Core quick reference guide maintained by TDWG.

The DwC operational unit is primarily based on taxa, their occurrence in nature as documented by observations, specimens, and samples, and related information. Included in the standard are documents describing how these terms are managed, how the set of terms can be extended for new purposes, and how the terms can be used.

Each DwC term includes a definition and discussions meant to promote the consistent use of the terms across applications and disciplines. In other contexts, such terms might be called properties, elements, fields, columns, attributes, or concepts. Though the data types and constraints are not provided in the term definitions, recommendations are made about how to restrict the values where appropriate, for instance by suggesting the use of controlled vocabularies.

DwC standards are versioned and are constantly evolving, and working groups frequently add to the documentation practical examples that discuss, refine, and expand the normative definitions of each term. This approach to documentation allows the standard to adapt to new purposes without disrupting existing applications.

In practice, Darwin Core decouples the definition and semantics of individual terms from application of these terms in different technologies. Darwin Core provides separate guidelines on how to encode the terms as RDF, [2] XML [3] or text files. [4]

The Simple Darwin Core [5] is a specification for one particular way to use the terms and to share data about taxa and their occurrences in a simply-structured way. It is likely what is meant if someone were to suggest "formatting your data according to the Darwin Core".

History

Darwin Core was originally created as a Z39.50 profile by the Z39.50 Biology Implementers Group (ZBIG), supported by funding from a USA National Science Foundation award. [6] The name "Darwin Core" was first coined by Allen Allison at the first meeting of the ZBIG held at the University of Kansas in 1998 while commenting on the profile's conceptual similarity with Dublin Core. The Darwin Core profile was later expressed as an XML Schema document for use by the Distributed Generic Information Retrieval (DiGIR) protocol. A TDWG task group was created to revise the Darwin Core, and a ratified metadata standard was officially released on 9 October 2009.

Though ratified as a standard by Biodiversity Information Standards (TDWG) since then, Darwin Core has had numerous previous versions in production usage. The published standard contains a normative term list [7] with the complete history of the versions of terms leading to the current standard.

Darwin Core versions
NameNamespaceNumber of termsXML SchemaDate Issued
Darwin Core 1.0Not Applicable24(Z39.50 GRS-1)1998
Darwin Core 1.2 (Classic) http://digir.net/schema/conceptual/darwin/2003/1.0/ Archived 2008-08-18 at the Wayback Machine [8] 46 2001-09-11
Darwin Core 1.21 (MaNIS/HerpNet/ORNIS/FishNet2) http://digir.net/schema/conceptual/darwin/2003/1.0/ Archived 2008-08-18 at the Wayback Machine [9] 63 2003-03-15
Darwin Core OBIS http://www.iobis.org/obis/ [10] 27 2005-07-10
Darwin Core 1.4 (Draft Standard) http://rs.tdwg.org/dwc/dwcore/ [11] 46 2007-04-23
Darwin Core Standard http://rs.tdwg.org/dwc/terms/ 169 2009-10-09

Key projects using Darwin Core

See also

Related Research Articles

<span class="mw-page-title-main">Dublin Core</span> Standardized set of metadata elements

The Dublin Core, also known as the Dublin Core Metadata Element Set (DCMES), is a set of fifteen main metadata items for describing digital or physical resources. It was the first metadata standard for describing web content. The Dublin Core Metadata Initiative (DCMI) is responsible for formulating the Dublin Core; DCMI is a project of the Association for Information Science and Technology (ASIS&T), a non-profit organization.

MARC is a standard set of digital formats for the machine-readable description of items catalogued by libraries, such as books, DVDs, and digital resources. Computerized library catalogs and library management software need to structure their catalog records as per an industry-wide standard, which is MARC, so that bibliographic information can be shared freely between computers. The structure of bibliographic records almost universally follows the MARC standard. Other standards work in conjunction with MARC, for example, Anglo-American Cataloguing Rules (AACR)/Resource Description and Access (RDA) provide guidelines on formulating bibliographic data into the MARC record structure, while the International Standard Bibliographic Description (ISBD) provides guidelines for displaying MARC records in a standard, human-readable form.

<span class="mw-page-title-main">XBRL</span> Exchange format for business information

XBRL is a freely available and global framework for exchanging business information. XBRL allows the expression of semantics commonly required in business reporting. The standard was originally based on XML, but now additionally supports reports in JSON and CSV formats, as well as the original XML-based syntax. XBRL is also increasingly used in its Inline XBRL variant, which embeds XBRL tags into an HTML document. One common use of XBRL is the exchange of financial information, such as in a company's annual financial report. The XBRL standard is developed and published by XBRL International, Inc. (XII).

The Darwin Information Typing Architecture (DITA) specification defines a set of document types for authoring and organizing topic-oriented information, as well as a set of mechanisms for combining, extending, and constraining document types. It is an open standard that is defined and maintained by the OASIS DITA Technical Committee.

<span class="mw-page-title-main">Learning object metadata</span> Data model

Learning Object Metadata is a data model, usually encoded in XML, used to describe a learning object and similar digital resources used to support learning. The purpose of learning object metadata is to support the reusability of learning objects, to aid discoverability, and to facilitate their interoperability, usually in the context of online learning management systems (LMS).

The Extensible Metadata Platform (XMP) is an ISO standard, originally created by Adobe Systems Inc., for the creation, processing and interchange of standardized and custom metadata for digital documents and data sets.

NIEMOpen, frequently referred to as NIEM, originated as an XML-based information exchange framework from the United States, but has transitioned to an OASISOpen Project. This initiative formalizes NIEM's designation as an official standard in national and international policy and procurement. NIEMOpen's Project Governing Board recently approved the first standard under this new project; the Conformance Targets Attribute Specification (CTAS) Version 3.0. A full collection of NIEMOpen standards are anticipated by end of year 2024.

A representation term is a word, or a combination of words, that semantically represent the data type of a data element. A representation term is commonly referred to as a class word by those familiar with data dictionaries. ISO/IEC 11179-5:2005 defines representation term as a designation of an instance of a representation class As used in ISO/IEC 11179, the representation term is that part of a data element name that provides a semantic pointer to the underlying data type. A Representation class is a class of representations. This representation class provides a way to classify or group data elements.

The semantic spectrum, sometimes referred to as the ontology spectrum, the smart data continuum, or semantic precision, is a series of increasingly precise or rather semantically expressive definitions for data elements in knowledge representations, especially for machine use.

Catalogue Service for the Web (CSW), sometimes seen as Catalogue Service - Web, is a standard for exposing a catalogue of geospatial records in XML on the Internet. The catalogue is made up of records that describe geospatial data, geospatial services, and related resources.

The AgMES initiative was developed by the Food and Agriculture Organization (FAO) of the United Nations and aims to encompass issues of semantic standards in the domain of agriculture with respect to description, resource discovery, interoperability, and data exchange for different types of information resources.

Biodiversity informatics is the application of informatics techniques to biodiversity information, such as taxonomy, biogeography or ecology. It is defined as the application of Information technology technologies to management, algorithmic exploration, analysis and interpretation of primary data regarding life, particularly at the species level organization. Modern computer techniques can yield new ways to view and analyze existing information, as well as predict future situations. Biodiversity informatics is a term that was only coined around 1992 but with rapidly increasing data sets has become useful in numerous studies and applications, such as the construction of taxonomic databases or geographic information systems. Biodiversity informatics contrasts with "bioinformatics", which is often used synonymously with the computerized handling of data in the specialized area of molecular biology.

<span class="mw-page-title-main">Metadata</span> Data about data

Metadata is "data that provides information about other data", but not the content of the data itself, such as the text of a message or the image itself. There are many distinct types of metadata, including:

A metadata standard is a requirement which is intended to establish a common understanding of the meaning or semantics of the data, to ensure correct and proper use and interpretation of the data by its owners and users. To achieve this common understanding, a number of characteristics, or attributes of the data have to be defined, also known as metadata.

The Publishing Requirements for Industry Standard Metadata (PRISM) for the Internet, computing, and computer science, is a specification that defines a set of XML metadata vocabularies for syndicating, aggregating, post-processing and multi-purposing content.

Darwin Core Archive (DwC-A) is a biodiversity informatics data standard that makes use of the Darwin Core terms to produce a single, self-contained dataset for species occurrence, checklist, sampling event or material sample data. Essentially it is a set of text (CSV) files with a simple descriptor (meta.xml) to inform others how your files are organized. The format is defined in the Darwin Core Text Guidelines. It is the preferred format for publishing data to the GBIF network.

In computing, a data definition specification (DDS) is a guideline to ensure comprehensive and consistent data definition. It represents the attributes required to quantify data definition. A comprehensive data definition specification encompasses enterprise data, the hierarchy of data management, prescribed guidance enforcement and criteria to determine compliance.

Plinian Core is a set of vocabulary terms that can be used to describe different aspects of biological species information. Under "biological species Information" all kinds of properties or traits related to taxa—biological and non-biological—are included. Thus, for instance, terms pertaining descriptions, legal aspects, conservation, management, demographics, nomenclature, or related resources are incorporated.

References

  1. Wieczorek, John; D. Bloom; R. Guralnick; S. Blum; M. Döring; R. De Giovanni; T. Robertson; D. Vieglais (2012). "Darwin Core: An Evolving Community-developed Biodiversity Data Standard". PLoS ONE . 7 (1): e29715. Bibcode:2012PLoSO...729715W. doi: 10.1371/journal.pone.0029715 . PMC   3253084 . PMID   22238640.
  2. Darwin Core XML Guide
  3. Darwin Core XML Guide
  4. Darwin Core Text Guide
  5. The Simple Darwin Core
  6. An Experimental Z39.50 Information Retrieval Protocol Test Bed for Biological Collection and Taxonomic Data, #9811443
  7. Darwin Core Normative Term List
  8. Darwin Core 1.2
  9. Darwin Core 1.21
  10. Darwin Core OBIS
  11. Darwin Core 1.4
  12. "Darwin Core". Global Biodiversity Information Facility. Archived from the original on April 12, 2011. Retrieved April 12, 2011.
  13. "Data Schema and metadata". Ocean Biogeographic Information System. Archived from the original on October 8, 2010. Retrieved April 12, 2011.
  14. "Data Ingestion Guidance". iDigBio . Retrieved 26 September 2016.
  15. "Getting your data out there: Data publishing & data standards with iDigBio". iDigBio . Retrieved 26 September 2016.

Categoría:Biología computacional

  1. Darwin Core task group, Biodiversity Information Standards (TDWG) (2014-11-08), Darwin Core: 2014-11-08, doi:10.5281/zenodo.592792 , retrieved 2020-10-22