The Dublin Core vocabulary, also known as the Dublin Core Metadata Terms (DCMT), is a general purpose metadata vocabulary for describing resources of any type. It was first developed for describing web content in the early days of the World Wide Web. The Dublin Core Metadata Initiative (DCMI) is responsible for maintaining the Dublin Core vocabulary.
Initially developed as fifteen terms in 1998 the set of elements has grown over time and in 2008 was redefined as an Resource Description Framework (RDF) vocabulary. [1]
Designed with minimal constraints, each Dublin Core element is optional and may be repeated. There is no prescribed order in Dublin Core for presenting or using the elements.
The Dublin Core Element Set was a response to concern about accurate finding of resources on the Web, with some early assumptions that this would be a library function. In particular it anticipated a future in which scholarly materials would be searchable on the World Wide Web. Whereas HTML was being used to mark-up the structure of documents, metadata was needed to mark-up the contents of documents. Given the great number of documents on, and soon to be on, the World Wide Web, it was proposed that "self-identifying" documents would be necessary. [7] [8]
To this end, the Dublin Core Metadata Workshop met beginning in 1995 to develop a vocabulary that could be used to insert consistent metadata into Web documents. [9] Originally defined as 15 metadata elements, the Dublin Core Element Set allowed authors of web pages a vocabulary and method for creating simple metadata for their works. [10] It provided a simple, flat element set that could be used
Qualified Dublin Core was developed in the late 1990s to provide an extension mechanism to the vocabulary of 15 elements. This was a response to communities whose metadata needs required additional detail. [11]
In 2012, the DCMI Metadata Terms was created using a RDF data model. [12] This expanded element set incorporates the original 15 elements and many of the qualifiers of the qualified Dublin Core as RDF properties.[ citation needed ] The full set of elements is found under the namespace http://purl.org/dc/terms/. There is a separate namespace for the original 15 elements as previously defined: http://purl.org/dc/elements/1.1/. [13]
The Dublin Core vocabulary published in 1999 consisted of 15 terms:
The vocabulary was commonly expressed in HTML 'meta' tagging in the "<head>" section of an HTML-encoded page. [14]
<head> <meta name="DC.title" content="Services to Government" > <meta name="DC.date" content="1997-07" > </head>
The vocabulary could be used in any metadata serialization including key/value pairs and XML. [15]
Subsequent to the specification of the original 15 elements, Qualified Dublin Core was developed to provide an extension mechanism to be used when the primary 15 terms were not sufficient. A set of common refinements was provided in the documentation. These schemes include controlled vocabularies and formal notations or parsing rules. [16] Qualified Dublin Core was not limited to these specific refinements, allowing communities to create extended metadata terms to meet their needs. [11]
The guiding principle for the qualification of Dublin Core elements, colloquially known as the Dumb-Down Principle, [17] states that an application that does not understand a specific element refinement term should be able to ignore the qualifier and treat the metadata value as if it were an unqualified (broader) element. While this may result in some loss of specificity, the remaining element value (without the qualifier) should continue to be generally correct and useful for discovery.[ citation needed ]
Qualified Dublin Core added qualifiers to these elements:
Element | Qualifier |
---|---|
Title | Alternative |
Description | Table Of Contents |
" | Abstract |
DateCreated | Valid |
" | Available |
" | Issued |
" | Modified |
FormatExtent | Medium |
Relation | Is Version Of |
" | Has Version |
" | Is Replaced By |
" | Replaces |
" | Is Required By |
" | Requires |
" | Is Part Of |
" | Has Part |
" | Is Referenced By |
" | References |
" | Is Format Of |
" | Has Format |
Coverage | Spatial |
" | Temporal |
And added three elements not in the base 15:
Qualified Dublin Core is often used with a "dot syntax", with a period separating the element and the qualifier(s). This is shown in this excerpted example provided by Chan and Hodges: [11]
Title: D-Lib Magazine
Title.alternative: Digital Library Magazine
Identifier.ISSN: 1082-9873
Publisher: Corporation for National Research Initiatives
Publisher.place: Reston, VA.
Subject.topical.LCSH: Digital libraries - Periodicals
The DCMI Metadata Terms lists the current set of the Dublin Core vocabulary. [12] This set includes the fifteen terms of the DCMES (in italic), as well as many of the qualified terms. Each term has a unique URI in the namespace http://purl.org/dc/terms, and all are defined as RDF properties.
It also includes these RDF classes which are used as domains and ranges of some properties:
Changes that are made to the Dublin Core standard are reviewed by a DCMI Usage Board within the context of a DCMI Namespace Policy. [18] This policy describes how terms are assigned and also sets limits on the amount of editorial changes allowed to the labels, definitions, and usage comments. [19]
The Dublin Core Metadata Terms vocabulary has been formally standardized internationally as ISO 15836 by the International Organization for Standardization (ISO) [20] and as IETF RFC 5013 by the Internet Engineering Task Force (IETF), [21] as well as in the U.S. as ANSI/NISO Z39.85 by the National Information Standards Organization (NISO). [22]
Syntax choices for metadata expressed with the Dublin Core elements depend on context. Dublin Core concepts and semantics are designed to be syntax independent[ clarification needed ] and apply to a variety of contexts, as long as the metadata is in a form suitable for interpretation by both machines and people.
One Document Type Definition based on Dublin Core is the Open Source Metadata Framework (OMF) specification. [23] OMF is in turn used by Rarian (superseding ScrollKeeper), which is used by the GNOME desktop and KDE help browsers and the ScrollServer documentation server.
PBCore is also based on Dublin Core. [24] The Zope CMF's Metadata products, used by the Plone, ERP5, the Nuxeo CPS Content management systems, SimpleDL, and Fedora Commons also implement Dublin Core. The EPUB e-book format uses Dublin Core metadata in the OPF file. [25] Qualified Dublin Core is used in the DSpace archival management software. [26]
The Australian Government Locator Service (AGLS) metadata standard is an application profile of Dublin Core. [27] : 5
The Resource Description Framework (RDF) is a method to describe and exchange graph data. It was originally designed as a data model for metadata by the World Wide Web Consortium (W3C). It provides a variety of syntax notations and data serialization formats, of which the most widely used is Turtle.
The Open Archives Initiative (OAI) was an informal organization, in the circle around the colleagues Herbert Van de Sompel, Carl Lagoze, Michael L. Nelson and Simeon Warner, to develop and apply technical interoperability standards for archives to share catalogue information (metadata). The group got together in the late late 1990s and was active for around twenty years. OAI coordinated in particular three specification activities: OAI-PMH, OAI-ORE and ResourceSync. All along the group worked towards building a "low-barrier interoperability framework" for archives containing digital content to allow people harvest metadata. Such sets of metadata are since then harvested to provide "value-added services", often by combining different data sets.
OLAC, the Open Language Archives Community, is an initiative to create a unified means of searching online databases of language resources for linguistic research. The information about resources is stored in XML format for easy searching. OLAC was founded in 2000, and is hosted at the Linguistic Data Consortium webserver at the University of Pennsylvania.
The PBCore metadata standard was created by the public broadcasting community in the United States of America for use by public broadcasters and related communities that manage audiovisual assets, including libraries, archives, independent producers, etc. PBCore is organized as a set of specified fields that can be used in database applications, and it can be used as a data model for media cataloging and asset management systems. As an XML schema, PBCore enables data exchange between media collections, systems and organizations.
Learning Object Metadata is a data model, usually encoded in XML, used to describe a learning object and similar digital resources used to support learning. The purpose of learning object metadata is to support the reusability of learning objects, to aid discoverability, and to facilitate their interoperability, usually in the context of online learning management systems (LMS).
RDF Schema (Resource Description Framework Schema, variously abbreviated as RDFS, RDF(S), RDF-S, or RDF/S) is a set of classes with certain properties using the RDF extensible knowledge representation data model, providing basic elements for the description of ontologies. It uses various forms of RDF vocabularies, intended to structure RDF resources. RDF and RDFS can be saved in a triplestore, then one can extract some knowledge from them using a query language, like SPARQL.
The Extensible Metadata Platform (XMP) is an ISO standard, originally created by Adobe Systems Inc., for the creation, processing and interchange of standardized and custom metadata for digital documents and data sets.
RDFa or Resource Description Framework in Attributes is a W3C Recommendation that adds a set of attribute-level extensions to HTML, XHTML and various XML-based document types for embedding rich metadata within Web documents. The Resource Description Framework (RDF) data-model mapping enables its use for embedding RDF subject-predicate-object expressions within XHTML documents. It also enables the extraction of RDF model triples by compliant user agents.
Simple Knowledge Organization System (SKOS) is a W3C recommendation designed for representation of thesauri, classification schemes, taxonomies, subject-heading systems, or any other type of structured controlled vocabulary. SKOS is part of the Semantic Web family of standards built upon RDF and RDFS, and its main objective is to enable easy publication and use of such vocabularies as linked data.
The AgMES initiative was developed by the Food and Agriculture Organization (FAO) of the United Nations and aims to encompass issues of semantic standards in the domain of agriculture with respect to description, resource discovery, interoperability, and data exchange for different types of information resources.
Agricultural Information Management Standards (AIMS) is a web site managed by the Food and Agriculture Organization of the United Nations (FAO) for accessing and discussing agricultural information management standards, tools and methodologies connecting information workers worldwide to build a global community of practice. Information management standards, tools and good practices can be found on AIMS:
Metadata is "data that provides information about other data", but not the content of the data itself, such as the text of a message or the image itself. There are many distinct types of metadata, including:
A metadata standard is a requirement which is intended to establish a common understanding of the meaning or semantics of the data, to ensure correct and proper use and interpretation of the data by its owners and users. To achieve this common understanding, a number of characteristics, or attributes of the data have to be defined, also known as metadata.
The Grey Literature International Steering Committee (GLISC) was established in 2006 after the 7th International Conference on Grey Literature (GL7) held in Nancy (France) on 5–6 December 2005.
The Office Open XML file formats are a set of file formats that can be used to represent electronic office documents. There are formats for word processing documents, spreadsheets and presentations as well as specific formats for material such as mathematical formulas, graphics, bibliographies etc.
The Publishing Requirements for Industry Standard Metadata (PRISM) for the Internet, computing, and computer science, is a specification that defines a set of XML metadata vocabularies for syndicating, aggregating, post-processing and multi-purposing content.
ISO/IEC 19788Information technology – Learning, education and training – Metadata for learning resources is a multi-part standard prepared by subcommittee SC 36 of the joint technical committee ISO/IEC JTC 1, Information Technology for Learning, Education and Training.
The Asset Description Metadata Schema (ADMS) is a common metadata vocabulary to describe standards, so-called interoperability assets, on the Web.
In the context of information retrieval, a thesaurus is a form of controlled vocabulary that seeks to dictate semantic manifestations of metadata in the indexing of content objects. A thesaurus serves to minimise semantic ambiguity by ensuring uniformity and consistency in the storage and retrieval of the manifestations of content objects. ANSI/NISO Z39.19-2005 defines a content object as "any item that is to be described for inclusion in an information retrieval system, website, or other source of information". The thesaurus aids the assignment of preferred terms to convey semantic metadata associated with the content object.
The Journal Article Tag Suite (JATS) is an XML format used to describe scientific literature published online. It is a technical standard developed by the National Information Standards Organization (NISO) and approved by the American National Standards Institute with the code Z39.96-2012.
PBCore is built on the foundation of the Dublin Core (ISO 15836), an international standard for resource discovery.