Indecs Content Model

Last updated

indecs [1] (an acronym of "interoperability of data in e-commerce systems"; written in lower case) was a project partly funded by the European Community Info 2000 initiative and by several organisations representing the music, rights, text publishing, authors, library and other sectors in 1998-2000, which has since been used in a number of metadata activities. A final report and related documents were published; the indecs Metadata Framework document [2] is a concise summary.

Contents

indecs provided an analysis of the requirements for metadata for e-commerce of content (intellectual property) in the network environment, focusing on semantic interoperability. Semantic interoperability deals with the question of how one computer system knows what the terms from another computer system mean (e.g. if A says "owner" and B says "owner", are they referring to the same thing? If A says "released" and B says "disseminated", do they mean different things?).

indecs was built from a simple generic model of commerce (the "model of making"): a model of the life cycle of any kind of content from conception to the final physical or digital copies. The top-level model is summarised as "people make stuff; people use stuff; and (for commerce to take place) people make deals about the stuff". If secure machine-to-machine management of commerce is to be possible, the stuff, the people and the deals must all be securely identified and described in standardised ways that machines can interpret and use. Central to the analysis is the assumption that it is possible to produce a generic mechanism to handle complex metadata for all different types of content. So, for example, instead of treating sound carriers, books, videos and photographs as fundamentally different things with different (if similar) characteristics, they are all recognised as creations with different values of the same higher-level attributes, whose metadata can be supported in a common environment.

Framework

The indecs analysis supports interoperability of at least five different types:

The indecs project developed a framework, described in detail in the final project documents, within which such interoperability could be achieved. indecs proposed four principles as key to the management of identification:

indecs also produced a definition of metadata:

The indecs framework stresses the significance of relationships, which lie at the heart of the indecs analysis. It underlines the importance of unique identification of all entities (since otherwise expressing relationships between them is of little practical utility). Finally, it raises the question of authority: the identification of the person making the claim is as significant as the identification of any other entity.

(Note: describing metadata as linking two referents may seem unusual: the point is that an unambiguous piece of metadata has to relate to precise enough things - referents - at each end of the link (e.g. my CAR is GREEN) to make a useful statement. "Precise enough" is contextual. "Green" might be a perfectly precise enough referent if the namespace it's coming from (where we are referring to, and the application we are interested in) is dealing with "what colour is your car: green, red, blue, black, or white...?"; but not if it's intended to describe precisely a green colour to a garage to respray your car following an accident, when you would need to say e.g. "Ford Colour ref 3456/2009 Metallic Green".)

The underlying assumptions or axioms of the indecs approach are (1) Metadata is critical; (2) Stuff is complex; (3): Metadata is modular; and (4) Transactions need automation.

Use

The indecs Framework does not presuppose any specific business model or legal framework; it can be used to describe transactions of copyrighted, open source, or freely available material.

The framework has been developed further as a generic ontology-based approach dealing with defined types of entity and attribute, and the relators that link them within a contextual model structure (where context is defined as an intersection of time and place, in which entities may play roles). Its main use to date has been in applications of commercial transactions of content and in some library-related applications. Examples of applications using this approach include:

One of the deliverables of the indecs project was a specification for a Directory of Parties. This led to a subsequent project, Interparty, [3] funded under the European Commission's Information Society Technologies Programme, to design and specify a network to support interoperability of party identification (for both natural and corporate names) across different domains, building on the indecs principles. InterParty was not proposed as a replacement for existing schemes for the identification of participants in the intellectual property domain (e.g. national library name authority files or systems oriented towards the needs of rights licensing) but as a means of effecting their interoperation. Some of its conclusions have been used elsewhere, e.g. in the work on the proposed ISO ISNI [4] (International Standard Name Identifier).

Other developments are continuing, notably through the OntologyX semantic engineering tools and services from Rightscom. The approach also has much in common with the CIDOC Conceptual Reference Model (CRM), an ontology for cultural heritage information, and the Functional Requirements for Bibliographic Records (FRBR) model in the library world.

In June 2009 a new initiative, the Vocabulary Mapping Framework (VMF), was announced by a consortium of partners. Funded by JISC, in Nov 2009 this delivered (as the first phase of an ongoing program of work) an extensive and authoritative mapping of vocabularies from nine major content metadata standards, creating a downloadable tool to support interoperability across communities. The mapping is also extensible to other standards. The work builds on the principles of interoperability established in the indecs Content Model, and is an expansion of the existing RDA/ONIX Framework for Resource Categorization into a comprehensive vocabulary of resource relators and categories, which will be a superset of those used in major standards from the publisher/producer, education and bibliographic/heritage communities. The International DOI Foundation, which fully endorses this work, is to provide a web hosting facility for the Framework as part of its commitment to promoting the wider use of interoperable metadata, and will use the vocabulary mapping wherever possible to support the association of metadata with DOI names [5]

Intellectual property rights

Indecs uses one common underlying structure which may be considered in three views: the general view; the commerce view; and the intellectual property (legal) view. An intellectual property right is a legal concept, with terms defined in a series of international conventions and treaties and under national law. The precise characteristics by which recognition of intellectual property rights is secured are elusive and are settled by editorial, commercial or, ultimately, by a legal judgement. Indecs does not attempt to replace such legal considerations, though a specific set of legal elements might be included in an indecs-based structure, and the indecs framework specifically includes some definitions from major international treaties such as the Berne Convention and the WIPO Copyright Treaty.

Mapping of terms

Different models of the life cycle of content may have important differences, not least in the specific meaning attached to the names of terms they employ. FRBR, indecs and CRM were each informed by different functional requirements, and so evolved different mechanisms for dealing with the issues that seemed most important to them. Each is a particular view on the "universe of discourse" of resources and relationships: there are many valid views. Broadly, they are compatible, and effective integration of metadata from schemes based on them should be achievable, but they must be handled with care. As an example: the terms abstraction, manifestation, item and expression are often used in considering content life cycles (e.g. a sound recording is the expression of a musical work during a recording session at a particular place and time, and is distinct from, say, the master tape made, which is a manifestation). These were dealt with in indecs, but may have slightly different meanings in other schemes. Such an analysis of meaning of a term from a scheme is possible in indecs by mapping the precise definitions into further terms with precise definitions within the framework. indecs and other frameworks based on it continue to be developed and refined through the process of implementation.

Related Research Articles

<span class="mw-page-title-main">Dublin Core</span> Standardized set of metadata elements

The Dublin Core, also known as the Dublin Core Metadata Element Set (DCMES), is a set of fifteen main metadata items for describing digital or physical resources. The Dublin Core Metadata Initiative (DCMI) is responsible for formulating the Dublin Core; DCMI is a project of the Association for Information Science and Technology (ASIS&T), a non-profit organization.

The ISO/IEC 11179 Metadata Registry (MDR) standard is an international ISO/IEC standard for representing metadata for an organization in a metadata registry. It documents the standardization and registration of metadata to make data understandable and shareable.

A web resource is any identifiable resource present on or connected to the World Wide Web. Resources are identified using Uniform Resource Identifiers (URIs). In the Semantic Web, web resources and their semantic properties are described using the Resource Description Framework (RDF).

Ontology alignment, or ontology matching, is the process of determining correspondences between concepts in ontologies. A set of correspondences is also called an alignment. The phrase takes on a slightly different meaning, in computer science, cognitive science or philosophy.

The AgMES initiative was developed by the Food and Agriculture Organization (FAO) of the United Nations and aims to encompass issues of semantic standards in the domain of agriculture with respect to description, resource discovery, interoperability, and data exchange for different types of information resources.

The CIDOC Conceptual Reference Model (CRM) provides an extensible ontology for concepts and information in cultural heritage and museum documentation. It is the international standard (ISO 21127:2014) for the controlled exchange of cultural heritage information. Galleries, libraries, archives, museums (GLAMs), and other cultural institutions are encouraged to use the CIDOC CRM to enhance accessibility to museum-related information and knowledge.

Linguistic categories include

SIMILE was a joint research project run by the World Wide Web Consortium (W3C), Massachusetts Institute of Technology Libraries and MIT CSAIL and funded by the Andrew W. Mellon Foundation. The project ran from 2003 to August 2008. It focused on developing tools to increase the interoperability of disparate digital collections. Much of SIMILE's technical focus is oriented towards Semantic Web technology and standards such as Resource Description Framework (RDF).

Ontology-based data integration involves the use of one or more ontologies to effectively combine data or information from multiple heterogeneous sources. It is one of the multiple data integration approaches and may be classified as Global-As-View (GAV). The effectiveness of ontology‑based data integration is closely tied to the consistency and expressivity of the ontology used in the integration process.

PREservation Metadata: Implementation Strategies (PREMIS) is the de facto digital preservation metadata standard.

The Handle System is the Corporation for National Research Initiatives's proprietary registry assigning persistent identifiers, or handles, to information resources, and for resolving "those handles into the information necessary to locate, access, and otherwise make use of the resources".

Semantic audio is the extraction of meaning from audio signals. The field of semantic audio is primarily based around the analysis of audio to create some meaningful metadata, which can then be used in a variety of different ways.

Knowledge extraction is the creation of knowledge from structured and unstructured sources. The resulting knowledge needs to be in a machine-readable and machine-interpretable format and must represent knowledge in a manner that facilitates inferencing. Although it is methodically similar to information extraction (NLP) and ETL, the main criterion is that the extraction result goes beyond the creation of structured information or the transformation into a relational schema. It requires either the reuse of existing formal knowledge or the generation of a schema based on the source data.

ISO/IEC 19788Information technology – Learning, education and training – Metadata for learning resources is a multi-part standard prepared by subcommittee SC 36 of the joint technical committee ISO/IEC JTC 1, Information Technology for Learning, Education and Training.

<span class="mw-page-title-main">Asset Description Metadata Schema</span>

The Asset Description Metadata Schema (ADMS) is a common metadata vocabulary to describe standards, so-called interoperability assets, on the Web.

The European Legislation Identifier (ELI) ontology is a vocabulary for representing metadata about national and European Union (EU) legislation. It is designed to provide a standardized way to identify and describe the context and content of national or EU legislation, including its purpose, scope, relationships with other legislations and legal basis. This will guarantee easier identification, access, exchange and reuse of legislation for public authorities, professional users, academics and citizens. ELI paves the way for knowledge graphs, based on semantic web standards, of legal gazettes and official journals.

The European Film Gateway (EFG) is a single access point to the digitized holdings of historical European film documents from numerous film archives and cinematheques, including over 600,000 individual objects from over 60 collections. The European Film Gateway gives access to images, textual materials, and moving images. The vast contents include film stills, set photos, posters, set drawings, portrait photographs, scripts, correspondences, film censorship and visa rulings, out-of-print books, film programs and reviews, as well as newsreels, documentaries, commercials, and feature films. The portal facilitates access to the archives which hold the original materials.

<span class="mw-page-title-main">Akoma Ntoso</span>

Akoma Ntoso (Architecture for Knowledge-Oriented Management of African Normative Texts using Open Standards and Ontologies) is an international technical standard for representing executive, legislative and judiciary documents in a structured manner using a domain specific, legal XML vocabulary.

BIBFRAME is a data model for bibliographic description. BIBFRAME was designed to replace the MARC standards, and to use linked data principles to make bibliographic data more useful both within and outside the library community.

<span class="mw-page-title-main">UMBEL</span>

UMBEL is a logically organized knowledge graph of 34,000 concepts and entity types that can be used in information science for relating information from disparate sources to one another. It was retired at the end of 2019. UMBEL was first released in July 2008. Version 1.00 was released in February 2011. Its current release is version 1.50.

References

  1. "CORDIS | European Commission".
  2. http://www.doi.org/topics/indecs/indecs_framework_2000.pdf "Principles, model and data dictionary"
  3. "Home". interparty.org.
  4. "ISNI".
  5. DOI News: Launch of “Vocabulary Mapping Framework”