The Open Archives Initiative (OAI) was an informal organization, in the circle around the colleagues Herbert Van de Sompel, Carl Lagoze, Michael L. Nelson and Simeon Warner, [1] to develop and apply technical interoperability standards for archives to share catalogue information (metadata). [2] [3] [4] [5] The group got together in the late late 1990s [6] and was active for around twenty years. OAI coordinated in particular three specification activities: OAI-PMH, [7] OAI-ORE [8] and ResourceSync. [9] [10] [11] All along the group worked towards building a "low-barrier interoperability framework" for archives (institutional repositories) containing digital content (digital libraries) to allow people (service providers) harvest metadata (from data providers). [12] [13] Such sets of metadata are since then harvested to provide "value-added services", often by combining different data sets. [14] [15] [16] [17]
OAI has been involved in developing a technological framework and interoperability standards for enhancing access to eprint archives, which make scholarly communications like academic journals available, associated with the open access publishing movement. The relevant technology and standards are applicable beyond scholarly publishing.
The OAI technical infrastructure, specified in the Protocol for Metadata Harvesting (OAI-PMH) version 2.0, [7] defines a mechanism for data providers to expose their metadata. This protocol mandates that individual archives map their metadata to the Dublin Core, a common metadata set for this purpose. OAI standards allow a common way to provide content, and part of those standards is that the content has metadata that describes the items in Dublin Core format. Object Reuse and Exchange (OAI-ORE) [8] defines standards for the description and exchange of aggregations of web resources.
Funding for the initiative came from the Andrew W. Mellon Foundation, Coalition for Networked Information (CNI), Digital Library Federation (DLF), National Science Foundation (NSF), the Alfred P. Sloan Foundation, and other organizations. [5]
The Dublin Core vocabulary, also known as the Dublin Core Metadata Terms (DCMT), is a general purpose metadata vocabulary for describing resources of any type. It was first developed for describing web content in the early days of the World Wide Web. The Dublin Core Metadata Initiative (DCMI) is responsible for maintaining the Dublin Core vocabulary.
The Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) is a protocol developed for harvesting metadata descriptions of records in an archive so that services can be built using metadata from many archives. An implementation of OAI-PMH must support representing metadata in Dublin Core, but may also support additional representations.
Z39.50 is an international standard client–server, application layer communications protocol for searching and retrieving information from a database over a TCP/IP computer network, developed and maintained by the Library of Congress. It is covered by ANSI/NISO standard Z39.50, and ISO standard 23950.
An institutional repository (IR) is an archive for collecting, preserving, and disseminating digital copies of the intellectual output of an institution, particularly a research institution. Academics also utilize their IRs for archiving published works to increase their visibility and collaboration with other academics. However, most of these outputs produced by universities are not effectively accessed and shared by researchers and other stakeholders. As a result academics should be involved in the implementation and development of an IR project so that they can learn the benefits and purpose of building an IR.
The National Information Standards Organization is a United States non-profit standards organization that develops, maintains and publishes technical standards related to publishing, bibliographic and library applications. It was founded in 1939 as the Z39 Committee, chaired from 1963-1977 by Jerrold Orne, incorporated as a not-for-profit education association in 1983, and assumed its current name in 1984.
Fedora is a digital asset management (DAM) content repository architecture upon which institutional repositories, digital archives, and digital library systems might be built. Fedora is the underlying architecture for a digital repository, and is not a complete management, indexing, discovery, and delivery application. It is a modular architecture built on the principle that interoperability and extensibility are best achieved by the integration of data, interfaces, and mechanisms as clearly defined modules.
mod_oai is an Apache module that allows web crawlers to efficiently discover new, modified, and deleted web resources from a web server by using OAI-PMH, a protocol which is widely used in the digital libraries community. mod_oai also allows harvesters to obtain "archive-ready" resources from a web server.
ContextObjects in Spans (COinS) is a method to embed bibliographic metadata in the HTML code of web pages. This allows bibliographic software to publish machine-readable bibliographic items and client reference management software to retrieve bibliographic metadata. The metadata can also be sent to an OpenURL resolver. This allows, for instance, searching for a copy of a book at a specific library.
BASE is a multi-disciplinary search engine to scholarly internet resources, created by Bielefeld University Library in Bielefeld, Germany. It is based on free and open-source software such as Apache Solr and VuFind. It harvests OAI metadata from institutional repositories and other academic digital libraries that implement the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH), and then normalizes and indexes the data for searching. In addition to OAI metadata, the library indexes selected web sites and local data collections, all of which can be searched via a single search interface.
Preservation metadata is item level information that describes the context and structure of a digital object. It provides background details pertaining to a digital object's provenance, authenticity, and environment. Preservation metadata, is a specific type of metadata that works to maintain a digital object's viability while ensuring continued access by providing contextual information, usage details, and rights.
PREservation Metadata: Implementation Strategies (PREMIS) is the de facto digital preservation metadata standard.
The Open Archives Initiative Object Reuse and Exchange (OAI-ORE) defines standards for the description and exchange of aggregations of web resources. The OAI-ORE specification implements the ORE Model which introduces the resource map (ReM) that makes it possible to associate an identity with aggregations of resources and make assertions about their structure and semantics.
A resource map (ReM) is a concept of the ORE Model for associating an identity with compound digital objects and making assertions about their structure and semantics. Compound objects combine distributed resources, including multiple media types.
A digital library is an online database of digital objects that can include text, still images, audio, video, digital documents, or other digital media formats or a library accessible through the internet. Objects can consist of digitized content like print or photographs, as well as originally produced digital content like word processor files or social media posts. In addition to storing content, digital libraries provide means for organizing, searching, and retrieving the content contained in the collection. Digital libraries can vary immensely in size and scope, and can be maintained by individuals or organizations. The digital content may be stored locally, or accessed remotely via computer networks. These information retrieval systems are able to exchange information with each other through interoperability and sustainability.
The Digital Library Federation (DLF) is a program of the Council on Library and Information Resources (CLIR) that brings together a consortium of college and university libraries, public libraries, museums, and related institutions with the stated mission of "advanc[ing] research, learning, social justice, and the public good through digital library technologies." It was formed in 1995.
Herbert Van de Sompel is a Belgian librarian, computer scientist, and musician, most known for his role in the development of the Open Archives Initiative (OAI) and standards such as OpenURL, Object Reuse and Exchange, and the OAI Protocol for Metadata Harvesting.
An open repository or open-access repository is a digital platform that holds research output and provides free, immediate and permanent access to research results for anyone to use, download and distribute. To facilitate open access such repositories must be interoperable according to the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH). Search engines harvest the content of open access repositories, constructing a database of worldwide, free of charge available research. Data repositories are the cornerstone for FAIR data practices and are used expeditiously within the scientific community.
OPUS is an open-source software package under the GNU General Public License used for creating Open Access repositories that are compliant with the Open Archives Initiative Protocol for Metadata Harvesting. It provides tools for creating collections of digital resources, as well as for their storage and dissemination. It is usually used at universities, libraries and research institutes as a platform for institutional repositories.
MyCoRe is an open source repository software framework for building disciplinary or institutional repositories, digital archives, digital libraries, and scientific journals. The software is developed at various German university libraries and computer centers. Although most MyCoRe web applications are located in Germany, there are English-language applications, such as "The International Treasury of Islamic Manuscripts" at the University of Cambridge (UK).
The Open Knowledge Repository is the official open-access repository of the World Bank and features research content about development. It was launched in 2012, alongside the World Bank's Open Access Policy and its adoption of the Creative Commons Attribution license for all research and knowledge products that it publishes, which collectively made the World Bank the first international organization to completely embrace open access. The repository collects the intellectual output of the World Bank in digital form, disseminates it, and preserves it long-term.