Open Archival Information System

Last updated

The term Open Archival Information System (or OAIS) refers to the ISO OAIS Reference Model for an OAIS. This reference model is defined by recommendation CCSDS 650.0-M-2-S of the Consultative Committee for Space Data Systems; [1] this text is identical to = 57284 ISO 14721:2012. The CCSDS's purview is space agencies, but the OAIS model it developed has proved useful to other organizations and institutions with digital archiving needs. OAIS, known as ISO 14721:2003, is widely accepted and utilized by various organizations and disciplines, both national and international, and was designed to ensure preservation. The OAIS standard, published in 2005, is considered the optimum standard to create and maintain a digital repository over a long period of time.

Contents

The OAIS model can be applied to various archives, e.g., open access, closed, restricted, "dark", or proprietary. [2] [3]

The information being maintained has been deemed to need "long term preservation", even if the OAIS itself is not permanent. "Long term" is long enough to be concerned with the impacts of changing technologies, including support for new media and data formats, or with a changing user community. "Long term" may extend indefinitely. The OAIS defines a long period of time as any length of time that might be impacted by changing technologies and the changing of "Designated Community," e.g., any group of consumers capable of understanding the information. This length of time can be indefinite. The archive defines the community and that definition is not fixed. [2]

The "O" in OAIS represents the "open way the standard was developed", and does not represent "open access", [2] or the usage of the term open in the Open Definition or Open Archives Initiative. The "I" in OAIS represents "information", meaning data that can be shared or exchanged. [2]

In this reference model there is a particular focus on digital information, both as the primary forms of information held and as supporting information for both digitally and physically archived materials. Therefore, the model accommodates information that is inherently non-digital (e.g., a physical sample), but the modeling and preservation of such information is not addressed in detail. As strictly a conceptual framework, the OAIS model does not require the use of any particular computing platform, system environment, system design paradigm, system development methodology, database management system, database design paradigm, data definition language, command language, system interface, user interface, technology, or media for an archive to be compliant. Its aim is to set the standard for the activities that are involved in preserving a digital archive rather than the method for carrying out those activities.

The acronym OAIS should not be confused with OAI, which is the Open Archives Initiative.

The OAIS environment and information model

OAIS environment diagram OAIS Environment (en).svg
OAIS environment diagram
OAIS information package diagram OAIS Information Package (en).svg
OAIS information package diagram

The OAIS environment involves the interaction of four entities: producers of information, consumers of information (or the designated community), management, and the archive itself. The management component of the OAIS environment is not an entity that carries out day-to-day maintenance of an archive but a person or group that sets policies for the content contained in the archive.

The OAIS model also defines an information model. Physical or digital items which contain information are known as data objects. Members of the Designated Community for an archive should be able to interpret and understand the information contained in a data object either because of their established knowledge base or with the assistance of supplementary "representation information" that is included with the data object.

An information package includes the following information objects:

There are three types of information package in the OAIS reference model:

These three information packages may or may not be identical to each other.

The functional model

OAIS functional model diagram OAIS Functional Model (en).svg
OAIS functional model diagram

There are six functional entities in an OAIS:

Adoption

Although originally developed by the Consultative Committee for Space Data Systems, a body dedicated to overseeing space agencies, as digital preservation has become a discipline unto itself, the OAIS has become the standard model for digital preservation systems at many institutions and organizations. OAIS-compliance has been a stated fundamental design requirement for major digital preservation and repository development efforts at the National Archives and Records Administration, Library of Congress, British Library, Bibliothèque nationale de France, National Library of the Netherlands, the Digital Curation Centre in the UK, OCLC (the Online Computer Library Center), the JSTOR (Journal Storage) scholarly journal archive, as well as several university library systems. Centre of Excellence for Digital Preservation, C-DAC, India has implemented OAIS for National Cultural Audiovisual Archive (NCAA) which has been certified as Trusted Digital Repository as per ISO 16363: 2012 during November 2017. This initiative was a part of Indian National Digital Preservation Program (NDPP). The OAIS has been the basis of numerous prominent digital preservation initiatives and standards including the Preservation Metadata: Implementation Strategies working group and the Trustworthy Repositories Audit & Certification (TRAC) document from OCLC. [5] which was an initial draft of, and subsequently superseded by, CCSDS 652.1-M-2 of the Consultative Committee for Space Data Systems; [1] this text is identical to ISO 16363:2012 which forms the basis of the ISO audit and certification of Trustworthy Repositories, described at iso16363.org. The ISO 19165:1-2018 recommends the use of the Open Packaging Conventions to implement the Geospatial Package.

Software architecture model

As part of #WeMissiPres, Frank Obermeit, a computer scientist at the State Archives of Saxony-Anhalt, Germany, presented [6] a software architecture model that fully implements the Open Archival Information System (OAIS) reference model on 22 September 2020. An appliance developed on the architecture model has been available since October 2020. The architecture model is based exclusively on de facto and de jure standards and the appliance developed according to it was realised exclusively with open source products. The three main standards are Business Process Model and Notation (BPMN), Representational State Transfer (REST) and OpenID Connect (OIDC). Scalability, distributability and extensibility are further essential features and enable the use in organisations of different sizes.

See also

Related Research Articles

<span class="mw-page-title-main">Digital obsolescence</span> Data loss as the format goes into disuse

Digital obsolescence is the risk of data loss because of inabilities to access digital assets, due to the hardware or software required for information retrieval being repeatedly replaced by newer devices and systems, resulting in increasingly incompatible formats. While the threat of an eventual "digital dark age" was initially met with little concern until the 1990s, modern digital preservation efforts in the information and archival fields have implemented protocols and strategies such as data migration and technical audits, while the salvage and emulation of antiquated hardware and software address digital obsolescence to limit the potential damage to long-term information access.

In library and archival science, digital preservation is a formal process to ensure that digital information of continuing value remains accessible and usable in the long term. It involves planning, resource allocation, and application of preservation methods and technologies, and combines policies, strategies and actions to ensure access to reformatted and "born-digital" content, regardless of the challenges of media failure and technological change. The goal of digital preservation is the accurate rendering of authenticated content over time.

An institutional repository (IR) is an archive for collecting, preserving, and disseminating digital copies of the intellectual output of an institution, particularly a research institution. Academics also utilize their IRs for archiving published works to increase their visibility and collaboration with other academics. However, most of these outputs produced by universities are not effectively accessed and shared by researchers and other stakeholders. As a result academics should be involved in the implementation and development of an IR project so that they can learn the benefits and purpose of building an IR.

Enterprise content management (ECM) extends the concept of content management by adding a timeline for each content item and, possibly, enforcing processes for its creation, approval, and distribution. Systems using ECM generally provide a secure repository for managed items, analog or digital. They also include one methods for importing content to manage new items, and several presentation methods to make items available for use. Although ECM content may be protected by digital rights management (DRM), it is not required. ECM is distinguished from general content management by its cognizance of the processes and procedures of the enterprise for which it is created.

<span class="mw-page-title-main">DSpace</span> Repository software package

DSpace is an open source repository software package typically used for creating open access repositories for scholarly and/or published digital content. While DSpace shares some feature overlap with content management systems and document management systems, the DSpace repository software serves a specific need as a digital archives system, focused on the long-term storage, access and preservation of digital content. The optional DSpace registry lists almost three thousand repositories all over the world.

<span class="mw-page-title-main">Fedora Commons</span>

Fedora is a digital asset management (DAM) content repository architecture upon which institutional repositories, digital archives, and digital library systems might be built. Fedora is the underlying architecture for a digital repository, and is not a complete management, indexing, discovery, and delivery application. It is a modular architecture built on the principle that interoperability and extensibility are best achieved by the integration of data, interfaces, and mechanisms as clearly defined modules.

The Metadata Encoding and Transmission Standard (METS) is a metadata standard for encoding descriptive, administrative, and structural metadata regarding objects within a digital library, expressed using the XML schema language of the World Wide Web Consortium (W3C). The standard is maintained as part of the MARC standards of the Library of Congress, and is being developed as an initiative of the Digital Library Federation (DLF).

Preservation metadata is item level information that describes the context and structure of a digital object. It provides background details pertaining to a digital object's provenance, authenticity, and environment. Preservation metadata, is a specific type of metadata that works to maintain a digital object's viability while ensuring continued access by providing contextual information, usage details, and rights.

Trustworthy Repositories Audit & Certification (TRAC) is a document describing the metrics of an OAIS-compliant digital repository that developed from work done by the OCLC/RLG Programs and National Archives and Records Administration (NARA) task force initiative.

PREservation Metadata: Implementation Strategies (PREMIS) is the de facto digital preservation metadata standard.

<span class="mw-page-title-main">Digital library</span> Online database of digital objects stored in electronic media formats and accessible via computers

A digital library is an online database of digital objects that can include text, still images, audio, video, digital documents, or other digital media formats or a library accessible through the internet. Objects can consist of digitized content like print or photographs, as well as originally produced digital content like word processor files or social media posts. In addition to storing content, digital libraries provide means for organizing, searching, and retrieving the content contained in the collection. Digital libraries can vary immensely in size and scope, and can be maintained by individuals or organizations. The digital content may be stored locally, or accessed remotely via computer networks. These information retrieval systems are able to exchange information with each other through interoperability and sustainability.

Data curation is the organization and integration of data collected from various sources. It involves annotation, publication and presentation of the data so that the value of the data is maintained over time, and the data remains available for reuse and preservation. Data curation includes "all the processes needed for principled and controlled data creation, maintenance, and management, together with the capacity to add value to data". In science, data curation may indicate the process of extraction of important information from scientific texts, such as research articles by experts, to be converted into an electronic format, such as an entry of a biological database.

The MetaArchive Cooperative is an international digital preservation network composed of libraries, archives, and other memory institutions. As of August 2011, the MetaArchive preservation network is composed of 24 secure servers in four countries with a collective capacity of over 300TB. Forty-eight institutions are actively preserving their digital collections in the network.

In information and archival communities, a designated community is an identified group of potential consumers who should be able to understand a particular set of information. These consumers may consist of multiple communities, are designated by the archive, and may change over time.

The Consultative Committee for Space Data Systems (CCSDS) was founded in 1982 for governmental and quasi-governmental space agencies to discuss and develop standards for space data and information systems. Currently composed of "eleven member agencies, twenty-eight observer agencies, and over 140 industrial associates," the CCSDS works to support collaboration and interoperability between member agencies through the establishment of data and system standards. According to the organisation's website, more than 1000 space missions have utilized data and systems standards created by CCSDS. The activities of the CCSDS are organized around six topic areas and composed of many working groups within the overall Collaborative Working Group Environment (CWE).

Data preservation is the act of conserving and maintaining both the safety and integrity of data. Preservation is done through formal activities that are governed by policies, regulations and strategies directed towards protecting and prolonging the existence and authenticity of data and its metadata. Data can be described as the elements or units in which knowledge and information is created, and metadata are the summarizing subsets of the elements of data; or the data about the data. The main goal of data preservation is to protect data from being lost or destroyed and to contribute to the reuse and progression of the data.

Keeping the foresight of rapidly changing technologies and rampant digital obsolescence, in 2008, the R & D in IT Group, Ministry of Electronics and Information Technology, Government of India envisaged to evolve Indian digital preservation initiative. In order to learn from the experience of developed nations, during March 24–25, 2009, an Indo-US Workshop on International Trends in Digital Preservation was organized by C-DAC, Pune with sponsorship from Indo-US Science & Technology Forum, which lead to more constructive developments towards formulation of the national program.

National Digital Repository for Museums of India is a C-DAC-led project to create a seamless access to collections and artifacts organized according to themes, regardless of the physical and geographical locations of the museums that house them. The first public version developed on Dspace was released in 2002. The initial draft of Open Archival Information System (OAIS) was also released in year 2003. It is necessary to transform museums for greater relevance and application for the modern society. Therefore, while focusing the needs of Indian museums, Dr. Dinesh Katre, Senior Director at C-DAC initiated the development of e-curator software named as JATAN (जतन): Virtual Museum Builder in 2001, which was developed and released in 2004. Subsequently, JATAN (जतन) software was deployed in Chhatrapati Shivaji Maharaj Museum, Mumbai; Raja Dinkar Kelkar Museum, Pune and The Baroda Museum & Picture Gallery in Vadodara. Although the response from museums was lukewarm, C-DAC continued developing JATAN (जतन) software into a comprehensive digital collection management system for museums. As part of this research, early visions of using crowdsourcing method for metadata enrichment of museum artefacts and unified virtual catalogue for Indian museums was presented in 2005.

<span class="mw-page-title-main">Audiovisual archive</span> Archive that contains audio-visual materials

In archives, the term "audiovisual" is frequently used generically to denote materials other than written documents. Films, videos, audio recordings, pictures, and other audio and visual media are collected in audiovisual archives. A vast amount of knowledge is included in audiovisual records, which are considered cultural treasures and must be preserved for future use. Print materials would not have the same reach across various audiences as audiovisual resources.

References

  1. 1 2 "CCSDS.org - All Active Publications". public.ccsds.org. Retrieved 2021-11-22.
  2. 1 2 3 4 Giaretta, David (20 January 2011). "Introduction to OAIS Concepts and Terminology". Advanced Digital Preservation. Springer Berlin Heidelberg. doi:10.1007/978-3-642-16809-3_3.
  3. Consultative Committee for Space Data Systems (June 2012). Reference Model for an Open Archival Information System (OAIS) (PDF). Recommended Practice, issue 2, CCSDS 650.0-M-2. NASA . Retrieved 25 March 2022.
  4. Lavoie, Brian. "Meeting the challenges of digital preservation: The OAIS reference model". OCLC. Archived from the original on 2014-05-12. Retrieved 2014-05-19.
  5. Lee, Christopher A. (2005). Defining Digital Preservation Work: A Case Study of the Development of the Reference Model for an Open Archival Information System (Thesis). The University of Michigan. hdl:2027.42/39372.
  6. #WeMissiPres Day One Session 3 Picking up the threads of 2019 , retrieved 2021-11-22