Open Archival Information System

Last updated

An Open Archival Information System (or OAIS) is an archive, consisting of an organization of people and systems, that has accepted the responsibility to preserve information and make it available for a Designated Community. [1] The OAIS model can be applied to various archives, e.g., open access, closed, restricted, "dark", or proprietary. [2]

Contents

The term OAIS also refers, by extension, to the ISO OAIS Reference Model for an OAIS. This reference model is defined by recommendation CCSDS 650.0-B-2 of the Consultative Committee for Space Data Systems; [3] this text is identical to = 57284 ISO 14721:2012. The CCSDS's purview is space agencies, but the OAIS model it developed has proved useful to other organizations and institutions with digital archiving needs. OAIS, known as ISO 14721:2003, is widely accepted and utilized by various organizations and disciplines, both national and international, and was designed to ensure preservation. The OAIS standard, published in 2005, is considered the optimum standard to create and maintain a digital repository over a long period of time.

The information being maintained has been deemed to need "long term preservation", even if the OAIS itself is not permanent. "Long term" is long enough to be concerned with the impacts of changing technologies, including support for new media and data formats, or with a changing user community. "Long term" may extend indefinitely. The OAIS defines a long period of time as any length of time that might be impacted by changing technologies and the changing of "Designated Community," e.g., any group of consumers capable of understanding the information. This length of time can be indefinite. The archive defines the community and that definition is not fixed. [2]

The "O" in OAIS represents the "open way the standard was developed", and does not represent "open access", [2] or the usage of the term open in the Open Definition or Open Archives Initiative. The "I" in OAIS represents "information", meaning data that can be shared or exchanged. [2]

In this reference model there is a particular focus on digital information, both as the primary forms of information held and as supporting information for both digitally and physically archived materials. Therefore, the model accommodates information that is inherently non-digital (e.g., a physical sample), but the modeling and preservation of such information is not addressed in detail. As strictly a conceptual framework, the OAIS model does not require the use of any particular computing platform, system environment, system design paradigm, system development methodology, database management system, database design paradigm, data definition language, command language, system interface, user interface, technology, or media for an archive to be compliant. Its aim is to set the standard for the activities that are involved in preserving a digital archive rather than the method for carrying out those activities.

The acronym OAIS should not be confused with OAI, which is the Open Archives Initiative.

The reference model

The reference model:

Requirements of the system

The reference model ( ISO 14721:2003) includes the following responsibilities that an OAIS archive must abide by:

The OAIS environment and information model

OAIS environment diagram OAIS Environment (en).svg
OAIS environment diagram
OAIS information package diagram OAIS Information Package (en).svg
OAIS information package diagram

The OAIS environment involves the interaction of four entities: producers of information, consumers of information (or the designated community), management, and the archive itself. The management component of the OAIS environment is not an entity that carries out day-to-day maintenance of an archive but a person or group that sets policies for the content contained in the archive.

The OAIS model also defines an information model. Physical or digital items which contain information are known as data objects. Members of the Designated Community for an archive should be able to interpret and understand the information contained in a data object either because of their established knowledge base or with the assistance of supplementary "representation information" that is included with the data object.

An information package includes the following information objects:

There are three types of information package in the OAIS reference model:

These three information packages may or may not be identical to each other.

The functional model

OAIS functional model diagram OAIS Functional Model (en).svg
OAIS functional model diagram

There are six functional entities in an OAIS:

Adoption

Although originally developed by the Consultative Committee for Space Data Systems, a body dedicated to overseeing space agencies, as digital preservation has become a discipline unto itself, the OAIS has become the standard model for digital preservation systems at many institutions and organizations. OAIS-compliance has been a stated fundamental design requirement for major digital preservation and repository development efforts at the National Archives and Records Administration, Library of Congress, British Library, Bibliothèque nationale de France, National Library of the Netherlands, the Digital Curation Centre in the UK, OCLC (the Online Computer Library Center), the JSTOR (Journal Storage) scholarly journal archive, as well as several university library systems. Centre of Excellence for Digital Preservation, C-DAC, India has implemented OAIS for National Cultural Audiovisual Archive (NCAA) which has been certified as Trusted Digital Repository as per ISO 16363: 2012 during November 2017. This initiative was a part of Indian National Digital Preservation Program (NDPP). The OAIS has been the basis of numerous prominent digital preservation initiatives and standards including the Preservation Metadata: Implementation Strategies working group and the Trustworthy Repositories Audit & Certification (TRAC) document from OCLC. [6] which was an initial draft of, and subsequently superseded by, CCSDS 652.1-M-2 of the Consultative Committee for Space Data Systems; [3] this text is identical to ISO 16363:2012 which forms the basis of the ISO audit and certification of Trustworthy Repositories, more details about which are available here. The ISO 19165:1-2018 recommends the use of the Open Packaging Conventions to implement the Geospatial Package.

Software architecture model

As part of #WeMissiPres, Frank Obermeit, a computer scientist at the State Archives of Saxony-Anhalt, Germany, presented [7] a software architecture model that fully implements the Open Archival Information System (OAIS) reference model on 22 September 2020. An appliance developed on the architecture model has been available since October 2020. The architecture model is based exclusively on de facto and de jure standards and the appliance developed according to it was realised exclusively with open source products. The three main standards are Business Process Model and Notation (BPMN), Representational State Transfer (REST) and OpenID Connect (OIDC). Scalability, distributability and extensibility are further essential features and enable the use in organisations of different sizes.

See also

Related Research Articles

<span class="mw-page-title-main">Dublin Core</span> Standardized set of metadata elements

The Dublin Core, also known as the Dublin Core Metadata Element Set (DCMES), is a set of fifteen main metadata items for describing digital or physical resources. The Dublin Core Metadata Initiative (DCMI) is responsible for formulating the Dublin Core; DCMI is a project of the Association for Information Science and Technology (ASIS&T), a non-profit organization.

The Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) is a protocol developed for harvesting metadata descriptions of records in an archive so that services can be built using metadata from many archives. An implementation of OAI-PMH must support representing metadata in Dublin Core, but may also support additional representations.

<span class="mw-page-title-main">Digital obsolescence</span>

Digital obsolescence is the risk of data loss because of inabilities to access digital assets, due to the hardware or software required for information retrieval being repeatedly replaced by newer devices and systems, resulting in increasingly incompatible formats. While the threat of an eventual "digital dark age" was initially met with little concern until the 1990s, modern digital preservation efforts in the information and archival fields have implemented protocols and strategies such as data migration and technical audits, while the salvage and emulation of antiquated hardware and software address digital obsolescence to limit the potential damage to long-term information access.

In library and archival science, digital preservation is a formal endeavor to ensure that digital information of continuing value remains accessible and usable. It involves planning, resource allocation, and application of preservation methods and technologies, and it combines policies, strategies and actions to ensure access to reformatted and "born-digital" content, regardless of the challenges of media failure and technological change. The goal of digital preservation is the accurate rendering of authenticated content over time. The Association for Library Collections and Technical Services Preservation and Reformatting Section of the American Library Association, defined digital preservation as combination of "policies, strategies and actions that ensure access to digital content over time." According to the Harrod's Librarian Glossary, digital preservation is the method of keeping digital material alive so that they remain usable as technological advances render original hardware and software specification obsolete.

<span class="mw-page-title-main">Fedora Commons</span>

Fedora is a digital asset management (DAM) content repository architecture upon which institutional repositories, digital archives, and digital library systems might be built. Fedora is the underlying architecture for a digital repository, and is not a complete management, indexing, discovery, and delivery application. It is a modular architecture built on the principle that interoperability and extensibility are best achieved by the integration of data, interfaces, and mechanisms as clearly defined modules.

The Metadata Encoding and Transmission Standard (METS) is a metadata standard for encoding descriptive, administrative, and structural metadata regarding objects within a digital library, expressed using the XML schema language of the World Wide Web Consortium (W3C). The standard is maintained as part of the MARC standards of the Library of Congress, and is being developed as an initiative of the Digital Library Federation (DLF).

Preservation metadata is item level information that describes the context and structure of a digital object. It provides background details pertaining to a digital object's provenance, authenticity, and environment. Preservation metadata, is a specific type of metadata that works to maintain a digital object's viability while ensuring continued access by providing contextual information, usage details, and rights.

Trustworthy Repositories Audit & Certification (TRAC) is a document describing the metrics of an OAIS-compliant digital repository that developed from work done by the OCLC/RLG Programs and National Archives and Records Administration (NARA) task force initiative.

Digital curation is the selection, preservation, maintenance, collection, and archiving of digital assets. Digital curation establishes, maintains, and adds value to repositories of digital data for present and future use. This is often accomplished by archivists, librarians, scientists, historians, and scholars. Enterprises are starting to use digital curation to improve the quality of information and data within their operational and strategic processes. Successful digital curation will mitigate digital obsolescence, keeping the information accessible to users indefinitely. Digital curation includes digital asset management, data curation, digital preservation, and electronic records management.

PREservation Metadata: Implementation Strategies (PREMIS) is the de facto digital preservation metadata standard.

Database preservation usually involves converting the information stored in a database to a form likely to be accessible in the long term as technology changes, without losing the initial characteristics of the data.

Data curation is the organization and integration of data collected from various sources. It involves annotation, publication and presentation of the data such that the value of the data is maintained over time, and the data remains available for reuse and preservation. Data curation includes "all the processes needed for principled and controlled data creation, maintenance, and management, together with the capacity to add value to data". In science, data curation may indicate the process of extraction of important information from scientific texts, such as research articles by experts, to be converted into an electronic format, such as an entry of a biological database.

The MetaArchive Cooperative is an international digital preservation network composed of libraries, archives, and other memory institutions. As of August 2011, the MetaArchive preservation network is composed of 24 secure servers in four countries with a collective capacity of over 300TB. Forty-eight institutions are actively preserving their digital collections in the network.

Islandora is a free and open-source software digital repository system based on Drupal and integrating with additional applications, including Fedora Commons. It is open source software. Islandora was originally developed at the University of Prince Edward Island by the Robertson Library and is now maintained by the Islandora Foundation, which has a mission to, "promote collaboration through transparency and consensus building among Islandora community members, and to steward their shared vision for digital curation features through a body of software and knowledge."

In computing, a data definition specification (DDS) is a guideline to ensure comprehensive and consistent data definition. It represents the attributes required to quantify data definition. A comprehensive data definition specification encompasses enterprise data, the hierarchy of data management, prescribed guidance enforcement and criteria to determine compliance.

In information and archival communities, a designated community is an identified group of potential consumers who should be able to understand a particular set of information. These consumers may consist of multiple communities, are designated by the archive, and may change over time.

The Consultative Committee for Space Data Systems (CCSDS) was founded in 1982 for governmental and quasi-governmental space agencies to discuss and develop standards for space data and information systems. Currently composed of "eleven member agencies, twenty-eight observer agencies, and over 140 industrial associates," the CCSDS works to support collaboration and interoperability between member agencies through the establishment of data and system standards. According to the organisation's website, more than 1000 space missions have utilized data and systems standards created by CCSDS. The activities of the CCSDS are organized around six topic areas and composed of many working groups within the overall Collaborative Working Group Environment (CWE).

Keeping the foresight of rapidly changing technologies and rampant digital obsolescence, in 2008, the R & D in IT Group, Ministry of Electronics and Information Technology, Government of India envisaged to evolve Indian digital preservation initiative. In order to learn from the experience of developed nations, during March 24–25, 2009, an Indo-US Workshop on International Trends in Digital Preservation was organized by C-DAC, Pune with sponsorship from Indo-US Science & Technology Forum, which lead to more constructive developments towards formulation of the national program.

National Digital Repository for Museums of India is a C-DAC-led project to create a seamless access to collections and artifacts organized according to themes, regardless of the physical and geographical locations of the museums that house them. The first public version developed on Dspace was released in 2002. The initial draft of Open Archival Information System (OAIS) was also released in year 2003. It is necessary to transform museums for greater relevance and application for the modern society. Therefore, while focusing the needs of Indian museums, Dr. Dinesh Katre, Senior Director at C-DAC initiated the development of e-curator software named as JATAN (जतन): Virtual Museum Builder in 2001, which was developed and released in 2004. Subsequently, JATAN (जतन) software was deployed in Chhatrapati Shivaji Maharaj Museum, Mumbai; Raja Dinkar Kelkar Museum, Pune and The Baroda Museum & Picture Gallery in Vadodara. Although the response from museums was lukewarm, C-DAC continued developing JATAN (जतन) software into a comprehensive digital collection management system for museums. As part of this research, early visions of using crowdsourcing method for metadata enrichment of museum artefacts and unified virtual catalogue for Indian museums was presented in 2005.

References

  1. Consultative Committee for Space Data Systems (June 2012). Reference Model for an Open Archival Information System (OAIS) (PDF). Recommended Practice, issue 2, CCSDS 650.0-M-2. NASA . Retrieved 25 March 2022.
  2. 1 2 3 4 Giaretta, David (20 January 2011). "Introduction to OAIS Concepts and Terminology". Advanced Digital Preservation. Springer Berlin Heidelberg. doi:10.1007/978-3-642-16809-3_3.
  3. 1 2 "CCSDS.org - All Active Publications". public.ccsds.org. Retrieved 2021-11-22.
  4. 1 2 "Reference Model for an Open Archival Information System (OAIS)". The Consultative Committee for Space Data Systems. June 2012. Retrieved April 11, 2022.
  5. Lavoie, Brian. "Meeting the challenges of digital preservation: The OAIS reference model". OCLC. Archived from the original on 2014-05-12. Retrieved 2014-05-19.
  6. Lee, Christopher A. (2005). Defining Digital Preservation Work: A Case Study of the Development of the Reference Model for an Open Archival Information System (Thesis). The University of Michigan. hdl:2027.42/39372.
  7. #WeMissiPres Day One Session 3 Picking up the threads of 2019 , retrieved 2021-11-22