File fixity

Last updated

File fixity is a digital preservation term referring to the property of a digital file being fixed, or unchanged. [1] Fixity checking is the process of verifying that a digital object has not been altered or corrupted. [2] During transfer, a repository may run a fixity check to ensure a transmitted file has not been altered en route. Within the repository, fixity checking is used to ensure that digital files have not been affected by data rot, or other digital preservation dangers. [3] By itself, fixity checking does not ensure the preservation of a digital file. Instead, it allows a repository to identify which corrupted files to replace with a clean copy from the producer or from a backup.

In practice, a fixity check is most often accomplished by computing checksum or cryptographic hash function values for a file and comparing them to a stored value or through digital signatures. [4] File fixity figures prominently in Preservation Metadata: Implementation Strategies (PREMIS), the Government Printing Office's work on the Authenticity of Electronic Federal Government Publications, [5] and fixity checking practices are used by a range of cultural heritage organizations. [6]

Related Research Articles

<span class="mw-page-title-main">Checksum</span> Data used to detect errors in other data

A checksum is a small-sized block of data derived from another block of digital data for the purpose of detecting errors that may have been introduced during its transmission or storage. By themselves, checksums are often used to verify data integrity but are not relied upon to verify data authenticity.

<span class="mw-page-title-main">Error detection and correction</span> Techniques that enable reliable delivery of digital data over unreliable communication channels

In information theory and coding theory with applications in computer science and telecommunication, error detection and correction (EDAC) or error control are techniques that enable reliable delivery of digital data over unreliable communication channels. Many communication channels are subject to channel noise, and thus errors may be introduced during transmission from the source to a receiver. Error detection techniques allow detecting such errors, while error correction enables reconstruction of the original data in many cases.

The MD5 message-digest algorithm is a widely used hash function producing a 128-bit hash value. MD5 was designed by Ronald Rivest in 1991 to replace an earlier hash function MD4, and was specified in 1992 as RFC 1321.

<span class="mw-page-title-main">Digital signature</span> Mathematical scheme for verifying the authenticity of digital documents

A digital signature is a mathematical scheme for verifying the authenticity of digital messages or documents. A valid digital signature on a message gives a recipient confidence that the message came from a sender known to the recipient.

Electronic business is any kind of business or commercial transaction that includes sharing information across the internet. Commerce constitutes the exchange of products and services between businesses, groups, and individuals and can be seen as one of the essential activities of any business.

<span class="mw-page-title-main">Cryptographic hash function</span> Hash function that is suitable for use in cryptography

A cryptographic hash function (CHF) is a hash algorithm that has special properties desirable for a cryptographic application:

In cryptography, a message authentication code (MAC), sometimes known as an authentication tag, is a short piece of information used for authenticating and integrity checking a message. In other words, to confirm that the message came from the stated sender and has not been changed. The MAC value allows verifiers to detect any changes to the message content.

Simple file verification (SFV) is a file format for storing CRC32 checksums of files to verify the integrity of files. SFV is used to verify that a file has not been corrupted, but it does not otherwise verify the file's authenticity. The .sfv file extension is usually used for SFV files.

File verification is the process of using an algorithm for verifying the integrity of a computer file, usually by checksum. This can be done by comparing two files bit-by-bit, but requires two copies of the same file, and may miss systematic corruptions which might occur to both files. A more popular approach is to generate a hash of the copied file and comparing that to the hash of the original file.

In library and archival science, digital preservation is a formal endeavor to ensure that digital information of continuing value remains accessible and usable. It involves planning, resource allocation, and application of preservation methods and technologies, and it combines policies, strategies and actions to ensure access to reformatted and "born-digital" content, regardless of the challenges of media failure and technological change. The goal of digital preservation is the accurate rendering of authenticated content over time. The Association for Library Collections and Technical Services Preservation and Reformatting Section of the American Library Association, defined digital preservation as combination of "policies, strategies and actions that ensure access to digital content over time." According to the Harrod's Librarian Glossary, digital preservation is the method of keeping digital material alive so that they remain usable as technological advances render original hardware and software specification obsolete.

<span class="mw-page-title-main">National Software Reference Library</span>

The National Software Reference Library (NSRL), is a project of the National Institute of Standards and Technology (NIST) which maintains a repository of known software, file profiles and file signatures for use by law enforcement and other organizations involved with computer forensic investigations. The project is supported by the United States Department of Justice's National Institute of Justice, the Federal Bureau of Investigation (FBI), Defense Computer Forensics Laboratory (DCFL), the U.S. Customs Service, software vendors, and state and local law enforcement. It also provides a research environment for computational analysis of large sets of files.

Enterprise content management (ECM) extends the concept of content management by adding a timeline for each content item and, possibly, enforcing processes for its creation, approval, and distribution. Systems using ECM generally provide a secure repository for managed items, analog or digital. They also include one methods for importing content to bring manage new items, and several presentation methods to make items available for use. Although ECM content may be protected by digital rights management (DRM), it is not required. ECM is distinguished from general content management by its cognizance of the processes and procedures of the enterprise for which it is created.

<span class="mw-page-title-main">Adobe LiveCycle</span> Java EE server software

Adobe LiveCycle Enterprise Suite (ES4) is a service-oriented architecture Java EE server software product from Adobe Systems used to build applications that automate a broad range of business processes for enterprises and government agencies. LiveCycle ES4 is an enterprise document and form platform that allows capturing and processing information, delivering personalized communications, and protecting and tracking sensitive information. It is used for purposes such as account opening, services, and benefits enrollment, correspondence management, requests for proposal processes, and other manual-based workflows. LiveCycle ES4 incorporates new features with a particular focus on mobile devices. LiveCycle applications also function in both online and offline environments. These capabilities are enabled through the use of Adobe Reader, HTML/PhoneGap, and Flash Player clients to reach desktop computers and mobile devices.

FFV1 is a lossless intra-frame video coding format. It can use either variable-length coding or arithmetic coding for entropy coding. The encoder and decoder are part of the free, open-source library libavcodec in the project FFmpeg since June 2003. FFV1 is also included in ffdshow and LAV Filters, which makes the video codec available to Microsoft Windows applications that support system-wide codecs over Video for Windows (VfW) or DirectShow. FFV1 is particularly popular for its performance regarding speed and size, compared to other lossless preservation codecs, such as M-JPEG2000. The European Broadcasting Union (EBU) lists FFV1 under the codec-family index "31" in their combined list of video codec references.

Preservation metadata is item level information that describes the context and structure of a digital object. It provides background details pertaining to a digital object's provenance, authenticity, and environment. Preservation metadata, is a specific type of metadata that works to maintain a digital object's viability while ensuring continued access by providing contextual information, usage details, and rights.

Digital artifactual value, a preservation term, is the intrinsic value of a digital object, rather than the informational content of the object. Though standards are lacking, born-digital objects and digital representations of physical objects may have a value attributed to them as artifacts.

PREservation Metadata: Implementation Strategies (PREMIS) is the de facto digital preservation metadata standard.

BagIt is a set of hierarchical file system conventions designed to support disk-based storage and network transfer of arbitrary digital content. A "bag" consists of a "payload" and "tags," which are metadata files intended to document the storage and transfer of the bag. A required tag file contains a manifest listing every file in the payload together with its corresponding checksum. The name, BagIt, is inspired by the "enclose and deposit" method, sometimes referred to as "bag it and tag it."

An advanced electronic signature is an electronic signature that has met the requirements set forth under EU Regulation No 910/2014 (eIDAS-regulation) on electronic identification and trust services for electronic transactions in the European Single Market.

Keeping the foresight of rapidly changing technologies and rampant digital obsolescence, in 2008, the R & D in IT Group, Ministry of Electronics and Information Technology, Government of India envisaged to evolve Indian digital preservation initiative. In order to learn from the experience of developed nations, during March 24–25, 2009, an Indo-US Workshop on International Trends in Digital Preservation was organized by C-DAC, Pune with sponsorship from Indo-US Science & Technology Forum, which lead to more constructive developments towards formulation of the national program.

References

  1. "Fixity Checks: Checksums, Message Digests and Digital Signatures Audrey Novak, ILTS" (PDF). November 2006. Archived from the original (PDF) on 2015-01-11.
  2. "fixity check". PREMIS 2.0 Preservation Events Collection. Library of Congress Standards & Research Data Values Registry.
  3. Giaretta, David (2011). Advanced Digital Preservation. ISBN   9783642168086.
  4. Novak, Audrey. "Fixity Checks: Checksums, Message Digests and Digital Signatures" (PDF). ILTS Digital Preservation Committee.
  5. "Authenticity of Electronic Federal Government Publications" (PDF). U.S. Government Printing Office.
  6. Bailey, Jefferson. "File Fixity and Digital Preservation Storage: More Results from the NDSA Storage Survey". The Signal: Digital Preservation Blog. The Library of Congress.