Digital permanence

Last updated

Digital permanence addresses the history and development of digital storage techniques, specifically quantifying the expected lifetime of data stored on various digital media and the factors which influence the permanence of digital data. It is often a mix of ensuring the data itself can be retained on a particular form of media and that the technology remains viable. Where possible, as well as describing expected lifetimes, factors affecting data retention will be detailed, including potential technology issues.

Contents

Since the inception of automatic computers, a key difference between them and other calculating machines has been their ability to store information. Over the years, various hardware devices have been designed to store ever larger quantities of data. With the development of the Internet the quantity of information available appears to continue to grow at an ever-increasing rate often characterised as an information explosion. As information is increasingly being stored on electronic media as opposed to traditional media such as hand-written documents, printed books, and photographic images, humanity's social and cultural legacy to future generations will depend increasingly on the permanence of these new media.

However, not all of this information is worth saving; sometimes its value can be short-lived. Other data, such as legal contracts, literature, scientific studies, are expected to last for centuries. This article describes how reliable different types of storage media are at storing data over time and factors affecting this reliability.

Librarians and archivists responsible for large repositories of information take a deeper view of electronic archives.

Data format
Data must be stored in a format which can be meaningfully accessed now and in the future.
Technology reliance
If data requires a special program to view it, say, as an image, then software must also be available to both interpret the basic data file and also render it appropriately. In some cases, this might also require special hardware.
Archival strategy
Data must remain available in the long term.
At present, a growing problem is the time taken to reproduce an archive, for instance following a hardware or system upgrade. Since the sheer volume of archive data continues to grow, new hardware is always required to maintain the archive and so regular migration of data to a new system must be performed on a regular basis. The time taken to migrate data is starting to approach the frequency of system upgrade, such that archive transfer will become a continuous, never-ending process. [1]
Digital rights management
Maintaining digital information in an accurate and accessible format over an extended retention period also must address the requirements of the authors' digital rights.
In many cases, the data may include proprietary information that should not be accessible to all, but only to a defined group of users who understand or have legally agreed to only utilize the information in limited ways so as to protect the proprietary rights of the original authoring team. Maintaining this requirement over decades can be a challenge that requires processes and tools to ensure total compliance.
Reproducibility
Digital information must be able to be reproduced as originally intended or available.
This is significant especially where the original data was produced on technology at a lower level than currently possible. For example, archivists try to maintain the distinction between listening to a gramophone record played on a gramophone as opposed to a digitally cleaned version of the same recording through a modern hi-fi system.

Given that individuals' personal data has been growing at a rapid rate in the 21st century, [2] these archiving issues affecting professional repositories will soon be manifest in small organisations and even the home.

Types of storage

Solid-state memory devices

Digital computers, in particular, make use of two forms of memory known as RAM or ROM and although the most common form today is RAM, designed to retain data while the computer is powered on, this was not always the case. Nor is active memory the only form used; passive memory devices are now in common use in digital cameras.

Magnetic media

7-inch (18 cm) reel of
.mw-parser-output .frac{white-space:nowrap}.mw-parser-output .frac .num,.mw-parser-output .frac .den{font-size:80%;line-height:0;vertical-align:super}.mw-parser-output .frac .den{vertical-align:sub}.mw-parser-output .sr-only{border:0;clip:rect(0,0,0,0);clip-path:polygon(0px 0px,0px 0px,0px 0px);height:1px;margin:-1px;overflow:hidden;padding:0;position:absolute;width:1px}
1/4-inch wide (6 mm) audio recording tape, typical of consumer use in the 1950s-70s Magtape1.jpg
7-inch (18 cm) reel of 14-inch wide (6 mm) audio recording tape, typical of consumer use in the 1950s–70s

Magnetic tapes consist of narrow bands of a magnetic medium bonded in paper or plastic. The magnetic medium passes across a semi-fixed head which reads or writes data. Typically, magnetic media has a maximum lifetime of about 50 years [3] although this assumes optimal storage conditions; life expectancy can decrease rapidly depending on storage conditions and the resilience and reliability of hardware components.

Magnetic disks and drums include a rotating magnetic medium combined with a movable read/write head.

Non-magnetic media

Printing technology

Printing hard-copies of documents and images is a popular means of representing digital data and possibly acquires the qualities associated with original documents, especially their potential for endurance. More recent advances in printer technology have raised the quality of photographic images in particular. Unfortunately, the permanence of printed documents cannot be easily discerned from the documents themselves.

Financial Driven Resources

A way of preserving digital content through means of financial trusts. The data is driven with financial investments typically assigned to a Trust Company which pay traditional storage providers to house data for long periods of time with the interest gained on the principal. In 2008 a series of companies such as LivingStory.com and Orysa.com started offering these services to store point in time accounting data and provide consumer archive services.

Soft storage technology

The short-comings of some storage media is already well recognised and various attempts have been made to supplement the permanence of an under-lying technology. These "soft storage technologies" enhance their base technology by applying software or system techniques often within quite narrow fields of data storage and not always with the explicit intention of improving digital permanence.

See also

Related Research Articles

<span class="mw-page-title-main">Computer data storage</span> Storage of digital data readable by computers

Computer data storage is a technology consisting of computer components and recording media that are used to retain digital data. It is a core function and fundamental component of computers.

<span class="mw-page-title-main">Computer memory</span> Component of a computer storing information for immediate use.

Computer memory stores information, such as data and programs for immediate use in the computer. The term memory is often synonymous with the term primary storage or main memory. An archaic synonym for memory is store.

<span class="mw-page-title-main">Data storage</span> Recording of information in a storage medium

Data storage is the recording (storing) of information (data) in a storage medium. Handwriting, phonographic recording, magnetic tape, and optical discs are all examples of storage media. Biological molecules such as RNA and DNA are considered by some as data storage. Recording may be accomplished with virtually any form of energy. Electronic data storage requires electrical power to store and retrieve data.

<span class="mw-page-title-main">Booting</span> Process of starting a computer

In computing, booting is the process of starting a computer as initiated via hardware such as a button or by a software command. After it is switched on, a computer's central processing unit (CPU) has no software in its main memory, so some process must load software into memory before it can be executed. This may be done by hardware or firmware in the CPU, or by a separate processor in the computer system.

In computing, a removable media is a data storage media that is designed to be readily inserted and removed from a system. Most early removable media, such as floppy disks and optical discs, require a dedicated read/write device to be installed in the computer, while others, such as USB flash drives, are plug-and-play with all the hardware required to read them built into the device, so only need a driver software to be installed in order to communicate with the device. Some removable media readers/drives are integrated into the computer case, while others are standalone devices that need to be additionally installed or connected.

<span class="mw-page-title-main">Magneto-optical drive</span>

A magneto-optical drive is a kind of optical disc drive capable of writing and rewriting data upon a magneto-optical disc. Both 130 mm (5.25 in) and 90 mm (3.5 in) form factors exist. In 1983, just a year after the introduction of the compact disc, Kees Schouhamer Immink and Joseph Braat presented the first experiments with erasable magneto-optical compact discs during the 73rd AES Convention in Eindhoven. The technology was introduced commercially in 1985. Although optical, they normally appear as hard disk drives to an operating system and can be formatted with any file system. Magneto-optical drives were common in some countries, such as Japan, but have fallen into disuse.

<span class="mw-page-title-main">USB flash drive</span> Data storage device

A USB flash drive is a data storage device that includes flash memory with an integrated USB interface. It is typically removable, rewritable and much smaller than an optical disc. Most weigh less than 30 g (1 oz). Since first appearing on the market in late 2000, as with virtually all other computer memory devices, storage capacities have risen while prices have dropped. As of March 2016, flash drives with anywhere from 8 to 256 gigabytes (GB) were frequently sold, while 512 GB and 1 terabyte (TB) units were less frequent. As of 2023, 2 TB flash drives were the largest currently in production. Some allow up to 100,000 write/erase cycles, depending on the exact type of memory chip used, and are thought to physically last between 10 and 100 years under normal circumstances.

In information technology, a backup, or data backup is a copy of computer data taken and stored elsewhere so that it may be used to restore the original after a data loss event. The verb form, referring to the process of doing so, is "back up", whereas the noun and adjective form is "backup". Backups can be used to recover data after its loss from data deletion or corruption, or to recover data from an earlier time. Backups provide a simple form of disaster recovery; however not all backup systems are able to reconstitute a computer system or other complex configuration such as a computer cluster, active directory server, or database server.

Non-volatile memory (NVM) or non-volatile storage is a type of computer memory that can retain stored information even after power is removed. In contrast, volatile memory needs constant power in order to retain data.

<span class="mw-page-title-main">Linear Tape-Open</span> Magnetic tape-based data storage technology

Linear Tape-Open (LTO) is a magnetic tape data storage technology used for backup, data archiving, and data transfer. It was originally developed in the late 1990s as an open standards alternative to the proprietary magnetic tape formats that were available at the time. Upon introduction, LTO rapidly defined the super tape market segment and has consistently been the best-selling super tape format. Despite advancements in other areas of data storage technology, LTO is still widely used.

Data remanence is the residual representation of digital data that remains even after attempts have been made to remove or erase the data. This residue may result from data being left intact by a nominal file deletion operation, by reformatting of storage media that does not remove data previously written to the media, or through physical properties of the storage media that allow previously written data to be recovered. Data remanence may make inadvertent disclosure of sensitive information possible should the storage media be released into an uncontrolled environment.

Semiconductor memory is a digital electronic semiconductor device used for digital data storage, such as computer memory. It typically refers to devices in which data is stored within metal–oxide–semiconductor (MOS) memory cells on a silicon integrated circuit memory chip. There are numerous different types using different semiconductor technologies. The two main types of random-access memory (RAM) are static RAM (SRAM), which uses several transistors per memory cell, and dynamic RAM (DRAM), which uses a transistor and a MOS capacitor per cell. Non-volatile memory uses floating-gate memory cells, which consist of a single floating-gate transistor per cell.

Enterprise content management (ECM) extends the concept of content management by adding a timeline for each content item and, possibly, enforcing processes for its creation, approval, and distribution. Systems using ECM generally provide a secure repository for managed items, analog or digital. They also include one methods for importing content to bring manage new items, and several presentation methods to make items available for use. Although ECM content may be protected by digital rights management (DRM), it is not required. ECM is distinguished from general content management by its cognizance of the processes and procedures of the enterprise for which it is created.

Write once read many (WORM) describes a data storage device in which information, once written, cannot be modified. This write protection affords the assurance that the data cannot be tampered with once it is written to the device, excluding the possibility of data loss from human error, computer bugs, or malware.

In computing, external storage refers to non-volatile (secondary) data storage outside a computer's own internal hardware, and thus can be readily disconnected and accessed elsewhere. Such storage devices may refer to removable media, compact flash drives, portable storage devices, or network-attached storage. Web-based cloud storage is the latest technology for external storage.

Magnetic-tape data storage is a system for storing digital information on magnetic tape using digital recording.

Oral history preservation is the field that deals with the care and upkeep of oral history materials, whatever format they may be in. Oral history is a method of historical documentation, using interviews with living survivors of the time being investigated. Oral history often touches on topics scarcely touched on by written documents, and by doing so, fills in the gaps of records that make up early historical documents.

<span class="mw-page-title-main">Read-only memory</span> Electronic memory that cannot be changed

Read-only memory (ROM) is a type of non-volatile memory used in computers and other electronic devices. Data stored in ROM cannot be electronically modified after the manufacture of the memory device. Read-only memory is useful for storing software that is rarely changed during the life of the system, also known as firmware. Software applications for programmable devices can be distributed as plug-in cartridges containing ROM.

This glossary of computer hardware terms is a list of definitions of terms and concepts related to computer hardware, i.e. the physical and structural components of computers, architectural issues, and peripheral devices.

Solid-state storage (SSS) is a type of non-volatile computer storage that stores and retrieves digital information using only electronic circuits, without any involvement of moving mechanical parts. This differs fundamentally from the traditional electromechanical storage, which records data using rotating or linearly moving media coated with magnetic material.

References

  1. Burk, Alan; James Kerr; and Andy Pope. "The Credibility of Electronic Publishing". Available at web.mala.bc.ca
  2. Sweeny, Latanya. "Information Explosion. Available at privacy.cs.cmu.edu Archived 2010-01-18 at the Wayback Machine
  3. Adelstein, Peter Z. "Permanence of Digital Information". Available at www.ica.org Archived 2004-09-26 at the Wayback Machine .