Digital dark age

Last updated

A computer terminal set up with a laserdisc containing information from the 1986 BBC Domesday Project. The original Domesday Book is 900 years old and still legible, while the laserdisc is already considered obsolete and difficult to read. VCF 2010 Domesday tray open.jpg
A computer terminal set up with a laserdisc containing information from the 1986 BBC Domesday Project. The original Domesday Book is 900 years old and still legible, while the laserdisc is already considered obsolete and difficult to read.

The digital dark age is a lack of historical information in the digital age as a direct result of outdated file formats, software, or hardware that becomes corrupt, scarce, or inaccessible as technologies evolve and data decays. [1] Future generations may find it difficult or impossible to retrieve electronic documents and multimedia, because they have been recorded in an obsolete and obscure file format, or on an obsolete physical medium, for example, floppy disks. The name derives from the term Dark Ages in the sense that there could be a relative lack of records in the digital age, as documents are transferred to digital formats and original copies are lost. An early mention of the term was at a conference of the International Federation of Library Associations and Institutions (IFLA) in 1997. [2] The term was also mentioned in 1998 at the Time and Bits conference, [3] [4] which was co-sponsored by the Long Now Foundation and the Getty Conservation Institute.

Contents

Proprietary and obsolete file formats

The problem is not limited to text documents, but applies equally to photos, video, audio and other kinds of electronic documents. One concern leading to the use of the term is that documents are stored on physical media which require special hardware in order to be read and that this hardware will not be available in a few decades from the time the document was created.

The digital dark age also applies to the problems which arise due to obsolete file formats. In such a case, it is the lack of necessary software which causes problems when retrieving stored documents. This is especially problematic when proprietary formats are used, in which case it might be impossible to write appropriate software to read the file.

Archiving the internet

The Internet Archive has stated that one of their goals is to prevent the digital dark age. [5]

Vinton Cerf, Vice President of Google, showed his concerns about data preservation in the annual meeting of the American Association for the Advancement of Science in 2015: "As the way that we store information about ourselves develops, memories stored in files that use older technology are becoming harder to access. That could mean that historians of the future are unable to learn about our lives". His suggested solution consists of preserving a sample of every piece of software and hardware that has ever existed so that it never becomes obsolete. He proposed taking an X-ray snapshot of the content, the application and the operating system along with a description of the machine. This information should be then stored, instead of in a museum, in servers in the cloud. [6]

Historical examples

A famous example is NASA, whose early space records have suffered from a dark age issue more than once. For over a decade, magnetic tapes from the 1976 Viking Mars landing were unprocessed. When later analyzed, the data was unreadable as it was in an unknown format and the original programmers had either died or left ASA. The images were eventually extracted following many months of puzzling through the data and examining how the recording machines functioned. [7] The original Moon landing tapes were likely erased and reused, with the only video recordings of the Apollo 11 landings being later copies. [8]

Another example is the BBC Domesday Project in which a survey of the nation was compiled 900 years after the Domesday Book was published. While the original Domesday Book of 1086 is still readable today, there were great fears that the discs of the 1986 Domesday Project would become unreadable as software and disk drives capable of reading the format became rarer and rarer. However, in 2002 the CAMiLEON project migrated the information to a system called DomesEm, allowing it to be accessed on modern computers. [9] More recently, the Domesday86 Project has continued this preservation effort by developing a digitizer for the original LaserDiscs and emulation software for the original BBC Domesday computer system.

Encryption and data preservation

Encryption may exacerbate the problem of preserving data, since decoding adds complexity even when the relevant software is available. [10] Historically, encrypted data is quite rare, but even the very simple means available throughout history have provided many examples of documents that can only be read with great effort. For example, it took the capacity of a distributed computing project to break the mechanically generated code of a single brief World War II submarine tactical message. [11]

Open source file formats

As more records are stored in digital form, there have been several measures to standardize electronic file formats so software to read them is widely available and can be re-implemented on new platforms if necessary.

PDF/A is an open standard based on Adobe Systems PDF format. [12] It has been widely adopted by governments and archives around the world, such as the United Kingdom. [13]

The Open Document Format for Office Applications (OpenDocument) has been standardized by OASIS in 2005, and by ISO in 2006. Since then, support for OpenDocument has been implemented in a large number of open source and proprietary software. Therefore, using OpenDocument is one option for archiving editable documents from office applications. More broadly, the use of open source software is a prevention measure. [14] Since the source code for reading and writing a file format is open, the code can be used as a base for future implementations. In 2007, the chief information officer of the UK's National Archives stated "We welcome open-source software because it makes our lives easier". [15]

Data storage standardization

In July 2007, Microsoft created a partnership with the UK's National Archives to prevent the digital dark age and "unlock millions of unreadable stored computer files". [16] [17] [18] UK's National Archives now accepts various file formats for long-term preservation, including Office Open XML, PDF and OpenDocument. [19]

Criticism

The notion of the digital dark age has been criticized by some scholars. Some of these, such as David Anderson and Jon Tilbury, view it as alarmist rhetoric, maintaining that the notion of a "dark age" incorrectly states the current condition. They argue that there has been significant progress in digital preservation and evidenced in the way organizations continue to find and reuse critical long-term digital information while finding new ways of sharing these with the public. [20] Some historians also fault proponents of the digital dark age for historical inaccuracies. These include Marilyn Deegan and Simon Tanner's claim that the Gutenberg printing revolution led Europe out of the Dark Ages, a period said to be marked by the loss of knowledge of the learning of the ancient Greeks and Romans. [21] It is argued that knowledge and information about classical learning had been recovered during the Middle Ages and it was not mainly due to the printing revolution but, instead, was largely a result of the intellectual exchange between Islamic and Christian cultures. [22]

See also

Related Research Articles

<span class="mw-page-title-main">PDF</span> Portable Document Format, a digital file format

Portable Document Format (PDF), standardized as ISO 32000, is a file format developed by Adobe in 1992 to present documents, including text formatting and images, in a manner independent of application software, hardware, and operating systems. Based on the PostScript language, each PDF file encapsulates a complete description of a fixed-layout flat document, including the text, fonts, vector graphics, raster images and other information needed to display it. PDF has its roots in "The Camelot Project" initiated by Adobe co-founder John Warnock in 1991. PDF was standardized as ISO 32000 in 2008. The last edition as ISO 32000-2:2020 was published in December 2020.

<span class="mw-page-title-main">Data storage</span> Recording of information in a storage medium

Data storage is the recording (storing) of information (data) in a storage medium. Handwriting, phonographic recording, magnetic tape, and optical discs are all examples of storage media. Biological molecules such as RNA and DNA are considered by some as data storage. Recording may be accomplished with virtually any form of energy. Electronic data storage requires electrical power to store and retrieve data.

A document management system (DMS) is usually a computerized system used to store, share, track and manage files or documents. Some systems include history tracking where a log of the various versions created and modified by different users is recorded. The term has some overlap with the concepts of content management systems. It is often viewed as a component of enterprise content management (ECM) systems and related to digital asset management, document imaging, workflow systems and records management systems.

<span class="mw-page-title-main">BBC Domesday Project</span> Crowdsourced born-digital description of the UK, published in 1986

The BBC Domesday Project was a partnership between Acorn Computers, Philips, Logica, and the BBC to mark the 900th anniversary of the original Domesday Book, an 11th-century census of England. It has been cited as an example of digital obsolescence on account of the physical medium used for data storage.

<span class="mw-page-title-main">Digital obsolescence</span> Data loss as the format goes into disuse

Digital obsolescence is the risk of data loss because of inabilities to access digital assets, due to the hardware or software required for information retrieval being repeatedly replaced by newer devices and systems, resulting in increasingly incompatible formats. While the threat of an eventual "digital dark age" was initially met with little concern until the 1990s, modern digital preservation efforts in the information and archival fields have implemented protocols and strategies such as data migration and technical audits, while the salvage and emulation of antiquated hardware and software address digital obsolescence to limit the potential damage to long-term information access.

Digital Negative (DNG) is an open, lossless raw image format developed by Adobe and used for digital photography. It was launched on September 27, 2004. The launch was accompanied by the first version of the DNG specification, plus various products, including a free-of-charge DNG converter utility. All Adobe photo manipulation software released since the launch supports DNG.

UVC-based preservation is an archival strategy for handling the preservation of digital objects. It employs the use of a Universal Virtual Computer (UVC)—a virtual machine (VM) specifically designed for archival purposes, that allows both emulation and migration to a language-neutral format like XML.

In library and archival science, digital preservation is a formal process to ensure that digital information of continuing value remains accessible and usable in the long term. It involves planning, resource allocation, and application of preservation methods and technologies, and combines policies, strategies and actions to ensure access to reformatted and "born-digital" content, regardless of the challenges of media failure and technological change. The goal of digital preservation is the accurate rendering of authenticated content over time.

Enterprise content management (ECM) extends the concept of content management by adding a timeline for each content item and, possibly, enforcing processes for its creation, approval, and distribution. Systems using ECM generally provide a secure repository for managed items, analog or digital. They also include one methods for importing content to bring manage new items, and several presentation methods to make items available for use. Although ECM content may be protected by digital rights management (DRM), it is not required. ECM is distinguished from general content management by its cognizance of the processes and procedures of the enterprise for which it is created.

Preservation of documents, pictures, recordings, digital content, etc., is a major aspect of archival science. It is also an important consideration for people who are creating time capsules, family history, historical documents, scrapbooks and family trees. Common storage media are not permanent, and there are few reliable methods of preserving documents and pictures for the future.

A digital asset is anything that exists only in digital form and comes with a distinct usage right or distinct permission for use. Data that do not possess those rights are not considered assets.

A camera raw image file contains unprocessed or minimally processed data from the image sensor of either a digital camera, a motion picture film scanner, or other image scanner. Raw files are so named because they are not yet processed, and contain large amounts of potentially redundant data. Normally, the image is processed by a raw converter, in a wide-gamut internal color space where precise adjustments can be made before conversion to a viewable file format such as JPEG or PNG for storage, printing, or further manipulation. There are dozens of raw formats in use by different manufacturers of digital image capture equipment.

PDF/A is an ISO-standardized version of the Portable Document Format (PDF) specialized for use in the archiving and long-term preservation of electronic documents. PDF/A differs from PDF by prohibiting features unsuitable for long-term archiving, such as font linking and encryption. The ISO requirements for PDF/A file viewers include color management guidelines, support for embedded fonts, and a user interface for reading embedded annotations.

A proprietary file format is a file format of a company, organization, or individual that contains data that is ordered and stored according to a particular encoding-scheme, designed by the company or organization to be secret, such that the decoding and interpretation of this stored data is easily accomplished only with particular software or hardware that the company itself has developed. The specification of the data encoding format is not released, or underlies non-disclosure agreements. A proprietary format can also be a file format whose encoding is in fact published, but is restricted through licences such that only the company itself or licensees may use it. In contrast, an open format is a file format that is published and free to be used by everybody.

The conservation and restoration of new media art is the study and practice of techniques for sustaining new media art created using from materials such as digital, biological, performative, and other variable media.

The following is a comparison of e-book formats used to create and publish e-books.

The term born-digital refers to materials that originate in a digital form. This is in contrast to digital reformatting, through which analog materials become digital, as in the case of files created by scanning physical paper records. It is most often used in relation to digital libraries and the issues that go along with said organizations, such as digital preservation and intellectual property. However, as technologies have advanced and spread, the concept of being born-digital has also been discussed in relation to personal consumer-based sectors, with the rise of e-books and evolving digital music. Other terms that might be encountered as synonymous include "natively digital", "digital-first", and "digital-exclusive".

Database preservation usually involves converting the information stored in a database to a form likely to be accessible in the long term as technology changes, without losing the initial characteristics of the data.

A machine-readable document is a document whose content can be readily processed by computers. Such documents are distinguished from more general machine-readable data by virtue of having further structure to provide the necessary context to support the business processes for which they are created.

References

  1. Young, Lauren J. (15 December 2017). "Data Reawakening". Science Friday. Retrieved 1 March 2018.
  2. Kuny, Terry (September 1997). "A Digital Dark Ages? Challenges in the Preservation of Electronic Prevention Information" (PDF). 63RD IFLA (International Federation of Library Associations and Institutions) Council and General Conference. Archived (PDF) from the original on 9 May 2008. Retrieved 7 November 2011.
  3. MacLean, Margaret (1999). MacLean, Margaret; Davis, Ben (eds.). Time and Bits, Managing Digital Continuity. Getty. ISBN   978-0-89236-583-8.
  4. Brand, Stewart (1 February 1999). "Escaping The Digital Dark Age". Library Journal. 124 (2): 46–69. ISSN   0363-0277. Archived from the original on 23 September 2005.
  5. "About the Internet Archive". Archived from the original on 2 October 2013. Retrieved 5 October 2013.
  6. Pallab, Ghosh (13 February 2015). "Google's Vint Cerf warns of 'digital Dark Age'". BBC News.
  7. Blakeslee, Sandra (20 March 1990). "Lost on Earth: Wealth of Data Found in Space". The New York Times. Archived from the original on 9 November 2012. Retrieved 7 July 2013.
  8. "Not-Unsolved Mysteries: The "Lost" Apollo 11 Tapes". 3 July 2019. Archived from the original on 25 December 2019.
  9. McKie, Robin; Thorpe, Vanessa (3 March 2002). "Digital Domesday Book lasts 15 years not 1000". The Observer. Archived from the original on 20 January 2013.
  10. Digital Preservation Coalition (2012). "Media and Formats - Compression and Encryption". Digital Preservation Handbook. Archived from the original on 29 July 2012. Retrieved 17 August 2013.
  11. Wearden, Graeme (27 February 2006). "Distributed computing cracks Enigma code". CNET News. Archived from the original on 10 August 2011.
  12. "Adobe Acrobat Engineering:PDF Standards". Adobe. 12 March 2013. Archived from the original on 7 July 2013. Retrieved 7 July 2013.
  13. "Viewing government documents". GOV.UK. Cabinet Office. 6 August 2015. Retrieved 10 September 2015.
  14. Cassia, Fernando (28 March 2007). "Open Source, the only weapon against 'planned obsolescence'". The Inquirer . Archived from the original on 20 January 2011. Retrieved 2 August 2012.{{cite web}}: CS1 maint: unfit URL (link)
  15. Donoghue, Andrew (19 July 2007). "Defending against the digital dark age". ZDNet. Archived from the original on 23 October 2012.
  16. Kennedy, Maev (4 July 2007). "National Archive project to avert digital dark age". The Guardian. Archived from the original on 17 July 2010. Retrieved 7 October 2009.
  17. Ferguson, Tim (5 July 2007). "Microsoft Helps Archives Save the Past". Business Week. Archived from the original on 10 July 2007. Retrieved 7 October 2009.
  18. Colvile, Robert (5 July 2007). "How to stave off a digital 'dark age'". The Telegraph. Archived from the original on 24 April 2012. Retrieved 7 October 2009.
  19. "File formats for transfer - The National Archives".
  20. Harvey, Ross; Weatherburn, Jaye (2018). Preserving Digital Materials (Third ed.). Lanham, MD: Rowman & Littlefield. p. 25. ISBN   9781538102961.
  21. Weller, Toni (2013). History in the Digital Age. Oxon: Routledge. p. 58. ISBN   9780415666961.
  22. Tredinnick, Luke (2008). Digital Information Culture: The Individual and Society in the Digital Age. Oxford: Chandos Publishing. p. 155. ISBN   9781843341703.

Further reading

External videos
Nuvola apps kaboodle.svg Digital Dark Age (Computer History Museum, 2011)