Dvdisaster

Last updated
dvdisaster
Developer(s) Carsten Gnörlich
Initial release0.40.7 (22 August 2004;19 years ago (2004-08-22))
Stable release
0.79.10 / 25 September 2021;2 years ago (2021-09-25)
Written in C
Operating system FreeBSD, Linux, NetBSD (Cross-platform till version 0.72.3)
Available inEnglish
Type Data recovery
License GPL
Website dvdisaster.jcea.es

dvdisaster is a computer program aimed to enhance data survivability on optical discs by creating error detection and correction data, which is used for data recovery. dvdisaster works exclusively at the image level. This program can be used either to generate Error-Correcting Code (ECC) data from an existing media or to augment an ISO image with ECC data prior to being written onto a medium. dvdisaster is free software available under the GNU General Public License.

Contents

Recovery modes

When an optical disc is physically damaged (such as by scratching), or has begun to deteriorate, some parts of the data on the disc may become unreadable. By utilizing the ECC data previously generated by dvdisaster, damaged parts of the disc data can be recovered.

The two modes of ECC data generation in dvdisaster make use of Reed–Solomon codes. In RS01 mode, the generated data is created from a disc image and is stored in a separate file, which must be written on some other medium. Alternatively, in RS02 mode, the ECC data is appended to the end of the disc image before the image is burned to disc.

When a CD or DVD has been augmented in RS02 mode, the 'augmented' section of the data remains invisible to the normal user, and the disc remains fully compatible with computers without knowledge of or installation of dvdisaster. In this way a damaged disc may be fully recoverable by installing the software, accessing the Reed-Solomon error correcting code using dvdisaster and rebuilding the image (to hard disk).

dvdisaster can be helpful to recover the contents of a damaged disc even when no ECC data is available. The entire disc can be read into an image, skipping damaged parts. dvdisaster can then repeatedly rescan just the missing parts, attempting to retrieve correct data.

Difference with other Reed-Solomon implementations

dvdisaster applies an image-based approach to data recovery. It does not apply a file-based data recovery, as reading a defective medium at the file level means trying to read as much data as possible from each file. But a limitation of the file-based approach arises when data sectors are damaged which have book-keeping functions in the file system. The list of files on the medium may be truncated. Or the mapping of data sectors to files is incomplete. Therefore, files or parts from files may be lost even though the respective data sectors would still be readable by the hardware. In contrast, reading at the image level uses direct communication with the drive hardware to access the data sectors.

It is important to point out that each unit of ECC data dvdisaster places at the end is calculated from sectors of the original data spread around in the original image. Each group of original data sectors and the added ECC sector(s) forms a "cluster". Any part of the cluster can be recovered as long as the amount of damages in that cluster is smaller than the amount of added ECC data in that cluster, therefore the location on disk of the ECC data doesn't matter.

Clusters are different in Parchive, since each file is considered as a single block: with dvdisaster data loss begins when one of the clusters has more than about 15% of errors (unlikely to happen but theoretically possible with few KiB of data [1] ), while Parchive can recover any error, provided that the PAR2 files are intact and that the number of corrupted files (it doesn't matter how much corrupted) is smaller than the number of available ECC files. dvdisaster also has a mode with separate ECC files. [2]

See also

Related Research Articles

<span class="mw-page-title-main">Error detection and correction</span> Techniques that enable reliable delivery of digital data over unreliable communication channels

In information theory and coding theory with applications in computer science and telecommunication, error detection and correction (EDAC) or error control are techniques that enable reliable delivery of digital data over unreliable communication channels. Many communication channels are subject to channel noise, and thus errors may be introduced during transmission from the source to a receiver. Error detection techniques allow detecting such errors, while error correction enables reconstruction of the original data in many cases.

RAID is a data storage virtualization technology that combines multiple physical disk drive components into one or more logical units for the purposes of data redundancy, performance improvement, or both. This is in contrast to the previous concept of highly reliable mainframe disk drives referred to as "single large expensive disk" (SLED).

<span class="mw-page-title-main">Live CD</span> Complete, bootable computer installation that runs directly from a CD-ROM

A live CD is a complete bootable computer installation including operating system which runs directly from a CD-ROM or similar storage device into a computer's memory, rather than loading from a hard disk drive. A live CD allows users to run an operating system for any purpose without installing it or making any changes to the computer's configuration. Live CDs can run on a computer without secondary storage, such as a hard disk drive, or with a corrupted hard disk drive or file system, allowing data recovery.

An optical disc image is a disk image that contains everything that would be written to an optical disc, disk sector by disc sector, including the optical disc file system. ISO images contain the binary image of an optical media file system, including the data in its files in binary format, copied exactly as they were stored on the disc. The data inside the ISO image will be structured according to the file system that was used on the optical disc from which it was created.

Data degradation is the gradual corruption of computer data due to an accumulation of non-critical failures in a data storage device. It is also referred to as data decay, data rot or bit rot. This results in a decline in data quality over time, even when the data is not being utilized.

<span class="mw-page-title-main">Optical disc authoring</span> Content publishing on optical disks

Optical disc authoring, including CD, DVD, and Blu-ray Disc authoring, is the process of assembling source material—video, audio or other data—into the proper logical volume format to then be recorded ("burned") onto an optical disc. This act is sometimes done illegally, by pirating copyrighted material without permission from the original artists.

Parchive is an erasure code system that produces par files for checksum verification of data integrity, with the capability to perform data recovery operations that can repair or regenerate corrupted or missing data.

In information technology, a backup, or data backup is a copy of computer data taken and stored elsewhere so that it may be used to restore the original after a data loss event. The verb form, referring to the process of doing so, is "back up", whereas the noun and adjective form is "backup". Backups can be used to recover data after its loss from data deletion or corruption, or to recover data from an earlier time. Backups provide a simple form of disaster recovery; however not all backup systems are able to reconstitute a computer system or other complex configuration such as a computer cluster, active directory server, or database server.

<span class="mw-page-title-main">Data corruption</span> Errors in computer data that introduce unintended changes to the original data

Data corruption refers to errors in computer data that occur during writing, reading, storage, transmission, or processing, which introduce unintended changes to the original data. Computer, transmission, and storage systems use a number of measures to provide end-to-end data integrity, or lack of errors.

CD/DVD copy protection is a blanket term for various methods of copy protection for CDs and DVDs. Such methods include DRM, CD-checks, Dummy Files, illegal tables of contents, over-sizing or over-burning the CD, physical errors and bad sectors. Many protection schemes rely on breaking compliance with CD and DVD standards, leading to playback problems on some devices.

In computing, data recovery is a process of retrieving deleted, inaccessible, lost, corrupted, damaged, or formatted data from secondary storage, removable media or files, when the data stored in them cannot be accessed in a usual way. The data is most often salvaged from storage media such as internal or external hard disk drives (HDDs), solid-state drives (SSDs), USB flash drives, magnetic tapes, CDs, DVDs, RAID subsystems, and other electronic devices. Recovery may be required due to physical damage to the storage devices or logical damage to the file system that prevents it from being mounted by the host operating system (OS).

In computing, an archive file is a computer file that is composed of one or more files along with metadata. Many archive formats also support compression of member files. Archive files are used to collect multiple data files together into a single file for easier portability and storage, or simply to compress files to use less storage space. Archive files often store directory structures, error detection and correction information, comments, and some use built-in encryption.

A cue sheet, or cue file, is a metadata file which describes how the tracks of a CD or DVD are laid out. Cue sheets are stored as plain text files and commonly have a .cue filename extension. CDRWIN first introduced cue sheets, which are now supported by many optical disc authoring applications and media players.

IMG, in computing, refers to binary files with the .img filename extension that store raw disk images of floppy disks, hard drives, and optical discs or a bitmap image – .img.

<span class="mw-page-title-main">PhotoRec</span> Open source data recovery software

PhotoRec is a free and open-source utility software for data recovery with text-based user interface using data carving techniques, designed to recover lost files from various digital camera memory, hard disk and CD-ROM. It can recover the files with more than 480 file extensions . It is also possible to add custom file signature to detect less known files.

A bad sector in computing is a disk sector on a disk storage unit that is unreadable. Upon taking damage, all information stored on that sector is lost. When a bad sector is found and marked, the operating system like Windows or Linux will skip it in the future. Bad sectors are a threat to information security in the sense of data remanence.

<span class="mw-page-title-main">Disk sector</span> Logical or physical division of storage media

In computer disk storage, a sector is a subdivision of a track on a magnetic disk or optical disc. For most disks, each sector stores a fixed amount of user-accessible data, traditionally 512 bytes for hard disk drives (HDDs) and 2048 bytes for CD-ROMs and DVD-ROMs. Newer HDDs and SSDs use 4096-byte (4 KiB) sectors, which are known as the Advanced Format (AF).

The preservation of optical media is essential because it is a resource in libraries, and stores audio, video, and computer data to be accessed by patrons. While optical discs are generally more reliable and durable than older media types, environmental conditions and/or poor handling can result in lost information.

<span class="mw-page-title-main">CD-ROM</span> Pre-pressed compact disc containing computer data

A CD-ROM is a type of read-only memory consisting of a pre-pressed optical compact disc that contains data computers can read—but not write or erase—CD-ROMs. Some CDs, called enhanced CDs, hold both computer data and audio with the latter capable of being played on a CD player, while data is only usable on a computer.

<span class="mw-page-title-main">Advanced Format</span> Disk format and access using sector sizes larger than 512 bytes

Advanced Format (AF) is any disk sector format used to store data on magnetic disks in hard disk drives (HDDs) that exceeds 528 bytes per sector, frequently 4096, 4112, 4160, or 4224-byte (4 KB) sectors. Larger sectors of an Advanced Format Drive (AFD) enable the integration of stronger error correction algorithms to maintain data integrity at higher storage densities.

References

  1. "Technical properties of the error correction". Archived from the original on 2014-10-13. Retrieved 2015-02-07.
  2. "The RS01, RS02 and RS03 methods". Archived from the original on 2015-02-07. Retrieved 2015-02-07.