Data recovery

Last updated

In computing, data recovery is a process of retrieving deleted, inaccessible, lost, corrupted, damaged, or formatted data from secondary storage, removable media or files, when the data stored in them cannot be accessed in a usual way. [1] The data is most often salvaged from storage media such as internal or external hard disk drives (HDDs), solid-state drives (SSDs), USB flash drives, magnetic tapes, CDs, DVDs, RAID subsystems, and other electronic devices. Recovery may be required due to physical damage to the storage devices or logical damage to the file system that prevents it from being mounted by the host operating system (OS). [2]

Contents

Logical failures occur when the hard drive devices are functional but the user or automated-OS cannot retrieve or access data stored on them. Logical failures can occur due to corruption of the engineering chip, lost partitions, firmware failure, or failures during formatting/re-installation. [3] [4]

Data recovery can be a very simple or technical challenge. This is why there are specific software companies specialized in this field. [5]

About

The most common data recovery scenarios involve an operating system failure, malfunction of a storage device, logical failure of storage devices, accidental damage or deletion, etc. (typically, on a single-drive, single-partition, single-OS system), in which case the ultimate goal is simply to copy all important files from the damaged media to another new drive. This can be accomplished using a Live CD, or DVD by booting directly from a ROM or a USB drive instead of the corrupted drive in question. Many Live CDs or DVDs provide a means to mount the system drive and backup drives or removable media, and to move the files from the system drive to the backup media with a file manager or optical disc authoring software. Such cases can often be mitigated by disk partitioning and consistently storing valuable data files (or copies of them) on a different partition from the replaceable OS system files.

Another scenario involves a drive-level failure, such as a compromised file system or drive partition, or a hard disk drive failure. In any of these cases, the data is not easily read from the media devices. Depending on the situation, solutions involve repairing the logical file system, partition table, or master boot record, or updating the firmware or drive recovery techniques ranging from software-based recovery of corrupted data, to hardware- and software-based recovery of damaged service areas (also known as the hard disk drive's "firmware"), to hardware replacement on a physically damaged drive which allows for the extraction of data to a new drive. If a drive recovery is necessary, the drive itself has typically failed permanently, and the focus is rather on a one-time recovery, salvaging whatever data can be read.

In a third scenario, files have been accidentally "deleted" from a storage medium by the users. Typically, the contents of deleted files are not removed immediately from the physical drive; instead, references to them in the directory structure are removed, and thereafter space the deleted data occupy is made available for later data overwriting. In the mind of end users, deleted files cannot be discoverable through a standard file manager, but the deleted data still technically exists on the physical drive. In the meantime, the original file contents remain, often several disconnected fragments, and may be recoverable if not overwritten by other data files.

The term "data recovery" is also used in the context of forensic applications or espionage, where data which have been encrypted, hidden, or deleted, rather than damaged, are recovered. Sometimes data present in the computer gets encrypted or hidden due to reasons like virus attacks which can only be recovered by some computer forensic experts.

Physical damage

A wide variety of failures can cause physical damage to storage media, which may result from human errors and natural disasters. CD-ROMs can have their metallic substrate or dye layer scratched off; hard disks can suffer from a multitude of mechanical failures, such as head crashes, PCB failure, and failed motors; tapes can simply break.

Physical damage to a hard drive, even in cases where a head crash has occurred, does not necessarily mean there will be a permanent loss of data. The techniques employed by many professional data recovery companies can typically salvage most, if not all, of the data that had been lost when the failure occurred.

Of course, there are exceptions to this, such as cases where severe damage to the hard drive platters may have occurred. However, if the hard drive can be repaired and a full image or clone created, then the logical file structure can be rebuilt in most instances.

Most physical damage cannot be repaired by end users. For example, opening a hard disk drive in a normal environment can allow airborne dust to settle on the platter and become caught between the platter and the read/write head. During normal operation, read/write heads float 3 to 6 nanometers above the platter surface, and the average dust particles found in a normal environment are typically around 30,000 nanometers in diameter. [6] When these dust particles get caught between the read/write heads and the platter, they can cause new head crashes that further damage the platter and thus compromise the recovery process. Furthermore, end users generally do not have the hardware or technical expertise required to make these repairs. Consequently, data recovery companies are often employed to salvage important data with the more reputable ones using class 100 dust- and static-free cleanrooms. [7]

Recovery techniques

Recovering data from physically damaged hardware can involve multiple techniques. Some damage can be repaired by replacing parts in the hard disk. This alone may make the disk usable, but there may still be logical damage. A specialized disk-imaging procedure is used to recover every readable bit from the surface. Once this image is acquired and saved on a reliable medium, the image can be safely analyzed for logical damage and will possibly allow much of the original file system to be reconstructed.

Hardware repair

Media that has suffered a catastrophic electronic failure requires data recovery in order to salvage its contents. HD with toasty PCB.jpg
Media that has suffered a catastrophic electronic failure requires data recovery in order to salvage its contents.

A common misconception is that a damaged printed circuit board (PCB) may be simply replaced during recovery procedures by an identical PCB from a healthy drive. While this may work in rare circumstances on hard disk drives manufactured before 2003, it will not work on newer drives. Electronics boards of modern drives usually contain drive-specific adaptation data (generally a map of bad sectors and tuning parameters) and other information required to properly access data on the drive. Replacement boards often need this information to effectively recover all of the data. The replacement board may need to be reprogrammed. Some manufacturers (Seagate, for example) store this information on a serial EEPROM chip, which can be removed and transferred to the replacement board. [8] [9]

Each hard disk drive has what is called a system area or service area; this portion of the drive, which is not directly accessible to the end user, usually contains drive's firmware and adaptive data that helps the drive operate within normal parameters. [10] One function of the system area is to log defective sectors within the drive; essentially telling the drive where it can and cannot write data.

The sector lists are also stored on various chips attached to the PCB, and they are unique to each hard disk drive. If the data on the PCB do not match what is stored on the platter, then the drive will not calibrate properly. [11] In most cases the drive heads will click because they are unable to find the data matching what is stored on the PCB.

Logical damage

Result of a failed data recovery from a hard disk drive. Data loss of image file.JPG
Result of a failed data recovery from a hard disk drive.

The term "logical damage" refers to situations in which the error is not a problem in the hardware and requires software-level solutions.

Corrupt partitions and file systems, media errors

In some cases, data on a hard disk drive can be unreadable due to damage to the partition table or file system, or to (intermittent) media errors. In the majority of these cases, at least a portion of the original data can be recovered by repairing the damaged partition table or file system using specialized data recovery software such as Testdisk; software like ddrescue can image media despite intermittent errors, and image raw data when there is partition table or file system damage. This type of data recovery can be performed by people without expertise in drive hardware as it requires no special physical equipment or access to platters.

Sometimes data can be recovered using relatively simple methods and tools; [12] more serious cases can require expert intervention, particularly if parts of files are irrecoverable. Data carving is the recovery of parts of damaged files using knowledge of their structure.

Overwritten data

After data has been physically overwritten on a hard disk drive, it is generally assumed that the previous data are no longer possible to recover. In 1996, Peter Gutmann, a computer scientist, presented a paper that suggested overwritten data could be recovered through the use of magnetic force microscopy. [13] In 2001, he presented another paper on a similar topic. [14] To guard against this type of data recovery, Gutmann and Colin Plumb designed a method of irreversibly scrubbing data, known as the Gutmann method and used by several disk-scrubbing software packages.

Substantial criticism has followed, primarily dealing with the lack of any concrete examples of significant amounts of overwritten data being recovered. [15] Although Gutmann's theory may be correct, there is no practical evidence that overwritten data can be recovered, while research has shown to support that overwritten data cannot be recovered.[ specify ] [16] [17] [18]

Solid-state drives (SSD) overwrite data differently from hard disk drives (HDD) which makes at least some of their data easier to recover. Most SSDs use flash memory to store data in pages and blocks, referenced by logical block addresses (LBA) which are managed by the flash translation layer (FTL). When the FTL modifies a sector it writes the new data to another location and updates the map so the new data appear at the target LBA. This leaves the pre-modification data in place, with possibly many generations, and recoverable by data recovery software.

Lost, deleted, and formatted data

Sometimes, data present in the physical drives (Internal/External Hard disk, Pen Drive, etc.) gets lost, deleted and formatted due to circumstances like virus attack, accidental deletion or accidental use of SHIFT+DELETE. In these cases, data recovery software is used to recover/restore the data files.

Logical bad sector

In the list of logical failures of hard disks, a logical bad sector is the most common fault leading data not to be readable. Sometimes it is possible to sidestep error detection even in software, and perhaps with repeated reading and statistical analysis recover at least some of the underlying stored data. Sometimes prior knowledge of the data stored and the error detection and correction codes can be used to recover even erroneous data. However, if the underlying physical drive is degraded badly enough, at least the hardware surrounding the data must be replaced, or it might even be necessary to apply laboratory techniques to the physical recording medium. Each of the approaches is progressively more expensive, and as such progressively more rarely sought.

Eventually, if the final, physical storage medium has indeed been disturbed badly enough, recovery will not be possible using any means; the information has irreversibly been lost.

Remote data recovery

Recovery experts do not always need to have physical access to the damaged hardware. When the lost data can be recovered by software techniques, they can often perform the recovery using remote access software over the Internet, LAN or other connection to the physical location of the damaged media. The process is essentially no different from what the end user could perform by themselves. [19]

Remote recovery requires a stable connection with an adequate bandwidth. However, it is not applicable where access to the hardware is required, as in cases of physical damage.

Four phases of data recovery

Usually, there are four phases when it comes to successful data recovery, though that can vary depending on the type of data corruption and recovery required. [20]

Phase 1
Repair the hard disk drive
The hard drive is repaired in order to get it running in some form, or at least in a state suitable for reading the data from it. For example, if heads are bad they need to be changed; if the PCB is faulty then it needs to be fixed or replaced; if the spindle motor is bad the platters and heads should be moved to a new drive.
Phase 2
Image the drive to a new drive or a disk image file
When a hard disk drive fails, the importance of getting the data off the drive is the top priority. The longer a faulty drive is used, the more likely further data loss is to occur. Creating an image of the drive will ensure that there is a secondary copy of the data on another device, on which it is safe to perform testing and recovery procedures without harming the source.
Phase 3
Logical recovery of files, partition, MBR and filesystem structures
After the drive has been cloned to a new drive, it is suitable to attempt the retrieval of lost data. If the drive has failed logically, there are a number of reasons for that. Using the clone it may be possible to repair the partition table or master boot record (MBR) in order to read the file system's data structure and retrieve stored data.
Phase 4
Repair damaged files that were retrieved
Data damage can be caused when, for example, a file is written to a sector on the drive that has been damaged. This is the most common cause in a failing drive, meaning that data needs to be reconstructed to become readable. Corrupted documents can be recovered by several software methods or by manually reconstructing the document using a hex editor.

Restore disk

The Windows operating system can be reinstalled on a computer that is already licensed for it. The reinstallation can be done by downloading the operating system or by using a "restore disk" provided by the computer manufacturer. Eric Lundgren was fined and sentenced to U.S. federal prison in April 2018 for producing 28,000 restore disks and intending to distribute them for about 25 cents each as a convenience to computer repair shops. [21]

List of data recovery software

Bootable

Data recovery cannot always be done on a running system. As a result, a boot disk, live CD, live USB, or any other type of live distro contains a minimal operating system.

Consistency checkers

File recovery

Forensics

Imaging tools

See also

Related Research Articles

<span class="mw-page-title-main">Disk partitioning</span> Creation of separate accessible storage areas on a secondary computer storage device

Disk partitioning or disk slicing is the creation of one or more regions on secondary storage, so that each region can be managed separately. These regions are called partitions. It is typically the first step of preparing a newly installed disk, before any file system is created. The disk stores the information about the partitions' locations and sizes in an area known as the partition table that the operating system reads before any other part of the disk. Each partition then appears to the operating system as a distinct "logical" disk that uses part of the actual disk. System administrators use a program called a partition editor to create, resize, delete, and manipulate the partitions. Partitioning allows the use of different filesystems to be installed for different kinds of files. Separating user data from system data can prevent the system partition from becoming full and rendering the system unusable. Partitioning can also make backing up easier. A disadvantage is that it can be difficult to properly size partitions, resulting in having one partition with too much free space and another nearly totally allocated.

In cryptography, plaintext usually means unencrypted information pending input into cryptographic algorithms, usually encryption algorithms. This usually refers to data that is transmitted or stored unencrypted.

Disk formatting is the process of preparing a data storage device such as a hard disk drive, solid-state drive, floppy disk, memory card or USB flash drive for initial use. In some cases, the formatting operation may also create one or more new file systems. The first part of the formatting process that performs basic medium preparation is often referred to as "low-level formatting". Partitioning is the common term for the second part of the process, dividing the device into several sub-devices and, in some cases, writing information to the device allowing an operating system to be booted from it. The third part of the process, usually termed "high-level formatting" most often refers to the process of generating a new file system. In some operating systems all or parts of these three processes can be combined or repeated at different levels and the term "format" is understood to mean an operation in which a new disk medium is fully prepared to store files. Some formatting utilities allow distinguishing between a quick format, which does not erase all existing data and a long option that does erase all existing data.

<span class="mw-page-title-main">Defragmentation</span> Rearrangement of sectors on a hard disk into contiguous units

In the maintenance of file systems, defragmentation is a process that reduces the degree of fragmentation. It does this by physically organizing the contents of the mass storage device used to store files into the smallest number of contiguous regions. It also attempts to create larger regions of free space using compaction to impede the return of fragmentation. Some defragmentation utilities try to keep smaller files within a single directory together, as they are often accessed in sequence.

<span class="mw-page-title-main">SystemRescue</span> Linux distribution

SystemRescue is a Linux distribution for x86-64 and IA-32 computers. The primary purpose of SystemRescue is to repair unbootable or otherwise damaged computer systems after a system crash. SystemRescue is not intended to be used as a permanent operating system. It runs from a Live CD, a USB flash drive or any type of hard drive. It was designed by a team led by François Dupoux, and is based on Arch Linux since version 6.0. Starting with version 6.0, it has systemd as its init system.

<span class="mw-page-title-main">Disk editor</span> Computer software

A disk editor is a computer program that allows its user to read, edit, and write raw data on disk drives ; as such, they are sometimes called sector editors, since the read/write routines built into the electronics of most disk drives require to read/write data in chunks of sectors. Many disk editors can also be used to edit the contents of a running computer's memory or a disk image.

Data remanence is the residual representation of digital data that remains even after attempts have been made to remove or erase the data. This residue may result from data being left intact by a nominal file deletion operation, by reformatting of storage media that does not remove data previously written to the media, or through physical properties of the storage media that allow previously written data to be recovered. Data remanence may make inadvertent disclosure of sensitive information possible should the storage media be released into an uncontrolled environment.

Data loss is an error condition in information systems in which information is destroyed by failures or neglect in storage, transmission, or processing. Information systems implement backup and disaster recovery equipment and processes to prevent data loss or restore lost data. Data loss can also occur if the physical medium containing the data is lost or stolen.

The Gutmann method is an algorithm for securely erasing the contents of computer hard disk drives, such as files. Devised by Peter Gutmann and Colin Plumb and presented in the paper Secure Deletion of Data from Magnetic and Solid-State Memory in July 1996, it involved writing a series of 35 patterns over the region to be erased.

Undeletion is a feature for restoring computer files which have been removed from a file system by file deletion. Deleted data can be recovered on many file systems, but not all file systems provide an undeletion feature. Recovering data without an undeletion facility is usually called data recovery, rather than undeletion. Undeletion can both help prevent users from accidentally losing data, or can pose a computer security risk, since users may not be aware that deleted files remain accessible.

<span class="mw-page-title-main">Disk Utility</span> Software for Apple macOS

Disk Utility is a system utility for performing disk and disk volume-related tasks on the macOS operating system by Apple Inc.

<span class="mw-page-title-main">TestDisk</span> Data recovery utility

TestDisk is a free and open-source data recovery utility that helps users recover lost partitions or repair corrupted filesystems. TestDisk can collect detailed information about a corrupted drive, which can then be sent to a technician for further analysis. TestDisk supports DOS, Microsoft Windows, Linux, FreeBSD, NetBSD, OpenBSD, SunOS, and MacOS. TestDisk handles non-partitioned and partitioned media. In particular, it recognizes the GUID Partition Table (GPT), Apple partition map, PC/Intel BIOS partition tables, Sun Solaris slice and Xbox fixed partitioning scheme. TestDisk uses a command line user interface. TestDisk can recover deleted files with 97% accuracy.

<span class="mw-page-title-main">PhotoRec</span> Open source data recovery software

PhotoRec is a free and open-source utility software for data recovery with text-based user interface using data carving techniques, designed to recover lost files from various digital camera memory, hard disk and CD-ROM. It can recover the files with more than 480 file extensions . It is also possible to add custom file signature to detect less known files.

A bad sector in computing is a disk sector on a disk storage unit that is unreadable. Upon taking damage, all information stored on that sector is lost. When a bad sector is found and marked, the operating system like Windows or Linux will skip it in the future. Bad sectors are a threat to information security in the sense of data remanence.

gpart is a software utility which scans a storage device, examining the data in order to detect partitions which may exist but are absent from the disk's partition tables. Gpart was written by Michail Brzitwa of Germany. The release on the author's website is now older than the releases some distributions are using. It appears that Michail Brzitwa does not actively maintain the code, instead the various distributions appear to maintain their own versions.

Data erasure is a software-based method of data sanitization that aims to completely destroy all electronic data residing on a hard disk drive or other digital media by overwriting data onto all sectors of the device in an irreversible process. By overwriting the data on the storage device, the data is rendered irrecoverable.

shred is a command on Unix-like operating systems that can be used to securely delete files and devices so that it is extremely difficult to recover them, even with specialized hardware and technology; assuming it's even possible to recover the file at all. It is a part of GNU Core Utilities. Being based on the Gutmann method paper, it suffers from the same criticisms and possible shortcomings.

Disk Drill is a data recovery utility for Windows and macOS developed by Cleverfiles. It was introduced in 2010, and is primarily designed to recover deleted or lost files from hard disk drives, USB flash drives and SSD drives with the help of Recovery Vault technology. While Disk Drill was originally exclusive to the Mac, a Windows version was released in 2015.

Hetman Partition Recovery is a shareware program for recovery of deleted data from hard drive partitions and other storage media. The utility supports both functioning disks and damaged logical partitions and recovers data from both reformatted disks and disks which have had their file system changed from FAT to NTFS or vice versa. In addition to working on existing partitions the tool can also find deleted logical drives, displaying them to the user for further search and recovery of deleted files as well as correcting errors in logical partition design. Hetman Partition Recovery supports reading of regular, zipped, and encrypted files, from disks formatted under NTFS and/or FAT file systems.

DMDE is a data recovery and disk editing tool for hard drives and other storage media. It can work with physical devices, logical disks, disk images, as well as RAID-arrays and recovers files that have been accidentally deleted or lost due to other incidents.

References

  1. "Data Recovery Explained". www.ibm.com. Archived from the original on 28 August 2022. Retrieved 28 August 2022.
  2. "Data Recovery Explained". www.ibm.com. Archived from the original on 28 August 2022. Retrieved 1 December 2022.
  3. "What is logical failure?". Disklabs Digital Forensics and Data Recovery. Archived from the original on 1 December 2022. Retrieved 1 December 2022.
  4. "What Happens When Drives Experience Logical Failure?". www.streetdirectory.com. Archived from the original on 1 December 2022. Retrieved 1 December 2022.
  5. "Data Recovery – Backup Technology". www.dell.com. Archived from the original on 1 December 2022. Retrieved 1 December 2022.
  6. "Data Recovery On A 3TB Seagate Hard Drive". acsdata.com. Archived from the original on 13 February 2017.
  7. Vasconcelos, Pedro. "DIY data recovery could mean "bye-bye"". The Ontrack Data Recovery Blog. Ontrack Data Recovery. Archived from the original on 26 July 2019. Retrieved 26 July 2019.
  8. "Hard Drive Circuit Board Replacement Guide or How To Swap HDD PCB". donordrives.com. Archived from the original on 27 May 2015. Retrieved 27 May 2015.
  9. "Firmware Adaptation Service - ROM Swap". pcb4you.com. Archived from the original on 29 March 2013. Retrieved 27 May 2015.
  10. Ariel Berkman (14 February 2013). "Hiding Data in Hard Drive's Service Areas" (PDF). recover.co.il. Archived from the original (PDF) on 26 February 2015. Retrieved 23 January 2015.
  11. "Data Recovery Report - Read Before Choosing A Data Recovery Company". 16 April 2013. Archived from the original on 16 April 2013.
  12. Data Recovery Software Archived 17 October 2016 at the Wayback Machine
  13. Secure Deletion of Data from Magnetic and Solid-State Memory Archived 9 December 2007 at the Wayback Machine , Peter Gutmann, Department of Computer Science, University of Auckland
  14. Data Remanence in Semiconductor Devices Archived 21 February 2007 at the Wayback Machine , Peter Gutmann, IBM T.J. Watson Research Center
  15. Feenberg, Daniel (14 May 2004). "Can Intelligence Agencies Read Overwritten Data? A response to Gutmann". National Bureau of Economic Research. Archived from the original on 9 May 2008. Retrieved 21 May 2008.
  16. "Disk Wiping – One Pass is Enough". anti-forensics.com. 17 March 2009. Archived from the original on 2 September 2012.
  17. "Disk Wiping – One Pass is Enough – Part 2 (this time with screenshots)". anti-forensics.com. 18 March 2009. Archived from the original on 27 November 2012.
  18. Wright, Dr. Craig (15 January 2009). "Overwriting Hard Drive Data". Archived from the original on 23 May 2010.
  19. Barton, Andre (17 December 2012). "Data Recovery Over the Internet". Data Recovery Digest. Archived from the original on 27 May 2015. Retrieved 29 April 2015.
  20. Stanley Morgan (28 December 2012). "[Infographic] Four Phases Of Data Recovery". dolphindatalab.com. Archived from the original on 2 April 2015. Retrieved 23 March 2015.
  21. Washington Post (26 April 2018). "Electronics-recycling innovator is going to prison for trying to extend computers' lives". Washington Post. Archived from the original on 2 May 2018. Retrieved 2 May 2018.

Further reading