Data remanence is the residual representation of digital data that remains even after attempts have been made to remove or erase the data. This residue may result from data being left intact by a nominal file deletion operation, by reformatting of storage media that does not remove data previously written to the media, or through physical properties of the storage media that allow previously written data to be recovered. Data remanence may make inadvertent disclosure of sensitive information possible should the storage media be released into an uncontrolled environment (e.g., thrown in the bin (trash) or lost).
Various techniques have been developed to counter data remanence. These techniques are classified as clearing, purging/sanitizing, or destruction. Specific methods include overwriting, degaussing, encryption, and media destruction.
Effective application of countermeasures can be complicated by several factors, including media that are inaccessible, media that cannot effectively be erased, advanced storage systems that maintain histories of data throughout the data's life cycle, and persistence of data in memory that is typically considered volatile.
Several standards exist for the secure removal of data and the elimination of data remanence.
Many operating systems, file managers, and other software provide a facility where a file is not immediately deleted when the user requests that action. Instead, the file is moved to a holding area (i.e. the "trash"), making it easy for the user to undo a mistake. Similarly, many software products automatically create backup copies of files that are being edited, to allow the user to restore the original version, or to recover from a possible crash ( autosave feature).
Even when an explicit deleted file retention facility is not provided or when the user does not use it, operating systems do not actually remove the contents of a file when it is deleted unless they are aware that explicit erasure commands are required, like on a solid-state drive. (In such cases, the operating system will issue the Serial ATA TRIM command or the SCSI UNMAP command to let the drive know to no longer maintain the deleted data.) Instead, they simply remove the file's entry from the file system directory because this requires less work and is therefore faster, and the contents of the file—the actual data—remain on the storage medium. The data will remain there until the operating system reuses the space for new data. In some systems, enough filesystem metadata are also left behind to enable easy undeletion by commonly available utility software. Even when undelete has become impossible, the data, until it has been overwritten, can be read by software that reads disk sectors directly. Computer forensics often employs such software.
Likewise, reformatting, repartitioning, or reimaging a system is unlikely to write to every area of the disk, though all will cause the disk to appear empty or, in the case of reimaging, empty except for the files present in the image, to most software.
Finally, even when the storage media is overwritten, physical properties of the media may permit recovery of the previous contents. In most cases however, this recovery is not possible by just reading from the storage device in the usual way, but requires using laboratory techniques such as disassembling the device and directly accessing/reading from its components.
§ Complications below gives further explanations for causes of data remanence.
There are three levels commonly recognized for eliminating remnant data:
Clearing is the removal of sensitive data from storage devices in such a way that there is assurance that the data may not be reconstructed using normal system functions or software file/data recovery utilities. The data may still be recoverable, but not without special laboratory techniques. [1]
Clearing is typically an administrative protection against accidental disclosure within an organization. For example, before a hard drive is re-used within an organization, its contents may be cleared to prevent their accidental disclosure to the next user.
Purging or sanitizing is the physical rewrite of sensitive data from a system or storage device done with the specific intent of rendering the data unrecoverable at a later time. [2] Purging, proportional to the sensitivity of the data, is generally done before releasing media beyond control, such as before discarding old media, or moving media to a computer with different security requirements.
The storage media is made unusable for conventional equipment. Effectiveness of destroying the media varies by medium and method. Depending on recording density of the media, and/or the destruction technique, this may leave data recoverable by laboratory methods. Conversely, destruction using appropriate techniques is the most secure method of preventing retrieval.
A common method used to counter data remanence is to overwrite the storage media with new data. This is often called wiping or shredding a disk or file, by analogy to common methods of destroying print media, although the mechanism bears no similarity to these. Because such a method can often be implemented in software alone, and may be able to selectively target only part of the media, it is a popular, low-cost option for some applications. Overwriting is generally an acceptable method of clearing, as long as the media is writable and not damaged.
The simplest overwrite technique writes the same data everywhere—often just a pattern of all zeros. At a minimum, this will prevent the data from being retrieved simply by reading from the media again using standard system functions.
In an attempt to counter more advanced data recovery techniques, specific overwrite patterns and multiple passes have often been prescribed. These may be generic patterns intended to eradicate any trace signatures; an example is the seven-pass pattern 0xF6, 0x00, 0xFF, <random byte>, 0x00, 0xFF, <random byte>, sometimes erroneously attributed to US standard DOD 5220.22-M.
One challenge with overwriting is that some areas of the disk may be inaccessible, due to media degradation or other errors. Software overwrite may also be problematic in high-security environments, which require stronger controls on data commingling than can be provided by the software in use. The use of advanced storage technologies may also make file-based overwrite ineffective (see the related discussion below under ).
There are specialized machines and software that are capable of doing overwriting. The software can sometimes be a standalone operating system specifically designed for data destruction. There are also machines specifically designed to wipe hard drives to the department of defense specifications DOD 5220.22-M. [3]
Peter Gutmann investigated data recovery from nominally overwritten media in the mid-1990s. He suggested magnetic force microscopy may be able to recover such data, and developed specific patterns, for specific drive technologies, designed to counter such. [4] These patterns have come to be known as the Gutmann method. Gutmann's belief in the possibility of data recovery is based on many questionable assumptions and factual errors that indicate a low level of understanding of how hard drives work. [5]
Daniel Feenberg, an economist at the private National Bureau of Economic Research, claims that the chances of overwritten data being recovered from a modern hard drive amount to "urban legend". [6] He also points to the "18+1⁄2-minute gap" Rose Mary Woods created on a tape of Richard Nixon discussing the Watergate break-in. Erased information in the gap has not been recovered, and Feenberg claims doing so would be an easy task compared to recovery of a modern high density digital signal.
As of November 2007, the United States Department of Defense considers overwriting acceptable for clearing magnetic media within the same security area/zone, but not as a sanitization method. Only degaussing or physical destruction is acceptable for the latter. [7]
On the other hand, according to the 2014 NIST Special Publication 800-88 Rev. 1 (p. 7): "For storage devices containing magnetic media, a single overwrite pass with a fixed pattern such as binary zeros typically hinders recovery of data even if state of the art laboratory techniques are applied to attempt to retrieve the data." [8] An analysis by Wright et al. of recovery techniques, including magnetic force microscopy, also concludes that a single wipe is all that is required for modern drives. They point out that the long time required for multiple wipes "has created a situation where many organizations ignore the issue [altogether] – resulting in data leaks and loss." [9]
Degaussing is the removal or reduction of a magnetic field of a disk or drive, using a device called a degausser that has been designed for the media being erased. Applied to magnetic media, degaussing may purge an entire media element quickly and effectively.
Degaussing often renders hard disks inoperable, as it erases low-level formatting that is only done at the factory during manufacturing. In some cases, it is possible to return the drive to a functional state by having it serviced at the manufacturer. However, some modern degaussers use such a strong magnetic pulse that the motor that spins the platters may be destroyed in the degaussing process, and servicing may not be cost-effective. Degaussed computer tape such as DLT can generally be reformatted and reused with standard consumer hardware.
In some high-security environments, one may be required to use a degausser that has been approved for the task. For example, in US government and military jurisdictions, one may be required to use a degausser from the NSA's "Evaluated Products List". [10]
Encrypting data before it is stored on the media may mitigate concerns about data remanence. If the decryption key is strong and carefully controlled, it may effectively make any data on the media unrecoverable. Even if the key is stored on the media, it may prove easier or quicker to overwrite just the key, versus the entire disk. This process is called crypto-shredding.
Encryption may be done on a file-by-file basis, or on the whole disk. Cold boot attacks are one of the few possible methods for subverting a full-disk encryption method, as there is no possibility of storing the plain text key in an unencrypted section of the medium. See the section Complications: Data in RAM for further discussion.
Other side-channel attacks (such as keyloggers, acquisition of a written note containing the decryption key, or rubber-hose cryptanalysis) may offer a greater chance of success, but do not rely on weaknesses in the cryptographic method employed. As such, their relevance for this article is minor.
Thorough destruction of the underlying storage media is the most certain way to counter data remanence. However, the process is generally time-consuming, cumbersome, and may require extremely thorough methods, as even a small fragment of the media may contain large amounts of data.
Specific destruction techniques include:
Storage media may have areas which become inaccessible by normal means. For example, magnetic disks may develop new bad sectors after data has been written, and tapes require inter-record gaps. Modern hard disks often feature reallocation of marginal sectors or tracks, automated in a way that the operating system would not need to work with it. The problem is especially significant in solid-state drives (SSDs) that rely on relatively large relocated bad block tables. Attempts to counter data remanence by overwriting may not be successful in such situations, as data remnants may persist in such nominally inaccessible areas.
Data storage systems with more sophisticated features may make overwrite ineffective, especially on a per-file basis. For example, journaling file systems increase the integrity of data by recording write operations in multiple locations, and applying transaction-like semantics; on such systems, data remnants may exist in locations "outside" the nominal file storage location. Some file systems also implement copy-on-write or built-in revision control, with the intent that writing to a file never overwrites data in-place. Furthermore, technologies such as RAID and anti-fragmentation techniques may result in file data being written to multiple locations, either by design (for fault tolerance), or as data remnants.
Wear leveling can also defeat data erasure, by relocating blocks between the time when they are originally written and the time when they are overwritten. For this reason, some security protocols tailored to operating systems or other software featuring automatic wear leveling recommend conducting a free-space wipe of a given drive and then copying many small, easily identifiable "junk" files or files containing other nonsensitive data to fill as much of that drive as possible, leaving only the amount of free space necessary for satisfactory operation of system hardware and software. As storage and system demands grow, the "junk data" files can be deleted as necessary to free up space; even if the deletion of "junk data" files is not secure, their initial nonsensitivity reduces to near zero the consequences of recovery of data remanent from them.[ citation needed ]
As optical media are not magnetic, they are not erased by conventional degaussing. Write-once optical media (CD-R, DVD-R, etc.) also cannot be purged by overwriting. Rewritable optical media, such as CD-RW and DVD-RW, may be receptive to overwriting. Methods for successfully sanitizing optical discs include delaminating or abrading the metallic data layer, shredding, incinerating, destructive electrical arcing (as by exposure to microwave energy), and submersion in a polycarbonate solvent (e.g., acetone).
Research from the Center for Magnetic Recording and Research, University of California, San Diego has uncovered problems inherent in erasing data stored on solid-state drives (SSDs). Researchers discovered three problems with file storage on SSDs: [11]
First, built-in commands are effective, but manufacturers sometimes implement them incorrectly. Second, overwriting the entire visible address space of an SSD twice is usually, but not always, sufficient to sanitize the drive. Third, none of the existing hard drive-oriented techniques for individual file sanitization are effective on SSDs. [11] : 1
Solid-state drives, which are flash-based, differ from hard-disk drives in two ways: first, in the way data is stored; and second, in the way the algorithms are used to manage and access that data. These differences can be exploited to recover previously erased data. SSDs maintain a layer of indirection between the logical addresses used by computer systems to access data and the internal addresses that identify physical storage. This layer of indirection hides idiosyncratic media interfaces and enhances SSD performance, reliability, and lifespan (see wear leveling), but it can also produce copies of the data that are invisible to the user and that a sophisticated attacker could recover. For sanitizing entire disks, sanitize commands built into the SSD hardware have been found to be effective when implemented correctly, and software-only techniques for sanitizing entire disks have been found to work most, but not all, of the time. [11] : section 5 In testing, none of the software techniques were effective for sanitizing individual files. These included well-known algorithms such as the Gutmann method, US DoD 5220.22-M, RCMP TSSIT OPS-II, Schneier 7 Pass, and Secure Empty Trash on macOS (a feature included in versions OS X 10.3-10.9). [11] : section 5
The TRIM feature in many SSD devices, if properly implemented, will eventually erase data after it is deleted [12] [ citation needed ], but the process can take some time, typically several minutes. Many older operating systems do not support this feature, and not all combinations of drives and operating systems work. [13]
Data remanence has been observed in static random-access memory (SRAM), which is typically considered volatile (i.e., the contents degrade with loss of external power). In one study, data retention was observed even at room temperature. [14]
Data remanence has also been observed in dynamic random-access memory (DRAM). Modern DRAM chips have a built-in self-refresh module, as they not only require a power supply to retain data, but must also be periodically refreshed to prevent their data contents from fading away from the capacitors in their integrated circuits. A study found data remanence in DRAM with data retention of seconds to minutes at room temperature and "a full week without refresh when cooled with liquid nitrogen." [15] The study authors were able to use a cold boot attack to recover cryptographic keys for several popular full disk encryption systems, including Microsoft BitLocker, Apple FileVault, dm-crypt for Linux, and TrueCrypt. [15] : 12
Despite some memory degradation, authors of the above described study were able to take advantage of redundancy in the way keys are stored after they have been expanded for efficient use, such as in key scheduling. The authors recommend that computers be powered down, rather than be left in a "sleep" state, when not in physical control of the owner. In some cases, such as certain modes of the software program BitLocker, the authors recommend that a boot password or a key on a removable USB device be used. [15] : 12 TRESOR is a kernel patch for Linux specifically intended to prevent cold boot attacks on RAM by ensuring that encryption keys are not accessible from user space and are stored in the CPU rather than system RAM whenever possible. Newer versions of the disk encryption software VeraCrypt can encrypt in-RAM keys and passwords on 64-bit Windows. [16]
Computer data storage or digital data storage is a technology consisting of computer components and recording media that are used to retain digital data. It is a core function and fundamental component of computers.
In cryptography, plaintext usually means unencrypted information pending input into cryptographic algorithms, usually encryption algorithms. This usually refers to data that is transmitted or stored unencrypted.
Disk formatting is the process of preparing a data storage device such as a hard disk drive, solid-state drive, floppy disk, memory card or USB flash drive for initial use. In some cases, the formatting operation may also create one or more new file systems. The first part of the formatting process that performs basic medium preparation is often referred to as "low-level formatting". Partitioning is the common term for the second part of the process, dividing the device into several sub-devices and, in some cases, writing information to the device allowing an operating system to be booted from it. The third part of the process, usually termed "high-level formatting" most often refers to the process of generating a new file system. In some operating systems all or parts of these three processes can be combined or repeated at different levels and the term "format" is understood to mean an operation in which a new disk medium is fully prepared to store files. Some formatting utilities allow distinguishing between a quick format, which does not erase all existing data and a long option that does erase all existing data.
In information technology, a backup, or data backup is a copy of computer data taken and stored elsewhere so that it may be used to restore the original after a data loss event. The verb form, referring to the process of doing so, is "back up", whereas the noun and adjective form is "backup". Backups can be used to recover data after its loss from data deletion or corruption, or to recover data from an earlier time. Backups provide a simple form of IT disaster recovery; however not all backup systems are able to reconstitute a computer system or other complex configuration such as a computer cluster, active directory server, or database server.
Degaussing is the process of decreasing or eliminating a remnant magnetic field. It is named after the gauss, a unit of magnetism, which in turn was named after Carl Friedrich Gauss. Due to magnetic hysteresis, it is generally not possible to reduce a magnetic field completely to zero, so degaussing typically induces a very small "known" field referred to as bias. Degaussing was originally applied to reduce ships' magnetic signatures during World War II. Degaussing is also used to reduce magnetic fields in cathode ray tube monitors and to destroy data held on magnetic storage.
File deletion is the removal of a file from a computer's file system.
The Gutmann method is an algorithm for securely erasing the contents of computer hard disk drives, such as files. Devised by Peter Gutmann and Colin Plumb and presented in the paper Secure Deletion of Data from Magnetic and Solid-State Memory in July 1996, it involved writing a series of 35 patterns over the region to be erased.
Disk encryption software is a computer security software that protects the confidentiality of data stored on computer media by using disk encryption.
In computing, data recovery is a process of retrieving deleted, inaccessible, lost, corrupted, damaged, or formatted data from secondary storage, removable media or files, when the data stored in them cannot be accessed in a usual way. The data is most often salvaged from storage media such as internal or external hard disk drives (HDDs), solid-state drives (SSDs), USB flash drives, magnetic tapes, CDs, DVDs, RAID subsystems, and other electronic devices. Recovery may be required due to physical damage to the storage devices or logical damage to the file system that prevents it from being mounted by the host operating system (OS).
Redaction or sanitization is the process of removing sensitive information from a document so that it may be distributed to a broader audience. It is intended to allow the selective disclosure of information. Typically, the result is a document that is suitable for publication or for dissemination to others rather than the intended audience of the original document.
Disk encryption is a technology which protects information by converting it into code that cannot be deciphered easily by unauthorized people or processes. Disk encryption uses disk encryption software or hardware to encrypt every bit of data that goes on a disk or disk volume. It is used to prevent unauthorized access to data storage.
Anti–computer forensics or counter-forensics are techniques used to obstruct forensic analysis.
Hardware-based full disk encryption (FDE) is available from many hard disk drive (HDD/SSD) vendors, including: Hitachi, Integral Memory, iStorage Limited, Micron, Seagate Technology, Samsung, Toshiba, Viasat UK, Western Digital. The symmetric encryption key is maintained independently from the computer's CPU, thus allowing the complete data store to be encrypted and removing computer memory as a potential attack vector.
Data erasure is a software-based method of data sanitization that aims to completely destroy all electronic data residing on a hard disk drive or other digital media by overwriting data onto all sectors of the device in an irreversible process. By overwriting the data on the storage device, the data is rendered irrecoverable.
Secure USB flash drives protect the data stored on them from access by unauthorized users. USB flash drive products have been on the market since 2000, and their use is increasing exponentially. As businesses have increased demand for these drives, manufacturers are producing faster devices with greater data storage capacities.
shred is a command on Unix-like operating systems that can be used to securely delete files and devices so that it is extremely difficult to recover them, even with specialized hardware and technology; assuming recovery is possible at all, which is not always the case. It is a part of GNU Core Utilities. Being based on the Gutmann method paper, it suffers from the same criticisms and possible shortcomings.
A trim command allows an operating system to inform a solid-state drive (SSD) which blocks of data are no longer considered to be "in use" and therefore can be erased internally.
HMG Infosec Standard 5, or IS5, is a data destruction standard used by the British government.
Crypto-shredding is the practice of rendering encrypted data unusable by deliberately deleting or overwriting the encryption keys: assuming the key is not later recovered and the encryption is not broken, the data should become irrecoverable, effectively permanently deleted or "shredded". This requires that the data have been encrypted.
Data sanitization involves the secure and permanent erasure of sensitive data from datasets and media to guarantee that no residual data can be recovered even through extensive forensic analysis. Data sanitization has a wide range of applications but is mainly used for clearing out end-of-life electronic devices or for the sharing and use of large datasets that contain sensitive information. The main strategies for erasing personal data from devices are physical destruction, cryptographic erasure, and data erasure. While the term data sanitization may lead some to believe that it only includes data on electronic media, the term also broadly covers physical media, such as paper copies. These data types are termed soft for electronic files and hard for physical media paper copies. Data sanitization methods are also applied for the cleaning of sensitive data, such as through heuristic-based methods, machine-learning based methods, and k-source anonymity.
{{cite book}}
: CS1 maint: others (link){{cite journal}}
: Cite journal requires |journal=
(help){{cite journal}}
: Cite journal requires |journal=
(help){{cite journal}}
: Cite journal requires |journal=
(help){{cite journal}}
: Cite journal requires |journal=
(help){{cite journal}}
: Cite journal requires |journal=
(help){{cite journal}}
: Cite journal requires |journal=
(help){{cite journal}}
: Cite journal requires |journal=
(help)