Data loss

Last updated

Data loss is an error condition in information systems in which information is destroyed by failures (like failed spindle motors or head crashes on hard drives) or neglect (like mishandling, careless handling or storage under unsuitable conditions) in storage, transmission, or processing. Information systems implement backup and disaster recovery equipment and processes to prevent data loss or restore lost data. [1] Data loss can also occur if the physical medium containing the data is lost or stolen.

Contents

Data loss is distinguished from data unavailability, which may arise from a network outage. Although the two have substantially similar consequences for users, data unavailability is temporary, while data loss may be permanent. Data loss is also distinct from data breach, an incident where data falls into the wrong hands, although the term data loss has been used in those incidents. [2]

Types

Studies show hardware failure and human error are the two most common causes of data loss, accounting for roughly three quarters of all incidents. [3] Another cause of data loss is a natural disaster, which is a greater risk dependent on where the hardware is located. While the probability of data loss due to natural disaster is small, the only way to prepare for such an event is to store backup data in a separate physical location. As such, the best backup plans always include at least one copy being stored off-site. [4]

Cost

The cost of a data loss event is directly related to the value of the data and the length of time that it is unavailable yet needed. For an enterprise in particular, the definition of cost extends beyond the financial and can also include time. Consider:

Prevention

The frequency of data loss and the impact can be greatly mitigated by taking proper precautions, those of which necessary can vary depending on the type of data loss. For example, multiple power circuits with battery backup and a generator only protect against power failures, though using an Uninterruptible Power Supply can protect drive against sudden power spikes. Similarly, using a journaling file system and RAID storage only protect against certain types of software and hardware failure. [5]

For hard disk drives, which are a physical storage medium, ensuring minimal vibration and movement will help protect against damaging the components internally, as can maintaining a suitable drive temperature. [6]

Regular data backups are an important asset to have when trying to recover after a data loss event, but they do not prevent user errors or system failures. As such, a data backup plan needs to be established and run in unison with a disaster recovery plan in order to lower risk. [7]

Data recovery

Media that's suffered a catastrophic electronic failure requires data recovery in order to salvage its contents. HD with toasty PCB.jpg
Media that's suffered a catastrophic electronic failure requires data recovery in order to salvage its contents.

Data recovery is often performed by specialized commercial services that have developed often proprietary methods to recover data from physically damaged media. Service costs at data recovery labs are usually dependent on type of damage and type of storage medium, as well as the required security or cleanroom procedures.

File system corruption can frequently be repaired by the user or the system administrator. For example, a deleted file is typically not immediately overwritten on disk, but more often simply has its entry deleted from the file system index. In such a case, the deletion can be easily reversed.

Successful recovery from data loss generally requires implementation of an effective backup strategy. Without an implemented backup strategy, recovery requires reinstallation of programs and regeneration of data. Even with an effective backup strategy, restoring a system to the precise state it was in prior to the Data Loss Event is extremely difficult. Some level of compromise between granularity of recoverability and cost is necessary. Furthermore, a Data Loss Event may not be immediately apparent. An effective backup strategy must also consider the cost of maintaining the ability to recover lost data for long periods of time.

A highly effective backup system would have duplicate copies of every file and program that were immediately accessible whenever a Data Loss Event was noticed. However, in most situations, there is an inverse correlation between the value of a unit of data and the length of time it takes to notice the loss of that data. Taking this into consideration, many backup strategies decrease the granularity of restorability as the time increases since the potential Data Loss Event. By this logic, recovery from recent Data Loss Events is easier and more complete than recovery from Data Loss Events that happened further in the past.

Recovery is also related to the type of Data Loss Event. Recovering a single lost file is substantially different from recovering an entire system that was destroyed in a disaster. An effective backup regimen has some proportionality between the magnitude of Data Loss and the magnitude of effort required to recover. For example, it should be far easier to restore the single lost file than to recover the entire system.

Initial steps upon data loss

If data loss occurs, a successful recovery must ensure that the deleted data is not over-written. For this reason write operations to the affected storage device should be avoided. This includes not starting the system to which the affected device is connected. This is because many operating systems create temporary files in order to boot, and these may overwrite areas of lost data — rendering it unrecoverable. Viewing web pages has the same effect — potentially overwriting lost files with the temporary HTML and image files created when viewing a web page. File operations such as copying, editing, or deleting should also be avoided.

Upon realizing data loss has occurred, it is often best to shut down the computer and remove the drive in question from the unit. Re-attach this drive to a secondary computer with a write blocker device and then attempt to recover lost data. If possible, create an image of the drive in order to establish a secondary copy of the data. This can then be tested on, with recovery attempted, abolishing the risk of harming the source data.{{Citation needed|date=October 2018

See also

Related Research Articles

In computing, a computer file is a resource for recording data on a computer storage device, primarily identified by its filename. Just as words can be written on paper, so can data be written to a computer file. Files can be shared with and transferred between computers and mobile devices via removable media, networks, or the Internet.

RAID is a data storage virtualization technology that combines multiple physical disk drive components into one or more logical units for the purposes of data redundancy, performance improvement, or both. This is in contrast to the previous concept of highly reliable mainframe disk drives referred to as "single large expensive disk" (SLED).

In database systems, durability is the ACID property that guarantees that the effects of transactions that have been committed will survive permanently, even in case of failures, including incidents and catastrophic events. For example, if a flight booking reports that a seat has successfully been booked, then the seat will remain booked even if the system crashes.

In information technology, a backup, or data backup is a copy of computer data taken and stored elsewhere so that it may be used to restore the original after a data loss event. The verb form, referring to the process of doing so, is "back up", whereas the noun and adjective form is "backup". Backups can be used to recover data after its loss from data deletion or corruption, or to recover data from an earlier time. Backups provide a simple form of disaster recovery; however not all backup systems are able to reconstitute a computer system or other complex configuration such as a computer cluster, active directory server, or database server.

Disaster recovery is the process of maintaining or reestablishing vital infrastructure and systems following a natural or human-induced disaster, such as a storm or battle. It employs policies, tools, and procedures. Disaster recovery focuses on information technology (IT) or technology systems supporting critical business functions as opposed to business continuity. This involves keeping all essential aspects of a business functioning despite significant disruptive events; it can therefore be considered a subset of business continuity. Disaster recovery assumes that the primary site is not immediately recoverable and restores data and services to a secondary site.

Utility software is a program specifically designed to help manage and tune system or application software. It is used to support the computer infrastructure - in contrast to application software, which is aimed at directly performing tasks that benefit ordinary users. However, utilities often form part of the application systems. For example, a batch job may run user-written code to update a database and may then include a step that runs a utility to back up the database, or a job may run a utility to compress a disk before copying files..

<span class="mw-page-title-main">Data corruption</span> Errors in computer data that introduce unintended changes to the original data

Data corruption refers to errors in computer data that occur during writing, reading, storage, transmission, or processing, which introduce unintended changes to the original data. Computer, transmission, and storage systems use a number of measures to provide end-to-end data integrity, or lack of errors.

Data remanence is the residual representation of digital data that remains even after attempts have been made to remove or erase the data. This residue may result from data being left intact by a nominal file deletion operation, by reformatting of storage media that does not remove data previously written to the media, or through physical properties of the storage media that allow previously written data to be recovered. Data remanence may make inadvertent disclosure of sensitive information possible should the storage media be released into an uncontrolled environment.

<span class="mw-page-title-main">Shadow Copy</span> Microsoft technology for storage snapshots

Shadow Copy is a technology included in Microsoft Windows that can create backup copies or snapshots of computer files or volumes, even when they are in use. It is implemented as a Windows service called the Volume Shadow Copy service. A software VSS provider service is also included as part of Windows to be used by Windows applications. Shadow Copy technology requires either the Windows NTFS or ReFS filesystems in order to create and store shadow copies. Shadow Copies can be created on local and external volumes by any Windows component that uses this technology, such as when creating a scheduled Windows Backup or automatic System Restore point.

A remote, online, or managed backup service, sometimes marketed as cloud backup or backup-as-a-service, is a service that provides users with a system for the backup, storage, and recovery of computer files. Online backup providers are companies that provide this type of service to end users. Such backup services are considered a form of cloud computing.

Given organizations' increasing dependency on information technology to run their operations, business continuity planning covers the entire organization, and disaster recovery focuses on IT.

Undeletion is a feature for restoring computer files which have been removed from a file system by file deletion. Deleted data can be recovered on many file systems, but not all file systems provide an undeletion feature. Recovering data without an undeletion facility is usually called data recovery, rather than undeletion. Undeletion can both help prevent users from accidentally losing data, or can pose a computer security risk, since users may not be aware that deleted files remain accessible.

In computing, data recovery is a process of retrieving deleted, inaccessible, lost, corrupted, damaged, or formatted data from secondary storage, removable media or files, when the data stored in them cannot be accessed in a usual way. The data is most often salvaged from storage media such as internal or external hard disk drives (HDDs), solid-state drives (SSDs), USB flash drives, magnetic tapes, CDs, DVDs, RAID subsystems, and other electronic devices. Recovery may be required due to physical damage to the storage devices or logical damage to the file system that prevents it from being mounted by the host operating system (OS).

<span class="mw-page-title-main">Recovery disc</span> Media intended to restore the operating system

The terms Recovery disc, Rescue Disk/Disc and Emergency Disk all refer to a capability to boot from an external device, possibly a thumb drive, that includes a self-running operating system: the ability to be a boot disk/Disc that runs independent of an internal hard drive that may be failing, or for some other reason is not the operating system to be run.

<span class="mw-page-title-main">Hard disk drive failure</span> Being electromechanical devices, hard disk drives (HDDs) have finite lifetime, usually 5-7 years

A hard disk drive failure occurs when a hard disk drive malfunctions and the stored information cannot be accessed with a properly configured computer.

Disk encryption is a technology which protects information by converting it into code that cannot be deciphered easily by unauthorized people or processes. Disk encryption uses disk encryption software or hardware to encrypt every bit of data that goes on a disk or disk volume. It is used to prevent unauthorized access to data storage.

Hardware virtualization is the virtualization of computers as complete hardware platforms, certain logical abstractions of their componentry, or only the functionality required to run various operating systems. Virtualization hides the physical characteristics of a computing platform from the users, presenting instead an abstract computing platform. At its origins, the software that controlled virtualization was called a "control program", but the terms "hypervisor" or "virtual machine monitor" became preferred over time.

Data erasure is a software-based method of data sanitization that aims to completely destroy all electronic data residing on a hard disk drive or other digital media by overwriting data onto all sectors of the device in an irreversible process. By overwriting the data on the storage device, the data is rendered irrecoverable.

In information technology, real-time recovery (RTR) is the ability to recover a piece of IT infrastructure such as a server from an infrastructure failure or human-induced error in a time frame that has minimal impact on business operations. Real-time recovery focuses on the most appropriate technology for restores, thus reducing the Recovery Time Objective (RTO) to minutes, Recovery Point Objectives (RPO) to within 15 minutes ago, and minimizing Test Recovery Objectives (TRO), which is the ability to test and validate that backups have occurred correctly without impacting production systems.

Photo recovery is the process of salvaging digital photographs from damaged, failed, corrupted, or inaccessible secondary storage media when it cannot be accessed normally. Photo recovery can be considered a subset of the overall data recovery field.

References

  1. Constantine., Photopoulos (2008). Managing catastrophic loss of sensitive data : a guide for IT and security professionals. Rockland, Mass.: Syngress. ISBN   9781597492393. OCLC   228148168.
  2. "Data Spill Management Guide". asd.gov.au. December 24, 2014. Archived from the original on January 23, 2015. Retrieved January 23, 2015. A data spill is sometimes referred to as unintentional information disclosure or a data leak.
  3. The cost of lost data - Graziadio Business Report
  4. Leopando, Jonathan (2 April 2013). "World Backup Day: The 3-2-1 Rule". TrendLabs Security Intelligence Blog. Trend Micro. Retrieved 29 April 2015.
  5. "Preventing data loss in a perilous digital age". TechRadar. Retrieved 2018-10-26.
  6. Connor, Chris (2 November 2013). "Data Loss Prevention: 10 Tips to Prevent Hard Drive Failure". Data Storage Digest. Retrieved 29 April 2015.
  7. Leonard, Prisley (14 June 2017). "Backup Software, Professional Backup Solution can prevent data loss". minitool.com. Retrieved 25 October 2018.