Differential backup

Last updated

A differential backup is a type of data backup that preserves data, saving only the difference in the data since the last full backup. The rationale in this is that, since changes to data are generally few compared to the entire amount of data in the data repository, the amount of time required to complete the backup will be smaller than if a full backup was performed every time that the organization or data owner wishes to back up changes since the last full backup. Another advantage, at least as compared to the incremental backup method of data backup, is that at data restoration time, at most two backup media are ever needed to restore all the data. This simplifies data restores as well as increases the likelihood of shortening data restoration time.

Contents

Meaning

A differential backup is a cumulative backup of all changes made since the last full backup, i.e., the differences since the last full backup. The advantage to this is the quicker recovery time, requiring only a full backup and the last differential backup to restore the entire data repository. The disadvantage is that for each day elapsed since the last full backup, more data needs to be backed up, especially if a significant proportion of the data has changed, thus increasing backup time as compared to the incremental backup method.

It is important to use the terms "differential backup" and "incremental backup" correctly. The two terms are widely used in the industry, and their use is universally standard. [1] A differential backup refers to a backup made to include the differences since the last full backup, while an incremental backup contains only the changes since the last incremental backup. (Or, of course, since the last full backup if the incremental backup in questions is the first incremental backup immediately after the last full backup.) All the major data backup vendors have standardized on these definitions. [2] [3]

Illustration

The difference between incremental and differential backups can be illustrated as follows: [1]

Incremental backups:
DaySundayMondayTuesdayWednesdayThursdayFridaySaturdaySunday
Backup typeFullIncrementalIncrementalIncrementalIncrementalIncrementalIncrementalFull
EffectN/AChanges since SundayChanges since MondayChanges since TuesdayChanges since WednesdayChanges since ThursdayChanges since FridayN/A

The above assumes that backups are done daily. Otherwise, the “Changes since” entry must be modified to refer to the last backup (whether such last backup was full or incremental). It also assumes a weekly rotation.

Differential backups:
DaySundayMondayTuesdayWednesdayThursdayFridaySaturdaySunday
Backup typeFullDifferentialDifferentialDifferentialDifferentialDifferentialDifferentialFull
EffectN/AChanges since SundayChanges since SundayChanges since SundayChanges since SundayChanges since SundayChanges since SundayN/A

It is important to remember the industry standard meaning of these two terms because, while the terms above are in very wide use, some writers have been known to reverse their meaning. For example, Oracle Corporation uses a backward description of differential backups in their DB product as of May 14, 2015:

"Differential incremental backups - In a differential level 1 backup, RMAN backs up all blocks that have changed since the most recent cumulative or differential incremental backup, whether at level 1 or level 0. RMAN determines which level 1 backup occurred most recently and backs up all blocks modified after that backup. If no level 1 is available, RMAN copies all blocks changed since the level 0 backup." [4]

See also

Related Research Articles

rsync File synchronization protocol and software

rsync is a utility for transferring and synchronizing files between a computer and a storage drive and across networked computers by comparing the modification times and sizes of files. It is commonly found on Unix-like operating systems and is under the GPL-3.0-or-later license.

In information technology, a backup, or data backup is a copy of computer data taken and stored elsewhere so that it may be used to restore the original after a data loss event. The verb form, referring to the process of doing so, is "back up", whereas the noun and adjective form is "backup". Backups can be used to recover data after its loss from data deletion or corruption, or to recover data from an earlier time. Backups provide a simple form of disaster recovery; however not all backup systems are able to reconstitute a computer system or other complex configuration such as a computer cluster, active directory server, or database server.

NetApp, Inc. is an intelligent data infrastructure company that provides unified data storage, integrated data services, and cloud operations (CloudOps) solutions to enterprise customers. The company is based in San Jose, California. It has ranked in the Fortune 500 from 2012 to 2021. Founded in 1992 with an initial public offering in 1995, NetApp offers cloud data services for management of applications and data both online and physically.

A remote, online, or managed backup service, sometimes marketed as cloud backup or backup-as-a-service, is a service that provides users with a system for the backup, storage, and recovery of computer files. Online backup providers are companies that provide this type of service to end users. Such backup services are considered a form of cloud computing.

dar is a computer program, a command-line archiving tool intended as a replacement for tar in Unix-like operating systems.

Hierarchical storage management (HSM), also known as Tiered storage, is a data storage and Data management technique that automatically moves data between high-cost and low-cost storage media. HSM systems exist because high-speed storage devices, such as solid state drive arrays, are more expensive than slower devices, such as hard disk drives, optical discs and magnetic tape drives. While it would be ideal to have all data available on high-speed devices all the time, this is prohibitively expensive for many organizations. Instead, HSM systems store the bulk of the enterprise's data on slower devices, and then copy data to faster disk drives when needed. The HSM system monitors the way data is used and makes best guesses as to which data can safely be moved to slower devices and which data should stay on the fast devices.

Backup software are computer programs used to perform a backup; they create supplementary exact copies of files, databases or entire computers. These programs may later use the supplementary copies to restore the original contents in the event of data loss; hence, they are very useful to users.

Continuous data protection (CDP), also called continuous backup or real-time backup, refers to backup of computer data by automatically saving a copy of every change made to that data, essentially capturing every version of the data that the user saves. In its true form it allows the user or administrator to restore data to any point in time. The technique was patented by British entrepreneur Pete Malcolm in 1989 as "a backup system in which a copy [editor's emphasis] of every change made to a storage medium is recorded as the change occurs [editor's emphasis]."

IBM Storage Protect is a data protection platform that gives enterprises a single point of control and administration for backup and recovery. It is the flagship product in the IBM Spectrum Protect family.

An incremental backup is one in which successive copies of the data contain only the portion that has changed since the preceding backup copy was made. When a full recovery is needed, the restoration process would need the last full backup plus all the incremental backups until the point of restoration. Incremental backups are often desirable as they reduce storage space usage, and are quicker to perform than differential backups.

<span class="mw-page-title-main">Bacula</span>

Bacula is an open-source, enterprise-level computer backup system for heterogeneous networks. It is designed to automate backup tasks that had often required intervention from a systems administrator or computer operator.

<span class="mw-page-title-main">BackupPC</span>

BackupPC is a free disk-to-disk backup software suite with a web-based frontend. The cross-platform server will run on any Linux, Solaris, or UNIX-based server. No client is necessary, as the server is itself a client for several protocols that are handled by other services native to the client OS. In 2007, BackupPC was mentioned as one of the three most well known open-source backup software, even though it is one of the tools that are "so amazing, but unfortunately, if no one ever talks about them, many folks never hear of them".

Duplicity, graphical interface known as Déjà Dup, is a software suite that provides encrypted, digitally signed, versioned, local or remote backup of files requiring little of the remote server. Released under the terms of the GNU General Public License (GPL), Duplicity is free software.

<span class="mw-page-title-main">Areca Backup</span> File backup system

Areca Backup is an Open Source personal file backup software developed in Java. It is released under the GNU General Public License (GPL) 2.

<span class="mw-page-title-main">Backup and Restore</span> Primary backup component of Windows Vista and Windows 7

Backup and Restore is the primary backup component of Windows Vista and Windows 7. It can create file and folder backups, as well as system images backups, to be used for recovery in the event of data corruption, hard disk drive failure, or malware infection. It replaces NTBackup, which has been part of Windows since Windows NT 3.51. Unlike its predecessor, it supports CDs, DVDs, and Blu-rays discs as backup media.

RMAN is a backup and recovery manager supplied for Oracle databases created by the Oracle Corporation. It provides database backup, restore, and recovery capabilities addressing high availability and disaster recovery concerns. Oracle Corporation recommends RMAN as its preferred method for backup and recovery and has written command-line and graphical interfaces for the product.

The subject of computer backups is rife with jargon and highly specialized terminology. This page is a glossary of backup terms that aims to clarify the meaning of such jargon and terminology.

NetVault is a set of data protection software developed and supported by Quest Software. NetVault Backup is a backup and recovery software product. It can be used to protect data and software applications in physical and virtual environments from one central management interface. It supports many servers, application platforms, and protocols such as UNIX, Linux, Microsoft Windows, VMware, Microsoft Hyper-V, Oracle, Sybase, Microsoft SQL Server, NDMP, Oracle ACSLS, IBM DAS/ACI, Microsoft Exchange Server, DB2, and Teradata.

<span class="mw-page-title-main">Veeam Backup & Replication</span> Backup and disaster recovery software

Veeam Backup & Replication is a proprietary backup app developed by Veeam for virtual environments built on VMware vSphere, Nutanix AHV, and Microsoft Hyper-V hypervisors. The software provides backup, restore and replication functionality for virtual machines, physical servers and workstations as well as cloud-based workload.

ZFS is a file system with volume management capabilities. It began as part of the Sun Microsystems Solaris operating system in 2001. Large parts of Solaris, including ZFS, were published under an open source license as OpenSolaris for around 5 years from 2005 before being placed under a closed source license when Oracle Corporation acquired Sun in 2009–2010. During 2005 to 2010, the open source version of ZFS was ported to Linux, Mac OS X and FreeBSD. In 2010, the illumos project forked a recent version of OpenSolaris, including ZFS, to continue its development as an open source project. In 2013, OpenZFS was founded to coordinate the development of open source ZFS. OpenZFS maintains and manages the core ZFS code, while organizations using ZFS maintain the specific code and validation processes required for ZFS to integrate within their systems. OpenZFS is widely used in Unix-like systems.

References

  1. 1 2 SQL Server differential backups. Carlos Rojas. EMC Community Network. EMC Corporation. 2 March 2011. Retrieved 21 August 2012.
  2. Description of Full, Incremental, and Differential Backups. Microsoft Support. Retrieved 21 August 2012.
  3. What are the differences between Differential and Incremental backups?. Symantec Enterprise Technical Support. Article: TECH7665. Created: 2000-01-27; Updated: 2012-05-12. Retrieved 21 August 2012.
  4. "RMAN Incremental Backups". Oracle. Retrieved 2015-05-14.

Further reading