Cold data

Last updated

In computer storage, cold data refers to data that is rarely accessed, therefore considered "cold".

Contents

Cold data is the opposite of hot data, which is data that is frequently accessed. [1]

Uses

To optimize storage costs, cold data can be stored on lower performing and less expensive storage media. [2] For example, solid state disks may be used for storing hot data, while cold data can be moved to hard drives, optical discs, tapes, or migrated to cloud storage. [3] [4]

See also

Related Research Articles

<span class="mw-page-title-main">Cache (computing)</span> Additional storage that enables faster access to main storage

In computing, a cache is a hardware or software component that stores data so that future requests for that data can be served faster; the data stored in a cache might be the result of an earlier computation or a copy of data stored elsewhere. A cache hit occurs when the requested data can be found in a cache, while a cache miss occurs when it cannot. Cache hits are served by reading data from the cache, which is faster than recomputing a result or reading from a slower data store; thus, the more requests that can be served from the cache, the faster the system performs.

A disk image is a snapshot of a storage device's structure and data typically stored in one or more computer files on another storage device. Traditionally, disk images were bit-by-bit copies of every sector on a hard disk often created for digital forensic purposes, but it is now common to only copy allocated data to reduce storage space. Compression and deduplication are commonly used to reduce the size of the image file set. Disk imaging is done for a variety of purposes including digital forensics, cloud computing, system administration, as part of a backup strategy, and legacy emulation as part of a digital preservation strategy. Disk images can be made in a variety of formats depending on the purpose. Virtual disk images are intended to be used for cloud computing, ISO images are intended to emulate optical media and raw disk images are used for forensic purposes. Proprietary formats are typically used by disk imaging software. Despite the benefits of disk imaging the storage costs can be high, management can be difficult and they can be time consuming to create.

In information technology, a backup, or data backup is a copy of computer data taken and stored elsewhere so that it may be used to restore the original after a data loss event. The verb form, referring to the process of doing so, is "back up", whereas the noun and adjective form is "backup". Backups can be used to recover data after its loss from data deletion or corruption, or to recover data from an earlier time. Backups provide a simple form of disaster recovery; however not all backup systems are able to reconstitute a computer system or other complex configuration such as a computer cluster, active directory server, or database server.

A file-hosting service, also known as cloud-storage service, online file-storage provider, or cyberlocker is an internet hosting service specifically designed to host user files. These services allows users to upload files that can be accessed over the internet after providing a username and password or other authentication. Typically, file hosting services allow HTTP access, and in some cases, FTP access. Other related services include content-displaying hosting services, virtual storage, and remote backup solutions.

An in-memory database is a database management system that primarily relies on main memory for computer data storage. It is contrasted with database management systems that employ a disk storage mechanism. In-memory databases are faster than disk-optimized databases because disk access is slower than memory access and the internal optimization algorithms are simpler and execute fewer CPU instructions. Accessing data in memory eliminates seek time when querying the data, which provides faster and more predictable performance than disk.

Hierarchical storage management (HSM), also known as Tiered storage, is a data storage and Data management technique that automatically moves data between high-cost and low-cost storage media. HSM systems exist because high-speed storage devices, such as solid state drive arrays, are more expensive than slower devices, such as hard disk drives, optical discs and magnetic tape drives. While it would be ideal to have all data available on high-speed devices all the time, this is prohibitively expensive for many organizations. Instead, HSM systems store the bulk of the enterprise's data on slower devices, and then copy data to faster disk drives when needed. The HSM system monitors the way data is used and makes best guesses as to which data can safely be moved to slower devices and which data should stay on the fast devices.

In computing, external storage refers to non-volatile (secondary) data storage outside a computer's own internal hardware, and thus can be readily disconnected and accessed elsewhere. Such storage devices may refer to removable media, compact flash drives, portable storage devices, or network-attached storage. Web-based cloud storage is the latest technology for external storage.

Apache Hadoop is a collection of open-source software utilities that facilitates using a network of many computers to solve problems involving massive amounts of data and computation. It provides a software framework for distributed storage and processing of big data using the MapReduce programming model. Hadoop was originally designed for computer clusters built from commodity hardware, which is still the common use. It has since also found use on clusters of higher-end hardware. All the modules in Hadoop are designed with a fundamental assumption that hardware failures are common occurrences and should be automatically handled by the framework.

Amazon S3 or Amazon Simple Storage Service is a service offered by Amazon Web Services (AWS) that provides object storage through a web service interface. Amazon S3 uses the same scalable storage infrastructure that Amazon.com uses to run its e-commerce network. Amazon S3 can store any type of object, which allows uses like storage for Internet applications, backups, disaster recovery, data archives, data lakes for analytics, and hybrid cloud storage. AWS launched Amazon S3 in the United States on March 14, 2006, then in Europe in November 2007.

Cloud storage is a model of computer data storage in which the digital data is stored in logical pools, said to be on "the cloud". The physical storage spans multiple servers, and the physical environment is typically owned and managed by a hosting company. These cloud storage providers are responsible for keeping the data available and accessible, and the physical environment secured, protected, and running. People and organizations buy or lease storage capacity from the providers to store user, organization, or application data.

This is a comparison of online backup services.

<span class="mw-page-title-main">Cloud computing</span> Form of shared Internet-based computing

Cloud computing is the on-demand availability of computer system resources, especially data storage and computing power, without direct active management by the user. Large clouds often have functions distributed over multiple locations, each of which is a data center. Cloud computing relies on sharing of resources to achieve coherence and typically uses a pay-as-you-go model, which can help in reducing capital expenses but may also lead to unexpected operating expenses for users.

<span class="mw-page-title-main">Cloud computing security</span> Methods used to protect cloud based assets

Cloud computing security or, more simply, cloud security, refers to a broad set of policies, technologies, applications, and controls utilized to protect virtualized IP, data, applications, services, and the associated infrastructure of cloud computing. It is a sub-domain of computer security, network security, and, more broadly, information security.

iCloud Cloud storage and cloud computing service by Apple

iCloud is a cloud service developed by Apple Inc. Launched on October 12, 2011, iCloud enables users to store and sync data across devices, including Apple Mail, Apple Calendar, Apple Photos, Apple Notes, contacts, settings, backups, and files, to collaborate with other users, and track assets through Find My. It is built into iOS, iPadOS, watchOS, tvOS and macOS and may additionally be accessed through a limited web interface and Windows application.

<span class="mw-page-title-main">Cloud computing architecture</span> Overview about the cloud computing architecture

Cloud computing architecture refers to the components and subcomponents required for cloud computing. These components typically consist of a front end platform, back end platforms, a cloud based delivery, and a network. Combined, these components make up cloud computing architecture.

Object storage is a computer data storage that manages data as objects, as opposed to other storage architectures like file systems which manages data as a file hierarchy, and block storage which manages data as blocks within sectors and tracks. Each object typically includes the data itself, a variable amount of metadata, and a globally unique identifier. Object storage can be implemented at multiple levels, including the device level, the system level, and the interface level. In each case, object storage seeks to enable capabilities not addressed by other storage architectures, like interfaces that are directly programmable by the application, a namespace that can span multiple instances of physical hardware, and data-management functions like data replication and data distribution at object-level granularity.

A distributed file system for cloud is a file system that allows many clients to have access to data and supports operations on that data. Each data file may be partitioned into several parts called chunks. Each chunk may be stored on different remote machines, facilitating the parallel execution of applications. Typically, data is stored in files in a hierarchical tree, where the nodes represent directories. There are several ways to share files in a distributed architecture: each solution must be suitable for a certain type of application, depending on how complex the application is. Meanwhile, the security of the system must be ensured. Confidentiality, availability and integrity are the main keys for a secure system.

Client-side encryption is the cryptographic technique of encrypting data on the sender's side, before it is transmitted to a server such as a cloud storage service. Client-side encryption features an encryption key that is not available to the service provider, making it difficult or impossible for service providers to decrypt hosted data. Client-side encryption allows for the creation of applications whose providers cannot access the data its users have stored, thus offering a high level of privacy. Those applications are sometimes marketed under the misleading term "zero-knowledge".

Dew computing is an information technology (IT) paradigm that combines the core concept of cloud computing with the capabilities of end devices. It is used to enhance the experience for the end user in comparison to only using cloud computing. Dew computing attempts to solve major problems related to cloud computing technology, such as reliance on internet access. Dropbox is an example of the dew computing paradigm, as it provides access to the files and folders in the cloud in addition to keeping copies on local devices. This allows the user to access files during times without an internet connection; when a connection is established again, files and folders are synchronized back to the cloud server.

<span class="mw-page-title-main">Hybrid cloud storage</span>

Hybrid cloud storage, in data storage, is a term for a storage infrastructure that uses a combination of on-premises storage resources with a public cloud storage provider. The on-premises storage is usually managed by the organization, while the public cloud storage provider is responsible for the management and security of the data stored in the cloud.

References

  1. Cai, Zhipeng; Wang, Chaokun; Cheng, Siyao; Wang, Hongzhi; Gao, Hong (18 June 2014). Wireless Algorithms, Systems, and Applications: 9th International Conference, WASA 2014, Harbin, China, June 23-25, 2014, Proceedings. Springer. ISBN   978-3-319-07782-6.
  2. Abbasi, Asif (20 November 2020). AWS Certified Data Analytics Study Guide: Specialty (DAS-C01) Exam. John Wiley & Sons. ISBN   978-1-119-64944-1.
  3. Cold Data, Techopedia
  4. Cold Cloud Data Storage, Enterprise Storage Forum