Cold data

Last updated

In computer storage, cold data refers to data that is rarely accessed, therefore considered "cold".

Contents

Cold data is the opposite of hot data, which is data that is frequently accessed. [1]

Uses

To optimize storage costs, cold data can be stored on lower performing and less expensive storage media. [2] For example, solid state disks may be used for storing hot data, while cold data can be moved to hard drives, optical discs, tapes, or migrated to cloud storage. [3] [4]

See also

Related Research Articles

<span class="mw-page-title-main">Cache (computing)</span> Additional storage that enables faster access to main storage

In computing, a cache is a hardware or software component that stores data so that future requests for that data can be served faster; the data stored in a cache might be the result of an earlier computation or a copy of data stored elsewhere. A cache hit occurs when the requested data can be found in a cache, while a cache miss occurs when it cannot. Cache hits are served by reading data from the cache, which is faster than recomputing a result or reading from a slower data store; thus, the more requests that can be served from the cache, the faster the system performs.

A disk image is a snapshot of a storage device's structure and data typically stored in one or more computer files on another storage device.

Quantum Corporation is a data storage, management, and protection company that provides technology to store, manage, archive, and protect video and unstructured data throughout the data life cycle. Their products are used by enterprises, media and entertainment companies, government agencies, big data companies, and life science organizations. Quantum is headquartered in San Jose, California and has offices around the world, supporting customers globally in addition to working with a network of distributors, VARs, DMRs, OEMs and other suppliers.

In information technology, a backup, or data backup is a copy of computer data taken and stored elsewhere so that it may be used to restore the original after a data loss event. The verb form, referring to the process of doing so, is "back up", whereas the noun and adjective form is "backup". Backups can be used to recover data after its loss from data deletion or corruption, or to recover data from an earlier time. Backups provide a simple form of disaster recovery; however not all backup systems are able to reconstitute a computer system or other complex configuration such as a computer cluster, active directory server, or database server.

A file-hosting service, also known as cloud-storage service, online file-storage provider, or cyberlocker, is an internet hosting service specifically designed to host user files. These services allow users to upload files that can be accessed over the internet after providing a username and password or other authentication. Typically, file hosting services allow HTTP access, and in some cases, FTP access. Other related services include content-displaying hosting services, virtual storage, and remote backup solutions.

An in-memory database is a database management system that primarily relies on main memory for computer data storage. It is contrasted with database management systems that employ a disk storage mechanism. In-memory databases are faster than disk-optimized databases because disk access is slower than memory access and the internal optimization algorithms are simpler and execute fewer CPU instructions. Accessing data in memory eliminates seek time when querying the data, which provides faster and more predictable performance than disk.

Hierarchical storage management (HSM), also known as tiered storage, is a data storage and data management technique that automatically moves data between high-cost and low-cost storage media. HSM systems exist because high-speed storage devices, such as solid-state drive arrays, are more expensive than slower devices, such as hard disk drives, optical discs and magnetic tape drives. While it would be ideal to have all data available on high-speed devices all the time, this is prohibitively expensive for many organizations. Instead, HSM systems store the bulk of the enterprise's data on slower devices, and then copy data to faster disk drives when needed. The HSM system monitors the way data is used and makes best guesses as to which data can safely be moved to slower devices and which data should stay on the fast devices.

Apache Hadoop is a collection of open-source software utilities that facilitates using a network of many computers to solve problems involving massive amounts of data and computation. It provides a software framework for distributed storage and processing of big data using the MapReduce programming model. Hadoop was originally designed for computer clusters built from commodity hardware, which is still the common use. It has since also found use on clusters of higher-end hardware. All the modules in Hadoop are designed with a fundamental assumption that hardware failures are common occurrences and should be automatically handled by the framework.

Amazon S3 or Amazon Simple Storage Service is a service offered by Amazon Web Services (AWS) that provides object storage through a web service interface. Amazon S3 uses the same scalable storage infrastructure that Amazon.com uses to run its e-commerce network. Amazon S3 can store any type of object, which allows uses like storage for Internet applications, backups, disaster recovery, data archives, data lakes for analytics, and hybrid cloud storage. AWS launched Amazon S3 in the United States on March 14, 2006, then in Europe in November 2007.

Cloud storage is a model of computer data storage in which data, said to be on "the cloud", is stored remotely in logical pools and is accessible to users over a network, typically the Internet. The physical storage spans multiple servers, and the physical environment is typically owned and managed by a cloud computing provider. These cloud storage providers are responsible for keeping the data available and accessible, and the physical environment secured, protected, and running. People and organizations buy or lease storage capacity from the providers to store user, organization, or application data.

This is a comparison of online backup services.

<span class="mw-page-title-main">Cloud computing</span> Form of shared internet-based computing

Cloud computing is the on-demand availability of computer system resources, especially data storage and computing power, without direct active management by the user. Large clouds often have functions distributed over multiple locations, each of which is a data center. Cloud computing relies on sharing of resources to achieve coherence and typically uses a pay-as-you-go model, which can help in reducing capital expenses but may also lead to unexpected operating expenses for users.

Cloud computing security or, more simply, cloud security, refers to a broad set of policies, technologies, applications, and controls utilized to protect virtualized IP, data, applications, services, and the associated infrastructure of cloud computing. It is a sub-domain of computer security, network security, and, more broadly, information security.

Google Cloud Storage is a RESTful online file storage web service for storing and accessing data on Google Cloud Platform infrastructure. The service combines the performance and scalability of Google's cloud with advanced security and sharing capabilities. It is an Infrastructure as a Service (IaaS), comparable to Amazon S3. Contrary to Google Drive and according to different service specifications, Google Cloud Storage appears to be more suitable for enterprises.

iCloud Cloud storage and cloud computing service by Apple

iCloud is a cloud service operated by Apple Inc. Launched on October 12, 2011, iCloud enables users to store and sync data across devices, including Apple Mail, Apple Calendar, Apple Photos, Apple Notes, contacts, settings, backups, and files, to collaborate with other users, and track assets through Find My. It is built into iOS, iPadOS, watchOS, tvOS, macOS, and visionOS. iCloud may additionally be accessed through a limited web interface and Windows application.

<span class="mw-page-title-main">Cloud computing architecture</span> Overview about the cloud computing architecture

Cloud computing architecture refers to the components and subcomponents required for cloud computing. These components typically consist of a front end platform, back end platforms, a cloud based delivery, and a network. Combined, these components make up cloud computing architecture.

Object storage is a computer data storage approach that manages data as "blobs" or "objects", as opposed to other storage architectures like file systems, which manage data as a file hierarchy, and block storage, which manages data as blocks within sectors and tracks. Each object is typically associated with a variable amount of metadata, and a globally unique identifier. Object storage can be implemented at multiple levels, including the device level, the system level, and the interface level. In each case, object storage seeks to enable capabilities not addressed by other storage architectures, like interfaces that are directly programmable by the application, a namespace that can span multiple instances of physical hardware, and data-management functions like data replication and data distribution at object-level granularity.

A distributed file system for cloud is a file system that allows many clients to have access to data and supports operations on that data. Each data file may be partitioned into several parts called chunks. Each chunk may be stored on different remote machines, facilitating the parallel execution of applications. Typically, data is stored in files in a hierarchical tree, where the nodes represent directories. There are several ways to share files in a distributed architecture: each solution must be suitable for a certain type of application, depending on how complex the application is. Meanwhile, the security of the system must be ensured. Confidentiality, availability and integrity are the main keys for a secure system.

Dew computing is an information technology (IT) paradigm that combines the core concept of cloud computing with the capabilities of end devices. It is used to enhance the experience for the end user in comparison to only using cloud computing. Dew computing attempts to solve major problems related to cloud computing technology, such as reliance on internet access. Dropbox is an example of the dew computing paradigm, as it provides access to the files and folders in the cloud in addition to keeping copies on local devices. This allows the user to access files during times without an internet connection; when a connection is established again, files and folders are synchronized back to the cloud server.

<span class="mw-page-title-main">Hybrid cloud storage</span>

Hybrid cloud storage, in data storage, is a term for a storage infrastructure that uses a combination of on-premises storage resources with a public cloud storage provider. The on-premises storage is usually managed by the organization, while the public cloud storage provider is responsible for the management and security of the data stored in the cloud.

References

  1. Cai, Zhipeng; Wang, Chaokun; Cheng, Siyao; Wang, Hongzhi; Gao, Hong (18 June 2014). Wireless Algorithms, Systems, and Applications: 9th International Conference, WASA 2014, Harbin, China, June 23-25, 2014, Proceedings. Springer. ISBN   978-3-319-07782-6.
  2. Abbasi, Asif (20 November 2020). AWS Certified Data Analytics Study Guide: Specialty (DAS-C01) Exam. John Wiley & Sons. ISBN   978-1-119-64944-1.
  3. Cold Data, Techopedia
  4. Cold Cloud Data Storage, Enterprise Storage Forum