Amazon Glacier

Last updated
Amazon Glacier
Type of site
Online backup service
Available in English
Owner Amazon.com
URL aws.amazon.com/glacier/
CommercialYes
RegistrationRequired
LaunchedAugust 21, 2012 [1]
Current statusActive

Amazon Glacier is an online file storage web service that provides storage for data archiving and backup. [2]

Contents

Glacier is part of the Amazon Web Services suite of cloud computing services, and is designed for long-term storage of data that is infrequently accessed and for which retrieval latency times of 3 to 5 hours are acceptable. Storage costs are a consistent $0.004 per gigabyte per month, which is substantially cheaper than Amazon's own Simple Storage Service (S3). [3]

Amazon hopes this service will move businesses from on-premises tape backup drives to cloud-based backup storage. [4]

Storage

ZDNet says, that according to private e-mail, Glacier runs on "inexpensive commodity hardware components". [4] In 2012, ZDNet quoted a former Amazon employee as saying that Glacier is based on custom low-RPM hard drives attached to custom logic boards where only a percentage of a rack's drives can be spun at full speed at any one time. [5] [6] Similar technology is also used by Facebook. [7]

There is some belief among users that the underlying hardware used for Glacier storage is tape-based, owing to the fact that Amazon has positioned Glacier as a direct competitor to tape backup services (both on-premises and cloud-based). [8] This confusion is exacerbated by the fact that Glacier has archive retrieval delays (3–5 hours before archives are available) similar to that of tape-based systems[ dubious ] and a pricing model that discourages frequent data retrieval. [9]

The Register claimed that Glacier runs on Spectra T-Finity tape libraries with LTO-6 tapes. [10] [11] Others have conjectured Amazon using off-line shingled magnetic recording hard drives, multi-layer Blu-ray optical discs, or an alternative proprietary storage technology. [12]

Data storage consultant Robin Harris speculated that the storage is based on cheap optical disks such as Blu-ray, based on hints from public sources. [13]

Cost

Glacier has two costs, one for storage and one for retrieval. Uploading data to Glacier is free. Storage pricing is simple: it currently costs 0.4 cents per gigabyte per month, which is 82% cheaper than S3 Standard. When Glacier launched in 2012, the storage charge was set to 1 cent per gigabyte per month. This was reduced to 0.7 cents in September 2015 and to the current 0.4 cents in December 2016. [14]

Glacier used to charge for retrievals based on peak monthly retrieval rate, meaning that (ignoring the free tier) if you downloaded four gigabytes in four hours, it would cost the same as if you downloaded 720 gigabytes in 720 hours, in a 30-day month. This made it cheaper to spread out data retrievals over a long period of time, but failing to do so could result in a surprisingly large bill. In one case, a user stored 15 GB of data in Glacier, retrieved 693 MB for testing, and ended up being charged for 126 GB due to retrieval rate calculation. [15] This pricing policy was widely regarded as a time bomb set to go off on retrieval. [16]

In 2016, AWS revised their retrieval pricing model. [17] The new model bases the retrieval fee on the number of gigabytes retrieved. This can amount to a 99% price cut for users who perform only one Glacier retrieval in a month. At the same time, AWS introduced new methods of retrieval that take different amounts of time. An expedited retrieval costs one cent per request and three cents per gigabyte, and can retrieve data in one to five minutes. A standard retrieval costs five cents per thousand requests and one cent per gigabyte, and takes three to five hours. A bulk retrieval costs 2.5 cents per thousand requests and 0.25 cents per gigabyte, and takes seven to twelve hours. AWS also introduced provisioned capacity for expedited retrievals, each unit of which costs $100 per month and guarantees at least three expedited retrievals every five minutes, and up to 150 MB/s of retrieval bandwidth. Without provisioned capacity, expedited retrievals are done on a capacity available basis. [18]

Data deleted from Glacier less than 90 days after being stored incurs a charge equal to the cost of storage for the remainder of the 90 days. (In effect, the user pays for 90 days minimum.) This move was designed to discourage the service's use in cases where Amazon's other storage offerings (e.g. S3) are more appropriate for real-time access. After 90 days, deletion from Glacier is free.

Retrieving data from Glacier is a two-step process. The first step is to retrieve the data into a staging area, where it stays for 24 hours. [19] The second step is to download the data from the staging area, which may incur bandwidth charges. [20]

Glacier is also available as a storage class in S3. [21] Objects can only be put into Glacier by lifecycle rules, which can be configured to put the objects in Glacier once they have reached a certain age. Pricing is the same, but there is no staging area; instead, retrieved objects are simultaneously stored in Glacier and in Reduced Redundancy class for a number of days that the user specifies.

Related Research Articles

Amazon Web Services On-demand cloud computing company

Amazon Web Services (AWS) is a subsidiary of Amazon providing on-demand cloud computing platforms and APIs to individuals, companies, and governments, on a metered pay-as-you-go basis. These cloud computing web services provide a variety of basic abstract technical infrastructure and distributed computing building blocks and tools. One of these services is Amazon Elastic Compute Cloud (EC2), which allows users to have at their disposal a virtual cluster of computers, available all the time, through the Internet. AWS's version of virtual computers emulates most of the attributes of a real computer, including hardware central processing units (CPUs) and graphics processing units (GPUs) for processing; local/RAM memory; hard-disk/SSD storage; a choice of operating systems; networking; and pre-loaded application software such as web servers, databases, and customer relationship management (CRM).

A remote, online, or managed backup service, sometimes marketed as cloud backup or backup-as-a-service, is a service that provides users with a system for the backup, storage, and recovery of computer files. Online backup providers are companies that provide this type of service to end users. Such backup services are considered a form of cloud computing.

Veritas Backup Exec is a data protection software product designed for customers who have mixed physical and virtual environments, and who are moving to public cloud services. Supported platforms include VMware and Hyper-V virtualization, Windows and Linux operating systems, Amazon S3, Microsoft Azure and Google cloud storage, among others. All management and configuration operations are performed with a single user interface. Backup Exec also provides integrated deduplication, replication, and disaster recovery capabilities and helps to manage multiple backup servers or multi-drive tape loaders.

Amazon S3 or Amazon Simple Storage Service is a service offered by Amazon Web Services (AWS) that provides object storage through a web service interface. Amazon S3 uses the same scalable storage infrastructure that Amazon.com uses to run its global e-commerce network. Amazon S3 can be employed to store any type of object, which allows for uses like storage for Internet applications, backup and recovery, disaster recovery, data archives, data lakes for analytics, and hybrid cloud storage.

Amazon Elastic Compute Cloud Amazon cloud computing platform

Amazon Elastic Compute Cloud (EC2) is a part of Amazon.com's cloud-computing platform, Amazon Web Services (AWS), that allows users to rent virtual computers on which to run their own computer applications. EC2 encourages scalable deployment of applications by providing a web service through which a user can boot an Amazon Machine Image (AMI) to configure a virtual machine, which Amazon calls an "instance", containing any software desired. A user can create, launch, and terminate server-instances as needed, paying by the second for active servers – hence the term "elastic". EC2 provides users with control over the geographical location of instances that allows for latency optimization and high levels of redundancy. In November 2010, Amazon switched its own retail website platform to EC2 and AWS.

This is a comparison of online backup services.

Cloud computing Form of Internet-based computing that shares processing resources and data

Cloud computing is the on-demand availability of computer system resources, especially data storage and computing power, without direct active management by the user. The term is generally used to describe data centers available to many users over the Internet. Large clouds, predominant today, often have functions distributed over multiple locations from central servers. If the connection to the user is relatively close, it may be designated an edge server.

This is a comparison of file hosting services which are currently active. File hosting services are a particular kind of online file storage; however, various products that are designed for online file storage may not have features or characteristics that others designed for sharing files have.

Carbonite, Inc. is an American company that offers an online backup service, available to Windows and macOS users. In 2019 it was acquired by Canadian software company, OpenText. It backs up documents, e-mails, music, photos, and settings. It is named after carbonite, the fictional substance used to freeze Han Solo in Star Wars: The Empire Strikes Back. Carbonite was the first such service to offer unlimited backup space for a fixed price. Previously, all online backup services were priced by the gigabyte; many other vendors have since changed to an unlimited model.

Amazon Virtual Private Cloud Cloud-based service

Amazon Virtual Private Cloud (VPC) is a commercial cloud computing service that provides users a virtual private cloud, by "provision[ing] a logically isolated section of Amazon Web Services (AWS) Cloud". Enterprise customers are able to access the Amazon Elastic Compute Cloud (EC2) over an IPsec based virtual private network. Unlike traditional EC2 instances which are allocated internal and external IP numbers by Amazon, the customer can assign IP numbers of their choosing from one or more subnets. By giving the user the option of selecting which AWS resources are public facing and which are not, VPC provides much more granular control over security. For Amazon it is "an endorsement of the hybrid approach, but it's also meant to combat the growing interest in private clouds".

Amazon Relational Database Service is a distributed relational database service by Amazon Web Services (AWS). It is a web service running "in the cloud" designed to simplify the setup, operation, and scaling of a relational database for use in applications. Administration processes like patching the database software, backing up databases and enabling point-in-time recovery are managed automatically. Scaling storage and compute resources can be performed by a single API call to the AWS control plane on-demand. AWS does not offer an SSH connection to the underlying virtual machine as part of the managed service.

Amazon Drive, formerly known as Amazon Cloud Drive, is a cloud storage application managed by Amazon. The service offers secure cloud storage, file backup, file sharing, and Photo printing. Using an Amazon account, the files and folders can be transferred and managed from multiple devices including web browsers, desktop applications, mobiles, and tablets. Amazon Drive also lets their U.S. users order photo prints and photo books using the Amazon Prints service.

Elasticsearch Distributed, scalable, and highly available real-time search platform with a RESTful API.

Elasticsearch is a search engine based on the Lucene library. It provides a distributed, multitenant-capable full-text search engine with an HTTP web interface and schema-free JSON documents. Elasticsearch is developed in Java and is dual-licensed under the source-available Server Side Public License and the Elastic license, while other parts fall under the proprietary (source-available) Elastic License. Official clients are available in Java, .NET (C#), PHP, Python, Apache Groovy, Ruby and many other languages. According to the DB-Engines ranking, Elasticsearch is the most popular enterprise search engine.

Zadara is a cloud computing company founded in 2011, with headquarters in Irvine, California. The company develops computer software that it markets as storage-as-a-service, which can be used for cloud or on-premises servers, a model sometimes called private cloud.

Redis Labs is a private computer software company based in Mountain View, California. It provides a database management system marketed as "NoSQL" as open source software or as a service using cloud computing. The company has additional offices in London and Tel Aviv.

This is a timeline of Amazon Web Services, which offers a suite of cloud computing services that make up an on-demand computing platform.

MSP360

MSP360, formerly CloudBerry Lab, is a software and application service provider company that develops online backup, remote desktop and file management products integrated with more than 20 cloud storage providers. MSP360 Backup and MSP360 Explorer are offered for personal use in a “freemium” model. Other products, including more advanced MSP360 Backup versions, are sold per license with free trials available. MSP360 Backup web service is a Backup-as-a-Service application with centralized management and monitoring that allows for Managed Service Providers and businesses backup and restore of desktops and servers.

This page is a timeline of digital preservation and Web archiving. It covers various aspects of saving and preserving digital data, whether they are born-digital or not.

Amazon Neptune is a managed graph database product published by Amazon.com. It is used as a web service and is part of Amazon Web Services (AWS). It was announced on November 29, 2017. Amazon Neptune supports popular graph models property graph and W3C's RDF, and their respective query languages Apache TinkerPop Gremlin and SPARQL, including other Amazon Web Services products.

Google One is a subscription service developed by Google that offers expanded cloud storage and is intended for the consumer market. Google One paid plans offer cloud storage starting at 100 gigabytes, up to a maximum of 30 terabytes, an expansion from the free Google Account storage space of 15 gigabytes, which is shared across Google Drive, Gmail, and Google Photos. Google One replaced the paid services of Google Drive to emphasize the fact that the program is used by multiple Google Services. The program's raw storage is not accessible by users, but emails, files, and pictures can be added and removed through Gmail, Google Drive, and Google Photos.

References

  1. Jeff Barr (August 21, 2012). "Amazon Glacier: Archival Storage for One Penny Per GB Per Month". AWS Blog. Retrieved November 29, 2016.
  2. Mlot, Stephanie (August 21, 2012). "Amazon Launches Glacier Cloud Storage Service". PCMag.com. Ziff Davis, Inc. Retrieved August 21, 2012.
  3. "Pricing". Aws.amazon.com. Retrieved June 18, 2015.
  4. 1 2 Clark, Jack (August 21, 2012). "Amazon launches Glacier cloud storage, hopes enterprise will go cold on tape use". ZDNet. CBS Interactive. Retrieved August 21, 2012.
  5. Clark, Jack (August 24, 2012). "Could the tech beneath Amazon's Glacier revolutionise data storage?". ZDNet. Retrieved June 18, 2015.
  6. "Former S3 employee here. I was on my way out of the company just after the stora... | Hacker News". News.ycombinator.com. Retrieved June 18, 2015.
  7. Gallagher, Sean (November 9, 2015). "How Facebook puts petabytes of old cat pix on ice in the name of sustainability". Ars Technica.
  8. "Amazon Glacier: 99.999999999% durability long-term storage, for a penny a gig". ExtremeTech. August 21, 2012. Retrieved June 18, 2015.
  9. Paul Cooper (November 9, 2013). "One of tech's most elusive mysteries: The secret of Amazon Glacier". IT ProPortal.
  10. "Insider 'fesses up: Amazon's Glacier cloud is made of ... TAPE". Theregister.co.uk. Retrieved June 18, 2015.
  11. "Spectra: Tape is dead? We installed 550PB of the stuff in 6 months". Theregister.co.uk. Retrieved June 18, 2015.
  12. "Amazon's Glacier secret: BDXL". Storagemojo.com. Retrieved June 18, 2015.
  13. Harris, Robin. "Amazon's Glacier secret: BDXL | StorageMojo".
  14. "The cloud price war continues: Amazon cuts its cloud storage prices, again". zdnet.com. Retrieved September 4, 2019.
  15. "FastGlacier surprising Retrieval Fee". AWS Developer Forums. Aws.amazon.com. September 21, 2012. Retrieved January 30, 2013.
  16. Finley, Klint (August 21, 2012). "Is There a Landmine Hidden in Amazon's Glacier?". Wired via www.wired.com.
  17. "AWS Storage Update – S3 & Glacier Price Reductions + Additional Retrieval Options for Glacier". aws.amazon.com. November 21, 2016. Retrieved February 1, 2018.
  18. "Glacier FAQ: Data Retrievals". aws.amazon.com. Retrieved February 1, 2018.
  19. "Retrieving Amazon Glacier Archives". aws.amazon.com. Retrieved February 1, 2018.
  20. "Amazon Glacier Pricing". aws.amazon.com. Retrieved February 1, 2018.
  21. "Amazon S3 Storage Classes". aws.amazon.com. Retrieved February 1, 2018.