This article needs additional citations for verification .(December 2012) |
PetaBox, also stylized Petabox, is a storage unit from Capricorn Technologies and the Internet Archive. [1] [2] It was designed by the staff of the Internet Archive and C. R. Saikley to store and process one petabyte (a million gigabytes) of information. [3]
Design goals of the Petabox included: [3]
The first 100 terabyte rack became operational in Amsterdam at the Internet Archive's European arm, the Stichting Internet Archive (SIA), in June 2004. The second 80 terabyte rack became operational in their main San Francisco location that same year. The Internet Archive then spun off its Petabox production to the newly-formed company Capricorn Technologies. [3]
Between 2004 and 2007, Capricorn replicated the Internet Archive's deployment of the Petabox for major academic institutions, digital preservationists, government agencies, high-performance computing (HPC) and major research sites, medical imaging providers, digital image repositories, storage outsourcing sites, and other enterprises. Their largest product uses 750 gigabyte disks. In 2007, the Internet Archive data center housed approximately three petabytes of Petabox storage technology.
In 2010, the fourth version of the Petabox began operation. Each Petabox allowed for 480 TB of raw storage (240 disks of 2 TB each, set up with 24 disks per 4U high rack units and with 10 units per rack) running on Linux. [4] [5]
As of December 2021, the Internet Archive's Petabox storage system consists of four data centers, 745 nodes, and 28,000 spinning disks. The Wayback Machine contains 57 petabytes of information; book, music and video collections contain an extra 42 petabytes of information, and "unique data" account for an extra 99 petabytes of information, for a total of 212 petabytes of storage. [3]
A hard disk drive (HDD), hard disk, hard drive, or fixed disk is an electro-mechanical data storage device that stores and retrieves digital data using magnetic storage with one or more rigid rapidly rotating platters coated with magnetic material. The platters are paired with magnetic heads, usually arranged on a moving actuator arm, which read and write data to the platter surfaces. Data is accessed in a random-access manner, meaning that individual blocks of data can be stored and retrieved in any order. HDDs are a type of non-volatile storage, retaining stored data when powered off. Modern HDDs are typically in the form of a small rectangular box.
Sneakernet, also called sneaker net, is an informal term for the transfer of electronic information by physically moving media such as magnetic tape, floppy disks, optical discs, USB flash drives or external hard drives between computers, rather than transmitting it over a computer network. Sneakernets enable data transfer through physical means and offer a solution in the presence of network connections that lack reliability; however, a consequence of this physical transfer is high latency. The term, a tongue-in-cheek play on net(work) as in Internet or Ethernet, refers to walking in sneakers as the transport mechanism. Alternative terms may be floppy net, train net, or pigeon net.
Density is a measure of the quantity of information bits that can be stored on a given physical space of a computer storage medium. There are three types of density: length of track, area of the surface, or in a given volume.
MareNostrum is the main supercomputer in the Barcelona Supercomputing Center. It is the most powerful supercomputer in Spain, one of thirteen supercomputers in the Spanish Supercomputing Network and one of the seven supercomputers of the European infrastructure PRACE.
Perpendicular recording, also known as conventional magnetic recording (CMR), is a technology for data recording on magnetic media, particularly hard disks. It was first proven advantageous in 1976 by Shun-ichi Iwasaki, then professor of the Tohoku University in Japan, and first commercially implemented in 2005. The first industry-standard demonstration showing unprecedented advantage of PMR over longitudinal magnetic recording (LMR) at nanoscale dimensions was made in 1998 at IBM Almaden Research Center in collaboration with researchers of Data Storage Systems Center (DSSC) – a National Science Foundation (NSF) Engineering Research Center (ERCs) at Carnegie Mellon University (CMU).
An optical jukebox is a robotic data storage device that can automatically load and unload optical discs, such as Compact Disc, DVD, Ultra Density Optical or Blu-ray and can provide terabytes (TB) or petabytes (PB) of tertiary storage. The devices are often called optical disk libraries, "optical storage archives", robotic drives, or autochangers. Jukebox devices may have up to 2,000 slots for disks, and usually have a picking device that traverses the slots and drives. Zerras Inc. provides a removeable capsule that holds up to 200 discs per library which can be scaled-out to manage 1600 discs per 42U rack unit. The arrangement of the slots and picking devices affects performance and maintenance costs, depending on the robotics design, the space between a disk and the picking device. Seek times and transfer rates vary depending upon the optical technology used.
Heat-assisted magnetic recording (HAMR) is a magnetic storage technology for greatly increasing the amount of data that can be stored on a magnetic device such as a hard disk drive by temporarily heating the disk material during writing, which makes it much more receptive to magnetic effects and allows writing to much smaller regions.
QFS is a filesystem from Oracle. It is tightly integrated with SAM, the Storage and Archive Manager, and hence is often referred to as SAM-QFS. SAM provides the functionality of a hierarchical storage manager.
A solid-state drive (SSD) is a type of solid-state storage device that uses integrated circuits to store data persistently. It is sometimes called semiconductor storage device, solid-state device, or solid-state disk.
Sun Modular Datacenter is a portable data center built into a standard 20-foot intermodal container manufactured and marketed by Sun Microsystems. An external chiller and power were required for the operation of a Sun MD. A data center of up to 280 servers could be rapidly deployed by shipping the container in a regular way to locations that might not be suitable for a building or another structure, and connecting it to the required infrastructure. Sun stated that the system could be made operational for 1% of the cost of building a traditional data center.
The IBM Storage product portfolio includes disk, flash, tape, NAS storage products, storage software and services. IBM's approach is to focus on data management.
High Performance Storage System (HPSS) is a flexible, scalable, policy-based, software-defined hierarchical storage management (HSM) product developed by the HPSS Collaboration. It provides scalable HSM, archive, and file system services using cluster, LAN and storage area network (SAN) technologies to aggregate the capacity and performance of many computers, disks, disk systems, tape drives, and tape libraries.
This timeline of binary prefixes lists events in the history of the evolution, development, and use of units of measure that are germane to the definition of the binary prefixes by the International Electrotechnical Commission (IEC) in 1998, used primarily with units of information such as the bit and the byte.
The Worldwide LHC Computing Grid (WLCG), formerly the LHC Computing Grid (LCG), is an international collaborative project that consists of a grid-based computer network infrastructure incorporating over 170 computing centers in 42 countries, as of 2017. It was designed by CERN to handle the prodigious volume of data produced by Large Hadron Collider (LHC) experiments.
The National Institute for Computational Sciences (NICS) is funded by the National Science Foundation and managed by the University of Tennessee. NICS was home to Kraken, the most powerful computer in the world managed by academia. The NICS petascale scientific computing environment is housed at Oak Ridge National Laboratory (ORNL), home to the world's most powerful computing complex. The mission of NICS, a member of the Extreme Science and Engineering Discovery Environment (XSEDE - formerly TeraGrid), is to enable the scientific discoveries of researchers nationwide by providing leading-edge computational resources, together with support for their effective use, and leveraging extensive partnership opportunities.
The National Computational Infrastructure is a high-performance computing and data services facility, located at the Australian National University (ANU) in Canberra, Australian Capital Territory. The NCI is supported by the Australian Government's National Collaborative Research Infrastructure Strategy (NCRIS), with operational funding provided through a formal collaboration incorporating CSIRO, the Bureau of Meteorology, the Australian National University, Geoscience Australia, the Australian Research Council, and a number of research-intensive universities and medical research institutes.
Virtual Storage Platform is the brand name for a Hitachi Data Systems line of computer data storage systems for data centers. Model numbers include G200, G400, G600, G800, G1000, G1500 and G5500
Exalogic is a computer appliance made by Oracle Corporation, commercially available since 2010. It is a cluster of x86-64-servers running Oracle Linux or Solaris preinstalled.
The NCAR-Wyoming Supercomputing Center (NWSC) is a high-performance computing (HPC) and data archival facility located in Cheyenne, Wyoming, that provides advanced computing services to researchers in the Earth system sciences.
Archival Disc (AD) is the trademarked name of a discontinued optical disc storage medium designed by Sony and Panasonic for long-term digital storage. First announced on 10 March 2014 and introduced in the second quarter of 2015, the discs were intended to withstand changes in temperature and humidity, in addition to dust and water, ensuring that the disc would be readable for at least 50 years. The agreement between Sony and Panasonic to jointly develop the next generation optical media standard was first announced on 29 July 2013. The discs were mass-produced by Panasonic in 2016. The product is discontinued as of 2024. The two companies have since collaborated on the development of another format, Optical Disc Archive.