Abbreviation | OCP |
---|---|
Formation | 2011 |
Type | Industry trade group |
Purpose | Sharing designs of data center products |
Region | Worldwide |
Membership | 50+ corporations |
Website | opencompute |
The Open Compute Project (OCP) is an organization that facilitates the sharing of data center product designs and industry best practices among companies. [1] [2] Founded in 2011, OCP has significantly influenced the design and operation of large-scale computing facilities worldwide. [1]
As of July 2024, over 300 companies across the world are members of OCP, including Arm, Meta, IBM, Wiwynn, Intel, Nokia, Google, Microsoft, Seagate Technology, Dell, Rackspace, Hewlett Packard Enterprise, NVIDIA, Cisco, Goldman Sachs, Fidelity, Lenovo, Accton Technology Corporation and Alibaba Group. [1] [3] [2]
The Open Compute Project Foundation is a 501(c)(6) non-profit incorporated in the state of Delaware, United States. OCP has multiple committees, including the board of directors, advisory board and steering committee to govern its operations.
As of July 2020, there are seven members who serve on the board of directors which is made up of one individual member and six organizational members. Mark Roenigk (Facebook) is the Foundation's president and chairman. Andy Bechtolsheim is the individual member. In addition to Mark Roenigk who represents Facebook, other organizations on the Open Compute board of directors include Intel (Rebecca Weekly), Microsoft (Kushagra Vaid), Google (Partha Ranganathan), and Rackspace (Jim Hawkins). [4]
A current list of members can be found on the opencompute.org website.
The Open Compute Project began in Facebook as an internal project in 2009 called "Project Freedom". The hardware designs and engineering team were led by Amir Michael (Manager, Hardware Design) [5] [6] [7] and sponsored by Jonathan Heiliger (VP, Technical Operations) and Frank Frankovsky (Director, Hardware Design and Infrastructure). The three would later open source the designs of Project Freedom and co-found the Open Compute Project. [8] [9] The project was announced at a press event at Facebook's headquarters in Palo Alto on April 7, 2011. [10]
The Open Compute Project Foundation maintains a number of OCP projects, such as:
Two years after the Open Compute Project had started, with regards to a more modular server design, it was admitted that "the new design is still a long way from live data centers". [11] However, some aspects published were used in Facebook's Prineville data center to improve energy efficiency, as measured by the power usage effectiveness index defined by The Green Grid. [12]
Efforts to advance server compute node designs included one for Intel processors and one for AMD processors. In 2013, Calxeda contributed a design with ARM architecture processors. [13] Since then, several generations of OCP server designs have been deployed: Wildcat (Intel), Spitfire (AMD), Windmill (Intel E5-2600), Watermark (AMD), Winterfell (Intel E5-2600 v2) and Leopard (Intel E5-2600 v3). [14] [15]
OCP Accelerator Module (OAM) is a design specification for hardware architectures that implement artificial intelligence systems that require high module-to-module bandwidth. [16]
OAM is used in some of AMD's Instinct accelerator modules.
The designs for a mechanical mounting system have been published, so that open racks have the same outside width (600 mm) and depth as standard 19-inch racks, but are designed to mount wider chassis with a 537 mm width (21 inches). This allows more equipment to fit in the same volume and improves air flow. Compute chassis sizes are defined in multiples of an OpenU or OU, which is 48 mm, slightly taller than the typical 44mm rack unit. The most current base mechanical specifications were defined and published by Meta as the Open Rack V3 Base Specification in 2022, with significant contributions from Google and Rittal. [17]
At the time the base specification was released, Meta also defined in greater depth the specifications for the rectifiers and power shelf. [18] [19] Specifications for the power monitoring interface (PMI), a communications interface enabling upstream communications between the rectifiers and battery backup unit(BBU) were published by Meta that same year, with Delta Electronics as the main technical contributor to the BBU spec. [20]
Since 2022 however, the power demands of AI in the data center has necessitated higher power requirements in order to fulfill the heavy power demands of newer data center processors that have since been released. Meta is currently in the process of updating its Open Rack v3 rectifier, power shelf, battery backup and power management interface specifications to account for these new more powerful AI architectures being used.
In May 2024, at an Open Compute regional summit, Meta and Rittal outlined their plans for development of their High Power Rack (HPR) ecosystem in conjunction with rack, power and cable partners, increasing the power capacity in the rack to 92 kilowatts or more of power, enabling the higher power needs of the latest generation of processors. [21] At the same meeting, Delta Electronics and Advanced Energy introduced their progress in developing new Open Compute standards specifying power shelf and rectifier designs for these HPR applications. [22] Rittal also outlined their collaboration with Meta in designing airflow containment, busbar designs and grounding schemes to the new HPR requirements. [23]
Open Vault storage building blocks offer high disk densities, with 30 drives in a 2U Open Rack chassis designed for easy disk drive replacement. The 3.5 inch disks are stored in two drawers, five across and three deep in each drawer, with connections via serial attached SCSI. [24] This storage is also called Knox, and there is also a cold storage variant where idle disks power down to reduce energy consumption. [25] Another design concept was contributed by Hyve Solutions, a division of Synnex in 2012. [26] [27] At the OCP Summit 2016 Facebook together with Taiwanese ODM Wistron's spin-off Wiwynn introduced Lightning, a flexible NVMe JBOF (just a bunch of flash), based on the existing Open Vault (Knox) design. [28] [29]
The OCP has published data center designs for energy efficiency. These include power distribution at 277 VAC, which eliminates one transformer stage in typical data centers, a single voltage (12.5 VDC) power supply designed to work with 277 VAC input, and 48 VDC battery backup. [12]
On May 8, 2013, an effort to define an open network switch was announced. [30] The plan was to allow Facebook to load its own operating system software onto the switch. Press reports predicted that more expensive and higher-performance switches would continue to be popular, while less expensive products treated more like a commodity (using the buzzword "top-of-rack") might adopt the proposal. [31]
The first attempt at an open networking switch by Facebook was designed together with Taiwanese ODM Accton using Broadcom Trident II chip and is called Wedge, the Linux OS that it runs is called FBOSS. [32] [33] [34] Later switch contributions include "6-pack" and Wedge-100, based on Broadcom Tomahawk chips. [35] Similar switch hardware designs have been contributed by: Accton Technology Corporation (and its Edgecore Networks subsidiary), Mellanox Technologies, Interface Masters Technologies, Agema Systems. [36] Capable of running Open Network Install Environment (ONIE)-compatible network operating systems such as Cumulus Linux, Switch Light OS by Big Switch Networks, or PICOS by Pica8. [37] A similar project for a custom switch for the Google platform had been rumored, and evolved to use the OpenFlow protocol. [38] [39]
Sub-project for Mezzanine (NIC) OCP NIC 3.0 specification 1v00 was released in late 2019 establishing 3 form factors: SFF, TSFF, and LFF . [40] [41]
In March, 2015, [42] BladeRoom Group Limited and Bripco (UK) Limited sued Facebook, Emerson Electric Co. and others alleging that Facebook has disclosed BladeRoom and Bripco's trade secrets for prefabricated data centers in the Open Compute Project. [43] Facebook petitioned for the lawsuit to be dismissed, [44] but this was rejected in 2017. [45] A confidential mid-trial settlement was agreed in April 2018. [46]
A data center is a building, a dedicated space within a building, or a group of buildings used to house computer systems and associated components, such as telecommunications and storage systems.
A blade server is a stripped-down server computer with a modular design optimized to minimize the use of physical space and energy. Blade servers have many components removed to save space, minimize power consumption and other considerations, while still having all the functional components to be considered a computer. Unlike a rack-mount server, a blade server fits inside a blade enclosure, which can hold multiple blade servers, providing services such as power, cooling, networking, various interconnects and management. Together, blades and the blade enclosure form a blade system, which may itself be rack-mounted. Different blade providers have differing principles regarding what to include in the blade itself, and in the blade system as a whole.
Google data centers are the large data center facilities Google uses to provide their services, which combine large drives, computer nodes organized in aisles of racks, internal and external networking, environmental controls, and operations software.
The Texas Advanced Computing Center (TACC) at the University of Texas at Austin, United States, is an advanced computing research center that is based on comprehensive advanced computing resources and supports services to researchers in Texas and across the U.S. The mission of TACC is to enable discoveries that advance science and society through the application of advanced computing technologies. Specializing in high-performance computing, scientific visualization, data analysis and storage systems, software, research and development, and portal interfaces, TACC deploys and operates advanced computational infrastructure to enable the research activities of faculty, staff, and students of UT Austin. TACC also provides consulting, technical documentation, and training to support researchers who use these resources. TACC staff members conduct research and development in applications and algorithms, computing systems design/architecture, and programming tools and environments.
The IBM Intelligent Cluster was a cluster solution for x86-based high-performance computing composed primarily of IBM components, integrated with network switches from various vendors and optional high-performance InfiniBand interconnects.
BladeSystem is a line of blade server machines from Hewlett Packard Enterprise that was introduced in June 2006.
Accton Technology Corporation is a Taiwanese company in the electronics industry that primarily engages in the development and manufacture of networking and communication solutions, as an original equipment manufacturer (OEM) or original design manufacturer (ODM) partner. Accton has manufacturing plants in Taiwan (Hsinchu), China (Shenzhen), and Vietnam, supported by research and development centers in Taiwan, Shanghai, and California. Its product include 100G, 400G, and 800G switches designed for data center applications, along with wireless devices and artificial intelligence acceleration hardware.
Exalogic is a computer appliance made by Oracle Corporation, commercially available since 2010. It is a cluster of x86-64-servers running Oracle Linux or Solaris preinstalled.
Cisco Unified Computing System (UCS) is a data center server computer product line composed of server hardware, virtualization support, switching fabric, and management software, introduced in 2009 by Cisco Systems. The products are marketed for scalability by integrating many components of a data center that can be managed as a single unit.
Virtual Computing Environment Company (VCE) was a division of EMC Corporation that manufactured converged infrastructure appliances for enterprise environments. Founded in 2009 under the name Acadia, it was originally a joint venture between EMC and Cisco Systems, with additional investments by Intel and EMC subsidiary VMware. EMC acquired a 90% controlling stake in VCE from Cisco in October 2014, giving it majority ownership. VCE ended in 2016 after an internal division realignment, followed by the sale of EMC to Dell.
The OpenPOWER Foundation is a collaboration around Power ISA-based products initiated by IBM and announced as the "OpenPOWER Consortium" on August 6, 2013. IBM's focus is to open up technology surrounding their Power Architecture offerings, such as processor specifications, firmware, and software with a liberal license, and will be using a collaborative development model with their partners.
A microDataCenter contains compute, storage, power, cooling and networking in a very small volume, sometimes also called a "DataCenter-in-a-box". The term has been used to describe various incarnations of this idea over the past 20 years. Late 2017 a very tightly integrated version was shown at SuperComputing conference 2017: the DOME microDataCenter. Key features are its hot-watercooling, fully solid-state and being built with commodity components and standards only.
QPACE 2 is a massively parallel and scalable supercomputer. It was designed for applications in lattice quantum chromodynamics but is also suitable for a wider range of applications..
Immersion cooling is an IT cooling practice by which servers are completely or partially immersed in a dielectric fluid that has significantly higher thermal conductivity than air. Heat is removed from the system by putting the coolant in direct contact with hot components, and circulating the heated liquid through heat exchangers. This practice is highly effective as liquid coolants can absorb more heat from the system than air. Immersion cooling has many benefits, including but not limited to: sustainability, performance, reliability, and cost.
MiTAC Holdings Corporation was formed on September 12, 2013, through a stock swap from MiTAC International Corp. As part of a restructuring aimed at future operational objectives, the Group established MiTAC Computing Technology Corporation on September 1, 2014, to focus on designs and manufactures servers for data centers and enterprises, offering solutions from edge to cloud computing, including hyperscale data centers, AI/HPC systems, and energy-efficient technologies like liquid cooling.
QCT is a provider of data center hardware and cloud solutions that are used by hyperscale data center operators.
The Nvidia DGX represents a series of servers and workstations designed by Nvidia, primarily geared towards enhancing deep learning applications through the use of general-purpose computing on graphics processing units (GPGPU). These systems typically come in a rackmount format featuring high-performance x86 server CPUs on the motherboard.
Open Rack is an Open Compute Project standard for a new rack and power delivery architecture and an efficient, scalable alternative to the EIA-310 19-inch rack. It differs from the traditional EIA-310 rack in that it was designed specifically for large-scale cloud deployments. There are four key features that make this rack design more efficient to deploy, support, and operate. The power to all of the compute, storage, or network devices is supplied by a pair of bus bars located in the rear of the rack. The bus bars are supplied with 48 V DC by a shelf of power supplies which provides efficient conversion from the local AC mains supply. The IT equipment that fits into Open Rack is 21 inches or 533 millimetres wide. This is a 15% increase in frontal area that provides more airflow to the IT devices, enabling the data center to reduce cooling costs. The vertical spacing is also taller to accommodate better airflow and structurally better enclosures that do not sag and interfere with adjacent equipment. All of the cables and interconnects are made from the front of the rack and the IT equipment is hot-pluggable and serviceable from the front of the rack. Service personnel no longer access the rear of the rack or need to work in the hot aisle.
Composable disaggregated infrastructure (CDI), sometimes stylized as composable/disaggregated infrastructure, is a technology that allows enterprise data center operators to achieve the cost and availability benefits of cloud computing using on-premises networking equipment. It is considered a class of converged infrastructure, and uses management software to combine compute, storage and network elements. It is similar to public cloud, except the equipment sits on premises in an enterprise data center.
Inspur Server Series is a series of server computers introduced in 1993 by Inspur, an information technology company, and later expanded to the international markets. The servers were likely among the first originally manufactured by a Chinese company. It is currently developed by Inspur Information and its San Francisco-based subsidiary company - Inspur Systems, both Inspur's spinoff companies. The product line includes GPU Servers, Rack-mounted servers, Open Computing Servers and Multi-node Servers.
{{cite web}}
: Check |url=
value (help)