Open Compute Project

Last updated
Open Compute Project
AbbreviationOCP
Formation2011;13 years ago (2011)
TypeIndustry trade group
PurposeSharing designs of data center products
Region
Worldwide
Membership50+ corporations
Website opencompute.org

The Open Compute Project (OCP) is an organization that facilitates the sharing of data center product designs and industry best practices among companies. [1] [2] Founded in 2011, OCP has significantly influenced the design and operation of large-scale computing facilities worldwide. [1]

Contents

As of July 2024, over 300 companies across the world are members of OCP, including Arm, Meta, IBM, Wiwynn, Intel, Nokia, Google, Microsoft, Seagate Technology, Dell, Rackspace, Hewlett Packard Enterprise, NVIDIA, Cisco, Goldman Sachs, Fidelity, Lenovo, Accton Technology Corporation and Alibaba Group. [1] [3] [2]

Structure

Open Compute V2 Server Open Compute Server Front.jpg
Open Compute V2 Server
Open Compute V2 Drive Tray,
2nd lower tray extended Open Compute 1U Drive Tray Bent.jpg
Open Compute V2 Drive Tray,
2nd lower tray extended

The Open Compute Project Foundation is a 501(c)(6) non-profit incorporated in the state of Delaware, United States. OCP has multiple committees, including the board of directors, advisory board and steering committee to govern its operations.

As of July 2020, there are seven members who serve on the board of directors which is made up of one individual member and six organizational members. Mark Roenigk (Facebook) is the Foundation's president and chairman. Andy Bechtolsheim is the individual member. In addition to Mark Roenigk who represents Facebook, other organizations on the Open Compute board of directors include Intel (Rebecca Weekly), Microsoft (Kushagra Vaid), Google (Partha Ranganathan), and Rackspace (Jim Hawkins). [4]

A current list of members can be found on the opencompute.org website.

History

The Open Compute Project began in Facebook as an internal project in 2009 called "Project Freedom". The hardware designs and engineering team were led by Amir Michael (Manager, Hardware Design) [5] [6] [7] and sponsored by Jonathan Heiliger (VP, Technical Operations) and Frank Frankovsky (Director, Hardware Design and Infrastructure). The three would later open source the designs of Project Freedom and co-found the Open Compute Project. [8] [9] The project was announced at a press event at Facebook's headquarters in Palo Alto on April 7, 2011. [10]

OCP projects

The Open Compute Project Foundation maintains a number of OCP projects, such as:

Server designs

Two years after the Open Compute Project had started, with regards to a more modular server design, it was admitted that "the new design is still a long way from live data centers". [11] However, some aspects published were used in Facebook's Prineville data center to improve energy efficiency, as measured by the power usage effectiveness index defined by The Green Grid. [12]

Efforts to advance server compute node designs included one for Intel processors and one for AMD processors. In 2013, Calxeda contributed a design with ARM architecture processors. [13] Since then, several generations of OCP server designs have been deployed: Wildcat (Intel), Spitfire (AMD), Windmill (Intel E5-2600), Watermark (AMD), Winterfell (Intel E5-2600 v2) and Leopard (Intel E5-2600 v3). [14] [15]

OCP Accelerator Module

OCP Accelerator Module (OAM) is a design specification for hardware architectures that implement artificial intelligence systems that require high module-to-module bandwidth. [16]

OAM is used in some of AMD's Instinct accelerator modules.

Rack and Power designs

The designs for a mechanical mounting system have been published, so that open racks have the same outside width (600 mm) and depth as standard 19-inch racks, but are designed to mount wider chassis with a 537 mm width (21 inches). This allows more equipment to fit in the same volume and improves air flow. Compute chassis sizes are defined in multiples of an OpenU or OU, which is 48 mm, slightly taller than the typical 44mm rack unit. The most current base mechanical specifications were defined and published by Meta as the Open Rack V3 Base Specification in 2022, with significant contributions from Google and Rittal. [17]

At the time the base specification was released, Meta also defined in greater depth the specifications for the rectifiers and power shelf. [18] [19] Specifications for the power monitoring interface (PMI), a communications interface enabling upstream communications between the rectifiers and battery backup unit(BBU) were published by Meta that same year, with Delta Electronics as the main technical contributor to the BBU spec. [20]

Since 2022 however, the power demands of AI in the data center has necessitated higher power requirements in order to fulfill the heavy power demands of newer data center processors that have since been released. Meta is currently in the process of updating its Open Rack v3 rectifier, power shelf, battery backup and power management interface specifications to account for these new more powerful AI architectures being used.

In May 2024, at an Open Compute regional summit, Meta and Rittal outlined their plans for development of their High Power Rack (HPR) ecosystem in conjunction with rack, power and cable partners, increasing the power capacity in the rack to 92 kilowatts or more of power, enabling the higher power needs of the latest generation of processors. [21] At the same meeting, Delta Electronics and Advanced Energy introduced their progress in developing new Open Compute standards specifying power shelf and rectifier designs for these HPR applications. [22] Rittal also outlined their collaboration with Meta in designing airflow containment, busbar designs and grounding schemes to the new HPR requirements. [23]

Data storage

Open Vault storage building blocks offer high disk densities, with 30 drives in a 2U Open Rack chassis designed for easy disk drive replacement. The 3.5 inch disks are stored in two drawers, five across and three deep in each drawer, with connections via serial attached SCSI. [24] This storage is also called Knox, and there is also a cold storage variant where idle disks power down to reduce energy consumption. [25] Another design concept was contributed by Hyve Solutions, a division of Synnex in 2012. [26] [27] At the OCP Summit 2016 Facebook together with Taiwanese ODM Wistron's spin-off Wiwynn introduced Lightning, a flexible NVMe JBOF (just a bunch of flash), based on the existing Open Vault (Knox) design. [28] [29]

Energy efficient data centers

The OCP has published data center designs for energy efficiency. These include power distribution at 277 VAC, which eliminates one transformer stage in typical data centers, a single voltage (12.5 VDC) power supply designed to work with 277 VAC input, and 48 VDC battery backup. [12]

Open networking switches

On May 8, 2013, an effort to define an open network switch was announced. [30] The plan was to allow Facebook to load its own operating system software onto the switch. Press reports predicted that more expensive and higher-performance switches would continue to be popular, while less expensive products treated more like a commodity (using the buzzword "top-of-rack") might adopt the proposal. [31]

The first attempt at an open networking switch by Facebook was designed together with Taiwanese ODM Accton using Broadcom Trident II chip and is called Wedge, the Linux OS that it runs is called FBOSS. [32] [33] [34] Later switch contributions include "6-pack" and Wedge-100, based on Broadcom Tomahawk chips. [35] Similar switch hardware designs have been contributed by: Accton Technology Corporation (and its Edgecore Networks subsidiary), Mellanox Technologies, Interface Masters Technologies, Agema Systems. [36] Capable of running Open Network Install Environment (ONIE)-compatible network operating systems such as Cumulus Linux, Switch Light OS by Big Switch Networks, or PICOS by Pica8. [37] A similar project for a custom switch for the Google platform had been rumored, and evolved to use the OpenFlow protocol. [38] [39]

Servers

Sub-project for Mezzanine (NIC) OCP NIC 3.0 specification 1v00 was released in late 2019 establishing 3 form factors: SFF, TSFF, and LFF . [40] [41]

Litigation

In March, 2015, [42] BladeRoom Group Limited and Bripco (UK) Limited sued Facebook, Emerson Electric Co. and others alleging that Facebook has disclosed BladeRoom and Bripco's trade secrets for prefabricated data centers in the Open Compute Project. [43] Facebook petitioned for the lawsuit to be dismissed, [44] but this was rejected in 2017. [45] A confidential mid-trial settlement was agreed in April 2018. [46]

See also

Related Research Articles

<span class="mw-page-title-main">Data center</span> Building or room used to house computer servers and related equipment

A data center is a building, a dedicated space within a building, or a group of buildings used to house computer systems and associated components, such as telecommunications and storage systems.

<span class="mw-page-title-main">Blade server</span> Server computer that uses less energy and space than a conventional server

A blade server is a stripped-down server computer with a modular design optimized to minimize the use of physical space and energy. Blade servers have many components removed to save space, minimize power consumption and other considerations, while still having all the functional components to be considered a computer. Unlike a rack-mount server, a blade server fits inside a blade enclosure, which can hold multiple blade servers, providing services such as power, cooling, networking, various interconnects and management. Together, blades and the blade enclosure form a blade system, which may itself be rack-mounted. Different blade providers have differing principles regarding what to include in the blade itself, and in the blade system as a whole.

<span class="mw-page-title-main">Google data centers</span> Facilities containing Google servers

Google data centers are the large data center facilities Google uses to provide their services, which combine large drives, computer nodes organized in aisles of racks, internal and external networking, environmental controls, and operations software.

The Texas Advanced Computing Center (TACC) at the University of Texas at Austin, United States, is an advanced computing research center that is based on comprehensive advanced computing resources and supports services to researchers in Texas and across the U.S. The mission of TACC is to enable discoveries that advance science and society through the application of advanced computing technologies. Specializing in high-performance computing, scientific visualization, data analysis and storage systems, software, research and development, and portal interfaces, TACC deploys and operates advanced computational infrastructure to enable the research activities of faculty, staff, and students of UT Austin. TACC also provides consulting, technical documentation, and training to support researchers who use these resources. TACC staff members conduct research and development in applications and algorithms, computing systems design/architecture, and programming tools and environments.

The IBM Intelligent Cluster was a cluster solution for x86-based high-performance computing composed primarily of IBM components, integrated with network switches from various vendors and optional high-performance InfiniBand interconnects.

<span class="mw-page-title-main">HPE BladeSystem</span> Line of blade server machines by Hewlett Packard Enterprise

BladeSystem is a line of blade server machines from Hewlett Packard Enterprise that was introduced in June 2006.

<span class="mw-page-title-main">Accton Technology Corporation</span> Taiwanese electronics company

Accton Technology Corporation is a Taiwanese company in the electronics industry that primarily engages in the development and manufacture of networking and communication solutions, as an original equipment manufacturer (OEM) or original design manufacturer (ODM) partner. Accton has manufacturing plants in Taiwan (Hsinchu), China (Shenzhen), and Vietnam, supported by research and development centers in Taiwan, Shanghai, and California. Its product include 100G, 400G, and 800G switches designed for data center applications, along with wireless devices and artificial intelligence acceleration hardware.

Exalogic is a computer appliance made by Oracle Corporation, commercially available since 2010. It is a cluster of x86-64-servers running Oracle Linux or Solaris preinstalled.

Cisco Unified Computing System (UCS) is a data center server computer product line composed of server hardware, virtualization support, switching fabric, and management software, introduced in 2009 by Cisco Systems. The products are marketed for scalability by integrating many components of a data center that can be managed as a single unit.

<span class="mw-page-title-main">Virtual Computing Environment</span> American computer hardware brand

Virtual Computing Environment Company (VCE) was a division of EMC Corporation that manufactured converged infrastructure appliances for enterprise environments. Founded in 2009 under the name Acadia, it was originally a joint venture between EMC and Cisco Systems, with additional investments by Intel and EMC subsidiary VMware. EMC acquired a 90% controlling stake in VCE from Cisco in October 2014, giving it majority ownership. VCE ended in 2016 after an internal division realignment, followed by the sale of EMC to Dell.

The OpenPOWER Foundation is a collaboration around Power ISA-based products initiated by IBM and announced as the "OpenPOWER Consortium" on August 6, 2013. IBM's focus is to open up technology surrounding their Power Architecture offerings, such as processor specifications, firmware, and software with a liberal license, and will be using a collaborative development model with their partners.

<span class="mw-page-title-main">DOME MicroDataCenter</span>

A microDataCenter contains compute, storage, power, cooling and networking in a very small volume, sometimes also called a "DataCenter-in-a-box". The term has been used to describe various incarnations of this idea over the past 20 years. Late 2017 a very tightly integrated version was shown at SuperComputing conference 2017: the DOME microDataCenter. Key features are its hot-watercooling, fully solid-state and being built with commodity components and standards only.

<span class="mw-page-title-main">QPACE2</span> Massively parallel and scalable supercomputer

QPACE 2 is a massively parallel and scalable supercomputer. It was designed for applications in lattice quantum chromodynamics but is also suitable for a wider range of applications..

<span class="mw-page-title-main">Immersion cooling</span> IT cooling practice

Immersion cooling is an IT cooling practice by which servers are completely or partially immersed in a dielectric fluid that has significantly higher thermal conductivity than air. Heat is removed from the system by putting the coolant in direct contact with hot components, and circulating the heated liquid through heat exchangers. This practice is highly effective as liquid coolants can absorb more heat from the system than air. Immersion cooling has many benefits, including but not limited to: sustainability, performance, reliability, and cost.

MiTAC Holdings Corporation was formed on September 12, 2013, through a stock swap from MiTAC International Corp. As part of a restructuring aimed at future operational objectives, the Group established MiTAC Computing Technology Corporation on September 1, 2014, to focus on designs and manufactures servers for data centers and enterprises, offering solutions from edge to cloud computing, including hyperscale data centers, AI/HPC systems, and energy-efficient technologies like liquid cooling.

QCT is a provider of data center hardware and cloud solutions that are used by hyperscale data center operators.

<span class="mw-page-title-main">Nvidia DGX</span> Line of Nvidia produced servers and workstations

The Nvidia DGX represents a series of servers and workstations designed by Nvidia, primarily geared towards enhancing deep learning applications through the use of general-purpose computing on graphics processing units (GPGPU). These systems typically come in a rackmount format featuring high-performance x86 server CPUs on the motherboard.

Open Rack is an Open Compute Project standard for a new rack and power delivery architecture and an efficient, scalable alternative to the EIA-310 19-inch rack. It differs from the traditional EIA-310 rack in that it was designed specifically for large-scale cloud deployments. There are four key features that make this rack design more efficient to deploy, support, and operate. The power to all of the compute, storage, or network devices is supplied by a pair of bus bars located in the rear of the rack. The bus bars are supplied with 48 V DC by a shelf of power supplies which provides efficient conversion from the local AC mains supply. The IT equipment that fits into Open Rack is 21 inches or 533 millimetres wide. This is a 15% increase in frontal area that provides more airflow to the IT devices, enabling the data center to reduce cooling costs. The vertical spacing is also taller to accommodate better airflow and structurally better enclosures that do not sag and interfere with adjacent equipment. All of the cables and interconnects are made from the front of the rack and the IT equipment is hot-pluggable and serviceable from the front of the rack. Service personnel no longer access the rear of the rack or need to work in the hot aisle.

Composable disaggregated infrastructure (CDI), sometimes stylized as composable/disaggregated infrastructure, is a technology that allows enterprise data center operators to achieve the cost and availability benefits of cloud computing using on-premises networking equipment. It is considered a class of converged infrastructure, and uses management software to combine compute, storage and network elements. It is similar to public cloud, except the equipment sits on premises in an enterprise data center.

Inspur Server Series is a series of server computers introduced in 1993 by Inspur, an information technology company, and later expanded to the international markets. The servers were likely among the first originally manufactured by a Chinese company. It is currently developed by Inspur Information and its San Francisco-based subsidiary company - Inspur Systems, both Inspur's spinoff companies. The product line includes GPU Servers, Rack-mounted servers, Open Computing Servers and Multi-node Servers.

References

  1. 1 2 3 Metz, Cade (11 Apr 2015). "How Facebook Changed the Basic Tech That Runs the Internet". Wired.
  2. 1 2 "Open Compute Project".
  3. "Incubation Committee". Open Compute. Retrieved 2016-08-19.
  4. "Organization and Board". Open Compute. Archived from the original on 2015-09-26. Retrieved 2015-09-12.
  5. "Facebook Follows Google to Data Center Savings". Data Center Knowledge. 2009-11-27. Retrieved 2020-12-13.
  6. "Oxide Computer Company: On the Metal: Amir Michael". Oxide Computer Company. Retrieved 2020-12-13.
  7. "Facebook Hacks Shipping Dock Into World-Class Server Lab". Wired. ISSN   1059-1028 . Retrieved 2020-12-13.
  8. "Why I Started the Open Compute Project – Vertex Ventures" . Retrieved 2020-12-13.
  9. "Introducing the Open Compute Project - YouTube". www.youtube.com. Retrieved 2020-12-13.
  10. "Facebook Opens its Server, Data Center Designs". Data Center Knowledge. 2011-04-07. Retrieved 2020-12-13.
  11. Metz, Cade (January 16, 2013). "Facebook Shatters the Computer Server Into Tiny Pieces". Wired. Retrieved July 9, 2013.
  12. 1 2 Michael, Amir (February 15, 2012). "Facebook's Open Compute Project". Stanford EE Computer Systems Colloquium. Stanford University. (video archive)
  13. Schnell, Tom (January 16, 2013). "ARM Server Motherboard Design for Open Vault Chassis Hardware v0.3 MB-draco-hesperides-0.3" (PDF). Archived from the original (PDF) on October 23, 2014. Retrieved July 9, 2013.
  14. Data Center Knowledge (April 28, 2016). "Guide to Facebook's Open Source Data Center Hardware" . Retrieved May 13, 2016.
  15. Register, The (January 17, 2013). "Facebook rolls out new web and database server designs". The Register . Retrieved May 13, 2016.
  16. Ledin, Jim (2020-04-30). Modern Computer Architecture and Organization. Birmingham Mumbai: Packt Publishing Ltd. p. 361. ISBN   978-1-83898-710-7.
  17. Charest, Glenn; Mills, Steve; Vorreiter, Loren. "Open Rack V3 Base Specification". opencompute.org. Meta. Retrieved 25 September 2024.
  18. Keyhani, Hamid; Tang, Ted; Shapiro, Dmitriy; Fernandes, John; Kim, Ben; Jin, Tiffany; Mercado, Rommel. "Open Rack V3 48V PSU Specification Rev: 1.0". opencompute.org. Meta. Retrieved 25 September 2024.
  19. Keyhani, Hamid; Shapiro, Dmitriy; Fernandes, John; Kim, Ben; Jin, Tiffany; Mercado, Rommel. "Open Rack V3 Power Shelf Rev 1.0 Specification". opencompute.org. Meta. Retrieved 25 September 2024.
  20. Sun, David; Shapiro, Dmitriy; Kim, Ben; Athavale, Jayati; Mercado, Rommel. "Open Rack V3 48V BBU Specification Rev: 1.4". opencompute.org. Meta. Retrieved 25 September 2024.
  21. Open Compute Project. "ORv3 High Power Rack (HPR) Ecosystem Solution". youtube.com. Youtube. Retrieved 25 September 2024.{{cite web}}: Check |url= value (help)
  22. Open Compute Project. "Requirements/Considerations of Next Generation ORv3 PSU and Power Shelves". Youtube. Youtube. Retrieved 25 September 2024.
  23. Open Compute Project. "ORv3 High Power Rack (HPR) Ecosystem Solution". Youtube. youtube. Retrieved 25 September 2024.
  24. Mike Yan and Jon Ehlen (January 16, 2013). "Open Vault Storage Hardware V0.7 OR-draco-bueana-0.7" (PDF). Archived from the original (PDF) on May 21, 2013. Retrieved July 9, 2013.
  25. "Under the hood: Facebook's cold storage system". May 4, 2015. Retrieved May 13, 2016.
  26. "Hyve Solutions Contributes Storage Design Concept to OCP Community". News release. January 17, 2013. Archived from the original on April 14, 2013. Retrieved July 9, 2013.
  27. Malone, Conor (January 15, 2012). "Torpedo Design Concept Storage Server for Open Rack Hardware v0.3 ST-draco-chimera-0.3" (PDF). Archived from the original (PDF) on May 21, 2013. Retrieved July 9, 2013.
  28. Petersen, Chris (March 9, 2016). "Introducing Lightning: A flexible NVMe JBOF" . Retrieved May 13, 2016.
  29. "Wiwynn Showcases All-Flash Storage Product with Leading-edge NVMe Technology". March 9, 2016. Retrieved May 13, 2016.
  30. Jay Hauser for Frank Frankovsky (May 8, 2013). "Up next for the Open Compute Project: The Network". Open Compute blog. Retrieved June 16, 2019.
  31. Chernicoff, David (May 9, 2013). "Can Open Compute change network switching?". ZDNet. Retrieved July 9, 2013.
  32. "Facebook Open Switching System (FBOSS) from Facebook". SDxCentral . Archived from the original on October 1, 2018 via Internet Archive.
  33. "Introducing "Wedge" and "FBOSS," the next steps toward a disaggregated network". Meet the engineers who code Facebook. June 18, 2014. Retrieved 2016-05-13.
  34. "Facebook Open Switching System ("FBOSS") and Wedge in the open". Meet the engineers who code Facebook. March 10, 2015. Retrieved 2016-05-13.
  35. "Opening designs for 6-pack and Wedge 100". Meet the engineers who code Facebook. March 9, 2016. Retrieved 2016-05-13.
  36. "Accepted or shared hardware specifications". Open Compute. Retrieved 2016-05-13.
  37. "Current Network Operating System (NOS) List". Open Compute. Retrieved 2016-05-13.
  38. Metz, Cade (May 8, 2013). "Facebook Rattles Networking World With 'Open Source' Gear". Wired. Retrieved July 9, 2013.
  39. Levy, Steven (April 17, 2012). "Going With the Flow: Google's Secret Switch to the Next Wave of Networking". Wired. Retrieved July 9, 2013.
  40. "Server/Mezz - OpenCompute". www.opencompute.org. Retrieved 2022-11-09.
  41. Kumar, Rohit (2022-05-02). "OCP NIC 3.0 Form Factors The Quick Guide". ServeTheHome. Retrieved 2022-11-09.
  42. "BladeRoom Group Limited et al v. Facebook, Inc". Justia. Retrieved 18 February 2017.
  43. "ORDER granting in part and denying in part 128 Motion to Dismiss". Justia. Retrieved 18 February 2017.
  44. Greene, Kat (10 May 2016). "Facebook Wants Data Center Trade Secrets Suit Tossed". Law360. Retrieved 8 March 2017.
  45. SVERDLIK, YEVGENIY (17 February 2017). "Court Throws Out Facebook's Motion to Dismiss Data Center Design Lawsuit". Data center Knowledge. Retrieved 8 March 2017.
  46. "Facebook settles $365m modular datacentre IP theft case with UK-based BladeRoom Group". Computer Weekly. 11 April 2018. Retrieved 15 March 2019.