Omni-Path

Last updated

Omni-Path Architecture (OPA) is a high-performance communication architecture developed by Intel. It aims for low communication latency, low power consumption and a high throughput. It directly competes with InfiniBand. Intel planned to develop technology based on this architecture for exascale computing. [1] [2] [3] The current owner of Omni-Path is Cornelis Networks.

Contents

History

Computing node of TSUBAME 3.0 supercomputer with four Omni-Path connections TSUBAME 3.0 PA075079.jpg
Computing node of TSUBAME 3.0 supercomputer with four Omni-Path connections

Production of Omni-Path products started in 2015 and delivery of these products started in the first quarter of 2016. In November 2015, adapters based on the 2-port "Wolf River" ASIC [4] were announced, using QSFP28 connectors with channel speeds up to 100 Gbit/s. Simultaneously, switches based on the 48-port "Prairie River" ASIC were announced. [5] First models of that series were available starting in 2015. [6]

In April 2016, implementation of the InfiniBand "verbs" interface for the Omni-Path fabric was discussed. [7]

In October 2016, IBM, Hewlett Packard Enterprise, Dell, Lenovo, Samsung, Seagate Technology, Micron Technology, Western Digital and SK Hynix announced a joint consortium called Gen-Z to develop an open specification and architecture for non-volatile storage and memory products—including Intel's 3D Xpoint technology—which might in part compete against Omni-Path. [8] Intel offered their Omni-Path products and components via other (hardware) vendors. For example, Dell EMC offered Intel Omni-Path as Dell Networking H-series, following the naming-standard of Dell Networking in 2017. [9]

In July 2019, Intel announced it would not continue development of Omni-Path networks and canceled OPA 200 series (200-Gbps variant of Omni-Path). [10] [11]

In September 2020, Intel announced that the Omni-Path network products and technology would be spun out into a new venture with Cornelis Networks. Intel would continue to maintain support for legacy Omni-Path products, while Cornelis Networks continues the product line, leveraging existing Intel intellectual property related to Omni-Path architecture. [12] [13] [14]

In 2021, Cornelis announced Omni-Path Express, which replaces PSM2-based drivers and middleware, which trace back to PathScale's PSM created in 2003, for the existing Omni-Path hardware, with a native libfabric provider. [15] [16]

See also

Related Research Articles

<span class="mw-page-title-main">Field-programmable gate array</span> Array of logic gates that are reprogrammable

A field-programmable gate array (FPGA) is a type of configurable integrated circuit that can be programmed or reprogrammed after manufacturing. FPGAs are part of a broader set of logic devices referred to as programmable logic devices (PLDs). They consist of an array of programmable logic blocks and interconnects that can be configured to perform various digital functions. FPGAs are commonly used in applications where flexibility, speed, and parallel processing capabilities are required, such as in telecommunications, automotive, aerospace, and industrial sectors.

<span class="mw-page-title-main">InfiniBand</span> Network standard

InfiniBand (IB) is a computer networking communications standard used in high-performance computing that features very high throughput and very low latency. It is used for data interconnect both among and within computers. InfiniBand is also used as either a direct or switched interconnect between servers and storage systems, as well as an interconnect between storage systems. It is designed to be scalable and uses a switched fabric network topology. Between 2014 and June 2016, it was the most commonly used interconnect in the TOP500 list of supercomputers.

Cray Inc., a subsidiary of Hewlett Packard Enterprise, is an American supercomputer manufacturer headquartered in Seattle, Washington. It also manufactures systems for data storage and analytics. Several Cray supercomputer systems are listed in the TOP500, which ranks the most powerful supercomputers in the world.

<span class="mw-page-title-main">Quadrics (company)</span>

Quadrics was a supercomputer company formed in 1996 as a joint venture between Alenia Spazio and the technical team from Meiko Scientific. They produced hardware and software for clustering commodity computer systems into massively parallel systems. Their highpoint was in June 2003 when six out of the ten fastest supercomputers in the world were based on Quadrics' interconnect. They officially closed on June 29, 2009.

In computing, remote direct memory access (RDMA) is a direct memory access from the memory of one computer into that of another without involving either one's operating system. This permits high-throughput, low-latency networking, which is especially useful in massively parallel computer clusters.

<span class="mw-page-title-main">Altix</span> Supercomputer family

Altix is a line of server computers and supercomputers produced by Silicon Graphics, based on Intel processors. It succeeded the MIPS/IRIX-based Origin 3000 servers.

<span class="mw-page-title-main">QLogic</span> American network device manufacturer

QLogic Corporation was an American manufacturer of networking server and storage networking connectivity and application acceleration products, based in Aliso Viejo, California through 2016. QLogic's products include Fibre Channel adapters, converged network adapters for Fibre Channel over Ethernet (FCoE), Ethernet network interface controllers, iSCSI adapters, and application-specific integrated circuits (ASICs). It was a public company from 1992 to 2016.

The Texas Advanced Computing Center (TACC) at the University of Texas at Austin, United States, is an advanced computing research center that is based on comprehensive advanced computing resources and supports services to researchers in Texas and across the U.S. The mission of TACC is to enable discoveries that advance science and society through the application of advanced computing technologies. Specializing in high performance computing, scientific visualization, data analysis & storage systems, software, research & development and portal interfaces, TACC deploys and operates advanced computational infrastructure to enable the research activities of faculty, staff, and students of UT Austin. TACC also provides consulting, technical documentation, and training to support researchers who use these resources. TACC staff members conduct research and development in applications and algorithms, computing systems design/architecture, and programming tools and environments.

<span class="mw-page-title-main">OpenFabrics Alliance</span> Organization

The OpenFabrics Alliance is a non-profit organization that promotes remote direct memory access (RDMA) switched fabric technologies for server and storage connectivity. These high-speed data-transport technologies are used in high-performance computing facilities, in research and various industries.

<span class="mw-page-title-main">Fabric computing</span>

Fabric computing or unified computing involves constructing a computing fabric consisting of interconnected nodes that look like a weave or a fabric when seen collectively from a distance.

<span class="mw-page-title-main">Chelsio Communications</span> American technology company

Chelsio Communications is a privately held technology company headquartered in Sunnyvale, California with a design center in Bangalore, India. Early venture capital funding came from Horizons Ventures, Invesco, Investor Growth Capital, NTT Finance, Vendanta Capital, Abacus Capital Group, Pacesetter Capital Group, and New Enterprise Associates. A third round of funding raised $25 million in late 2004. LSI Corporation was added as investor in 2006 in the series D round. By January 2008, a $25M financing round was announced as series E. In 2009, an additional $17M was raised from previous investors plus Mobile Internet Capital.

RDMA over Converged Ethernet (RoCE) or InfiniBand over Ethernet (IBoE) is a network protocol which allows remote direct memory access (RDMA) over an Ethernet network. It does this by encapsulating an InfiniBand (IB) transport packet over Ethernet. There are multiple RoCE versions. RoCE v1 is an Ethernet link layer protocol and hence allows communication between any two hosts in the same Ethernet broadcast domain. RoCE v2 is an internet layer protocol which means that RoCE v2 packets can be routed. Although the RoCE protocol benefits from the characteristics of a converged Ethernet network, the protocol can also be used on a traditional or non-converged Ethernet network.

<span class="mw-page-title-main">Xeon Phi</span> Series of x86 manycore processors from Intel

Xeon Phi is a discontinued series of x86 manycore processors designed and made by Intel. It was intended for use in supercomputers, servers, and high-end workstations. Its architecture allowed use of standard programming languages and application programming interfaces (APIs) such as OpenMP.

<span class="mw-page-title-main">Mellanox Technologies</span> Israeli-American multinational supplier of computer networking products

Mellanox Technologies Ltd. was an Israeli-American multinational supplier of computer networking products based on InfiniBand and Ethernet technology. Mellanox offered adapters, switches, software, cables and silicon for markets including high-performance computing, data centers, cloud computing, computer data storage and financial services.

Bright Computing, Inc. is a developer of software for deploying and managing high-performance (HPC) clusters, Kubernetes clusters, and OpenStack private clouds in on-premises data centers as well as in the public cloud.

The Gen-Z Consortium is a trade group of technology vendors involved in designing CPUs, random access memory, servers, storage, and accelerators. The goal was to design an open and royalty-free "memory-semantic" bus protocol, which is not limited by the memory controller of a CPU, to be used in either a switched fabric or a point-to-point device link on a standard connector.

<span class="mw-page-title-main">Singularity (software)</span> Free, cross-platform and open-source computer program

Singularity is a free and open-source computer program that performs operating-system-level virtualization also known as containerization.

Compute Express Link (CXL) is an open standard for high-speed, high capacity central processing unit (CPU)-to-device and CPU-to-memory connections, designed for high performance data center computers. CXL is built on the serial PCI Express (PCIe) physical and electrical interface and includes PCIe-based block input/output protocol (CXL.io) and new cache-coherent protocols for accessing system memory (CXL.cache) and device memory (CXL.mem). The serial communication and pooling capabilities allows CXL memory to overcome performance and socket packaging limitations of common DIMM memory when implementing high storage capacities.

Inspur Server Series is a series of server computers introduced in 1993 by Inspur, an information technology company, and later expanded to the international markets. The servers were likely among the first originally manufactured by a Chinese company. It is currently developed by Inspur Information and its San Francisco-based subsidiary company - Inspur Systems, both Inspur's spinoff companies. The product line includes GPU Servers, Rack-mounted servers, Open Computing Servers and Multi-node Servers.

<span class="mw-page-title-main">Leonardo (supercomputer)</span> Supercomputer in Italy

Leonardo is a petascale supercomputer located at the CINECA datacenter in Bologna, Italy. The system consists of an Atos BullSequana XH2000 computer, with close to 14,000 Nvidia Ampere GPUs and 200 Gbit/s Nvidia Mellanox HDR InfiniBand connectivity. Inaugurated in November 2022, Leonardo is capable of 250 petaflops, making it one of the top five fastest supercomputers in the world. It debuted on the TOP500 in November 2022 ranking fourth in the world, and second in Europe.

References

  1. "Intel Architects High Performance Computing System Designs to Bring Power of Supercomputing Mainstream". Intel. 16 November 2015. Retrieved 28 March 2016.
  2. "Intel Reveals Details for Future High-Performance Computing System Building Blocks as Momentum Builds for Intel Xeon Phi Product". Intel. 17 November 2015. Retrieved 28 March 2016.
  3. Richard Chirgwin (17 November 2015). "Intel's Omni-Path InfiniBand-killer debuts at sizzling 100 Gb/sec". The Register. Retrieved 3 April 2016.
  4. Timothy Prickett Morgan (16 November 2015). "Intel Rounds Out Scalable Systems With Omni-Path". The Next Platform. Retrieved 10 December 2020.
  5. "Intel Announces New Details for Future HPC Products and Extended Industry Collaborations at ISC 2015" (PDF). Intel. 13 July 2015. Retrieved 28 March 2016.
  6. Intel OMNI-PATH EDGE SWITCH PRODUCTS: Intel Fabric Products
  7. Weiny, Ira (5 April 2016). "Extending RDMA for Alternative Fabrics" (PDF). 12th Annual OpenFabrics Workshop.
  8. Shah, Agam (11 October 2016). "Hardware makers unite to challenge Intel with Gen-Z spec". cio.com. CIO. Retrieved 11 October 2016.
  9. "Dell Networking H-Series Edge Switches based on the Intel Omni-Path Architecture". Archived from the original on April 18, 2017. Retrieved September 25, 2021.
  10. Martin, Dylan (31 July 2019). "Intel Kills 2nd-Gen Omni-Path Interconnect For HPC, AI Workloads". CRN. Retrieved 6 September 2019.
  11. "Intel Confirms Retreat on Omni-Path". HPCwire. 1 August 2019. Retrieved 6 September 2019.
  12. "Intel Omni-Path Business Spun Out as Cornelis Networks". HPCwire. 30 September 2020. Retrieved 6 November 2020.
  13. "Omni-Path HPC Interconnect Reemerges as Intel Spin-out with $20M Investment Round from Intel Capital, Others". HPCwire. 30 September 2020. Retrieved 6 November 2020.
  14. "Intel® Fabric Products". HPCwire. 30 September 2020. Retrieved 6 November 2020.
  15. "Cornelis Unveils Ambitious Omni-Path Interconnect Roadmap". The Next Platform. 2021-08-24. Retrieved 2023-10-30.
  16. Smith, Brian (2023-08-22). "Omni-Path and the Open Fabrics Interfaces" (PDF). 11th Annual MVAPICH User Group (MUG) Conference. Retrieved 2023-10-30.