SXM (socket)

Last updated
Computing node of TSUBAME 3.0 supercomputer showing four NVIDIA Tesla P100 SXM modules TSUBAME 3.0 PA075079.jpg
Computing node of TSUBAME 3.0 supercomputer showing four NVIDIA Tesla P100 SXM modules
Bare SXM sockets next to sockets with GPUs installed SXM socket pic.png
Bare SXM sockets next to sockets with GPUs installed

SXM (Server PCI Express Module) [1] is a high bandwidth socket solution for connecting Nvidia Compute Accelerators to a system. Each generation of Nvidia Tesla since P100 models, the DGX computer series and the HGX boards come with an SXM socket type that realizes high bandwidth, power delivery and more for the matching GPU daughter cards. [2] Nvidia offers these combinations as an end-user product e.g. in their models of the DGX system series. Current socket generations are SXM for Pascal based GPUs, SXM2 and SXM3 for Volta based GPUs, SXM4 for Ampere based GPUs, and SXM5 for Hopper based GPUs. These sockets are used for specific models of these accelerators, and offer higher performance per card than PCIe equivalents. [2] The DGX-1 system was the first to be equipped with SXM-2 sockets and thus was the first to carry the form factor compatible SXM modules with P100 GPUs and later was unveiled to be capable of allowing upgrading to (or being pre-equipped with) SXM2 modules with V100 GPUs. [3] [4]

SXM boards are typically built with four or eight GPU slots, although some solutions such as the Nvidia DGX-2 connect multiple boards to deliver high performance. While third party solutions for SXM boards exist, most System Integrators such as Supermicro use prebuilt Nvidia HGX boards, which come in four or eight socket configurations. [5] This solution greatly lowers the cost and difficulty of SXM based GPU servers, and enables compatibility and reliability across all boards of the same generation.

SXM modules on e.g. HGX boards, particularly recent generations, may have NVLink switches to allow faster GPU-to-GPU communication. This as well reduces bottlenecks which would normally be located within CPU and PCIe. [2] [6] The GPUs on the daughter cards are just using NVLink as their main communication protocol. For example a Hopper-based H100 SXM5 based GPU can use up to 900 GB/s of bandwidth across 18 NVLink 4 channels, with each contributing a 50 GB/s of bandwidth; [7] This compared to PCIe 5.0, which can handle up to 64 GB/s of bandwidth within a x16 slot. [8] This high bandwidth also means that GPUs can share memory over the NVLink bus, allowing an entire HGX board to present to the host system as a single, massive GPU. [9]

Power delivery is also handled by the SXM socket, negating the need for external power cables such as those needed in PCIe equivalent cards. This, combined with the horizontal mounting allows cooling options of higher efficiency which in turn allows the SXM based GPUs to operate at a much higher TDP. The Hopper-based H100, for example, can draw up to 700W solely from the SXM socket. [10] The lack of cabling also makes assembling and repairing of large systems much easier, and also reduces the possible points of failure. [2]

The early Nvidia Tegra automotive targeted evaluation board, 'Drive PX2', had two MXM (Mobile PCI Express Module) sockets on both sides of the card, this dual MXM design can be considered a predecessor to the Nvidia Tesla implementation of the SXM socket.

Comparison of accelerators used in DGX: [11] [12] [13]

ModelArchitectureSocketFP32
CUDA
cores
FP64 cores
(excl. tensor)
Mixed
INT32/FP32
cores
INT32
cores
Boost
clock
Memory
clock
Memory
bus width
Memory
bandwidth
VRAMSingle
precision
(FP32)
Double
precision
(FP64)
INT8
(non-tensor)
INT8
dense tensor
INT32FP4
dense tensor
FP16FP16
dense tensor
bfloat16
dense tensor
TensorFloat-32
(TF32)
dense tensor
FP64
dense tensor
Interconnect
(NVLink)
GPUL1 CacheL2 CacheTDPDie sizeTransistor
count
Process
B200 Blackwell N/AN/AN/AN/AN/AN/A8 Gbit/s HBM3e8192-bit8 TB/sec192 GB HBM3eN/AN/AN/A4.5 POPSN/A9 PFLOPSN/A2.2 PFLOPS2.2 PFLOPS1.1 PFLOPS40 TFLOPS1.8 TB/secGB200N/AN/A1000 WN/A208 BTSMC 4NP
B100 Blackwell N/AN/AN/AN/AN/AN/A8 Gbit/s HBM3e8192-bit8 TB/sec192 GB HBM3eN/AN/AN/A3.5 POPSN/A7 PFLOPSN/A1.8 PFLOPS1.8 PFLOPS900 TFLOPS30 TFLOPS1.8 TB/secGB100N/AN/A700 WN/A208 BTSMC 4NP
H200 Hopper SXM516896460816896N/A1980 MHz6.3 Gbit/s HBM3e6144-bit4.8 TB/sec141 GB HBM3e67 TFLOPS34 TFLOPSN/A1.98 POPSN/AN/AN/AN/AN/A989 TFLOPSN/A900 GB/secGH10025344 KB (192 KB × 132)51200 KB700 W814 mm280 BTSMC 4N
H100 Hopper SXM516896460816896N/A1980 MHz5.2 Gbit/s HBM35120-bit3.35 TB/sec80 GB HBM367 TFLOPS34 TFLOPSN/A1.98 POPSN/AN/AN/A990 TFLOPS990 TFLOPS495 TFLOPS67 TFLOPS900 GB/secGH10025344 KB (192 KB × 132)51200 KB700 W814 mm280 BTSMC 4N
A100 80GB Ampere SXM4691234566912N/A1410 MHz3.2 Gbit/s HBM2e5120-bit1.52 TB/sec80 GB HBM2e19.5 TFLOPS9.7 TFLOPSN/A624 TOPS19.5 TOPSN/A78 TFLOPS312 TFLOPS312 TFLOPS156 TFLOPS19.5 TFLOPS600 GB/secGA10020736 KB (192 KB × 108)40960 KB400 W826 mm254.2 BTSMC 7N
A100 40GB Ampere SXM4691234566912N/A1410 MHz2.4 Gbit/s HBM25120-bit1.52 TB/sec40 GB HBM219.5 TFLOPS9.7 TFLOPSN/A624 TOPS19.5 TOPSN/A78 TFLOPS312 TFLOPS312 TFLOPS156 TFLOPS19.5 TFLOPS600 GB/secGA10020736 KB (192 KB × 108)40960 KB400 W826 mm254.2 BTSMC 7N
V100 32GB Volta SXM351202560N/A51201530 MHz1.75 Gbit/s HBM24096-bit900 GB/sec32 GB HBM215.7 TFLOPS7.8 TFLOPS62 TOPSN/A15.7 TOPSN/A31.4 TFLOPS125 TFLOPSN/AN/AN/A300 GB/secGV10010240 KB (128 KB × 80)6144 KB350 W815 mm221.1 BTSMC 12 nm FFN
V100 16GB Volta SXM251202560N/A51201530 MHz1.75 Gbit/s HBM24096-bit900 GB/sec16 GB HBM215.7 TFLOPS7.8 TFLOPS62 TOPSN/A15.7 TOPSN/A31.4 TFLOPS125 TFLOPSN/AN/AN/A300 GB/secGV10010240 KB (128 KB × 80)6144 KB300 W815 mm221.1 BTSMC 12 nm FFN
P100 Pascal SXM/SXM2N/A17923584N/A1480 MHz1.4 Gbit/s HBM24096-bit720 GB/sec16 GB HBM210.6 TFLOPS5.3 TFLOPSN/AN/AN/AN/A21.2 TFLOPSN/AN/AN/AN/A160 GB/secGP1001344 KB (24 KB × 56)4096 KB300 W610 mm215.3 BTSMC 16 nm FinFET+

Related Research Articles

<span class="mw-page-title-main">Scalable Link Interface</span> Brand name; multi-GPU technology by Nvidia

Scalable Link Interface (SLI) is the brand name for a now discontinued multi-GPU technology developed by Nvidia for linking two or more video cards together to produce a single output. SLI is a parallel processing algorithm for computer graphics, meant to increase the available processing power.

<span class="mw-page-title-main">POWER8</span> 2014 family of multi-core microprocessors by IBM

POWER8 is a family of superscalar multi-core microprocessors based on the Power ISA, announced in August 2013 at the Hot Chips conference. The designs are available for licensing under the OpenPOWER Foundation, which is the first time for such availability of IBM's highest-end processors.

<span class="mw-page-title-main">POWER9</span> 2017 family of multi-core microprocessors by IBM

POWER9 is a family of superscalar, multithreading, multi-core microprocessors produced by IBM, based on the Power ISA. It was announced in August 2016. The POWER9-based processors are being manufactured using a 14 nm FinFET process, in 12- and 24-core versions, for scale out and scale up applications, and possibly other variations, since the POWER9 architecture is open for licensing and modification by the OpenPOWER Foundation members.

<span class="mw-page-title-main">Nvidia Tesla</span> Nvidias line of general purpose GPUs

Nvidia Tesla is the former name for a line of products developed by Nvidia targeted at stream processing or general-purpose graphics processing units (GPGPU), named after pioneering electrical engineer Nikola Tesla. Its products began using GPUs from the G80 series, and have continued to accompany the release of new chips. They are programmable using the CUDA or OpenCL APIs.

<span class="mw-page-title-main">NVLink</span> High speed chip interconnect

NVLink is a wire-based serial multi-lane near-range communications link developed by Nvidia. Unlike PCI Express, a device can consist of multiple NVLinks, and devices use mesh networking to communicate instead of a central hub. The protocol was first announced in March 2014 and uses a proprietary high-speed signaling interconnect (NVHS).

<span class="mw-page-title-main">Pascal (microarchitecture)</span> GPU microarchitecture by Nvidia

Pascal is the codename for a GPU microarchitecture developed by Nvidia, as the successor to the Maxwell architecture. The architecture was first introduced in April 2016 with the release of the Tesla P100 (GP100) on April 5, 2016, and is primarily used in the GeForce 10 series, starting with the GeForce GTX 1080 and GTX 1070, which were released on May 17, 2016, and June 10, 2016, respectively. Pascal was manufactured using TSMC's 16 nm FinFET process, and later Samsung's 14 nm FinFET process.

<span class="mw-page-title-main">Volta (microarchitecture)</span> GPU microarchitecture by Nvidia

Volta is the codename, but not the trademark, for a GPU microarchitecture developed by Nvidia, succeeding Pascal. It was first announced on a roadmap in March 2013, although the first product was not announced until May 2017. The architecture is named after 18th–19th century Italian chemist and physicist Alessandro Volta. It was Nvidia's first chip to feature Tensor Cores, specially designed cores that have superior deep learning performance over regular CUDA cores. The architecture is produced with TSMC's 12 nm FinFET process. The Ampere microarchitecture is the successor to Volta.

<span class="mw-page-title-main">High Bandwidth Memory</span> Type of memory used on processors that require high transfer rate memory

High Bandwidth Memory (HBM) is a computer memory interface for 3D-stacked synchronous dynamic random-access memory (SDRAM) initially from Samsung, AMD and SK Hynix. It is used in conjunction with high-performance graphics accelerators, network devices, high-performance datacenter AI ASICs, as on-package cache in CPUs and on-package RAM in upcoming CPUs, and FPGAs and in some supercomputers. The first HBM memory chip was produced by SK Hynix in 2013, and the first devices to use HBM were the AMD Fiji GPUs in 2015.

<span class="mw-page-title-main">Nvidia DGX</span> Line of Nvidia produced servers and workstations

The Nvidia DGX represents a series of servers and workstations designed by Nvidia, primarily geared towards enhancing deep learning applications through the use of General-Purpose computing on Graphics Processing Units (GPGPU). These systems typically come in a rackmount format featuring high-performance x86 server CPUs on the motherboard.

Coherent Accelerator Processor Interface (CAPI), is a high-speed processor expansion bus standard for use in large data center computers, initially designed to be layered on top of PCI Express, for directly connecting central processing units (CPUs) to external accelerators like graphics processing units (GPUs), ASICs, FPGAs or fast storage. It offers low latency, high speed, direct memory access connectivity between devices of different instruction set architectures.

<span class="mw-page-title-main">Ampere (microarchitecture)</span> GPU microarchitecture by Nvidia

Ampere is the codename for a graphics processing unit (GPU) microarchitecture developed by Nvidia as the successor to both the Volta and Turing architectures. It was officially announced on May 14, 2020 and is named after French mathematician and physicist André-Marie Ampère.

Compute Express Link (CXL) is an open standard for high-speed, high capacity central processing unit (CPU)-to-device and CPU-to-memory connections, designed for high performance data center computers. CXL is built on the serial PCI Express (PCIe) physical and electrical interface and includes PCIe-based block input/output protocol (CXL.io) and new cache-coherent protocols for accessing system memory (CXL.cache) and device memory (CXL.mem). The serial communication and pooling capabilities allows CXL memory to overcome performance and socket packaging limitations of common DIMM memory when implementing high storage capacities.

<span class="mw-page-title-main">Hopper (microarchitecture)</span> GPU microarchitecture designed by Nvidia

Hopper is a graphics processing unit (GPU) microarchitecture developed by Nvidia. It is designed for datacenters and is parallel to Ada Lovelace. It's the latest generation of Nvidia Tesla.

Christofari — are Christofari (2019), Christofari Neo (2021) supercomputers of Sberbank based on Nvidia corporation hardware Sberbank of Russia and Nvidia. Their main purpose is neural network learning. They are also used for scientific research and commercial calculations.

Inspur Server Series is a series of server computers introduced in 1993 by Inspur, an information technology company, and later expanded to the international markets. The servers were likely among the first originally manufactured by a Chinese company. It is currently developed by Inspur Information and its San Francisco-based subsidiary company - Inspur Systems, both Inspur's spinoff companies. The product line includes GPU Servers, Rack-mounted servers, Open Computing Servers and Multi-node Servers.

Selene is a supercomputer developed by Nvidia, capable of achieving 63.460 petaflops, ranking as the fifth fastest supercomputer in the world, when it entered the list. Selene is based on the Nvidia DGX system consisting of AMD CPUs, Nvidia A100 GPUs, and Mellanox HDDR networking. Selene is based on the Nvidia DGX Superpod, which is a high performance turnkey supercomputer solution provided by Nvidia using DGX hardware. DGX Superpod is a tightly integrated system that combines high performance DGX compute nodes with fast storage and high bandwidth networking. It aims to provide a turnkey solution to high-demand machine learning workloads. Selene was built in three months and is the fastest industrial system in the US while being the second-most energy-efficient supercomputing system ever.

Nvidia BlueField is a line of data processing units (DPUs) designed and produced by Nvidia. Initially developed by Mellanox Technologies, the BlueField IP was acquired by Nvidia in March 2019, when Nvidia acquired Mellanox Technologies for US$6.9 billion. The first Nvidia produced BlueField cards, named BlueField-2, were shipped for review shortly after their announcement at VMworld 2019, and were officially launched at GTC 2020. Also launched at GTC 2020 was the Nvidia BlueField-2X, an Nvidia BlueField card with an Ampere generation graphics processing unit (GPU) integrated onto the same card. BlueField-3 and BlueField-4 DPUs were first announced at GTC 2021, with the tentative launch dates for these cards being 2022 and 2024 respectively.

<span class="mw-page-title-main">CDNA (microarchitecture)</span> AMD compute-focused GPU microarchitecture

CDNA is a compute-centered graphics processing unit (GPU) microarchitecture designed by AMD for datacenters. Mostly used in the AMD Instinct line of data center graphics cards, CDNA is a successor to the Graphics Core Next (GCN) microarchitecture; the other successor being RDNA, a consumer graphics focused microarchitecture.

Blackwell is a graphics processing unit (GPU) microarchitecture developed by Nvidia as the successor to the Hopper and Ada Lovelace microarchitectures.

References

  1. Michael Brown, W.; et al. (2012). "An Evaluation of Molecular Dynamics Performance on the Hybrid Cray XK6 Supercomputer". Procedia Computer Science. 9: 186–195. doi: 10.1016/j.procs.2012.04.020 .
  2. 1 2 3 4 Proud, Matt. "Achieving Maximum Compute Throughput: PCIe vs. SXM2". The Next Platform. Retrieved 2022-03-31.
  3. Volta architecture whitepaper nvidia.com
  4. DGX 1 User Guide nvidia.com
  5. servethehome (2020-05-14). "NVIDIA A100 4x GPU HGX Redstone Platform". ServeTheHome. Retrieved 2022-03-31.
  6. "NVLink & NVSwitch for Advanced Multi-GPU Communication". NVIDIA.
  7. "Nvidia's H100 – What It Is, What It Does, and Why It Matters". Data Center Knowledge | News and analysis for the data center industry. 2022-03-23. Retrieved 2022-03-31.
  8. "Is PCIe 5.0 Worth It? The Benefits of PCIe 5.0 (2022)". www.techreviewer.com. Retrieved 2022-03-31.
  9. "NVIDIA HGX A100: Powered by A100 GPUs and NVSwitch". NVIDIA. Retrieved 2022-03-31.
  10. "NVIDIA H100 GPU full details: TSMC N4, HBM3, PCIe 5.0, 700W TDP, more". TweakTown. 2022-03-23. Retrieved 2022-03-31.
  11. Smith, Ryan (March 22, 2022). "NVIDIA Hopper GPU Architecture and H100 Accelerator Announced: Working Smarter and Harder". AnandTech.
  12. Smith, Ryan (May 14, 2020). "NVIDIA Ampere Unleashed: NVIDIA Announces New GPU Architecture, A100 GPU, and Accelerator". AnandTech.
  13. "NVIDIA Tesla V100 tested: near unbelievable GPU power". TweakTown. September 17, 2017.