Blackwell (microarchitecture)

Blackwell
Launched	Q4 2024
Designed by	Nvidia
Manufactured by	TSMC ;
Fabrication process	TSMC 4NP (Datacenter) ; TSMC 4N (Consumer)
Codenames	GB100 ; GB20x
Product Series
Desktop	GeForce RTX 50 series ;
Professional/workstation	RTX PRO Blackwell series;
Specifications
Memory support	GDDR7 (Consumer) ; HBM3E (Datacenter)
PCIe support	PCIe 5.0 (Consumer) ; PCIe 6.0 (Datacenter)
Supported Graphics APIs
DirectX	DirectX 12 Ultimate (Feature Level 12_2)
Direct3D	Direct3D 12
Shader Model	Shader Model 6.8
OpenGL	OpenGL 4.6
Vulkan	Vulkan 1.4
Supported Compute APIs
OpenCL	OpenCL 3.0 (64-bit only, 32-bit support removed)
CUDA	Compute Capability 10.x (64-bit only, 32-bit support removed); Compute Capability 12.x (64-bit only)
DirectCompute	Yes
Media Engine
Encoder supported	NVENC
History
Predecessor	Ada Lovelace (consumer); Hopper (datacenter)
Successor	Rubin

Last updated February 27, 2026

Blackwell is a graphics processing unit (GPU) microarchitecture developed by Nvidia as the successor to the Hopper and Ada Lovelace microarchitectures.

Named after statistician and mathematician David Blackwell, the name of the Blackwell architecture was leaked in 2022 with the B40 and B100 accelerators being confirmed in October 2023 with an official Nvidia roadmap shown during an investors presentation.^[4] It was officially announced at Nvidia's GTC 2024 keynote on March 18, 2024.^[5]

History

In March 2022, Nvidia announced the Hopper datacenter architecture for AI accelerators. Demand for Hopper products was high throughout 2023's AI hype.^[6] The lead time from order to delivery of H100-based servers was between 36 and 52 weeks due to shortages and high demand.^[7] Nvidia reportedly sold 500,000 Hopper-based H100 accelerators in Q3 2023 alone.^[7] Nvidia's AI dominance with Hopper products led to the company increasing its market capitalization to over $2 trillion, behind only Microsoft and Apple.^[8]

The Blackwell architecture is named after American mathematician David Blackwell who was known for his contributions to the mathematical fields of game theory, probability theory, information theory, and statistics. These areas have influenced or are implemented in transformer-based generative AI model designs or their training algorithms. Blackwell was the first African American scholar to be inducted into the National Academy of Sciences.^[9]

In Nvidia's October 2023 Investor Presentation, its datacenter roadmap was updated to include reference to its B100 and B40 accelerators and the Blackwell architecture.^[10]^[11] Previously, the successor to Hopper was simply named on roadmaps as "Hopper-Next". Nvidia's updated roadmap emphasized the move from a two-year release cadence for datacenter products to yearly releases targeted for x86 and ARM systems.

At the Graphics Technology Conference (GTC) on March 18, 2024, Nvidia officially announced the Blackwell architecture with focus placed on its B100 and B200 datacenter accelerators and associated products, such as the eight-GPU HGX B200 board and the 72-GPU NVL72 rack-scale system.^[12] Nvidia CEO Jensen Huang said that with Blackwell, "we created a processor for the generative AI era" and emphasized the overall Blackwell platform combining Blackwell accelerators with Nvidia's ARM-based Grace CPU.^[13]^[14] Nvidia touted endorsements of Blackwell from the CEOs of Google, Meta, Microsoft, OpenAI and Oracle.^[14] The keynote did not mention gaming.

It was reported in October 2024 that there was a design flaw in the Blackwell architecture that had been fixed in collaboration with TSMC.^[15] According to Huang, the design flaw was "functional" and "caused the yield[s] to be low".^[16] By November 2024, Morgan Stanley was reporting that "the entire 2025 production" of Blackwell silicon was "already sold out".^[17]

During the company's CES 2025 keynote, Nvidia announced that the foundation models for Blackwell will include models from Black Forest Labs (Flux), Meta AI, Mistral AI, and Stability AI.^[18]

Architecture

Blackwell is an architecture designed for both datacenter compute applications, and for gaming and workstation applications with dedicated dies for each purpose.

Process node

Blackwell is fabricated on the custom 4NP process node for datacenter products, and on the custom 4N process node for consumer products, from TSMC. 4NP is an enhancement of the 4N node used for the Hopper and Ada Lovelace architectures. The Nvidia-specific 4NP process likely adds metal layers to the standard TSMC N4P technology.^[19] The GB100 die contains 104 billion transistors, a 30% increase over the 80 billion transistors in the previous generation Hopper GH100 die.^[20] As Blackwell cannot reap the benefits that come with a major process node advancement, it must achieve power efficiency and performance gains through underlying architectural changes.^[21]

The GB100 die is at the reticle limit of semiconductor fabrication.^[22] The reticle limit in semiconductor fabrication is the maximum size of features that lithography machines can etch into a silicon die. Previously, Nvidia had nearly hit TSMC's reticle limit with GH100's 814 mm² die. In order to not be constrained by die size, Nvidia's B100 accelerator utilizes two GB100 dies in a single package, connected with a 10 TB/s link that Nvidia calls the NV-High Bandwidth Interface (NV-HBI). NV-HBI is based on the NVLink 7 protocol. Nvidia CEO Jensen Huang said in an interview with CNBC that Nvidia had spent around $10 billion in research and development for Blackwell's NV-HBI die interconnect. Veteran semiconductor engineer Jim Keller, who had worked on AMD's K7, K12 and Zen architectures, criticized this figure and claimed that the same outcome could be achieved for $1 billion through using Ultra Ethernet rather than the proprietary NVLink system.^[23] The two connected GB100 dies are able to act like a large monolithic piece of silicon with full cache coherency between both dies.^[24] The dual die package totals 208 billion transistors.^[22] Those two GB100 dies are placed on top of a silicon interposer produced using TSMC's CoWoS-L 2.5D packaging technique.^[25]

On the consumer side, Blackwell's largest die, GB202, measures in at 750mm² which is 20% larger than AD102, Ada Lovelace's largest die.^[26] GB202 contains a total of 24,576 CUDA cores, 28.5% more than the 18,432 CUDA cores in AD102. GB202 is the largest consumer die designed by Nvidia since the 754mm² TU102 die in 2018, based on the Turing microarchitecture. The gap between GB202 and GB203 has also gotten much wider compared to previous generations. GB202 features more than double the number of CUDA cores than GB203 which was not the case with AD102 over AD103.

Streaming multiprocessor

CUDA cores

CUDA Compute Capability 10.0 and Compute Capability 12.0 are added with Blackwell.^[27]

Tensor Cores

The Blackwell architecture introduces fifth-generation Tensor Cores for AI compute and performing floating-point calculations. In the data center, Blackwell adds native support for sub-8-bit data types, including new Open Compute Project (OCP) community-defined MXFP6 and MXFP4 microscaling formats to improve efficiency and accuracy in low-precision computations.^[28]^[29]^[30]^[31]^[32] The previous Hopper architecture introduced the Transformer Engine, software to facilitate quantization of higher-precision models (e.g., FP32) to lower precision, for which Hopper has greater throughput. Blackwell's second-generation Transformer Engine adds support for MXFP4 and MXFP6. Using 4-bit data allows greater efficiency and throughput for model inference during generative AI training. Nvidia claims 20 petaflops (excluding the 2x gain the company claims for sparsity) of FP4 compute for the dual-GPU GB200 superchip.^[33]

Ray Tracing Cores

The fourth generation of ray tracing cores are introduced in Blackwell and include a new Triangle Cluster Intersection Engine for Mega Geometry and Linear Swept Spheres for accelerated ray tracing of finer details, like hair.^[2]

AI Management Processor

Blackwell introduces an AI Management Processor (AMP), a dedicated scheduler chip on the GPU built on RISC-V.^[2] It is designed to offload scheduling from the CPU to a greater degree than what previous generations did and helps the GPU better control its own resources. It is utilized through Windows Hardware-Accelerated GPU Scheduling (HAGS).

Blackwell dies

Datacenter

Die		GB100	GB102	GB200
Variant(s)		Unknown	Unknown	Unknown
Release date		Dec 2024	Nov 2024	Unknown
Cores	CUDA Cores	18,432
	TMUs	576
	ROPs	24
	RT Cores	Unknown	Unknown	Unknown
	Tensor Cores	576
Streaming Multiprocessors		Unknown	Unknown	Unknown
Cache	L1	8.25 MB
Cache	L2	60 MB
Memory interface		8192-bit
Die size		Unknown	Unknown	Unknown
Transistor count		104 bn.
Transistor density		Unknown	Unknown	Unknown
Package socket		SXM6
Products		B200 SXM 192GB	B100	Unknown

Consumer

Die		GB10	GB202	GB203	GB205	GB206	GB207
Variant(s)			GB202-300-A1	GB203-200-A1 GB203-300-A1 GB203-400-A1	GB205-300-A1	GB206-250-A1 GB206-300-A1	GB207-300-A1
Release date		Oct 15th, 2025	Jan 30, 2025	Jan 30, 2025	Mar 4, 2025	Apr 16, 2025	Jun 24, 2025
Cores	CUDA Cores	6,144	24,576	10,752	6,400	4,608	2,560
	TMUs	384	768	336	200	144	80
	ROPs	48	192	112	80	48	32
	RT Cores	48	192	84	50	36	20
	Tensor Cores	384	768	336	200	144	80
SMs		48	192	84	50	36	20
GPCs			12	7	5	3	2
Cache	L1	128KB (per SM)	24 MB	10.5 MB	6.25 MB	4.5 MB	2.5 MB
Cache	L2	50 MB	128 MB	64 MB	48 MB	32 MB	32 MB
Memory interface		256-bit	512-bit	256-bit	192-bit	128-bit	128-bit
Die size		Unknown	750 mm²	378 mm²	263 mm²	181 mm²	149 mm²
Transistor count		Unknown	92.2 bn.	45.6 bn.	31.1 bn.	21.9 bn.	16.9 bn.
Transistor density		Unknown	122.6 MTr/mm²	120.6 MTr/mm²	118.3 MTr/mm²	121.0 MTr/mm²	113.4 MTr/mm²
Products
Consumer	Desktop	N/A	RTX 5090 RTX 5090 D	RTX 5070 Ti RTX 5080	RTX 5070	RTX 5060 RTX 5060 Ti	RTX 5050
Consumer	Mobile	N/A	RTX 5070 Ti Laptop	RTX 5060 Laptop RTX 5070 Laptop	RTX 5050 Laptop
Workstation	Desktop	DGX Spark GB20B	RTX PRO 5000 RTX PRO 6000	RTX PRO 4000 RTX PRO 4500	N/a	N/a	N/a
Workstation	Mobile	N/A	RTX PRO 3000 Mobile	RTX PRO 2000 Mobile	RTX PRO 500 Mobile RTX PRO 1000 Mobile
Server			RTX PRO 6000	N/a	N/a	N/a	N/a

References

↑ "NVIDIA Blackwell Architecture Technical Brief". NVIDIA. Retrieved February 2, 2025.
1 2 3 "NVIDIA RTX BLACKWELL GPU ARCHITECTURE" (PDF). NVIDIA. Retrieved February 2, 2025.
1 2 3 Deakin, Daniel R (February 26, 2025). "Strange GeForce RTX 50-series benchmark issues solved but new problem for gamers arises". Notebookcheck. Retrieved February 26, 2025.
↑ "Nvidia Corporation - Nvidia Investor Presentation October 2023". Nvidia. Retrieved March 19, 2024.
↑ "Nvidia Blackwell Platform Arrives to Power a New Era of Computing". Nvidia Newsroom. Retrieved March 19, 2024.
↑ Szewczyk, Chris (August 18, 2023). "The AI hype means Nvidia is making shiploads of cash". Tom's Hardware. Retrieved March 24, 2024.
1 2 Shilov, Anton (November 28, 2023). "Nvidia sold half a million H100 AI GPUs in Q3 thanks to Meta, Facebook — lead times stretch up to 52 weeks: Report". Tom's Hardware. Retrieved March 24, 2024.
↑ King, Ian (March 19, 2024). "Nvidia Looks to Extend AI Dominance With New Blackwell Chips". Yahoo! Finance. Retrieved March 24, 2024.
↑ Lee, Jane Lanhee (March 19, 2024). "Why Nvidia's New Blackwell Chip Is Key to the Next Stage of AI". Bloomberg. Retrieved March 24, 2024.
↑ "Investor Presentation" (PDF). Nvidia. October 2023. Retrieved March 24, 2024.
↑ Garreffa, Anthony (October 10, 2023). "Nvidia's next-gen GB200 'Blackwell' GPU listed on its 2024 data center roadmap". TweakTown. Retrieved March 24, 2024.
↑ "Nvidia GB200 NVL72". Nvidia. Retrieved July 4, 2024.
↑ Leswing, Kif (March 18, 2024). "Nvidia CEO Jensen Huang announces new AI chips: 'We need bigger GPUs'". CNBC. Retrieved March 24, 2024.
1 2 Caulfield, Brian (March 18, 2024). "'We Created a Processor for the Generative AI Era,' Nvidia CEO Says". Nvidia. Retrieved March 24, 2024.
↑ Gronholt-Pedersen, Jacob; Mukherjee, Supantha (October 23, 2024). "Nvidia's design flaw with Blackwell AI chips now fixed, CEO says". Reuters. Retrieved December 17, 2024.
↑ Shilov, Anton (October 23, 2024). "Nvidia's Jensen Huang admits AI chip design flaw was '100% Nvidia's fault' — TSMC not to blame, now-fixed Blackwell chips are in production". Tom's Hardware. Retrieved December 17, 2024.
↑ Kahn, Jeremy (November 12, 2024). "60 direct reports, but no 1-on-1 meetings: How an unconventional leadership style helped Jensen Huang of Nvidia become one of the most powerful people in business" . Fortune. Retrieved November 16, 2024.
↑ Takahashi, Dean (January 7, 2025). "Nvidia unveils AI foundation models running on RTX AI PCs". VentureBeat. Retrieved January 19, 2025.
↑ Byrne, Joseph (March 28, 2024). "Monster Nvidia Blackwell GPU Promises 30× Speedup, but Expect 3×". XPU.pub. Retrieved July 4, 2024.
↑ Smith, Ryan (March 18, 2024). "Nvidia Blackwell Architecture and B200/B100 Accelerators Announced: Going Bigger With Smaller Data". AnandTech. Archived from the original on March 18, 2024. Retrieved March 24, 2024.
↑ Prickett Morgan, Timothy (March 18, 2024). "With Blackwell GPUs, AI Gets Cheaper and Easier, Competing with Nvidia Gets Harder". The Next Platform. Retrieved March 24, 2024.
1 2 "Nvidia Blackwell Platform Arrives to Power a New Era of Computing". Nvidia Newsroom. March 18, 2024. Retrieved March 24, 2024.
↑ Garreffa, Anthony (April 14, 2024). "Jim Keller laughs at $10B R&D cost for Nvidia Blackwell, should've used ethernet for $1B". TweakTown. Retrieved April 16, 2024.
↑ Hagedoom, Hilbert (March 18, 2024). "Nvidia B200 and GB200 AI GPUs Technical Overview: Unveiled at GTC 2024". Guru3D. Retrieved April 7, 2024.
↑ "Nvidia Blackwell "B100" to feature 2 dies and 192GB of HBM3e memory, B200 with 288GB". VideoCardz. March 17, 2024. Retrieved March 24, 2024.
↑ "Nvidia GeForce RTX 5090 GB202 GPU die reportedly measures 744 mm2, 20% larger than AD102". VideoCardz. November 22, 2024. Retrieved January 7, 2025.
↑ "CUDA C Programming Guide". Nvidia. Retrieved January 28, 2025.
↑ Edwards, Benj (March 18, 2024). "Nvidia unveils Blackwell B200, the "world's most powerful chip" designed for AI". Ars Technica. Retrieved March 24, 2024.
↑ "Blackwell Architecture". Nvidia. Retrieved February 5, 2025.
↑ Rouhani, Bita Darvish; Zhao, Ritchie; More, Ankit; Hall, Mathew; Khodamoradi, Alireza; Deng, Summer; Choudhary, Dhruv; Cornea, Marius; Dellinger, Eric; Denolf, Kristof (2023). "Microscaling Data Formats for Deep Learning". arXiv: 2310.10537 [cs.LG].
↑ "OCP Microscaling Formats (MX) v1.0 Specification". Open Compute Project. 2024. Retrieved February 5, 2025.
↑ "OpenAI Triton on NVIDIA Blackwell Boosts AI Performance and Programmability". NVIDIA Developer Blog. NVIDIA. 2024. Retrieved February 5, 2025.
↑ "Nvidia GB200 NVL72". Nvidia. Retrieved July 4, 2024.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] "NVIDIA Blackwell Architecture Technical Brief". NVIDIA. Retrieved February 2, 2025.

[arch-2] 1 2 3 "NVIDIA RTX BLACKWELL GPU ARCHITECTURE" (PDF). NVIDIA. Retrieved February 2, 2025.

[nbcheck-3] 1 2 3 Deakin, Daniel R (February 26, 2025). "Strange GeForce RTX 50-series benchmark issues solved but new problem for gamers arises". Notebookcheck. Retrieved February 26, 2025.

[4] "Nvidia Corporation - Nvidia Investor Presentation October 2023". Nvidia. Retrieved March 19, 2024.

[5] "Nvidia Blackwell Platform Arrives to Power a New Era of Computing". Nvidia Newsroom. Retrieved March 19, 2024.

[6] Szewczyk, Chris (August 18, 2023). "The AI hype means Nvidia is making shiploads of cash". Tom's Hardware. Retrieved March 24, 2024.

[Shilov-7] 1 2 Shilov, Anton (November 28, 2023). "Nvidia sold half a million H100 AI GPUs in Q3 thanks to Meta, Facebook — lead times stretch up to 52 weeks: Report". Tom's Hardware. Retrieved March 24, 2024.

[8] King, Ian (March 19, 2024). "Nvidia Looks to Extend AI Dominance With New Blackwell Chips". Yahoo! Finance. Retrieved March 24, 2024.

[9] Lee, Jane Lanhee (March 19, 2024). "Why Nvidia's New Blackwell Chip Is Key to the Next Stage of AI". Bloomberg. Retrieved March 24, 2024.

[10] "Investor Presentation" (PDF). Nvidia. October 2023. Retrieved March 24, 2024.

[11] Garreffa, Anthony (October 10, 2023). "Nvidia's next-gen GB200 'Blackwell' GPU listed on its 2024 data center roadmap". TweakTown. Retrieved March 24, 2024.

[12] "Nvidia GB200 NVL72". Nvidia. Retrieved July 4, 2024.

[13] Leswing, Kif (March 18, 2024). "Nvidia CEO Jensen Huang announces new AI chips: 'We need bigger GPUs'". CNBC. Retrieved March 24, 2024.

[Caufield-14] 1 2 Caulfield, Brian (March 18, 2024). "'We Created a Processor for the Generative AI Era,' Nvidia CEO Says". Nvidia. Retrieved March 24, 2024.

[15] Gronholt-Pedersen, Jacob; Mukherjee, Supantha (October 23, 2024). "Nvidia's design flaw with Blackwell AI chips now fixed, CEO says". Reuters. Retrieved December 17, 2024.

[16] Shilov, Anton (October 23, 2024). "Nvidia's Jensen Huang admits AI chip design flaw was '100% Nvidia's fault' — TSMC not to blame, now-fixed Blackwell chips are in production". Tom's Hardware. Retrieved December 17, 2024.

[Kahn-17] Kahn, Jeremy (November 12, 2024). "60 direct reports, but no 1-on-1 meetings: How an unconventional leadership style helped Jensen Huang of Nvidia become one of the most powerful people in business" . Fortune. Retrieved November 16, 2024.

[18] Takahashi, Dean (January 7, 2025). "Nvidia unveils AI foundation models running on RTX AI PCs". VentureBeat. Retrieved January 19, 2025.

[19] Byrne, Joseph (March 28, 2024). "Monster Nvidia Blackwell GPU Promises 30× Speedup, but Expect 3×". XPU.pub. Retrieved July 4, 2024.

[20] Smith, Ryan (March 18, 2024). "Nvidia Blackwell Architecture and B200/B100 Accelerators Announced: Going Bigger With Smaller Data". AnandTech. Archived from the original on March 18, 2024. Retrieved March 24, 2024.

[21] Prickett Morgan, Timothy (March 18, 2024). "With Blackwell GPUs, AI Gets Cheaper and Easier, Competing with Nvidia Gets Harder". The Next Platform. Retrieved March 24, 2024.

[Nvidia_2024-03-18-22] 1 2 "Nvidia Blackwell Platform Arrives to Power a New Era of Computing". Nvidia Newsroom. March 18, 2024. Retrieved March 24, 2024.

[23] Garreffa, Anthony (April 14, 2024). "Jim Keller laughs at $10B R&D cost for Nvidia Blackwell, should've used ethernet for $1B". TweakTown. Retrieved April 16, 2024.

[24] Hagedoom, Hilbert (March 18, 2024). "Nvidia B200 and GB200 AI GPUs Technical Overview: Unveiled at GTC 2024". Guru3D. Retrieved April 7, 2024.

[25] "Nvidia Blackwell "B100" to feature 2 dies and 192GB of HBM3e memory, B200 with 288GB". VideoCardz. March 17, 2024. Retrieved March 24, 2024.

[26] "Nvidia GeForce RTX 5090 GB202 GPU die reportedly measures 744 mm2, 20% larger than AD102". VideoCardz. November 22, 2024. Retrieved January 7, 2025.

[27] "CUDA C Programming Guide". Nvidia. Retrieved January 28, 2025.

[28] Edwards, Benj (March 18, 2024). "Nvidia unveils Blackwell B200, the "world's most powerful chip" designed for AI". Ars Technica. Retrieved March 24, 2024.

[29] "Blackwell Architecture". Nvidia. Retrieved February 5, 2025.

[30] Rouhani, Bita Darvish; Zhao, Ritchie; More, Ankit; Hall, Mathew; Khodamoradi, Alireza; Deng, Summer; Choudhary, Dhruv; Cornea, Marius; Dellinger, Eric; Denolf, Kristof (2023). "Microscaling Data Formats for Deep Learning". arXiv: 2310.10537 [cs.LG].

[31] "OCP Microscaling Formats (MX) v1.0 Specification". Open Compute Project. 2024. Retrieved February 5, 2025.

[32] "OpenAI Triton on NVIDIA Blackwell Boosts AI Performance and Programmability". NVIDIA Developer Blog. NVIDIA. 2024. Retrieved February 5, 2025.

[33] "Nvidia GB200 NVL72". Nvidia. Retrieved July 4, 2024.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

[25]

[26]

[27]

[28]

[29]

[30]

[31]

[32]

[33]