Launched | October 12, 2022 |
---|---|
Designed by | Nvidia |
Manufactured by | |
Fabrication process | TSMC 4N |
Codename(s) | AD10x |
Product Series | |
Desktop | |
Professional/workstation |
|
Server/datacenter | |
Specifications | |
Clock rate | 735 MHz to 2640 MHz |
L1 cache | 128 KB (per SM) |
L2 cache | 32 MB to 96 MB |
Memory support | |
Memory clock rate | 21-23 Gbit/s |
PCIe support | PCIe 4.0 |
Supported Graphics APIs | |
DirectX | DirectX 12 Ultimate (Feature Level 12_2) |
Direct3D | Direct3D 12 |
Shader Model | Shader Model 6.8 |
OpenCL | OpenCL 3.0 |
OpenGL | OpenGL 4.6 |
CUDA | Compute Capability 8.9 |
Vulkan | Vulkan 1.3 |
Supported Compute APIs | |
CUDA | CUDA Toolkit 11.6 |
DirectCompute | Yes |
Media Engine | |
Encode codecs | |
Decode codecs | |
Color bit-depth |
|
Encoder(s) supported | NVENC |
Display outputs | |
History | |
Predecessor | Ampere |
Variant | Hopper (datacenter) |
Successor | Blackwell |
Support status | |
Supported |
Ada Lovelace, also referred to simply as Lovelace, [1] is a graphics processing unit (GPU) microarchitecture developed by Nvidia as the successor to the Ampere architecture, officially announced on September 20, 2022. It is named after the English mathematician Ada Lovelace, [2] one of the first computer programmers. Nvidia announced the architecture along with the GeForce RTX 40 series consumer GPUs [3] and the RTX 6000 Ada Generation workstation graphics card. [4] The Lovelace architecture is fabricated on TSMC's custom 4N process which offers increased efficiency over the previous Samsung 8 nm and TSMC N7 processes used by Nvidia for its previous-generation Ampere architecture. [5]
The Ada Lovelace architecture follows on from the Ampere architecture that was released in 2020. The Ada Lovelace architecture was announced by Nvidia CEO Jensen Huang during a GTC 2022 keynote on September 20, 2022 with the architecture powering Nvidia's GPUs for gaming, workstations and datacenters. [6]
Architectural improvements of the Ada Lovelace architecture include the following: [7]
128 CUDA cores are included in each SM.
Ada Lovelace features third-generation RT cores. The RTX 4090 features 128 RT cores compared to the 84 in the previous generation RTX 3090 Ti. These 128 RT cores can provide up to 191 TFLOPS of compute with 1.49 TFLOPS per RT core. [14] A new stage in the ray tracing pipeline called Shader Execution Reordering (SER) is added in the Lovelace architecture which Nvidia claims provides a 2x performance improvement in ray tracing workloads. [6]
Lovelace's new fourth-generation Tensor cores enable the AI technology used in DLSS 3's frame generation techniques. Much like Ampere, each SM contains 4 Tensor cores but Lovelace contains a greater number of Tensor cores overall given its increased number of SMs.
There is a significant increase in clock speeds with the Ada Lovelace architecture with the RTX 4090's base clock speed being higher than the boost clock speed of the RTX 3090 Ti.
RTX 2080 Ti | RTX 3090 Ti | RTX 4090 | |
---|---|---|---|
Architecture | Turing | Ampere | Ada Lovelace |
Base clock speed (MHz) | 1350 | 1560 | 2235 |
Boost clock speed (MHz) | 1635 | 1860 | 2520 |
RTX 2080 Ti | RTX 3090 Ti | RTX 4090 | |
---|---|---|---|
Architecture | Turing | Ampere | Ada Lovelace |
L1 Data Cache | 6.375 MB (96 KB per SM) | 10.5 MB (128 KB per SM) | 16 MB (128 KB per SM) |
L2 Cache | 5.5 MB | 6 MB | 72 MB |
The last enabled AD102 Lovelace die features 96 MB of L2 cache, a 16x increase from the 6 MB in the Ampere-based GA102 die. [15] The GPU having quick access to a high amount of L2 cache benefits complex operations like ray tracing compared to the GPU seeking data from the GDDR video memory which is slower. Relying less on accessing memory for storing important and frequently accessed data means that a narrower memory bus width can be used in tandem with a large L2 cache.
Each memory controller uses a 32-bit connection with up to 12 controllers present for a combined memory bus width of 384-bit. The Lovelace architecture can use either GDDR6 or GDDR6X memory. GDDR6X memory features on the desktop GeForce RTX 40 series while the more energy-efficient GDDR6 memory is used on its corresponding mobile versions and on RTX A6000 workstation GPUs.
The Ada Lovelace architecture is able to use lower voltages compared to its predecessor. [6] Nvidia claims a 2x performance increase for the RTX 4090 at the same 450W used by the previous generation flagship RTX 3090 Ti. [16]
Increased power efficiency can be attributed in part to the smaller fabrication node used by the Lovelace architecture. The Ada Lovelace architecture is fabricated on TSMC's cutting-edge 4N process, a custom designed process node for Nvidia. The previous generation Ampere architecture used Samsung's 8nm-based 8N process node from 2018, which was two years old by the time of Ampere's launch. [17] [18] The AD102 die with its 76.3 billion transistors has a transistor density of 125.5 million per mm2, a 178% increase in density from GA102's 45.1 million per mm2.
The Lovelace architecture utilizes the new 8th generation Nvidia NVENC video encoder and the 7th generation NVDEC video decoder introduced by Ampere returns. [19]
NVENC AV1 hardware encoding with support for up to 8K resolution at 60FPS in 10-bit color is added, enabling higher video fidelity at lower bit rates compared to the H.264 and H.265 codecs. [20] Nvidia claims that its NVENC AV1 encoder featured in the Lovelace architecture is 40% more efficient than the H.264 encoder in the Ampere architecture. [21]
The Lovelace architecture received criticism for not supporting the DisplayPort 2.0 connection that supports higher display data bandwidth and instead uses the older DisplayPort 1.4a which is limited to a peak bandwidth of 32 Gbit/s. [22] As a result, Lovelace GPUs would be limited by DisplayPort 1.4a's supported refresh rates despite the GPU's performance being able to reach higher frame rates. Intel's Arc GPUs that also released in October 2022 included DisplayPort 2.0. AMD's competing RDNA 3 architecture released just two months after Lovelace included DisplayPort 2.1. [23]
Die [24] | AD102 [25] | AD103 [26] | AD104 [27] | AD106 [28] | AD107 [29] |
---|---|---|---|---|---|
Die size | 609 mm2 | 379 mm2 | 294 mm2 | 188 mm2 | 159 mm2 |
Transistors | 76.3B | 45.9B | 35.8B | 22.9B | 18.9B |
Transistor density | 125.3 MTr/mm2 | 121.1 MTr/mm2 | 121.8 MTr/mm2 | 121.8 MTr/mm2 | 118.9 MTr/mm2 |
Graphics processing clusters | 12 | 7 | 5 | 3 | 2 |
Streaming multiprocessors | 144 | 80 | 60 | 36 | 24 |
CUDA cores | 18432 | 10240 | 7680 | 4608 | 3072 |
Texture mapping units | 576 | 320 | 240 | 144 | 96 |
Render output units | 192 | 112 | 80 | 48 | 32 |
Tensor cores | 576 | 320 | 240 | 144 | 96 |
RT cores | 144 | 80 | 60 | 36 | 24 |
L1 cache | 18 MB | 10 MB | 7.5 MB | 4.5 MB | 3 MB |
128 KB per SM | |||||
L2 cache | 96 MB | 64 MB | 48 MB | 32 MB |
Type | AD107 | AD106 | AD104 | AD103 | AD102 |
---|---|---|---|---|---|
GeForce 40 Series (Desktop) | GeForce RTX 4060 | GeForce RTX 4060 Ti | GeForce RTX 4070 GeForce RTX 4070 SUPER GeForce RTX 4070 Ti | GeForce RTX 4070 Ti Super GeForce RTX 4080 GeForce RTX 4080 Super | GeForce RTX 4090 D GeForce RTX 4090 |
GeForce 40 Series (Mobile) | GeForce RTX 4050 GeForce RTX 4060 | GeForce RTX 4070 | GeForce RTX 4080 | GeForce RTX 4090 | — |
Nvidia Workstation GPUs (Desktop) | RTX 2000 Ada Generation | — | RTX 4000 Ada Generation RTX 4000 SFF Ada Generation RTX 4500 Ada Generation | — | RTX 5000 Ada Generation RTX 5880 Ada Generation RTX 6000 Ada Generation |
Nvidia Workstation GPUs (Mobile) | RTX 2000 Max-Q Ada Generation RTX 2000 Ada Generation | RTX 3000 Ada Generation | RTX 3500 Ada Generation RTX 4000 Ada Generation | RTX 5000 Ada Generation | — |
Nvidia Data Center GPUs | — | — | — | Nvidia L40 Nvidia L40G Nvidia L40CNX |
GeForce is a brand of graphics processing units (GPUs) designed by Nvidia and marketed for the performance market. As of the GeForce 40 series, there have been eighteen iterations of the design. The first GeForce products were discrete GPUs designed for add-on graphics boards, intended for the high-margin PC gaming market, and later diversification of the product line covered all tiers of the PC graphics market, ranging from cost-sensitive GPUs integrated on motherboards, to mainstream add-in retail boards. Most recently, GeForce technology has been introduced into Nvidia's line of embedded application processors, designed for electronic handhelds and mobile handsets.
Alienware Corporation is an American computer hardware subsidiary brand of Dell. Their product range is dedicated to gaming computers and accessories and can be identified by their alien-themed designs. Alienware was founded in 1996 by Nelson Gonzalez and Alex Aguila. The development of the company is also associated with Frank Azor, Arthur Lewis, Joe Balerdi, and Michael S. Dell (CEO). The company's corporate headquarters is located in The Hammocks, Miami, Florida.
Quadro was Nvidia's brand for graphics cards intended for use in workstations running professional computer-aided design (CAD), computer-generated imagery (CGI), digital content creation (DCC) applications, scientific calculations and machine learning from 2000 to 2020.
PureVideo is Nvidia's hardware SIP core that performs video decoding. PureVideo is integrated into some of the Nvidia GPUs, and it supports hardware decoding of multiple video codec standards: MPEG-2, VC-1, H.264, HEVC, and AV1. PureVideo occupies a considerable amount of a GPU's die area and should not be confused with Nvidia NVENC. In addition to video decoding on chip, PureVideo offers features such as edge enhancement, noise reduction, deinterlacing, dynamic contrast enhancement and color enhancement.
The GeForce 600 series is a series of graphics processing units developed by Nvidia, first released in 2012. It served as the introduction of the Kepler architecture. It is succeeded by the GeForce 700 series.
The GeForce 10 series is a series of graphics processing units developed by Nvidia, initially based on the Pascal microarchitecture announced in March 2014. This design series succeeded the GeForce 900 series, and is succeeded by the GeForce 16 series and GeForce 20 series using the Turing microarchitecture.
Nvidia Tesla is the former name for a line of products developed by Nvidia targeted at stream processing or general-purpose graphics processing units (GPGPU), named after pioneering electrical engineer Nikola Tesla. Its products began using GPUs from the G80 series, and have continued to accompany the release of new chips. They are programmable using the CUDA or OpenCL APIs.
Maxwell is the codename for a GPU microarchitecture developed by Nvidia as the successor to the Kepler microarchitecture. The Maxwell architecture was introduced in later models of the GeForce 700 series and is also used in the GeForce 800M series, GeForce 900 series, and Quadro Mxxx series, as well as some Jetson products.
Pascal is the codename for a GPU microarchitecture developed by Nvidia, as the successor to the Maxwell architecture. The architecture was first introduced in April 2016 with the release of the Tesla P100 (GP100) on April 5, 2016, and is primarily used in the GeForce 10 series, starting with the GeForce GTX 1080 and GTX 1070, which were released on May 27, 2016, and June 10, 2016, respectively. Pascal was manufactured using TSMC's 16 nm FinFET process, and later Samsung's 14 nm FinFET process.
Nvidia NVENC is a feature in Nvidia graphics cards that performs video encoding, offloading this compute-intensive task from the CPU to a dedicated part of the GPU. It was introduced with the Kepler-based GeForce 600 series in March 2012.
Nvidia RTX is a professional visual computing platform created by Nvidia, primarily used in workstations for designing complex large-scale models in architecture and product design, scientific visualization, energy exploration, and film and video production, as well as being used in mainstream PCs for gaming.
The GeForce 20 series is a family of graphics processing units developed by Nvidia. Serving as the successor to the GeForce 10 series, the line started shipping on September 20, 2018, and after several editions, on July 2, 2019, the GeForce RTX Super line of cards was announced.
Turing is the codename for a graphics processing unit (GPU) microarchitecture developed by Nvidia. It is named after the prominent mathematician and computer scientist Alan Turing. The architecture was first introduced in August 2018 at SIGGRAPH 2018 in the workstation-oriented Quadro RTX cards, and one week later at Gamescom in consumer GeForce 20 series graphics cards. Building on the preliminary work of Volta, its HPC-exclusive predecessor, the Turing architecture introduces the first consumer products capable of real-time ray tracing, a longstanding goal of the computer graphics industry. Key elements include dedicated artificial intelligence processors and dedicated ray tracing processors. Turing leverages DXR, OptiX, and Vulkan for access to ray tracing. In February 2019, Nvidia released the GeForce 16 series GPUs, which utilizes the new Turing design but lacks the RT and Tensor cores.
The GeForce 16 series is a series of graphics processing units (GPUs) developed by Nvidia, based on the Turing microarchitecture, announced in February 2019. The 16 series, commercialized within the same timeframe as the 20 series, aims to cover the entry-level to mid-range market, not addressed by the latter. As a result, the media have mainly compared it to AMD's Radeon RX 500 series of GPUs.
Ampere is the codename for a graphics processing unit (GPU) microarchitecture developed by Nvidia as the successor to both the Volta and Turing architectures. It was officially announced on May 14, 2020 and is named after French mathematician and physicist André-Marie Ampère.
The GeForce 30 series is a suite of graphics processing units (GPUs) designed and marketed by Nvidia, succeeding the GeForce 20 series. The GeForce 30 series is based on the Ampere architecture, which features Nvidia's second-generation ray tracing (RT) cores and third-generation Tensor Cores. Through Nvidia RTX, hardware-enabled real-time ray tracing is possible on GeForce 30 series cards.
The Radeon RX 6000 series is a series of graphics processing units developed by AMD, based on their RDNA 2 architecture. It was announced on October 28, 2020 and is the successor to the Radeon RX 5000 series. It consists of the entry-level RX 6400, mid-range RX 6500 XT, high-end RX 6600, RX 6600 XT, RX 6650 XT, RX 6700, RX 6700 XT, upper high-end RX 6750 XT, RX 6800, RX 6800 XT, and enthusiast RX 6900 XT and RX 6950 XT for desktop computers; and the RX 6600M, RX 6700M, and RX 6800M for laptops. A sub-series for mobile, Radeon RX 6000S, was announced in CES 2022, targeting thin and light laptop designs.
The GeForce 40 series is the latest family of consumer-level graphics processing units developed by Nvidia, succeeding the GeForce 30 series. The series was announced on September 20, 2022, at the GPU Technology Conference (GTC) 2022 event.
The Radeon RX 7000 series is a series of graphics processing units developed by AMD, based on their RDNA 3 architecture. It was announced on November 3, 2022 and is the successor to the Radeon RX 6000 series. The first two graphics cards of the family were released on Dec 13, 2022.