Graphics Core Next

Last updated

Graphics Core Next (GCN) [1] is the codename for a series of microarchitectures and an instruction set architecture that were developed by AMD for its GPUs as the successor to its TeraScale microarchitecture. The first product featuring GCN was launched on January 9, 2012. [2]

Contents

GCN is a reduced instruction set SIMD microarchitecture contrasting the very long instruction word SIMD architecture of TeraScale. [3] GCN requires considerably more transistors than TeraScale, but offers advantages for general-purpose GPU (GPGPU) computation due to a simpler compiler.

GCN graphics chips were fabricated with CMOS at 28 nm, and with FinFET at 14 nm (by Samsung Electronics and GlobalFoundries) and 7 nm (by TSMC), available on selected models in AMD's Radeon HD 7000, HD 8000, 200, 300, 400, 500 and Vega series of graphics cards, including the separately released Radeon VII. GCN was also used in the graphics portion of Accelerated Processing Units (APUs), including those in the PlayStation 4 and Xbox One.

Instruction set

The GCN instruction set is owned by AMD and was developed specifically for GPUs. It has no micro-operation for division.

Documentation is available for:

An LLVM compiler back end is available for the GCN instruction set. [5] It is used by Mesa 3D.

GNU Compiler Collection 9 supports GCN 3 and GCN 5 since 2019 [6] for single-threaded, stand-alone programs, with GCC 10 also offloading via OpenMP and OpenACC. [7]

MIAOW is an open-source RTL implementation of the AMD Southern Islands GPGPU microarchitecture.

In November 2015, AMD announced its Boltzmann Initiative, which aims to enable the porting of CUDA-based applications to a common C++ programming model. [8]

At the Super Computing 15 event, AMD displayed a Heterogeneous Compute Compiler (HCC), a headless Linux driver and HSA runtime infrastructure for cluster-class high-performance computing, and a Heterogeneous-compute Interface for Portability (HIP) tool for porting CUDA applications to the aforementioned common C++ model.

Microarchitectures

As of July 2017, the Graphics Core Next instruction set has seen five iterations. The differences between the first four generations are rather minimal, but the fifth-generation GCN architecture features heavily modified stream processors to improve performance and support the simultaneous processing of two lower-precision numbers in place of a single higher-precision number. [9]

Command processing

GCN command processing: Each Asynchronous Compute Engines (ACE) can parse incoming commands and dispatch work to the Compute Units (CUs). Each ACE can manage up to 8 independent queues. The ACEs can operate in parallel with the graphics command processor and two DMA engines. The graphics command processor handles graphics queues, the ACEs handle compute queues, and the DMA engines handle copy queues. Each queue can dispatch work items without waiting for other tasks to complete, allowing independent command streams to be interleaved on the GPU's Shader. GCN command processing.svg
GCN command processing: Each Asynchronous Compute Engines (ACE) can parse incoming commands and dispatch work to the Compute Units (CUs). Each ACE can manage up to 8 independent queues. The ACEs can operate in parallel with the graphics command processor and two DMA engines. The graphics command processor handles graphics queues, the ACEs handle compute queues, and the DMA engines handle copy queues. Each queue can dispatch work items without waiting for other tasks to complete, allowing independent command streams to be interleaved on the GPU's Shader.

Graphics Command Processor

The Graphics Command Processor (GCP) is a functional unit of the GCN microarchitecture. Among other tasks, it is responsible for the handling of asynchronous shaders. [10]

Asynchronous Compute Engine

The Asynchronous Compute Engine (ACE) is a distinct functional block serving computing purposes, whose purpose is similar to that of the Graphics Command Processor.[ ambiguous ]

Schedulers

Since the third iteration of GCN, the hardware contains two schedulers: one to schedule "wavefronts" during shader execution (the CU Scheduler, or Compute Unit Scheduler) and the other to schedule execution of draw and compute queues. The latter helps performance by executing compute operations when the compute units (CUs) are underutilized due to graphics commands limited by fixed function pipeline speed or bandwidth. This functionality is known as Async Compute.

For a given shader, the GPU drivers may also schedule instructions on the CPU to minimize latency.

Geometric processor

Geometry processor GCN Geometry Processors.svg
Geometry processor

The geometry processor contains a Geometry Assembler, a Tesselator, and a Vertex Assembler.

The Tesselator is capable of doing tessellation in hardware as defined by Direct3D 11 and OpenGL 4.5 (see AMD January 21, 2017), [11] and succeeded ATI TruForm and hardware tessellation in TeraScale as AMD's then-latest semiconductor intellectual property core.

Compute units

One compute unit (CU) combines 64 shader processors with 4 texture mapping units (TMUs). [12] [13] The compute units are separate from, but feed into, the render output units (ROPs). [13] Each compute unit consists of the following:

Four Compute units are wired to share a 16KiB L1 instruction cache and a 32KiB L1 data cache, both of which are read-only. A SIMD-VU operates on 16 elements at a time (per cycle), while a SU can operate on one a time (one/cycle). In addition, the SU handles some other operations, such as branching. [15]

Every SIMD-VU has some private memory where it stores its registers. There are two types of registers: scalar registers (S0, S1, etc.), which hold 4 bytes number each, and vector registers (V0, V1, etc.), which each represent a set of 64 4-byte numbers. On the vector registers, every operation is done in parallel on the 64 numbers. which correspond to 64 inputs. For example, it may work on 64 different pixels at a time (for each of them the inputs are slightly different, and thus you get slightly different color at the end).

Every SIMD-VU has room for 512 scalar registers and 256 vector registers.

AMD has claimed that each GCN compute unit (CU) has 64 KiB Local Data Share (LDS). [16]

CU scheduler

The CU scheduler is the hardware functional block, choosing which wavefronts the SIMD-VU executes. It picks one SIMD-VU per cycle for scheduling. This is not to be confused with other hardware or software schedulers.

Wavefront

A shader is a small program written in GLSL that performs graphics processing, and a kernel is a small program written in OpenCL that performs GPGPU processing. These processes don't need that many registers, but they do need to load data from system or graphics memory. This operation comes with significant latency. AMD and Nvidia chose similar approaches to hide this unavoidable latency: the grouping of multiple threads. AMD calls such a group a "wavefront", whereas Nvidia calls it a "warp". A group of threads is the most basic unit of scheduling of GPUs that implement this approach to hide latency. It is the minimum size of the data processed in SIMD fashion, the smallest executable unit of code, and the way to processes a single instruction over all of the threads in it at the same time.

In all GCN GPUs, a "wavefront" consists of 64 threads, and in all Nvidia GPUs, a "warp" consists of 32 threads.

AMD's solution is to attribute multiple wavefronts to each SIMD-VU. The hardware distributes the registers to the different wavefronts, and when one wavefront is waiting on some result, which lies in memory, the CU Scheduler assigns the SIMD-VU another wavefront. Wavefronts are attributed per SIMD-VU. SIMD-VUs do not exchange wavefronts. A maximum of 10 wavefronts can be attributed per SIMD-VU (thus 40 per CU).

AMD CodeXL shows tables with the relationship between number of SGPRs and VGPRs to the number of wavefronts, but essentially, for SGPRS it is between 104 and 512 per number of wavefronts, and for VGPRS it is 256 per number of wavefronts.

Note that in conjunction with the SSE instructions, this concept of the most basic level of parallelism is often called a "vector width". The vector width is characterized by the total number of bits in it.

SIMD Vector Unit

Each SIMD Vector Unit has:

  • a 16-lane integer and floating point vector Arithmetic Logic Unit (ALU)
  • 64 KiB Vector General Purpose Register (VGPR) file
  • 10× 48-bit Program Counters
  • Instruction buffer for 10 wavefronts (each wavefront is a group of 64 threads, or the size of one logical VGPR)
  • A 64-thread wavefront issues to a 16-lane SIMD Unit over four cycles

Each SIMD-VU has 10 wavefront instruction buffers, and it takes 4 cycles to execute one wavefront.

Audio and video acceleration blocks

Many implementations of GCN are typically accompanied by several of AMD's other ASIC blocks. Including but not limited to the Unified Video Decoder, Video Coding Engine, and AMD TrueAudio.

Video Coding Engine

The Video Coding Engine is a video encoding ASIC, first introduced with the Radeon HD 7000 series. [17]

The initial version of the VCE added support for encoding I and P frames H.264 in the YUV420 pixel format, along with SVE temporal encode and Display Encode Mode, while the second version added B-frame support for YUV420 and YUV444 I-frames.

VCE 3.0 formed a part of the third generation of GCN, adding high-quality video scaling and the HEVC (H.265) codec.

VCE 4.0 was part of the Vega architecture, and was subsequently succeeded by Video Core Next.

TrueAudio

Unified virtual memory

In a preview in 2011, AnandTech wrote about the unified virtual memory, supported by Graphics Core Next. [18]

Heterogeneous System Architecture (HSA)

GCN includes special purpose function blocks to be used by HSA. Support for these function blocks is available through
.mw-parser-output .monospaced{font-family:monospace,monospace}
amdkfd since Linux kernel 3.19. Linux AMD graphics stack.svg
GCN includes special purpose function blocks to be used by HSA. Support for these function blocks is available through amdkfd since Linux kernel 3.19.

Some of the specific HSA features implemented in the hardware need support from the operating system's kernel (its subsystems) and/or from specific device drivers. For example, in July 2014, AMD published a set of 83 patches to be merged into Linux kernel mainline 3.17 for supporting their Graphics Core Next-based Radeon graphics cards. The so-called HSA kernel driver resides in the directory /drivers/gpu/hsa, while the DRM graphics device drivers reside in /drivers/gpu/drm [21] and augment the already existing DRM drivers for Radeon cards. [22] This very first implementation focuses on a single "Kaveri" APU and works alongside the existing Radeon kernel graphics driver (kgd).

Lossless Delta Color Compression

Hardware schedulers

Hardware schedulers are used to perform scheduling [23] and offload the assignment of compute queues to the ACEs from the driver to hardware, by buffering these queues until there is at least one empty queue in at least one ACE. This causes the HWS to immediately assign buffered queues to the ACEs until all queues are full or there are no more queues to safely assign. [24]

Part of the scheduling work performed includes prioritized queues which allow critical tasks to run at a higher priority than other tasks without requiring the lower priority tasks to be preempted to run the high priority task, therefore allowing the tasks to run concurrently with the high priority tasks scheduled to hog the GPU as much as possible while letting other tasks use the resources that the high priority tasks are not using. [23] These are essentially Asynchronous Compute Engines that lack dispatch controllers. [23] They were first introduced in the fourth generation GCN microarchitecture, [23] but were present in the third generation GCN microarchitecture for internal testing purposes. [25] A driver update has enabled the hardware schedulers in third generation GCN parts for production use. [23]

Primitive Discard Accelerator

This unit discards degenerate triangles before they enter the vertex shader and triangles that do not cover any fragments before they enter the fragment shader. [26] This unit was introduced with the fourth generation GCN microarchitecture. [26]

Generations

Graphics Core Next 1

AMD Graphics Core Next 1
Release dateJanuary 2012;12 years ago (January 2012)[ citation needed ]
History
Predecessor TeraScale 3
Successor Graphics Core Next 2
Support status
Unsupported

The GCN 1 microarchitecture was used in several Radeon HD 7000 series graphics cards.

Die shot of the Tahiti GPU used in Radeon HD 7950 GHz Edition graphics cards AMD@28nm@GCN 1st gen@Tahiti@Radeon HD 7950 GHz Edition@1312 NCC858.00 512-0821065 Stack-DSC03197-DSC03218 - ZS-retouched (30132516316).jpg
Die shot of the Tahiti GPU used in Radeon HD 7950 GHz Edition graphics cards

There are Asynchronous Compute Engines controlling computation and dispatching. [15] [30]

ZeroCore Power

ZeroCore Power is a long idle power saving technology, shutting off functional units of the GPU when not in use. [31] AMD ZeroCore Power technology supplements AMD PowerTune.

Chips

Discrete GPUs (Southern Islands family):

  • Hainan
  • Oland
  • Cape Verde
  • Pitcairn
  • Tahiti

Graphics Core Next 2

AMD Graphics Core Next 2
Release dateSeptember 2013;10 years ago (September 2013)[ citation needed ]
History
Predecessor Graphics Core Next 1
Successor Graphics Core Next 3
Support status
Unsupported
AMD PowerTune "Bonaire" AMD PowerTune Bonaire.svg
AMD PowerTune "Bonaire"
Die shot of the Hawaii GPU used in Radeon R9 290 graphics cards AMD@28nm@GCN 2nd gen@Hawaii@Radeon R9 290@215-0852020@ DSCx2 polysilicon@5x - Flickr - FritzchensFritz.jpg
Die shot of the Hawaii GPU used in Radeon R9 290 graphics cards

The 2nd generation of GCN was introduced with the Radeon HD 7790 and is also found in the Radeon HD 8770, R7 260/260X, R9 290/290X, R9 295X2, R7 360, and R9 390/390X, as well as Steamroller-based desktop "Kaveri" APUs and mobile "Kaveri" APUs and in the Puma-based "Beema" and "Mullins" APUs. It has multiple advantages over the original GCN, including FreeSync support, AMD TrueAudio and a revised version of AMD PowerTune technology.

GCN 2nd generation introduced an entity called "Shader Engine" (SE). A Shader Engine comprises one geometry processor, up to 44 CUs (Hawaii chip), rasterizers, ROPs, and L1 cache. Not part of a Shader Engine is the Graphics Command Processor, the 8 ACEs, the L2 cache and memory controllers as well as the audio and video accelerators, the display controllers, the 2 DMA controllers and the PCIe interface.

The A10-7850K "Kaveri" contains 8 CUs (compute units) and 8 Asynchronous Compute Engines for independent scheduling and work item dispatching. [32]

At AMD Developer Summit (APU) in November 2013 Michael Mantor presented the Radeon R9 290X. [33]

Chips

Discrete GPUs (Sea Islands family):

  • Bonaire
  • Hawaii

integrated into APUs:

  • Temash
  • Kabini
  • Liverpool (i.e. the APU found in the PlayStation 4)
  • Durango (i.e. the APU found in the Xbox One and Xbox One S)
  • Kaveri
  • Godavari
  • Mullins
  • Beema
  • Carrizo-L

Graphics Core Next 3

AMD Graphics Core Next 3
Release dateJune 2015;8 years ago (June 2015)[ citation needed ]
History
Predecessor Graphics Core Next 2
Successor Graphics Core Next 4
Die shot of the Fiji GPU used in Radeon R9 Nano graphics cards AMD@28nm@GCN 3th gen@Fiji@Radeon R9 Nano@SPMRC REA0356A-1539 215-0862120 DSCx1 polysilicon@5x.jpg
Die shot of the Fiji GPU used in Radeon R9 Nano graphics cards

GCN 3rd generation [34] was introduced in 2014 with the Radeon R9 285 and R9 M295X, which have the "Tonga" GPU. It features improved tessellation performance, lossless delta color compression to reduce memory bandwidth usage, an updated and more efficient instruction set, a new high quality scaler for video, HEVC encoding (VCE 3.0) and HEVC decoding (UVD 6.0), and a new multimedia engine (video encoder/decoder). Delta color compression is supported in Mesa. [35] However, its double precision performance is worse compared to previous generation. [36]

Chips

discrete GPUs:

  • Tonga (Volcanic Islands family), comes with UVD 5.0 (Unified Video Decoder)
  • Fiji (Pirate Islands family), comes with UVD 6.0 and High Bandwidth Memory (HBM 1)

integrated into APUs:

  • Carrizo, comes with UVD 6.0
  • Bristol Ridge [37]
  • Stoney Ridge [37]

Graphics Core Next 4

AMD Graphics Core Next 4
Release dateJune 2016;7 years ago (June 2016)[ citation needed ]
History
Predecessor Graphics Core Next 3
Successor Graphics Core Next 5
Support status
Supported
Die shot of the Polaris 11 GPU used in Radeon RX 460 graphics cards AMD@14nm@GCN 4th gen@Polaris 11@Radeon RX 460@1628 NAA2Y.1 215-0895088 DSCx4 polysilicon layer@5x.jpg
Die shot of the Polaris 11 GPU used in Radeon RX 460 graphics cards
Die shot of the Polaris 10 GPU used in Radeon RX 470 graphics cards AMD@14nm@GCN 4th gen@Polaris 10@Radeon RX 470@1622 M60J5.0A 215-0876204 DSCx1 polysilicon@5.jpg
Die shot of the Polaris 10 GPU used in Radeon RX 470 graphics cards

GPUs of the Arctic Islands-family were introduced in Q2 of 2016 with the AMD Radeon 400 series. The 3D-engine (i.e. GCA (Graphics and Compute array) or GFX) is identical to that found in the Tonga-chips. [38] But Polaris feature a newer Display Controller engine, UVD version 6.3, etc.

All Polaris-based chips other than the Polaris 30 are produced on the 14 nm FinFET process, developed by Samsung Electronics and licensed to GlobalFoundries. [39] The slightly newer refreshed Polaris 30 is built on the 12 nm LP FinFET process node, developed by Samsung and GlobalFoundries. The fourth generation GCN instruction set architecture is compatible with the third generation. It is an optimization for 14 nm FinFET process enabling higher GPU clock speeds than with the 3rd GCN generation. [40] Architectural improvements include new hardware schedulers, a new primitive discard accelerator, a new display controller, and an updated UVD that can decode HEVC at 4K resolutions at 60 frames per second with 10 bits per color channel.

Chips

discrete GPUs: [41]

  • Polaris 10 (also codenamed Ellesmere) found on "Radeon RX 470" and "Radeon RX 480"-branded graphics cards
  • Polaris 11 (also codenamed Baffin) found on "Radeon RX 460"-branded graphics cards (also Radeon RX 560D)
  • Polaris 12 (also codenamed Lexa) found on "Radeon RX 550" and "Radeon RX 540"-branded graphics cards
  • Polaris 20, which is a refreshed (14 nm LPP Samsung/GloFo FinFET process) Polaris 10 with higher clocks, used for "Radeon RX 570" and "Radeon RX 580"-branded graphics cards [42]
  • Polaris 21, which is a refreshed (14 nm LPP Samsung/GloFo FinFET process) Polaris 11, used for "Radeon RX 560"-branded graphics cards
  • Polaris 22, found on "Radeon RX Vega M GH" and "Radeon RX Vega M GL"-branded graphics cards (as part of Kaby Lake-G)
  • Polaris 23, which is a refreshed (14 nm LPP Samsung/GloFo FinFET process) Polaris 12, used for "Radeon Pro WX 3200" and "Radeon RX 540X"-branded graphics cards (also Radeon RX 640) [43]
  • Polaris 30, which is a refreshed (12 nm LP GloFo FinFET process) Polaris 20 with higher clocks, used for "Radeon RX 590"-branded graphics cards [44]

In addition to dedicated GPUs, Polaris is utilized in the APUs of the PlayStation 4 Pro and Xbox One X, titled "Neo" and "Scorpio", respectively.

Precision Performance

FP64 performance of all GCN 4th generation GPUs is 1/16 of FP32 performance.

Graphics Core Next 5

AMD Graphics Core Next 5
Release dateJune 2017;6 years ago (June 2017)[ citation needed ]
History
Predecessor Graphics Core Next 4
Successor RDNA 1
Support status
Supported
Die shot of the Vega 10 GPU used in Radeon RX Vega 64 graphics cards AMD@14nm@GCN 5th gen@Vega10@Radeon RX Vega 64@ES-Sample@ DSCx6 polysilicon microscope stitched@5x.jpg
Die shot of the Vega 10 GPU used in Radeon RX Vega 64 graphics cards

AMD began releasing details of their next generation of GCN Architecture, termed the 'Next-Generation Compute Unit', in January 2017. [40] [45] [46] The new design was expected to increase instructions per clock, higher clock speeds, support for HBM2, a larger memory address space. The discrete graphics chipsets also include "HBCC (High Bandwidth Cache Controller)", but not when integrated into APUs. [47] Additionally, the new chips were expected to include improvements in the Rasterisation and Render output units. The stream processors are heavily modified from the previous generations to support packed math Rapid Pack Math technology for 8-bit, 16-bit, and 32-bit numbers. With this there is a significant performance advantage when lower precision is acceptable (for example: processing two half-precision numbers at the same rate as a single single precision number).

Nvidia introduced tile-based rasterization and binning with Maxwell, [48] and this was a big reason for Maxwell's efficiency increase. In January, AnandTech assumed that Vega would finally catch up with Nvidia regarding energy efficiency optimizations due to the new "DSBR (Draw Stream Binning Rasterizer)" to be introduced with Vega. [49]

It also added support for a new shader stage – Primitive Shaders. [50] [51] Primitive shaders provide more flexible geometry processing and replace the vertex and geometry shaders in a rendering pipeline. As of December 2018, the Primitive shaders can't be used because required API changes are yet to be done. [52]

Vega 10 and Vega 12 use the 14 nm FinFET process, developed by Samsung Electronics and licensed to GlobalFoundries. Vega 20 uses the 7 nm FinFET process developed by TSMC.

Chips

discrete GPUs:

  • Vega 10 (14 nm Samsung/GloFo FinFET process) (also codenamed Greenland [53] ) found on "Radeon RX Vega 64", "Radeon RX Vega 56", "Radeon Vega Frontier Edition", "Radeon Pro V340", Radeon Pro WX 9100, and Radeon Pro WX 8200 graphics cards [54]
  • Vega 12 (14 nm Samsung/GloFo FinFET process) found on "Radeon Pro Vega 20" and "Radeon Pro Vega 16"-branded mobile graphics cards [55]
  • Vega 20 (7 nm TSMC FinFET process) found on "Radeon Instinct MI50" and "Radeon Instinct MI60"-branded accelerator cards, [56] "Radeon Pro Vega II", and "Radeon VII"-branded graphics cards. [57]

integrated into APUs:

  • Raven Ridge [58] came with VCN 1 which supersedes VCE and UVD and allows full fixed-function VP9 decode.

Precision performance

Double-precision floating-point (FP64) performance of all GCN 5th generation GPUs, except for Vega 20, is one-sixteenth of FP32 performance. For Vega 20 with Radeon Instinct this is half of FP32 performance. For Vega 20 with Radeon VII this is a quarter of FP32 performance. [59] All GCN 5th generation GPUs support half-precision floating-point (FP16) calculations which is twice of FP32 performance.

Comparison of GCN chips

Microarchitecture [60] GCN 1GCN 2GCN 3GCN 4GCN 5
ChipTahiti [61] Pitcairn [62] Cape Verde [63] Oland [64] Hainan [65] Bonaire [66] Hawaii [67] Topaz [68] Tonga [69] Fiji [70] Ellesmere [71] Baffin [72] Lexa [73] Vega 10 [74] Vega 12 [75] Vega 20 [76]
Code name 1???Tiran??IbizaIceland??Polaris 10Polaris 11Polaris 12GreenlandTreasure RefreshMoonshot
Chip variant(s)New Zealand
Malta
Wimbledon
Curaçao
Neptune
Trinidad
Chelsea
Heathrow
Venus
Tropo
Mars
Opal
Litho
Sun
Jet
Exo
Banks
Saturn
Tobago
Strato
Emerald
Vesuvius
Grenada
Meso
Weston
Polaris 24
Amethyst
Antigua
CapsaicinPolaris 20
Polaris 30
Polaris 21Polaris 23
Fab TSMC  28 nm GlobalFoundries  14 nm / 12 nm (Polaris 30) TSMC  7 nm
Die size (mm2)352 / 365 (Malta)2121237756160438125366596232123103495Un­known331
Transistors (million)4,3132,8001,5009506902,0806,2001,5505,0008,9005,7003,0002,20012,500Un­known13,230
Transistor density (MTr/mm2)12.3 / 12.8 (Malta)13.212.212.313.014.212.413.714.924.624.421.425.3Un­known40.0
Asynchronous compute engines28?84?4
Geometry engines212?4?4
Shader engines4?42
Hardware schedulers2?2
Compute units322010 / 8 (Chelsea)65 / 6 (Jet)144463264361610642064
Stream processors 20481280640 / 512 (Chelsea)384320 / 384 (Jet)89628163842048409623041024640409612804096
Texture mapping units 1288040 / 32 (Chelsea)2420 / 24 (Jet)5617624128256144644025680256
Render output units 321681664832643216643264
Z/Stencil OPS 12864166425616128256
L1  cache (KB)16 per Compute unit (CU)
L2 cache (KB)768512256128 / 256 (Jet)256102425676820481024512409610244096
Display Core Engine6.06.48.28.510.011.212.012.1
Unified Video Decoder 3.24.04.25.06.06.37.07.2
Video Coding Engine 1.02.03.03.44.04.1
Launch2Dec 2011Mar 2012Feb 2012Jan 2013May 2015Mar 2013Oct 20132014Aug 2014Jun 2015Jun 2016Aug 2016Apr 2017Jun 2017Nov 2018Nov 2018
Series (Family)Southern IslandsSea IslandsVolcanic IslandsPirate IslandsArctic IslandsVegaVega II
Notesmobile/OEMmobile/OEMmobile

1 Old code names such as Treasure (Lexa) or Hawaii Refresh (Ellesmere) are not listed.
2 Initial launch date. Launch dates of variant chips such as Polaris 20 (April 2017) are not listed.

See also

Related Research Articles

<span class="mw-page-title-main">AMD</span> American multinational semiconductor company

Advanced Micro Devices, Inc. (AMD) is an American multinational corporation and semiconductor company based in Santa Clara, California, that develops computer processors and related technologies for business and consumer markets.

<span class="mw-page-title-main">Radeon</span> Brand of computer products

Radeon is a brand of computer products, including graphics processing units, random-access memory, RAM disk software, and solid-state drives, produced by Radeon Technologies Group, a division of AMD. The brand was launched in 2000 by ATI Technologies, which was acquired by AMD in 2006 for US$5.4 billion.

<span class="mw-page-title-main">AMD APU</span> Series of microprocessors by AMD

AMD Accelerated Processing Unit (APU), formerly known as Fusion, is a series of 64-bit microprocessors from Advanced Micro Devices (AMD), combining a general-purpose AMD64 central processing unit (CPU) and 3D integrated graphics processing unit (IGPU) on a single die.

AMD PowerPlay is the brand name for a set of technologies for the reduction of the energy consumption implemented in several of AMD's graphics processing units and APUs supported by their proprietary graphics device driver "Catalyst". AMD PowerPlay is also implemented into ATI/AMD chipsets which integrated graphics and into AMD's Imageon handheld chipset, that was sold to Qualcomm in 2008.

<span class="mw-page-title-main">Radeon HD 7000 series</span> Series of video cards

The Radeon HD 7000 series, codenamed "Southern Islands", is a family of GPUs developed by AMD, and manufactured on TSMC's 28 nm process.

<span class="mw-page-title-main">Radeon HD 8000 series</span> Family of GPUs by AMD

The Radeon HD 8000 series is a family of computer GPUs developed by AMD. AMD was initially rumored to release the family in the second quarter of 2013, with the cards manufactured on a 28 nm process and making use of the improved Graphics Core Next architecture. However the 8000 series turned out to be an OEM rebadge of the 7000 series.

<span class="mw-page-title-main">Radeon 200 series</span> Series of video cards

The Radeon 200 series is a series of graphics processors developed by AMD. These GPUs are manufactured on a 28 nm Gate-Last process through TSMC or Common Platform Alliance.

Video Code Engine is AMD's video encoding application-specific integrated circuit implementing the video codec H.264/MPEG-4 AVC. Since 2012 it was integrated into all of their GPUs and APUs except Oland.

<span class="mw-page-title-main">AMD PowerTune</span> Brand name by AMD

AMD PowerTune is a series of dynamic frequency scaling technologies built into some AMD GPUs and APUs that allow the clock speed of the processor to be dynamically changed by software. This allows the processor to meet the instantaneous performance needs of the operation being performed, while minimizing power draw, heat generation and noise avoidance. AMD PowerTune aims to solve thermal design power and performance constraints.

TeraScale is the codename for a family of graphics processing unit microarchitectures developed by ATI Technologies/AMD and their second microarchitecture implementing the unified shader model following Xenos. TeraScale replaced the old fixed-pipeline microarchitectures and competed directly with Nvidia's first unified shader microarchitecture named Tesla.

The Radeon 400 series is a series of graphics processors developed by AMD. These cards were the first to feature the Polaris GPUs, using the new 14 nm FinFET manufacturing process, developed by Samsung Electronics and licensed to GlobalFoundries. The Polaris family initially included two new chips in the Graphics Core Next (GCN) family. Polaris implements the 4th generation of the Graphics Core Next instruction set, and shares commonalities with the previous GCN microarchitectures.

<span class="mw-page-title-main">Zen (first generation)</span> 2017 AMD 14-nanometre processor microarchitecture

Zen is the codename for the first iteration in a family of computer processor microarchitectures of the same name from AMD. It was first used with their Ryzen series of CPUs in February 2017. The first Zen-based preview system was demonstrated at E3 2016, and first substantially detailed at an event hosted a block away from the Intel Developer Forum 2016. The first Zen-based CPUs, codenamed "Summit Ridge", reached the market in early March 2017, Zen-derived Epyc server processors launched in June 2017 and Zen-based APUs arrived in November 2017.

<span class="mw-page-title-main">Radeon 500 series</span> Series of graphics cards by AMD

The Radeon 500 series is a series of graphics processors developed by AMD. These cards are based on the fourth iteration of the Graphics Core Next architecture, featuring GPUs based on Polaris 30, Polaris 20, Polaris 11, and Polaris 12 chips. Thus the RX 500 series uses the same microarchitecture and instruction set as its predecessor, while making use of improvements in the manufacturing process to enable higher clock rates.

Zen+ is the codename for a computer processor microarchitecture by AMD. It is the successor to the first gen Zen microarchitecture, and was first released in April 2018, powering the second generation of Ryzen processors, known as Ryzen 2000 for mainstream desktop systems, Threadripper 2000 for high-end desktop setups and Ryzen 3000G for accelerated processing units (APUs).

The Radeon RX Vega series is a series of graphics processors developed by AMD. These GPUs use the Graphics Core Next (GCN) 5th generation architecture, codenamed Vega, and are manufactured on 14 nm FinFET technology, developed by Samsung Electronics and licensed to GlobalFoundries. The series consists of desktop graphics cards and APUs aimed at desktops, mobile devices, and embedded applications.

<span class="mw-page-title-main">RDNA (microarchitecture)</span> GPU microarchitecture and accompanying instruction set architecture

RDNA is a graphics processing unit (GPU) microarchitecture and accompanying instruction set architecture developed by AMD. It is the successor to their Graphics Core Next (GCN) microarchitecture/instruction set. The first product lineup featuring RDNA was the Radeon RX 5000 series of video cards, launched on July 7, 2019. The architecture is also used in mobile products. It is manufactured and fabricated with TSMC's N7 FinFET graphics chips used in the Navi series of AMD Radeon graphics cards.

References

  1. AMD Developer Central (January 31, 2014). "GS-4106 The AMD GCN Architecture – A Crash Course, by Layla Mah". Slideshare.net.
  2. "AMD Launches World's Fastest Single-GPU Graphics Card – the AMD Radeon HD 7970" (Press release). AMD. December 22, 2011. Archived from the original on January 20, 2015. Retrieved January 20, 2015.
  3. Gulati, Abheek (November 11, 2019). "An Architectural Deep-Dive into AMD's TeraScale, GCN & RDNA GPU Architectures". Medium. Retrieved December 12, 2021.
  4. "AMD community forums". Community.amd.com. July 15, 2016.
  5. "LLVM back-end amdgpu". Llvm.org.
  6. "GCC 9 Release Series Changes, New Features, and Fixes" . Retrieved November 13, 2019.
  7. "AMD GCN Offloading Support" . Retrieved November 13, 2019.
  8. "AMD Boltzmann Initiative – Heterogeneous-compute Interface for Portability (HIP)". November 16, 2015. Archived from the original on January 26, 2016. Retrieved December 8, 2019.
  9. Smith, Ryan (January 5, 2017). "The AMD Vega GPU Architecture Preview". Anandtech.com. Retrieved July 11, 2017.
  10. Smith, Ryan. "AMD Dives Deep On Asynchronous Shading". Anandtech.com.
  11. "Conformant Products". Khronos.org. October 26, 2017.
  12. Compute Cores Whitepaper (PDF). AMD. 2014. p. 5.
  13. 1 2 Smith, Ryan (December 21, 2011). "AMD's Graphics Core Next Preview". Anandtech.com. Retrieved April 18, 2017.
  14. "AMD's Graphics Core Next (GCN) Architecture" (PDF). TechPowerUp. Retrieved February 26, 2024.
  15. 1 2 Mantor, Michael; Houston, Mike (June 15, 2011). "AMD Graphics Core Next" (PDF). AMD. p. 40. Retrieved July 15, 2014. Asynchronous Compute Engine (ACE)
  16. "Optimizing GPU occupancy and resource usage with large thread groups". AMD GPUOpen. Retrieved January 1, 2024.
  17. "White Paper AMD UnifiedVideoDecoder (UVD)" (PDF). June 15, 2012. Retrieved May 20, 2017.
  18. 1 2 "Not Just A New Architecture, But New Features Too". AnandTech. December 21, 2011. Retrieved July 11, 2014.
  19. "Kaveri microarchitecture". SemiAccurate . January 15, 2014.
  20. Airlie, Dave (November 26, 2014). "Merge AMDKFD". freedesktop.org . Retrieved January 21, 2015.
  21. "/drivers/gpu/drm". Kernel.org.
  22. "[PATCH 00/83] AMD HSA kernel driver". LKML. July 10, 2014. Retrieved July 11, 2014.
  23. 1 2 3 4 5 Angelini, Chris (June 29, 2016). "AMD Radeon RX 480 8GB Review". Tom's Hardware . p. 1. Retrieved August 11, 2016.
  24. "Dissecting the Polaris Architecture" (PDF). 2016. Archived from the original (PDF) on September 20, 2016. Retrieved August 12, 2016.
  25. Shrout, Ryan (June 29, 2016). "The AMD Radeon RX 480 Review – The Polaris Promise". PC Perspective. p. 2. Archived from the original on October 10, 2016. Retrieved August 12, 2016.
  26. 1 2 Smith, Ryan (June 29, 2016). "The AMD Radeon RX 480 Preview: Polaris Makes Its Mainstream Mark". AnandTech. p. 3. Retrieved August 11, 2016.
  27. "AMD Radeon HD 7000 Series to be PCI-Express 3.0 Compliant". TechPowerUp. Retrieved July 21, 2011.
  28. "AMD Details Next Gen. GPU Architecture" . Retrieved August 3, 2011.
  29. Tony Chen; Jason Greaves, "AMD's Graphics Core Next (GCN) Architecture" (PDF), AMD, retrieved August 13, 2016
  30. "AMD's Graphics Core Next Preview: AMD's New GPU, Architected For Compute". AnandTech. December 21, 2011. Retrieved July 15, 2014. AMD's new Asynchronous Compute Engines serve as the command processors for compute operations on GCN. The principal purpose of ACEs will be to accept work and to dispatch it off to the CUs for processing.
  31. "Managing Idle Power: Introducing ZeroCore Power". AnandTech.com. December 22, 2011. Retrieved April 29, 2015.
  32. "AMD's Kaveri A10-7850K tested". AnandTech . January 14, 2014. Retrieved July 7, 2014.
  33. "AMD Radeon R9-290X". November 21, 2013.
  34. "Carrizo Overview" (PNG). Images.anandtech.com. Retrieved July 20, 2018.
  35. "Add DCC Support". Freedesktop.org. October 11, 2015.
  36. Smith, Ryan (September 10, 2014). "AMD Radeon R9 285 Review". Anandtech.com. Retrieved March 13, 2017.
  37. 1 2 Cutress, Ian (June 1, 2016). "AMD Announces 7th Generation APU". Anandtech.com. Retrieved June 1, 2016.
  38. "RadeonFeature". www.x.org.
  39. "Radeon Technologies Group – January 2016 – AMD Polaris Architecture". Guru3d.com.
  40. 1 2 Smith, Ryan (January 5, 2017). "The AMD Vega Architecture Teaser: Higher IPC, Tiling, & More, coming in H1'2017". Anandtech.com. Retrieved January 10, 2017.
  41. WhyCry (March 24, 2016). "AMD confirms Polaris 10 is Ellesmere and Polaris 11 is Baffin". VideoCardz. Retrieved April 8, 2016.
  42. "Fast vollständige Hardware-Daten zu AMDs Radeon RX 500 Serie geleakt". www.3dcenter.org.
  43. "AMD Polaris 23". TechPowerUp. Retrieved May 12, 2022.
  44. Oh, Nate (November 15, 2018). "The AMD Radeon RX 590 Review, feat. XFX & PowerColor: Polaris Returns (Again)". anandtech.com. Retrieved November 24, 2018.
  45. Kampman, Jeff (January 5, 2017). "The curtain comes up on AMD's Vega architecture". TechReport.com. Retrieved January 10, 2017.
  46. Shrout, Ryan (January 5, 2017). "AMD Vega GPU Architecture Preview: Redesigned Memory Architecture". PC Perspective. Retrieved January 10, 2017.
  47. Kampman, Jeff (October 26, 2017). "AMD's Ryzen 7 2700U and Ryzen 5 2500U APUs revealed". Techreport.com. Retrieved October 26, 2017.
  48. Raevenlord (March 1, 2017). "On NVIDIA's Tile-Based Rendering". techPowerUp.
  49. "Vega Teaser: Draw Stream Binning Rasterizer". Anandtech.com.
  50. "Radeon RX Vega Revealed: AMD promises 4K gaming performance for $499 – Trusted Reviews". Trustedreviews.com. July 31, 2017. Archived from the original on July 14, 2017. Retrieved March 20, 2017.
  51. "The curtain comes up on AMD's Vega architecture". Techreport.com. Archived from the original on September 1, 2017. Retrieved March 20, 2017.
  52. Kampman, Jeff (January 23, 2018). "Radeon RX Vega primitive shaders will need API support". Techreport.com. Retrieved December 29, 2018.
  53. "ROCm-OpenCL-Runtime/libUtils.cpp at master · RadeonOpenCompute/ROCm-OpenCL-Runtime". github.com. May 3, 2017. Retrieved November 10, 2018.
  54. "The AMD Radeon RX Vega 64 & RX Vega 56 Review: Vega Burning Bright". Anandtech.com. August 14, 2017. Retrieved November 16, 2017.
  55. "AMD's Vega Mobile Lives: Vega Pro 20 & 16 in Updated MacBook Pros In November". Anandtech.com. October 30, 2018. Retrieved November 10, 2018.
  56. "AMD Announces Radeon Instinct MI60 & MI50 Accelerators: Powered By 7nm Vega". Anandtech.com. November 6, 2018. Retrieved November 10, 2018.
  57. "AMD Unveils World's First 7nm Gaming GPU – Delivering Exceptional Performance and Incredible Experiences for Gamers, Creators and Enthusiasts" (Press release). Las Vegas, Nevada: AMD. January 9, 2019. Retrieved January 12, 2019.
  58. Ferreira, Bruno (May 16, 2017). "Ryzen Mobile APUs are coming to a laptop near you". Tech Report. Retrieved May 16, 2017.
  59. "AMD Unveils World's First 7nm Datacenter GPUs – Powering the Next Era of Artificial Intelligence, Cloud Computing and High Performance Computing (HPC) | AMD". AMD.com (Press release). November 6, 2018. Retrieved November 10, 2018.
  60. "RadeonFeature". x.Org. Retrieved November 21, 2022.
  61. "AMD Tahiti GPU Specs". TechPowerUp. Retrieved November 20, 2022.
  62. "AMD Pitcairn GPU Specs". TechPowerUp. Retrieved November 20, 2022.
  63. "AMD Cape Verde GPU Specs". TechPowerUp. Retrieved November 20, 2022.
  64. "AMD Oland GPU Specs". TechPowerUp. Retrieved November 20, 2022.
  65. "AMD Hainan GPU Specs". TechPowerUp. Retrieved November 20, 2022.
  66. "AMD Bonaire GPU Specs". TechPowerUp. Retrieved November 21, 2022.
  67. "AMD Hawaii GPU Specs". TechPowerUp. Retrieved November 21, 2022.
  68. "AMD Topaz GPU Specs". TechPowerUp. Retrieved November 21, 2022.
  69. "AMD Tonga GPU Specs". TechPowerUp. Retrieved November 21, 2022.
  70. "AMD Fiji GPU Specs". TechPowerUp. Retrieved November 21, 2022.
  71. "AMD Ellesmere GPU Specs". TechPowerUp. Retrieved November 21, 2022.
  72. "AMD Baffin GPU Specs". TechPowerUp. Retrieved November 21, 2022.
  73. "AMD Lexa GPU Specs". TechPowerUp. Retrieved November 21, 2022.
  74. "AMD Vega 10 GPU Specs". TechPowerUp. Retrieved November 21, 2022.
  75. "AMD Vega 12 GPU Specs". TechPowerUp. Retrieved November 21, 2022.
  76. "AMD Vega 20 GPU Specs". TechPowerUp. Retrieved November 21, 2022.