Close to Metal

Last updated June 24, 2024

In computing, Close To Metal (CTM, originally Close-to-the-Metal) is the name of a beta version of a low-level programming interface developed by ATI, now the AMD Graphics Product Group, aimed at enabling GPGPU computing. CTM was short-lived, and the first production version of AMD's GPGPU technology is now called AMD Stream SDK, or rather the current AMD APP SDK (AMD Accelerated Parallel Processing SDK)^[1] ) for Windows and Linux 32-bit and 64-bit, which also targets Heterogeneous System Architecture.^{[ citation needed ]}

Overview

Close To Metal, originally called THIN (Thin Hardware INterface) and Data Parallel Virtual Machine, gave developers direct access to the native instruction set and memory of the massively parallel computational elements in modern AMD video cards. CTM bypassed the graphics-centric DirectX and OpenGL APIs for the GPGPU programmer to expose previously unavailable low-level functionality, including direct control of the stream processors/ALUs and the memory controllers. R580 (ATI X1900) and later generations of AMD's GPU microarchitecture supported the CTM interface.

CTM's commercial successor, AMD Stream SDK, was released under AMD EULA in December 2007 after the software stack was rewritten.^[2] Stream SDK provides high-level in addition to low-level tools for general-purpose access to AMD graphics hardware.

Using GPUs to perform computations holds a lot of potential for some applications because of the fundamental differences of GPU microarchitectures compared to CPUs. GPUs achieve much greater throughput (calculations per second) by executing many programs in parallel and restricting flow control (the ability of one program to execute instructions independently of another). Modern GPUs also have addressable on-die memory and extremely high performance multi-channel external memory.

AMD subsequently switched from CTM to OpenCL.^[3]

Open-source

Some components of CTM and the Stream SDK are open source, such as the Brook+ C-like language and compiler.

Related Research Articles

A graphics processing unit (GPU) is a specialized electronic circuit initially designed to accelerate computer graphics and image processing. After their initial design, GPUs were found to be useful for non-graphic calculations involving embarrassingly parallel problems due to their parallel structure. Other non-graphical uses include the training of neural networks and cryptocurrency mining.

Radeon is a brand of computer products, including graphics processing units, random-access memory, RAM disk software, and solid-state drives, produced by Radeon Technologies Group, a division of AMD. The brand was launched in 2000 by ATI Technologies, which was acquired by AMD in 2006 for US$5.4 billion.

General-purpose computing on graphics processing units is the use of a graphics processing unit (GPU), which typically handles computation only for computer graphics, to perform computation in applications traditionally handled by the central processing unit (CPU). The use of multiple video cards in one computer, or large numbers of graphics chips, further parallelizes the already parallel nature of graphics processing.

A physics processing unit (PPU) is a dedicated microprocessor designed to handle the calculations of physics, especially in the physics engine of video games. It is an example of hardware acceleration.

Quadro was Nvidia's brand for graphics cards intended for use in workstations running professional computer-aided design (CAD), computer-generated imagery (CGI), digital content creation (DCC) applications, scientific calculations and machine learning from 2000 to 2020.

AMD FirePro was AMD's brand of graphics cards designed for use in workstations and servers running professional Computer-aided design (CAD), Computer-generated imagery (CGI), Digital content creation (DCC), and High-performance computing/GPGPU applications. The GPU chips on FirePro-branded graphics cards are identical to the ones used on Radeon-branded graphics cards. The end products differentiate substantially by the provided graphics device drivers and through the available professional support for the software. The product line is split into two categories: "W" workstation series focusing on workstation and primarily focusing on graphics and display, and "S" server series focused on virtualization and GPGPU/High-performance computing.

In computing, the Brook programming language and its implementation BrookGPU were early and influential attempts to enable general-purpose computing on graphics processing units (GPGPU). Brook, developed at Stanford University graphics group, was a compiler and runtime implementation of a stream programming language targeting modern, highly parallel GPUs such as those found on ATI or Nvidia graphics cards.

In computing, CUDA is a proprietary parallel computing platform and application programming interface (API) that allows software to use certain types of graphics processing units (GPUs) for accelerated general-purpose processing, an approach called general-purpose computing on GPUs (GPGPU). CUDA API and its runtime: The CUDA API is an extension of the C programming language that adds the ability to specify thread-level parallelism in C and also to specify GPU device specific operations. CUDA is a software layer that gives direct access to the GPU's virtual instruction set and parallel computational elements for the execution of compute kernels. In addition to drivers and runtime kernels, the CUDA platform includes compilers, libraries and developer tools to help programmers accelerate their applications.

Larrabee is the codename for a cancelled GPGPU chip that Intel was developing separately from its current line of integrated graphics accelerators. It is named after either Mount Larrabee or Larrabee State Park in Whatcom County, Washington, United States, near the town of Bellingham. The chip was to be released in 2010 as the core of a consumer 3D graphics card, but these plans were cancelled due to delays and disappointing early performance figures. The project to produce a GPU retail product directly from the Larrabee research project was terminated in May 2010 and its technology was passed on to the Xeon Phi. The Intel MIC multiprocessor architecture announced in 2010 inherited many design elements from the Larrabee project, but does not function as a graphics processing unit; the product is intended as a co-processor for high performance computing.

AMD FireStream was AMD's brand name for their Radeon-based product line targeting stream processing and/or GPGPU in supercomputers. Originally developed by ATI Technologies around the Radeon X1900 XTX in 2006, the product line was previously branded as both ATI FireSTREAM and AMD Stream Processor. The AMD FireStream can also be used as a floating-point co-processor for offloading CPU calculations, which is part of the Torrenza initiative. The FireStream line has been discontinued since 2012, when GPGPU workloads were entirely folded into the AMD FirePro line.

OpenCL is a framework for writing programs that execute across heterogeneous platforms consisting of central processing units (CPUs), graphics processing units (GPUs), digital signal processors (DSPs), field-programmable gate arrays (FPGAs) and other processors or hardware accelerators. OpenCL specifies programming languages for programming these devices and application programming interfaces (APIs) to control the platform and execute programs on the compute devices. OpenCL provides a standard interface for parallel computing using task- and data-based parallelism.

The Northern Islands series is a family of GPUs developed by Advanced Micro Devices (AMD) forming part of its Radeon-brand, based on the 40 nm process. Some models are based on TeraScale 2 (VLIW5), some on the new TeraScale 3 (VLIW4) introduced with them.

The Radeon HD 7000 series, codenamed "Southern Islands", is a family of GPUs developed by AMD, and manufactured on TSMC's 28 nm process.

Graphics Core Next (GCN) is the codename for a series of microarchitectures and an instruction set architecture that were developed by AMD for its GPUs as the successor to its TeraScale microarchitecture. The first product featuring GCN was launched on January 9, 2012.

Heterogeneous System Architecture (HSA) is a cross-vendor set of specifications that allow for the integration of central processing units and graphics processors on the same bus, with shared memory and tasks. The HSA is being developed by the HSA Foundation, which includes AMD and ARM. The platform's stated aim is to reduce communication latency between CPUs, GPUs and other compute devices, and make these various devices more compatible from a programmer's perspective, relieving the programmer of the task of planning the moving of data between devices' disjoint memories.

TeraScale is the codename for a family of graphics processing unit microarchitectures developed by ATI Technologies/AMD and their second microarchitecture implementing the unified shader model following Xenos. TeraScale replaced the old fixed-pipeline microarchitectures and competed directly with Nvidia's first unified shader microarchitecture named Tesla.

AMD APP SDK is a software development kit by AMD for "Accelerated Parallel Processing" (APP). AMD APP SDK also targets Heterogeneous System Architecture.

Single instruction, multiple threads (SIMT) is an execution model used in parallel computing where single instruction, multiple data (SIMD) is combined with multithreading. It is different from SPMD in that all instructions in all "threads" are executed in lock-step. The SIMT execution model has been implemented on several GPUs and is relevant for general-purpose computing on graphics processing units (GPGPU), e.g. some supercomputers combine CPUs with GPUs.

In computing, a compute kernel is a routine compiled for high throughput accelerators, separate from but used by a main program. They are sometimes called compute shaders, sharing execution units with vertex shaders and pixel shaders on GPUs, but are not limited to execution on one class of device, or graphics APIs.

ROCm is an Advanced Micro Devices (AMD) software stack for graphics processing unit (GPU) programming. ROCm spans several domains: general-purpose computing on graphics processing units (GPGPU), high performance computing (HPC), heterogeneous computing. It offers several programming models: HIP, OpenMP/Message Passing Interface (MPI), and OpenCL.

References

↑ "AMD APP SDK OpenCL™ Accelerated Parallel Processing". Archived from the original on 2014-07-01. Retrieved 2014-07-06.
↑ AMD Stream SDK download page Archived December 23, 2007, at the Wayback Machine , retrieved June 12, 2008
↑ Valich, Theo (7 August 2008). "AMD Ditches Close-To-Metal, Focuses On DX11 And OpenCL". Tom's Hardware. Retrieved 13 September 2017.

Notes

^ AMD “Close to Metal” Technology Unleashes the Power of Stream Computing: AMD Press Release, November 14, 2006.
^ AnandTech report: ATI's Stream Processing & Folding@Home, September 30, 2006.
^ Universität Dortmund, Fachbereich Mathematik research: Accelerating Double precision on GPUs (Proceedings of ASIM 2005), Dominik Goddeke, Robert Strzodka, and Stefan Turek. 18th Symposium on Simulation Technique, 2005.^{[ dead link ]}
^ TGDaily report: Nvidia activates a supercomputer in your PC, February 16, 2007.

External links

ATI official site* AMD official website
"ATI DPVM SIGGRAPH 2006 sketch" (PDF). Archived from the original (PDF) on 2007-09-27. (134 KiB)
"ATI DVPM SIGGRAPH 2006 Presentation" (PDF). Archived from the original (PDF) on 2007-09-27. (671 KiB)
"CTM Guide - CTI Technical Reference Manual" (PDF). Archived from the original (PDF) on 2007-02-22. (866 KiB)
AMD Close-to-the-Metal (CTM) open source project site

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] "AMD APP SDK OpenCL™ Accelerated Parallel Processing". Archived from the original on 2014-07-01. Retrieved 2014-07-06.

[2] AMD Stream SDK download page Archived December 23, 2007, at the Wayback Machine , retrieved June 12, 2008

[3] Valich, Theo (7 August 2008). "AMD Ditches Close-To-Metal, Focuses On DX11 And OpenCL". Tom's Hardware. Retrieved 13 September 2017.

[1]

[2]

[3]