A graphics processing unit (GPU) is a specialized electronic circuit designed to rapidly manipulate and alter memory to accelerate the creation of images in a frame buffer intended for output to a display device. GPUs are used in embedded systems, mobile phones, personal computers, workstations, and game consoles. Modern GPUs are very efficient at manipulating computer graphics and image processing. Their highly parallel structure makes them more efficient than general-purpose CPUs for algorithms that process large blocks of data in parallel. In a personal computer, a GPU can be present on a video card or embedded on the motherboard. In certain CPUs, they are embedded on the CPU die.
An electronic circuit is composed of individual electronic components, such as resistors, transistors, capacitors, inductors and diodes, connected by conductive wires or traces through which electric current can flow. To be referred to as electronic, rather than electrical, generally at least one active component must be present. The combination of components and wires allows various simple and complex operations to be performed: signals can be amplified, computations can be performed, and data can be moved from one place to another.
An image is an artifact that depicts visual perception, such as a photograph or other two-dimensional picture, that resembles a subject—usually a physical object—and thus provides a depiction of it. In the context of signal processing, an image is a distributed amplitude of color(s).
A display device is an output device for presentation of information in visual or tactile form. When the input information that is supplied has an electrical signal, the display is called an electronic display.
The term GPU has been used from at least the 1980s.It was popularized by Nvidia in 1999, who marketed the GeForce 256 as "the world's first GPU". It was presented as a "single-chip processor with integrated transform, lighting, triangle setup/clipping, and rendering engines". Rival ATI Technologies coined the term "visual processing unit" or VPU with the release of the Radeon 9700 in 2002.
Nvidia Corporation is an American technology company incorporated in Delaware and based in Santa Clara, California. It designs graphics processing units (GPUs) for the gaming and professional markets, as well as system on a chip units (SoCs) for the mobile computing and automotive market. Its primary GPU product line, labeled "GeForce", is in direct competition with Advanced Micro Devices' (AMD) "Radeon" products. Nvidia expanded its presence in the gaming industry with its handheld Shield Portable, Shield Tablet and Shield Android TV.
The GeForce 256 is the original release in Nvidia's "GeForce" product-line. Announced on August 31, 1999 and released on October 11, 1999, the GeForce 256 improves on its predecessor by increasing the number of fixed pixel pipelines, offloading host geometry calculations to a hardware transform and lighting (T&L) engine, and adding hardware motion-compensation for MPEG-2 video. It offered a notably large leap in 3D gaming performance and was the first fully Direct3D 7-compliant 3D accelerator.
Transform, clipping, and lighting is a term used in computer graphics.
Arcade system boards have been using specialized graphics chips since the 1970s. In early video game hardware, the RAM for frame buffers was expensive, so video chips composited data together as the display was being scanned out on the monitor.
An arcade system board is a dedicated computer system created for the purpose of running video arcade games. Arcade system boards typically consist of a main system board with any number of supporting boards.
Random-access memory is a form of computer data storage that stores data and machine code currently being used. A random-access memory device allows data items to be read or written in almost the same amount of time irrespective of the physical location of data inside the memory. In contrast, with other direct-access data storage media such as hard disks, CD-RWs, DVD-RWs and the older magnetic tapes and drum memory, the time required to read and write data items varies significantly depending on their physical locations on the recording medium, due to mechanical limitations such as media rotation speeds and arm movement.
Fujitsu's MB14241 video shifter was used to accelerate the drawing of sprite graphics for various 1970s arcade games from Taito and Midway, such as Gun Fight (1975), Sea Wolf (1976) and Space Invaders (1978).The Namco Galaxian arcade system in 1979 used specialized graphics hardware supporting RGB color, multi-colored sprites and tilemap backgrounds. The Galaxian hardware was widely used during the golden age of arcade video games, by game companies such as Namco, Centuri, Gremlin, Irem, Konami, Midway, Nichibutsu, Sega and Taito.
Fujitsu Ltd. is a Japanese multinational information technology equipment and services company headquartered in Tokyo, Japan. In 2015, it was the world's fourth-largest IT services provider measured by IT services revenue. Fortune named Fujitsu as one of the world's most admired companies and a Global 500 company.
A video display controller or VDC is an integrated circuit which is the main component in a video signal generator, a device responsible for the production of a TV video signal in a computing or game system. Some VDCs also generate an audio signal, but that is not their main function.
Sprite is a computer graphics term for a two-dimensional bitmap that is integrated into a larger scene.
In the home market, the Atari 2600 in 1977 used a video shifter called the Television Interface Adaptor. —the way the scan lines map to specific bitmapped or character modes and where the memory is stored (so there did not need to be a contiguous frame buffer). 6502 machine code subroutines could be triggered on scan lines by setting a bit on a display list instruction. ANTIC also supported smooth vertical and horizontal scrolling independent of the CPU.The Atari 8-bit computers (1979) had ANTIC, a video processor which interpreted instructions describing a "display list"
The Atari 2600, originally sold as the Atari Video Computer System or Atari VCS until November 1982, is a home video game console from Atari, Inc. Released on September 11, 1977, it is credited with popularizing the use of microprocessor-based hardware and games contained on ROM cartridges, a format first used with the Fairchild Channel F in 1976. This contrasts with the older model of having dedicated hardware that could play only those games that were physically built into the unit. The 2600 was bundled with two joystick controllers, a conjoined pair of paddle controllers, and a game cartridge: initially Combat, and later Pac-Man.
The Television Interface Adaptor (TIA) is the custom computer chip that is the heart of the Atari 2600 game console, generating the screen display, sound effects, and reading input controllers. Its design was widely affected by an attempt to reduce the amount of RAM needed to operate the display. The resulting design is notoriously difficult to program, which is an ongoing challenge for developers.
The Atari 8-bit family is a series of 8-bit home computers introduced by Atari, Inc. in 1979 and manufactured until 1992. All of the machines in the family are technically similar and differ primarily in packaging. They are based on the MOS Technology 6502 CPU running at 1.79 MHz, and were the first home computers designed with custom co-processor chips. This architecture enabled graphics and sound capabilities that were more advanced than contemporary machines at the time of release, and gaming on the platform was a major draw. Star Raiders is considered the platform's killer app.
The NEC µPD7220 was one of the first implementations of a graphics display controller as a single Large Scale Integration (LSI) integrated circuit chip, enabling the design of low-cost, high-performance video graphics cards such as those from Number Nine Visual Technology. It became one of the best known of what were known as graphics processing units in the 1980s.
The High-Performance Graphics Display Controller 7220 is a video interface controller capable of drawing lines, circles, arcs, and character graphics to a bit-mapped display. It was developed by NEC and used in NEC's PC-9801 and APC III computers, the NEC N5200 intelligent kanji terminal, the optional graphics module for the DEC Rainbow, the Tulip System-1, and the Epson QX-10.
An integrated circuit or monolithic integrated circuit is a set of electronic circuits on one small flat piece of semiconductor material that is normally silicon. The integration of large numbers of tiny transistors into a small chip results in circuits that are orders of magnitude smaller, cheaper, and faster than those constructed of discrete electronic components. The IC's mass production capability, reliability and building-block approach to circuit design has ensured the rapid adoption of standardized ICs in place of designs using discrete transistors. ICs are now used in virtually all electronic equipment and have revolutionized the world of electronics. Computers, mobile phones, and other digital home appliances are now inextricable parts of the structure of modern societies, made possible by the small size and low cost of ICs.
Number Nine Visual Technology Corporation was a manufacturer of video graphics chips and cards from 1982 to 1999. Number Nine developed the first 128-bit graphics processor, as well as the first 256-color (8-bit) and 16.8 million color (24-bit) cards.
The Williams Electronics arcade games Robotron 2084 , Joust , Sinistar , and Bubbles , all released in 1982, contain custom blitter chips for operating on 16-color bitmaps.
In 1985, the Commodore Amiga featured a custom graphics chip, with a blitter unit accelerating bitmap manipulation, line draw, and area fill functions. Also included is a coprocessor (commonly referred to as "The Copper") with its own primitive instruction set, capable of manipulating graphics hardware registers in sync with the video beam (e.g. for per-scanline palette switches, sprite multiplexing, and hardware windowing), or driving the blitter.
In 1986, Texas Instruments released the TMS34010, the first microprocessor with on-chip graphics capabilities. It could run general-purpose code, but it had a very graphics-oriented instruction set. In 1990-1992, this chip would become the basis of the Texas Instruments Graphics Architecture ("TIGA") Windows accelerator cards.
In 1987, the IBM 8514 graphics system was released as one of[ vague ] the first video cards for IBM PC compatibles to implement fixed-function 2D primitives in electronic hardware. The same year, Sharp released the X68000, which used a custom graphics chipset that was powerful for a home computer at the time, with a 65,536 color palette and hardware support for sprites, scrolling and multiple playfields, eventually serving as a development machine for Capcom's CP System arcade board. Fujitsu later competed with the FM Towns computer, released in 1989 with support for a full 16,777,216 color palette.
In 1988, the first dedicated polygonal 3D graphics boards were introduced in arcades with the Namco System 21and Taito Air System.
In 1991, S3 Graphics introduced the S3 86C911 , which its designers named after the Porsche 911 as an indication of the performance increase it promised.The 86C911 spawned a host of imitators: by 1995, all major PC graphics chip makers had added 2D acceleration support to their chips. By this time, fixed-function Windows accelerators had surpassed expensive general-purpose graphics coprocessors in Windows performance, and these coprocessors faded away from the PC market.
Throughout the 1990s, 2D GUI acceleration continued to evolve. As manufacturing capabilities improved, so did the level of integration of graphics chips. Additional application programming interfaces (APIs) arrived for a variety of tasks, such as Microsoft's WinG graphics library for Windows 3.x, and their later DirectDraw interface for hardware acceleration of 2D games within Windows 95 and later.
In the early- and mid-1990s, real-time 3D graphics were becoming increasingly common in arcade, computer and console games, which led to an increasing public demand for hardware-accelerated 3D graphics. Early examples of mass-market 3D graphics hardware can be found in arcade system boards such as the Sega Model 1, Namco System 22, and Sega Model 2, and the fifth-generation video game consoles such as the Saturn, PlayStation and Nintendo 64. Arcade systems such as the Sega Model 2 and Namco Magic Edge Hornet Simulator in 1993 were capable of hardware T&L (transform, clipping, and lighting) years before appearing in consumer graphics cards.Some systems used DSPs to accelerate transformations. Fujitsu, which worked on the Sega Model 2 arcade system, began working on integrating T&L into a single LSI solution for use in home computers in 1995; the Fujitsu Pinolite, the first 3D geometry processor for personal computers, released in 1997. The first hardware T&L GPU on home video game consoles was the Nintendo 64's Reality Coprocessor, released in 1996. In 1997, Mitsubishi released the 3Dpro/2MP, a fully featured GPU capable of transformation and lighting, for workstations and Windows NT desktops; ATi utilized it for their FireGL 4000 graphics card, released in 1997.
In the PC world, notable failed first tries for low-cost 3D graphics chips were the S3 ViRGE , ATI Rage, and Matrox Mystique. These chips were essentially previous-generation 2D accelerators with 3D features bolted on. Many were even pin-compatible with the earlier-generation chips for ease of implementation and minimal cost. Initially, performance 3D graphics were possible only with discrete boards dedicated to accelerating 3D functions (and lacking 2D GUI acceleration entirely) such as the PowerVR and the 3dfx Voodoo. However, as manufacturing technology continued to progress, video, 2D GUI acceleration and 3D functionality were all integrated into one chip. Rendition's Verite chipsets were among the first to do this well enough to be worthy of note. In 1997, Rendition went a step further by collaborating with Hercules and Fujitsu on a "Thriller Conspiracy" project which combined a Fujitsu FXG-1 Pinolite geometry processor with a Vérité V2200 core to create a graphics card with a full T&L engine years before Nvidia's GeForce 256. This card, designed to reduce the load placed upon the system's CPU, never made it to market.[ citation needed ]
OpenGL appeared in the early '90s as a professional graphics API, but originally suffered from performance issues which allowed the Glide API to step in and become a dominant force on the PC in the late '90s.However, these issues were quickly overcome and the Glide API fell by the wayside. Software implementations of OpenGL were common during this time, although the influence of OpenGL eventually led to widespread hardware support. Over time, a parity emerged between features offered in hardware and those offered in OpenGL. DirectX became popular among Windows game developers during the late 90s. Unlike OpenGL, Microsoft insisted on providing strict one-to-one support of hardware. The approach made DirectX less popular as a standalone graphics API initially, since many GPUs provided their own specific features, which existing OpenGL applications were already able to benefit from, leaving DirectX often one generation behind. (See: Comparison of OpenGL and Direct3D.)
Over time, Microsoft began to work more closely with hardware developers, and started to target the releases of DirectX to coincide with those of the supporting graphics hardware. Direct3D 5.0 was the first version of the burgeoning API to gain widespread adoption in the gaming market, and it competed directly with many more-hardware-specific, often proprietary graphics libraries, while OpenGL maintained a strong following. Direct3D 7.0 introduced support for hardware-accelerated transform and lighting (T&L) for Direct3D, while OpenGL had this capability already exposed from its inception. 3D accelerator cards moved beyond being just simple rasterizers to add another significant hardware stage to the 3D rendering pipeline. The Nvidia GeForce 256 (also known as NV10) was the first consumer-level card released on the market with hardware-accelerated T&L, while professional 3D cards already had this capability. Hardware transform and lighting, both already existing features of OpenGL, came to consumer-level hardware in the '90s and set the precedent for later pixel shader and vertex shader units which were far more flexible and programmable.
Nvidia was first to produce a chip capable of programmable shading, the GeForce 3 (code named NV20). Each pixel could now be processed by a short "program" that could include additional image textures as inputs, and each geometric vertex could likewise be processed by a short program before it was projected onto the screen. Used in the Xbox console, it competed with the PlayStation 2 (which used a custom vector DSP for hardware accelerated vertex processing; commonly referred to VU0/VU1). The earliest incarnations of shader execution engines used in Xbox were not general purpose and could not execute arbitrary pixel code. Vertices and pixels were processed by different units which had their own resources with pixel shaders having much tighter constraints (being as they are executed at much higher frequencies than with vertices). Pixel shading engines were actually more akin to a highly customizable function block and didn't really "run" a program. Many of these disparities between vertex and pixel shading wouldn't be addressed until much later with the Unified Shader Model.
By October 2002, with the introduction of the ATI Radeon 9700 (also known as R300), the world's first Direct3D 9.0 accelerator, pixel and vertex shaders could implement looping and lengthy floating point math, and were quickly becoming as flexible as CPUs, yet orders of magnitude faster for image-array operations. Pixel shading is often used for bump mapping, which adds texture, to make an object look shiny, dull, rough, or even round or extruded.
With the introduction of the GeForce 8 series, which was produced by Nvidia, and then new generic stream processing unit GPUs became a more generalized computing device. Today, parallel GPUs have begun making computational inroads against the CPU, and a subfield of research, dubbed GPU Computing or GPGPU for General Purpose Computing on GPU, has found its way into fields as diverse as machine learning,oil exploration, scientific image processing, linear algebra, statistics, 3D reconstruction and even stock options pricing determination. GPGPU at the time was the precursor to what we now call compute shaders (e.g. CUDA, OpenCL, DirectCompute) and actually abused the hardware to a degree by treating the data passed to algorithms as texture maps and executing algorithms by drawing a triangle or quad with an appropriate pixel shader. This obviously entails some overheads since we involve units like the Scan Converter where they aren't really needed (nor do we even care about the triangles, except to invoke the pixel shader). Over the years, the energy consumption of GPUs has increased and to manage it, several techniques have been proposed.
Nvidia's CUDA platform, first introduced in 2007, [ citation needed ]was the earliest widely adopted programming model for GPU computing. More recently OpenCL has become broadly supported. OpenCL is an open standard defined by the Khronos Group which allows for the development of code for both GPUs and CPUs with an emphasis on portability. OpenCL solutions are supported by Intel, AMD, Nvidia, and ARM, and according to a recent report by Evan's Data, OpenCL is the GPGPU development platform most widely used by developers in both the US and Asia Pacific.
In 2010, Nvidia began a partnership with Audi to power their cars' dashboards. These Tegra GPUs were powering the cars' dashboard, offering increased functionality to cars' navigation and entertainment systems. nm process.Advancements in GPU technology in cars has helped push self-driving technology. AMD's Radeon HD 6000 Series cards were released in 2010 and in 2011, AMD released their 6000M Series discrete GPUs to be used in mobile devices. The Kepler line of graphics cards by Nvidia came out in 2012 and were used in the Nvidia's 600 and 700 series cards. A new feature in this new GPU microarchitecture included GPU boost, a technology adjusts the clock-speed of a video card to increase or decrease it according to its power draw. The Kepler microarchitecture was manufactured on the 28
The PS4 and Xbox One were released in 2013, they both use GPUs based on AMD's Radeon HD 7850 and 7790. Nvidia's Kepler line of GPUs was followed by the Maxwell line, manufactured on the same process. 28 nm chips by Nvidia were manufactured by TSMC, the Taiwan Semiconductor Manufacturing Company, that was manufacturing using the 28 nm process at the time. Compared to the 40 nm technology from the past, this new manufacturing process allowed a 20 percent boost in performance while drawing less power. Virtual reality headsets like the Oculus Rift and the HTC Vive have very high system requirements. VR headset manufacturers recommended the GTX 970 and the R9 290X or better at the time of their release. Pascal is the next generation of consumer graphics cards by Nvidia released in 2016. The GeForce 10 series of cards are under this generation of graphics cards. They are made using the 16 nm manufacturing process which improves upon previous microarchitectures. Nvidia has released one non-consumer card under the new Volta architecture, the Titan V. Changes from the Titan XP, Pascal's high-end card, include an increase in the number of CUDA cores, the addition of tensor cores, and high-bandwidth memory. Tensor cores are cores specially designed for deep learning, while high-bandwidth memory is on-die, stacked, lower-clocked memory that offers an extremely wide memory bus that is useful for the Titan V's intended purpose. To emphasize that the Titan V is not a gaming card, Nvidia removed the "Geforce GTX" suffix it adds to consumer gaming cards. A new generation, the RTX Turing GPUs were unveiled on August 20, 2018 that add ray-tracing cores to GPUs, improving their performance on lighting effects.[ citation needed ] Polaris 11 and Polaris 10 GPUs from AMD are fabricated a 14-nanometer process. Their release results in a substantial increase in the performance per watt of AMD video cards. AMD has also released for its high-end market the Vega GPUs, which also feature high-bandwidth memory like the Titan V.
Many companies have produced GPUs under a number of brand names. In 2009, Intel, Nvidia and AMD/ATI were the market share leaders, with 49.4%, 27.8% and 20.6% market share respectively. However, those numbers include Intel's integrated graphics solutions as GPUs. Not counting those, Nvidia and AMD control nearly 100% of the market as of 2018. Their respective market shares are 66% and 33%.In addition, S3 Graphics and Matrox produce GPUs. Modern smartphones are also using mostly Adreno GPUs from Qualcomm, PowerVR GPUs from Imagination Technologies and Mali GPUs from ARM.
Modern GPUs use most of their transistors to do calculations related to 3D computer graphics. In addition to the 3D hardware, today's GPUs include basic 2D acceleration and framebuffer capabilities (usually with a VGA compatibility mode). Newer cards like AMD/ATI HD5000-HD7000 even lack 2D acceleration; it has to be emulated by 3D hardware. GPUs were initially used to accelerate the memory-intensive work of texture mapping and rendering polygons, later adding units to accelerate geometric calculations such as the rotation and translation of vertices into different coordinate systems. Recent developments in GPUs include support for programmable shaders which can manipulate vertices and textures with many of the same operations supported by CPUs, oversampling and interpolation techniques to reduce aliasing, and very high-precision color spaces. Because most of these computations involve matrix and vector operations, engineers and scientists have increasingly studied the use of GPUs for non-graphical calculations; they are especially suited to other embarrassingly parallel problems.
With the emergence of deep learning, the importance of GPUs has increased. In research done by Indigo, it was found that while training deep learning neural networks, GPUs can be 250 times faster than CPUs. The explosive growth of Deep Learning in recent years has been attributed to the emergence of general purpose GPUs.[ citation needed ] There has been some level of competition in this area with ASICs, most prominently the Tensor Processing Unit (TPU) made by Google. However, these can require changes to existing code and GPUs are still very popular.
Most GPUs made since 1995 support the YUV color space and hardware overlays, important for digital video playback, and many GPUs made since 2000 also support MPEG primitives such as motion compensation and iDCT. This process of hardware accelerated video decoding, where portions of the video decoding process and video post-processing are offloaded to the GPU hardware, is commonly referred to as "GPU accelerated video decoding", "GPU assisted video decoding", "GPU hardware accelerated video decoding" or "GPU hardware assisted video decoding".
More recent graphics cards even decode high-definition video on the card, offloading the central processing unit. The most common APIs for GPU accelerated video decoding are DxVA for Microsoft Windows operating system and VDPAU, VAAPI, XvMC, and XvBA for Linux-based and UNIX-like operating systems. All except XvMC are capable of decoding videos encoded with MPEG-1, MPEG-2, MPEG-4 ASP (MPEG-4 Part 2), MPEG-4 AVC (H.264 / DivX 6), VC-1, WMV3/WMV9, Xvid / OpenDivX (DivX 4), and DivX 5 codecs, while XvMC is only capable of decoding MPEG-1 and MPEG-2.
The video decoding processes that can be accelerated by today's modern GPU hardware are:
In personal computers, there are two main forms of GPUs. Each has many synonyms:
Most GPUs are designed for a specific usage, real-time 3D graphics or other mass calculations:
The GPUs of the most powerful class typically interface with the motherboard by means of an expansion slot such as PCI Express (PCIe) or Accelerated Graphics Port (AGP) and can usually be replaced or upgraded with relative ease, assuming the motherboard is capable of supporting the upgrade. A few graphics cards still use Peripheral Component Interconnect (PCI) slots, but their bandwidth is so limited that they are generally used only when a PCIe or AGP slot is not available.
A dedicated GPU is not necessarily removable, nor does it necessarily interface with the motherboard in a standard fashion. The term "dedicated" refers to the fact that dedicated graphics cards have RAM that is dedicated to the card's use, not to the fact that most dedicated GPUs are removable. Further, this RAM is usually specially selected for the expected serial workload of the graphics card (see GDDR). Sometimes, systems with dedicated, discrete GPUs were called "DIS" systems, as opposed to "UMA" systems (see next section). Dedicated GPUs for portable computers are most commonly interfaced through a non-standard and often proprietary slot due to size and weight constraints. Such ports may still be considered PCIe or AGP in terms of their logical host interface, even if they are not physically interchangeable with their counterparts.
Technologies such as SLI by Nvidia and CrossFire by AMD allow multiple GPUs to draw images simultaneously for a single screen, increasing the processing power available for graphics.
Integrated graphics processing unit (IGPU), Integrated graphics, shared graphics solutions, integrated graphics processors (IGP) or unified memory architecture (UMA) utilize a portion of a computer's system RAM rather than dedicated graphics memory. IGPs can be integrated onto the motherboard as part of the chipset, or on the same die with the CPU (like AMD APU or Intel HD Graphics). On certain motherboards [ needs update ] They are less costly to implement than dedicated graphics processing, but tend to be less capable. Historically, integrated processing was often considered unfit to play 3D games or run graphically intensive programs but could run less intensive programs such as Adobe Flash. Examples of such IGPs would be offerings from SiS and VIA circa 2004. However, modern integrated graphics processors such as AMD Accelerated Processing Unit and Intel HD Graphics are more than capable of handling 2D graphics or low stress 3D graphics.AMD's IGPs can use dedicated sideport memory. This is a separate fixed block of high performance memory that is dedicated for use by the GPU. In early 2007, computers with integrated graphics account for about 90% of all PC shipments.
As a GPU is extremely memory intensive, integrated processing may find itself competing with the CPU for the relatively slow system RAM, as it has minimal or no dedicated video memory. IGPs can have up to 29.856 GB/s of memory bandwidth from system RAM, whereas a graphics card may have up to 264 GB/s of bandwidth between its RAM and GPU core. This memory bus bandwidth can limit the performance of the GPU, though multi-channel memory can mitigate this deficiency. Older integrated graphics chipsets lacked hardware transform and lighting, but newer ones include it.
This newer class of GPUs competes with integrated graphics in the low-end desktop and notebook markets. The most common implementations of this are ATI's HyperMemory and Nvidia's TurboCache.
Hybrid graphics cards are somewhat more expensive than integrated graphics, but much less expensive than dedicated graphics cards. These share memory with the system and have a small dedicated memory cache, to make up for the high latency of the system RAM. Technologies within PCI Express can make this possible. While these solutions are sometimes advertised as having as much as 768MB of RAM, this refers to how much can be shared with the system memory.
It is becoming increasingly common to use a general purpose graphics processing unit (GPGPU) as a modified form of stream processor (or a vector processor), running compute kernels. This concept turns the massive computational power of a modern graphics accelerator's shader pipeline into general-purpose computing power, as opposed to being hard wired solely to do graphical operations. In certain applications requiring massive vector operations, this can yield several orders of magnitude higher performance than a conventional CPU. The two largest discrete (see "Dedicated graphics cards" above) GPU designers, AMD and Nvidia, are beginning to pursue this approach with an array of applications. Both Nvidia and AMD have teamed with Stanford University to create a GPU-based client for the Folding@home distributed computing project, for protein folding calculations. In certain circumstances the GPU calculates forty times faster than the conventional CPUs traditionally used by such applications.
GPGPU can be used for many types of embarrassingly parallel tasks including ray tracing. They are generally suited to high-throughput type computations that exhibit data-parallelism to exploit the wide vector width SIMD architecture of the GPU.
Furthermore, GPU-based high performance computers are starting to play a significant role in large-scale modelling. Three of the 10 most powerful supercomputers in the world take advantage of GPU acceleration.
GPU supports API extensions to the C programming language such as OpenCL and OpenMP. Furthermore, each GPU vendor introduced its own API which only works with their cards, AMD APP SDK and CUDA from AMD and Nvidia, respectively. These technologies allow specified functions called compute kernels from a normal C program to run on the GPU's stream processors. This makes it possible for C programs to take advantage of a GPU's ability to operate on large buffers in parallel, while still using the CPU when appropriate. CUDA is also the first API to allow CPU-based applications to directly access the resources of a GPU for more general purpose computing without the limitations of using a graphics API.[ citation needed ]
Since 2005 there has been interest in using the performance offered by GPUs for evolutionary computation in general, and for accelerating the fitness evaluation in genetic programming in particular. Most approaches compile linear or tree programs on the host PC and transfer the executable to the GPU to be run. Typically the performance advantage is only obtained by running the single active program simultaneously on many example problems in parallel, using the GPU's SIMD architecture.However, substantial acceleration can also be obtained by not compiling the programs, and instead transferring them to the GPU, to be interpreted there. Acceleration can then be obtained by either interpreting multiple programs simultaneously, simultaneously running multiple example problems, or combinations of both. A modern GPU can readily simultaneously interpret hundreds of thousands of very small programs.
Some modern workstation GPUs, such as the Nvidia Quadro workstation cards using the Volta and Turing architectures, feature dedicating processing cores for tensor-based deep learning applications. In Nvidia's current series of GPUs these cores are called Tensor Cores.These GPUs usually have significant FLOPS performance increases, utilizing 4x4 matrix multiplication and division, resulting in hardware performance up to 128 TFLOPS in some applications. These tensor cores are also supposed to appear in consumer cards running the Turing architecture, and possibly in the Navi series of consumer cards from AMD.
An external GPU is a graphics processor located outside of the housing of the computer. External graphics processors are sometimes used with laptop computers. Laptops might have a substantial amount of RAM and a sufficiently powerful central processing unit (CPU), but often lack a powerful graphics processor, and instead have a less powerful but more energy-efficient on-board graphics chip. On-board graphics chips are often not powerful enough for playing the latest games, or for other graphically intensive tasks, such as editing video.
Therefore, it is desirable to be able to attach a GPU to some external bus of a notebook. PCI Express is the only bus commonly used for this purpose. The port may be, for example, an ExpressCard or mPCIe port (PCIe ×1, up to 5 or 2.5 Gbit/s respectively) or a Thunderbolt 1, 2, or 3 port (PCIe ×4, up to 10, 20, or 40 Gbit/s respectively). Those ports are only available on certain notebook systems.
Official vendor support for External GPUs has gained traction recently. One notable milestone was Apple’s decision to officially support External GPUs with mac OS High Sierra 10.13.4.There are also several major hardware vendors (HP, Alienware, Razer) releasing Thunderbolt 3 eGPU enclosures. This support has continued to fuel eGPU implementations by enthusiasts.
In 2013, 438.3 million GPUs were shipped globally and the forecast for 2014 was 414.2 million.
A video card is an expansion card which generates a feed of output images to a display device. Frequently, these are advertised as discrete or dedicated graphics cards, emphasizing the distinction between these and integrated graphics. At the core of both is the graphics processing unit (GPU), which is the main part that does the actual computations, but should not be confused as the video card as a whole, although "GPU" is often used to refer to video cards.
ATI Technologies Inc. was a semiconductor technology corporation based in Markham, Ontario, Canada, that specialized in the development of graphics processing units and chipsets. Founded in 1985 as Array Technology Inc., the company listed publicly in 1993. Advanced Micro Devices (AMD) acquired ATI in 2006. As a major fabrication-less or fabless semiconductor company, ATI conducted research and development in-house and outsourced the manufacturing and assembly of its products. With the decline and eventual bankruptcy of 3dfx in 2000, ATI and its chief rival Nvidia emerged as the two dominant players in the graphics processors industry, eventually forcing other manufacturers into niche roles.
General-purpose computing on graphics processing units is the use of a graphics processing unit (GPU), which typically handles computation only for computer graphics, to perform computation in applications traditionally handled by the central processing unit (CPU). The use of multiple video cards in one computer, or large numbers of graphics chips, further parallelizes the already parallel nature of graphics processing. In addition, even a single GPU-CPU framework provides advantages that multiple CPUs on their own do not offer due to the specialization in each chip.
A physics processing unit (PPU) is a dedicated microprocessor designed to handle the calculations of physics, especially in the physics engine of video games. It is an example of hardware acceleration.
A free and open-source graphics device driver is a software stack which controls computer-graphics hardware and supports graphics-rendering application programming interfaces (APIs) and is released under a free and open-source software license. Graphics device drivers are written for specific hardware to work within a specific operating system kernel and to support a range of APIs used by applications to access the graphics hardware. They may also control output to the display if the display driver is part of the graphics hardware. Most free and open-source graphics device drivers are developed by the Mesa project. The driver is made up of a compiler, a rendering API, and software which manages access to the graphics hardware.
Graphics hardware is computer hardware that generates computer graphics and allows them to be shown on a display, usually using a graphics card in combination with a device driver to create the images on the screen.
X-Video Motion Compensation (XvMC), is an extension of the X video extension (Xv) for the X Window System. The XvMC API allows video programs to offload portions of the video decoding process to the GPU video-hardware. In theory this process should also reduce bus bandwidth requirements. Currently, the supported portions to be offloaded by XvMC onto the GPU are motion compensation and inverse discrete cosine transform (iDCT) for MPEG-2 video. XvMC also supports offloading decoding of mo comp, iDCT, and VLD for not only MPEG-2 but also MPEG-4 ASP video on VIA Unichrome hardware.
ATI Avivo is a set of hardware and low level software features present on the ATI Radeon R520 family of GPUs and all later ATI Radeon products. ATI Avivo was designed to offload video decoding, encoding, and post-processing from a computer's CPU to a compatible GPU. ATI Avivo compatible GPUs have lower CPU usage when a player and decoder software that support ATI Avivo is used. ATI Avivo has been long superseded by Unified Video Decoder (UVD) and Video Coding Engine (VCE).
The AMD Accelerated Processing Unit (APU), formerly known as Fusion, is the marketing term for a series of 64-bit microprocessors from Advanced Micro Devices (AMD), designed to act as a central processing unit (CPU) and graphics processing unit (GPU) on a single die.
Unified Video Decoder (UVD), previously called Universal Video Decoder, is the name given to AMD's dedicated video decoding ASIC. There are multiple versions implementing a multitude of video codecs, such as H.264 and VC-1.
AMD FireStream was AMD's brand name for their Radeon-based product line targeting stream processing and/or GPGPU in supercomputers. Originally developed by ATI Technologies around the Radeon X1900 XTX in 2006, the product line was previously branded as both ATI FireSTREAM and AMD Stream Processor. The AMD FireStream can also be used as a floating-point co-processor for offloading CPU calculations, which is part of the Torrenza initiative. The FireStream line has been discontinued since 2012, when GPGPU workloads were entirely folded into the AMD FirePro line.
Video Acceleration API is a royalty-free API as well as its implementation as free and open-source library (libVA) distributed under the MIT License.
X-Video Bitstream Acceleration (XvBA), designed by AMD Graphics for its Radeon GPU and Fusion APU, is an arbitrary extension of the X video extension (Xv) for the X Window System on Linux operating-systems. XvBA API allows video programs to offload portions of the video decoding process to the GPU video-hardware. Currently, the portions designed to be offloaded by XvBA onto the GPU are currently motion compensation (MC) and inverse discrete cosine transform (IDCT), and variable-length decoding (VLD) for MPEG-2, MPEG-4 ASP, MPEG-4 AVC (H.264), WMV3, and VC-1 encoded video.
Video Decode and Presentation API for Unix (VDPAU) is a royalty-free application programming interface (API) as well as its implementation as free and open-source library (libvdpau) distributed under the MIT License. VDPAU is also supported by Nvidia.
The Radeon HD 7000 series, codenamed "Southern Islands", is a family of GPUs developed by AMD, and manufactured on TSMC's 28 nm process. The primary competitor of Southern Islands, Nvidia's GeForce 600 Series, also shipped during Q1 2012, largely due to the immaturity of the 28 nm process.
GPU switching is a mechanism used on computers with multiple graphic controllers. This mechanism allows the user to either maximize the graphic performance or prolong battery life by switching between the graphic cards. It's mostly used on gaming laptops which usually have an integrated graphic device and a discrete video card.
Graphics Core Next (GCN) is the codename for both a series of microarchitectures as well as for an instruction set. GCN was developed by AMD for their GPUs as the successor to TeraScale microarchitecture/instruction set. The first product featuring GCN was launched in 2011.
Distributed Codec Engine (DCE) is an API and its implementation as software library ("libdce") by Texas Instruments. The library was released under the Revised BSD License and some additional terms.
The HD 8000 series is a family of computer GPUs developed by AMD. AMD was initially rumored to release the family in the second quarter of 2013, with the cards manufactured on a 28 nm process and making use of the improved Graphics Core Next architecture. However the 8000 series turned out to be an OEM rebadge of the 7000 series.
Perhaps the best known one is the NEC 7220.CS1 maint: Uses editors parameter (link)
|Wikimedia Commons has media related to Graphics processing units .|