GPU virtualization

Last updated

GPU virtualization refers to technologies that allow the use of a GPU to accelerate graphics or GPGPU applications running on a virtual machine. GPU virtualization is used in various applications such as desktop virtualization, [1] cloud gaming [2] and computational science (e.g. hydrodynamics simulations). [3]

Contents

GPU virtualization implementations generally involve one or more of the following techniques: device emulation, API remoting, fixed pass-through and mediated pass-through. Each technique presents different trade-offs regarding virtual machine to GPU consolidation ratio, graphics acceleration, rendering fidelity and feature support, portability to different hardware, isolation between virtual machines, and support for suspending/resuming and live migration. [1] [4] [5] [6]

API remoting

In API remoting or API forwarding, calls to graphical APIs from guest applications are forwarded to the host by remote procedure call, and the host then executes graphical commands from multiple guests using the host's GPU as a single user. [1] It may be considered a form of paravirtualization when combined with device emulation. [7] This technique allows sharing GPU resources between multiple guests and the host when the GPU does not support hardware-assisted virtualization. It is conceptually simple to implement, but it has several disadvantages: [1]

Hypervisors usually use shared memory between guest and host to maximize performance and minimize latency. Using a network interface instead (a common approach in distributed rendering), third-party software can add support for specific APIs (e.g. rCUDA [8] for CUDA) or add support for typical APIs (e.g. VMGL [9] for OpenGL) when it is not supported by the hypervisor's software package, although network delay and serialization overhead may outweigh the benefits.

Application support from API remoting virtualization technologies
Technology Direct3D OpenGL Vulkan OpenCL DXVA
VMware Virtual Shared Graphics Acceleration (vSGA) [10] [11] 114.1YesNoNo
Parallels Desktop for Mac 3D acceleration [12] 11 [upper-alpha 1] 3.3 [upper-alpha 2] NoNoNo
Hyper-V RemoteFX vGPU [14] [15] 124.4No1.1No
VirtualBox Guest Additions 3D driver [16] [17] [18] 8/9 [upper-alpha 3] 2.1 [upper-alpha 4] NoNoNo
Thincast Workstation - Virtual 3D [20] 12.1NoYesNoNo
QEMU/KVM with Virgil 3D [21] [22] [23] [24] No4.3PlannedNoNo
  1. Wrapped to OpenGL using WineD3D. [13]
  2. Compatibility profile.
  3. Experimental. Wrapped to OpenGL using WineD3D. [19]
  4. Experimental.

Fixed pass-through

In fixed pass-through or GPU pass-through (a special case of PCI pass-through), a GPU is accessed directly by a single virtual machine exclusively and permanently. This technique achieves 96100% of native performance [3] and high fidelity, [1] but the acceleration provided by the GPU cannot be shared between multiple virtual machines. As such, it has the lowest consolidation ratio and the highest cost, as each graphics-accelerated virtual machine requires an additional physical GPU. [1]

The following software technologies implement fixed pass-through:

VirtualBox removed support for PCI pass-through in version 6.1.0. [33]

QEMU/KVM

For certain GPU models, Nvidia and AMD video card drivers attempt to detect the GPU is being accessed by a virtual machine and disable some or all GPU features. [34] NVIDIA has recently changed virtualization rules for consumer GPUs by disabling the check in GeForce Game Ready driver 465.xx and later. [35]

For NVIDIA, various architectures of desktop and laptop consumer GPUs can be passed through in various ways. For desktop graphics cards, passthrough can be done via the KVM using either the legacy or UEFI BIOS configuration via SeaBIOS and OVMF, respectively.

NVIDIA

Desktops

For desktops, most graphics cards can be passed through, although for graphics cards with the Pascal architecture or older, the VBIOS of the graphics card must be passed through in the virtual machine if the GPU is used to boot the host. [36]

Laptops

For laptops, the NVIDIA driver checks for the presence of a battery via ACPI, and without a battery, an error will be returned. To avoid this, an acpitable created from text converted into Base64 is required to spoof a battery and bypass the check. [36]

Pascal and earlier

For the laptop graphics cards that are Pascal and older, passthrough varies widely on the configuration of the graphics card. For laptops that do not have NVIDIA Optimus, such as the MXM variants, passthrough can be achieved through traditional methods. For laptops that have NVIDIA Optimus on as well as rendering through the CPU's integrated graphics framebuffer as opposed to its own, the passthrough is more complicated, requiring a remote rendering display or service, the use of Intel GVT-g, as well as integrating the VBIOS into the boot configuration due to the VBIOS being present in the laptop's system BIOS as opposed to the GPU itself. For laptops that have a GPU with NVIDIA Optimus and have a dedicated framebuffer, the configurations may vary. If NVIDIA Optimus can be switched off, then passthrough is possible through traditional means. However, if Optimus is the only configuration, then it is most likely that the VBIOS is present in the laptop's system BIOS, requiring the same steps as the laptop rendering only on the integrated graphics framebuffer, but an external monitor is also possible. [37]

Mediated pass-through

In mediated device pass-through or full GPU virtualization, the GPU hardware provides contexts with virtual memory ranges for each guest through IOMMU and the hypervisor sends graphical commands from guests directly to the GPU. This technique is a form of hardware-assisted virtualization and achieves near-native [lower-alpha 2] performance and high fidelity. If the hardware exposes contexts as full logical devices, then guests can use any API. Otherwise, APIs and drivers must manage the additional complexity of GPU contexts. As a disadvantage, there may be little isolation between virtual machines when accessing GPU resources. [1]

The following software and hardware technologies implement mediated pass-through:

While API remoting is generally available for current and older GPUs, mediated pass-through requires hardware support available only on specific devices.

Hardware support for mediated pass-through virtualization
VendorTechnology Dedicated graphics card families Integrated GPU families
ServerProfessionalConsumer
Nvidia vGPU [46] GRID, Tesla Quadro No
AMD MxGPU [42] [47] FirePro Server, Radeon Instinct Radeon Pro NoNo
Intel GVT-g Broadwell and newer

Device emulation

GPU architectures are very complex and change quickly, and their internal details are often kept secret. It is generally not feasible to fully virtualize new generations of GPUs, only older and simpler generations. For example, PCem, a specialized emulator of the IBM PC architecture, can emulate a S3 ViRGE/DX graphics device, which supports Direct3D 3, and a 3dfx Voodoo2, which supports Glide, among others. [48]

When using a VGA or an SVGA virtual display adapter, [49] [50] [51] the guest may not have 3D graphics acceleration, providing only minimal functionality to allow access to the machine via a graphics terminal. The emulated device may expose only basic 2D graphics modes to guests. The virtual machine manager may also provide common API implementations using software rendering to enable 3D graphics applications on the guest, albeit at speeds that may be low as 3% of hardware-accelerated native performance. [1] The following software technologies implement graphics APIs using software rendering:

See also

Notes

  1. 1 2 Not available on VMware Workstation.
  2. Intel GVT-g achieves 80–90% of native performance. [38] [39] Nvidia vGPU achieves 88–96% of native performance considering the overhead on a VMware hypervisor. [40]

Related Research Articles

<span class="mw-page-title-main">Graphics processing unit</span> Specialized electronic circuit; graphics accelerator

A graphics processing unit (GPU) is a specialized electronic circuit initially designed to accelerate computer graphics and image processing. After their initial design, GPUs were found to be useful for non-graphic calculations involving embarrassingly parallel problems due to their parallel structure. Other non-graphical uses include the training of neural networks and cryptocurrency mining.

<span class="mw-page-title-main">Xen</span> Type-1 hypervisor

Xen is a free and open-source type-1 hypervisor, providing services that allow multiple computer operating systems to execute on the same computer hardware concurrently. It was originally developed by the University of Cambridge Computer Laboratory and is now being developed by the Linux Foundation with support from Intel, Citrix, Arm Ltd, Huawei, AWS, Alibaba Cloud, AMD, Bitdefender and epam.

x86 virtualization is the use of hardware-assisted virtualization capabilities on an x86/x86-64 CPU.

In computing, paravirtualization or para-virtualization is a virtualization technique that presents a software interface to the virtual machines which is similar, yet not identical, to the underlying hardware–software interface.

Mesa, also called Mesa3D and The Mesa 3D Graphics Library, is an open source implementation of OpenGL, Vulkan, and other graphics API specifications. Mesa translates these specifications to vendor-specific graphics hardware drivers.

Platform virtualization software, specifically emulators and hypervisors, are software packages that emulate the whole physical computer machine, often providing multiple virtual machines on one physical platform. The table below compares basic information about platform virtualization hypervisors.

X-Video Motion Compensation (XvMC), is an extension of the X video extension (Xv) for the X Window System. The XvMC API allows video programs to offload portions of the video decoding process to the GPU video-hardware. In theory this process should also reduce bus bandwidth requirements. Currently, the supported portions to be offloaded by XvMC onto the GPU are motion compensation and inverse discrete cosine transform (iDCT) for MPEG-2 video. XvMC also supports offloading decoding of mo comp, iDCT, and VLD for not only MPEG-2 but also MPEG-4 ASP video on VIA Unichrome hardware.

In computing, hardware-assisted virtualization is a platform virtualization approach that enables efficient full virtualization using help from hardware capabilities, primarily from the host processors. A full virtualization is used to emulate a complete hardware environment, or virtual machine, in which an unmodified guest operating system effectively executes in complete isolation. Hardware-assisted virtualization was added to x86 processors in 2005, 2006 and 2010 (respectively).

Desktop virtualization is a software technology that separates the desktop environment and associated application software from the physical client device that is used to access it.

Hardware virtualization is the virtualization of computers as complete hardware platforms, certain logical abstractions of their componentry, or only the functionality required to run various operating systems. Virtualization hides the physical characteristics of a computing platform from the users, presenting instead an abstract computing platform. At its origins, the software that controlled virtualization was called a "control program", but the terms "hypervisor" or "virtual machine monitor" became preferred over time.

X-Video Bitstream Acceleration (XvBA), designed by AMD Graphics for its Radeon GPU and APU, is an arbitrary extension of the X video extension (Xv) for the X Window System on Linux operating-systems. XvBA API allows video programs to offload portions of the video decoding process to the GPU video-hardware. Currently, the portions designed to be offloaded by XvBA onto the GPU are currently motion compensation (MC) and inverse discrete cosine transform (IDCT), and variable-length decoding (VLD) for MPEG-2, MPEG-4 ASP, MPEG-4 AVC (H.264), WMV3, and VC-1 encoded video.

Video Decode and Presentation API for Unix (VDPAU) is a royalty-free application programming interface (API) as well as its implementation as free and open-source library distributed under the MIT License. VDPAU is also supported by Nvidia.

<span class="mw-page-title-main">Intel Graphics Technology</span> Series of integrated graphics processors by Intel

Intel Graphics Technology (GT) is the collective name for a series of integrated graphics processors (IGPs) produced by Intel that are manufactured on the same package or die as the central processing unit (CPU). It was first introduced in 2010 as Intel HD Graphics and renamed in 2017 as Intel UHD Graphics.

Nvidia Optimus is a computer GPU switching technology created by Nvidia which, depending on the resource load generated by client software applications, will seamlessly switch between two graphics adapters within a computer system in order to provide either maximum performance or minimum power draw from the system's graphics rendering hardware.

XenClient is a discontinued desktop virtualization product developed by Citrix. It runs virtual desktops on endpoint devices. The product reached end of-life in December 2016. Unlike modern systems, XenClient runs both operating system and applications locally in the end users device, without the need for a connection to a data center, making it suitable for use in environments with limited connectivity, disconnected operation on laptops, and other scenarios where local execution is desired while keeping management centralized.

Second Level Address Translation (SLAT), also known as nested paging, is a hardware-assisted virtualization technology which makes it possible to avoid the overhead associated with software-managed shadow page tables.

In computer security, virtual machine escape is the process of a program breaking out of the virtual machine on which it is running and interacting with the host operating system. A virtual machine is a "completely isolated guest operating system installation within a normal host operating system". In 2008, a vulnerability in VMware discovered by Core Security Technologies made VM escape possible on VMware Workstation 6.0.2 and 5.5.4. A fully working exploit labeled Cloudburst was developed by Immunity Inc. for Immunity CANVAS. Cloudburst was presented in Black Hat USA 2009.

bhyve is a type-2 hypervisor initially written for FreeBSD. It can also be used on a number of illumos based distributions including SmartOS, OpenIndiana, and OmniOS. A port of bhyve to macOS called xhyve is also available.

<span class="mw-page-title-main">ROCm</span> Parallel computing platform: GPGPU libraries and application programming interface

ROCm is an Advanced Micro Devices (AMD) software stack for graphics processing unit (GPU) programming. ROCm spans several domains: general-purpose computing on graphics processing units (GPGPU), high performance computing (HPC), heterogeneous computing. It offers several programming models: HIP, OpenMP/Message Passing Interface (MPI), and OpenCL.

Harvester is a cloud native hyper-converged infrastructure (HCI) open source software. Harvester was announced in 2020 by SUSE.

References

  1. 1 2 3 4 5 6 7 8 Dowty, Micah; Sugerman, Jeremy (July 2009). Written at San Diego. "GPU Virtualization on VMware's Hosted I/O Architecture" (PDF). ACM SIGOPS Operating Systems Review. 43 (3). New York City: Association for Computing Machinery: 73–82. doi:10.1145/1618525.1618534. ISSN   0163-5980. S2CID   228328 . Retrieved 10 September 2020.
  2. Hong, Hua-Jun; Fan-Chiang, Tao-Ya; Lee, Che-Rung; Chen, Kuan-Ta; Huang, Chun-Ying; Hsu, Cheng-Hsin (2014). GPU Consolidation for Cloud Games: Are We There Yet?. 13th Annual Workshop on Network and Systems Support for Games. Nagoya: Institute of Electrical and Electronics Engineers. pp. 1–6. doi:10.1109/NetGames.2014.7008969. ISBN   978-1-4799-6882-4. ISSN   2156-8138. S2CID   664129 . Retrieved 14 September 2020.
  3. 1 2 Walters, John; Younge, Andrew; Kang, Dong-In; Yao, Ke-Thia; Kang, Mikyung; Crago, Stephen; Fox, Geoffrey (2014). "GPU Passthrough Performance: A Comparison of KVM, Xen, VMware ESXi, and LXC for CUDA and OpenCL Applications". IEEE 7th International Conference on Cloud Computing. IEEE 7th International Conference on Cloud Computing. Anchorage: IEEE Computer Society. pp. 636–643. doi:10.1109/CLOUD.2014.90. ISBN   978-1-4799-5063-8. ISSN   2159-6190 . Retrieved 13 September 2020.
  4. Yu, Hangchen; Rossbach, Christopher (25 June 2017). Full Virtualization for GPUs Reconsidered (PDF). ISCA-44 14th Annual Workshop on Duplicating, Deconstructing and Debunking. Toronto . Retrieved 12 September 2020.
  5. Tian, Kun; Dong, Yaozu; Cowperthwaite, David (June 2014). "A Full GPU Virtualization Solution with Mediated Pass-Through" (PDF). Proceedings of the 2014 USENIX Conference on USENIX Annual Technical Conference (USENIX ATC'14). USENIX Annual Technical Conference. Philadelphia: USENIX. pp. 121–132. ISBN   978-1-931971-10-2.
  6. Gottschlag, Mathias; Hillenbrand, Marius; Kehne, Jens; Stoess, Jan; Bellosa, Frank (November 2013). LoGV: Low-Overhead GPGPU Virtualization (PDF). 10th International Conference on High Performance Computing. Zhangjiajie: IEEE Computer Society. pp. 1721–1726. doi:10.1109/HPCC.and.EUC.2013.245. ISBN   978-0-7695-5088-6 . Retrieved 16 September 2020.
  7. Suzuki, Yusuke; Kato, Shinpei; Yamada, Hiroshi; Kono, Kenji (June 2014). "GPUvm: Why Not Virtualizing GPUs at the Hypervisor?" (PDF). Proceedings of the 2014 USENIX Conference on USENIX Annual Technical Conference (USENIX ATC'14). USENIX Annual Technical Conference. Philadelphia: USENIX. pp. 109–120. ISBN   978-1-931971-10-2 . Retrieved 14 September 2020.
  8. Duato, José; Peña, Antonio; Silla, Federico; Fernández, Juan; Mayo, Rafael; Quintana-Ortí, Enrique (December 2011). Enabling CUDA acceleration within virtual machines using rCUDA (PDF). 18th International Conference on High Performance Computing. International Conference on High Performance Computing . Bangalore: IEEE Computer Society. pp. 1–10. doi:10.1109/HiPC.2011.6152718. hdl: 2117/168226 . ISBN   978-1-4577-1951-6. ISSN   1094-7256 . Retrieved 13 September 2020.
  9. Lagar-Cavilla, Horacio; Tolia, Niraj; Satyanarayanan, Mahadev; Lara, Eyal (June 2007). "VMM-Independent Graphics Acceleration" (PDF). Written at San Antonio. Proceedings of the 3rd International Conference on Virtual Execution Environments. VEE '07. New York City: Association for Computing Machinery. pp. 33–43. doi:10.1145/1254810.1254816. ISBN   978-1-59593-630-1 . Retrieved 12 September 2020.
  10. 1 2 Lantinga, Hilko. Deploying Hardware-Accelerated Graphics with VMware Horizon (Guide). VMware . Retrieved 12 September 2020.
  11. visaac. "VMware Workstation 16 Pro Release Notes". docs.vmware.com. Retrieved 2021-03-24.
  12. "Graphics Settings". Parallels Desktop - User's Guide (Guide). Parallels.
  13. Bright, Peter (11 March 2014). "Valve releases open source Direct3D to OpenGL translator". Ars Technica . Retrieved 15 September 2020.
  14. "Deploy graphics devices using RemoteFX vGPU". Hyper-V on Windows Server (Manual). Microsoft . Retrieved 13 September 2020.
  15. "Plan for GPU acceleration in Windows Server". Hyper-V on Windows Server (Manual). Microsoft . Retrieved 15 September 2020.
  16. "Hardware-Accelerated Graphics". Oracle VM VirtualBox User Manual (Manual). Oracle Corporation . Retrieved 12 September 2012.
  17. "Guest Additions". Oracle VM VirtualBox User Manual (Manual). Oracle Corporation . Retrieved 12 September 2020.
  18. Larabel, Michael (19 December 2018). "VirtualBox 6.0 3D/OpenGL Performance With VMSVGA Adapter". Phoronix . Retrieved 15 September 2020.
  19. Larabel, Michael (29 January 2009). "VirtualBox Gets Accelerated Direct3D Support". Phoronix . Retrieved 15 September 2020.
  20. Hi! - The Thincast Workstation FreeRDP Blog
  21. "Virgil 3D GPU project". GitHub (Project). freedesktop.org . Retrieved 13 September 2020.
  22. Edge, Jake (10 September 2014). Virgil 3D: A virtual GPU (Article). LWN.net . Retrieved 13 September 2020.
  23. Wollny, Gert (28 August 2019). "Virglrenderer and the state of virtualized virtual worlds". Collabora News & Blog. Retrieved 15 September 2020.
  24. Hoffmann, Gerd (28 November 2019). "virtio gpu status and plans" . Retrieved 15 September 2020.
  25. GPU Development with Parallels Workstation Extreme (PDF) (White paper). Parallels. 2010. Retrieved 13 September 2020.
  26. "Deploy graphics devices using Discrete Device Assignment". Hyper-V on Windows Server (Manual). Microsoft . Retrieved 13 September 2020.
  27. 1 2 "HDX 3D Pro". XenApp and XenDesktop 7.15 LTSR (Manual). Citrix Systems . Retrieved 15 September 2020.
  28. 1 2 "Graphics overview". Citrix Hypervisor 8.2 (Manual). Citrix Systems . Retrieved 15 September 2020.
  29. 1 2 GVT-d Setup Guide. GitHub (Guide). Retrieved 13 September 2020.
  30. 1 2 3 Larabel, Michael (4 May 2014). "Intel Pushes Their Graphics Virtualization Capabilities". Phoronix . Retrieved 13 September 2020.
  31. 1 2 "Bringing New Use Cases and Workloads to the Cloud with Intel Graphics Virtualization Technology (Intel GVT-g)" (PDF). Intel Open Source Technology Center (Flyer). Intel. 2016. Retrieved 14 August 2020.
  32. 1 2 Jain, Sunil (4 May 2014). "Intel Graphics Virtualization Update" (Article). Intel . Retrieved 13 September 2020.
  33. "Changelog for VirtualBox 6.1". VirtualBox . Oracle Corporation. 10 December 2019. Retrieved 12 September 2020.
  34. "PCI passthrough via OVMF - Video card driver virtualization detection". Arch Linux Wiki (Wiki). Retrieved 13 September 2020.
  35. "GeForce GPU Passthrough for Windows Virtual Machine (Beta)". NVIDIA Support. 2021-03-30.
  36. 1 2 "PCI passthrough via OVMF - ArchWiki". wiki.archlinux.org. Retrieved 2021-05-20.
  37. Tian, Lan (2020-06-25). "Intel and NVIDIA GPU Passthrough on a Optimus MUXless Laptop".
  38. Zheng, Xiao (August 2015). Media Cloud Based on Intel Graphics Virtualization Technology (Intel GVT-g) and OpenStack (PDF). Intel Developer Forum (Presentation slide). San Francisco: Intel . Retrieved 14 September 2020.
  39. Wang, Zhenyu (September 2017). Full GPU virtualization in mediated pass-through way (PDF). XDC2017 (Presentation slide). Mountain View, California: X.Org Foundation . Retrieved 14 September 2020.
  40. Kurkure, Uday (12 October 2017). Performance Comparison of Native GPU to Virtualized GPU and Scalability of Virtualized GPUs for Machine Learning. VMware VROOM! Performance Blog (Article). VMware. Episode 3. Retrieved 14 September 2020.
  41. Virtual GPU Software User Guide (Guide). Nvidia . Retrieved 13 September 2020.
  42. 1 2 Wong, Tonny (28 January 2016). AMD multiuser GPU: hardware-enabled GPU virtualization for a true workstation experience (PDF) (White paper). AMD . Retrieved 12 September 2020.
  43. Wang, Hongbo (18 October 2018). "2018-Q3 release of XenGT (Intel GVT-g for Xen)" (Press release). Intel Open Source Technology Center . Retrieved 14 August 2020.
  44. 1 2 GVT-g Setup Guide. GitHub (Guide). Retrieved 13 September 2020.
  45. Wang, Hongbo (18 October 2018). "2018-Q3 release of KVMGT (Intel GVT-g for KVM)" (Press release). Intel Open Source Technology Center . Retrieved 14 August 2020.
  46. "NVIDIA Virtual GPU Software Supported GPUs". Nvidia . Retrieved 9 September 2020.
  47. AMD FirePro S-Series for Virtualization (PDF) (Datasheet). AMD. 2016. Retrieved 13 September 2020.
  48. "Systems/motherboards emulated". PCem (Project). Retrieved 26 October 2020.
  49. "VMware Tools Device Drivers". VMware Tools Documentation (Manual). VMware . Retrieved 12 September 2020.
  50. 1 2 "Configuring Virtual Machines". Oracle VM VirtualBox User Manual (Manual). Oracle Corporation . Retrieved 12 September 2020.
  51. "Display options". QEMU User Documentation. QEMU (Manual). Retrieved 12 September 2020.
  52. Long, Simon (2013). Virtual Machine Graphics Acceleration Deployment Guide (PDF) (White paper). VMware . Retrieved 14 September 2020.
  53. "OpenGL Software Accelerator". XenApp and XenDesktop 7.15 LTSR (Manual). Citrix Systems . Retrieved 15 September 2020.