SYCL

Last updated
SYCL
Original author(s) Khronos Group
Developer(s) Khronos Group
Initial releaseMarch 2014;9 years ago (2014-03)
Stable release
2020 revision 8 (1.2.1) / 19 October 2023;4 months ago (2023-10-19) [1]
Operating system Cross-platform
Platform Cross-platform
Type High-level programming language
Website www.khronos.org/sycl/ sycl.tech

SYCL (pronounced "sickle") is a higher-level programming model to improve programming productivity on various hardware accelerators. It is a single-source embedded domain-specific language (eDSL) based on pure C++17. It is a standard developed by Khronos Group, announced in March 2014.

Contents

Origin of the name

SYCL (pronounced ‘sickle’) originally stood for SYstem-wide Compute Language, [2] but since 2020 SYCL developers have stated that SYCL is a name and have made clear that it is no longer an acronym and contains no reference to OpenCL. [3]

Purpose

SYCL is a royalty-free, cross-platform abstraction layer that builds on the underlying concepts, portability and efficiency inspired by OpenCL that enables code for heterogeneous processors to be written in a “single-source” style using completely standard C++. SYCL enables single-source development where C++ template functions can contain both host and device code to construct complex algorithms that use hardware accelerators, and then re-use them throughout their source code on different types of data.

While the SYCL standard started as the higher-level programming model sub-group of the OpenCL working group and was originally developed for use with OpenCL and SPIR, SYCL is a Khronos Group workgroup independent from the OpenCL working group since September 20, 2019 and starting with SYCL 2020, SYCL has been generalized as a more general heterogeneous framework able to target other systems. This is now possible with the concept of a generic backend to target any acceleration API while enabling full interoperability with the target API, like using existing native libraries to reach the maximum performance along with simplifying the programming effort. For example, the Open SYCL implementation targets ROCm and CUDA via AMD's cross-vendor HIP.

Versions

SYCL was introduced at GDC in March 2014 with provisional version 1.2, [4] then the SYCL 1.2 final version was introduced at IWOCL 2015 in May 2015. [5]

The latest version for the previous SYCL 1.2.1 series is SYCL 1.2.1 revision 7 which was published on April 27, 2020 (the first version was published on December 6, 2017 [6] ).

SYCL 2.2 provisional was introduced at IWOCL 2016 in May 2016 [7] targeting C++14 and OpenCL 2.2. But the SYCL committee preferred not to finalize this version and to move towards a more flexible SYCL specification to address the increasing diversity of current hardware accelerators, including artificial intelligence engines, which led to SYCL 2020.

The latest version is SYCL 2020 revision 6 which was published on November 13, 2022, an evolution from first release of revision 2 which was published on February 9, 2021, [8] taking into account the feedback from users and implementors on the SYCL 2020 Provisional Specification revision 1 published on June 30, 2020. [9] C++17 and OpenCL 3.0 support are main targets of this release. Unified shared memory (USM) is one main feature for GPUs with OpenCL and CUDA support.

At IWOCL 2021 a roadmap was presented. DPC++, ComputeCpp, Open SYCL, triSYCL and neoSYCL are the main implementations of SYCL. Next Target in development is support of C++20 in future SYCL 202x. [10]

Implementations

Software

Resources

Khronos Maintains a list of SYCL resource. [27] Codeplay Software also provides tutorials on the website sycl.tech along with other information and news on the SYCL ecosystem.

License

The source files for building the specification, such as Makefiles and some scripts, the SYCL headers and the SYCL code samples are under the Apache 2.0 license. [28] Details of the license are at: https://www.apache.org/licenses/LICENSE-2.0.html

Comparison with other APIs

The open standards SYCL and OpenCL are similar to the programming models of the proprietary stack CUDA from Nvidia, and HIP from the open-source stack ROCm, supported by AMD.

In the Khronos Group realm, OpenCL and Vulkan are the low-level non-single source API and SYCL is the high-level single-source C++ embedded domain-specific language (eDSL).

CUDA

By comparison, the single-source C++ embedded domain-specific language version of CUDA, which is actually named "CUDA Runtime API", is somewhat similar to SYCL. But there is actually a less known non single-source version of CUDA which is called "CUDA Driver API", similar to OpenCL, and used for example by the CUDA Runtime API implementation itself.

SYCL extends the C++ AMP features relieving the programmer from explicitly transferring the data between the host and devices by using buffers and accessors, by opposition to CUDA (before the introduction of Unified Memory in CUDA 6). But starting with SYCL 2020, it is also possible to use USM instead of buffers and accessors to use a lower-level programming model similar to Unified Memory in CUDA.

SYCL is higher-level than C++ AMP and CUDA since you do not need to build an explicit dependency graph between all the kernels, and provides you automatic asynchronous scheduling of the kernels with communication and computation overlap. This is all done by using the concept of accessors, without requiring any compiler support.

Unlike C++ AMP and CUDA, SYCL is a pure C++ eDSL without any C++ extension, allowing a basic CPU implementation relying on pure runtime without any specific compiler. This is very useful for debugging an application or for prototyping for a new architecture without having the architecture and compiler available yet.

There are at least 3 known SYCL implementations targeting the CUDA backend.

ROCm HIP

ROCm HIP can be seen as a clone of CUDA targeting Nvidia GPU, AMD GPU and x86 CPU. Thus ROCm HIP is a lower-level API compared to SYCL and most of the comments mentioned in the comparison with CUDA do apply.

ROCm HIP has some similarities to SYCL in the sense that it can target various vendors (AMD and Nvidia) and accelerator types (GPU and CPU). But SYCL can target according to the implementation any type of accelerators and any vendors, potentially at the same time, in a single application with the concept of backend. SYCL is also pure C++ while HIP uses some extensions inherited from CUDA, which prevents using a normal C++ compiler to target any CPU.

There are at least 2 known implementations of SYCL targeting the HIP backend, oneAPI DPC++ and Open SYCL. The Open SYCL implementation, over HIP, adds SYCL programming to CUDA and HIP.

Other programming models

SYCL has many similarities to the Kokkos programming model, [29] including the use of opaque multi-dimensional array objects (SYCL buffers and Kokkos arrays), multi-dimensional ranges for parallel execution, and reductions (added in SYCL 2020). Numerous features in SYCL 2020 were added in response to feedback from the Kokkos community.

See also

Related Research Articles

<span class="mw-page-title-main">OpenGL</span> Cross-platform graphics API

OpenGL is a cross-language, cross-platform application programming interface (API) for rendering 2D and 3D vector graphics. The API is typically used to interact with a graphics processing unit (GPU), to achieve hardware-accelerated rendering.

OpenMAX, often shortened as "OMX", is a non-proprietary and royalty-free cross-platform set of C-language programming interfaces. It provides abstractions for routines that are especially useful for processing of audio, video, and still images. It is intended for low power and embedded system devices that need to efficiently process large amounts of multimedia data in predictable ways, such as video codecs, graphics libraries, and other functions for video, image, audio, voice and speech.

The Khronos Group, Inc. is an open, non-profit, member-driven consortium of 170 organizations developing, publishing and maintaining royalty-free interoperability standards for 3D graphics, virtual reality, augmented reality, parallel computation, vision acceleration and machine learning. The open standards and associated conformance tests enable software applications and middleware to effectively harness authoring and accelerated playback of dynamic media across a wide variety of platforms and devices. The group is based in Beaverton, Oregon.

Mesa, also called Mesa3D and The Mesa 3D Graphics Library, is an open source implementation of OpenGL, Vulkan, and other graphics API specifications. Mesa translates these specifications to vendor-specific graphics hardware drivers.

<span class="mw-page-title-main">OpenGL Shading Language</span> High-level shading language

OpenGL Shading Language (GLSL) is a high-level shading language with a syntax based on the C programming language. It was created by the OpenGL ARB to give developers more direct control of the graphics pipeline without having to use ARB assembly language or hardware-specific languages.

<span class="mw-page-title-main">CUDA</span> Parallel computing platform and programming model

CUDA is a proprietary and closed-source parallel computing platform and application programming interface (API) that allows software to use certain types of graphics processing units (GPUs) for general-purpose processing, an approach called general-purpose computing on GPUs (GPGPU). CUDA is a software layer that gives direct access to the GPU's virtual instruction set and parallel computational elements for the execution of compute kernels.

Intel oneAPI DPC++/C++ Compiler and Intel C++ Compiler Classic are Intel’s C, C++, SYCL, and Data Parallel C++ (DPC++) compilers for Intel processor-based systems, available for Windows, Linux, and macOS operating systems.

<span class="mw-page-title-main">The Portland Group</span> American technology company

PGI was a company that produced a set of commercially available Fortran, C and C++ compilers for high-performance computing systems. On July 29, 2013, Nvidia acquired The Portland Group, Inc. As of August 5, 2020, the "PGI Compilers and Tools" technology is a part of the Nvidia HPC SDK product available as a free download from Nvidia.

<span class="mw-page-title-main">OpenCL</span> Open standard for programming heterogenous computing systems, such as CPUs or GPUs

OpenCL is a framework for writing programs that execute across heterogeneous platforms consisting of central processing units (CPUs), graphics processing units (GPUs), digital signal processors (DSPs), field-programmable gate arrays (FPGAs) and other processors or hardware accelerators. OpenCL specifies programming languages for programming these devices and application programming interfaces (APIs) to control the platform and execute programs on the compute devices. OpenCL provides a standard interface for parallel computing using task- and data-based parallelism.

OpenACC is a programming standard for parallel computing developed by Cray, CAPS, Nvidia and PGI. The standard is designed to simplify parallel programming of heterogeneous CPU/GPU systems.

C++ Accelerated Massive Parallelism is a native programming model that contains elements that span the C++ programming language and its runtime library. It provides an easy way to write programs that compile and execute on data-parallel hardware, such as graphics cards (GPUs).

<span class="mw-page-title-main">IWOCL</span>

The International Workshop on OpenCL is an annual conference that brings together the community of OpenCL users, researchers, developers and suppliers to share OpenCL best practices and help advance the use of the Khronos OpenCL standard for the parallel programming of heterogeneous systems.

OpenVX is an open, royalty-free standard for cross-platform acceleration of computer vision applications. It is designed by the Khronos Group to facilitate portable, optimized and power-efficient processing of methods for vision algorithms. This is aimed for embedded and real-time programs within computer vision and related scenarios. It uses a connected graph representation of operations.

Vulkan is a low-level low-overhead, cross-platform API and open standard for 3D graphics and computing. It was intended to address the shortcomings of OpenGL, and allow developers more control over the GPU. It is designed to support a wide variety of GPUs, CPUs and operating systems, it is also designed to work with modern multi-core CPUs.

<span class="mw-page-title-main">Standard Portable Intermediate Representation</span>

Standard Portable Intermediate Representation (SPIR) is an intermediate language for parallel computing and graphics by Khronos Group. It is used in multiple execution environments, including the Vulkan graphics API and the OpenCL compute API, to represent a shader or kernel. It is also used as an interchange language for cross compilation.

<span class="mw-page-title-main">MulticoreWare</span>

MulticoreWare Inc is a software development company, offering products and services related to HEVC video compression, machine learning, compilers for heterogeneous computing, and software performance optimization services. MulticoreWare's customers include AMD, Microsoft, Google, Qualcomm and Telestream. The company was founded in 2009 and has offices in the United States, China and India.

<span class="mw-page-title-main">GPUOpen</span> Middleware software suite

GPUOpen is a middleware software suite originally developed by AMD's Radeon Technologies Group that offers advanced visual effects for computer games. It was released in 2016. GPUOpen serves as an alternative to, and a direct competitor of Nvidia GameWorks. GPUOpen is similar to GameWorks in that it encompasses several different graphics technologies as its main components that were previously independent and separate from one another. However, GPUOpen is partially open source software, unlike GameWorks which is proprietary and closed.

<span class="mw-page-title-main">ROCm</span> Parallel computing platform: GPGPU libraries and application programming interface

ROCm is an Advanced Micro Devices (AMD) software stack for graphics processing unit (GPU) programming. ROCm spans several domains: general-purpose computing on graphics processing units (GPGPU), high performance computing (HPC), heterogeneous computing. It offers several programming models: HIP, OpenMP/Message Passing Interface (MPI), OpenCL.

oneAPI (compute acceleration) Open standard for parallel computing

oneAPI is an open standard, adopted by Intel, for a unified application programming interface (API) intended to be used across different computing accelerator (coprocessor) architectures, including GPUs, AI accelerators and field-programmable gate arrays. It is intended to eliminate the need for developers to maintain separate code bases, multiple programming languages, tools, and workflows for each architecture.

References

  1. "Khronos SYCL Registry - the Khronos Group Inc".
  2. Keryell, Ronan (17 November 2019). "SYCL: A Single-Source C++ Standard for Heterogeneous Computing" (PDF). Khronos.org. Retrieved 26 September 2023.
  3. Keryell, Ronan. "Meaning of SYCL". GitHub. Retrieved 5 February 2021.
  4. Khronos Group (19 March 2014). "Khronos Releases SYCL 1.2 Provisional Specification". Khronos. Retrieved 20 August 2017.
  5. Khronos Group (11 May 2015). "Khronos Releases SYCL 1.2 Final Specification". Khronos. Retrieved 20 August 2017.
  6. Khronos Group (6 December 2017). "The Khronos Group Releases Finalized SYCL 1.2.1". Khronos. Retrieved 12 December 2017.
  7. Khronos Group (18 April 2016). "Khronos Releases OpenCL 2.2 Provisional Specification with OpenCL C++ Kernel Language". Khronos. Retrieved 18 September 2017.
  8. Khronos Group (9 February 2021). "Khronos Releases SYCL 2020 Specification". Khronos. Retrieved 22 February 2021.
  9. Khronos Group (30 June 2020). "Khronos Steps Towards Widespread Deployment of SYCL with Release of SYCL 2020 Provisional Specification". Khronos. Retrieved 4 December 2020.
  10. https://www.iwocl.org/wp-content/uploads/k04-iwocl-syclcon-2021-wong-slides.pdf [ bare URL PDF ]
  11. https://www.iwocl.org/wp-content/uploads/k01-iwocl-syclcon-2021-reinders-slides.pdf [ bare URL PDF ]
  12. "Compile Cross-Architecture: Intel® oneAPI DPC++/C++ Compiler".
  13. "Home - ComputeCpp CE - Products - Codeplay Developer".
  14. "Guides - ComputeCpp CE - Products - Codeplay Developer".
  15. "The Future of ComputeCpp". www.codeplay.com. Retrieved 2023-12-09.
  16. "AdaptiveCpp (formerly known as hipSYCL / Open SYCL)". GitHub . 4 July 2023.
  17. "hipSYCL feature support". GitHub . 4 July 2023.
  18. "triSYCL". GitHub . 6 January 2022.
  19. Ke, Yinan; Agung, Mulya; Takizawa, Hiroyuki (2021). "NeoSYCL: A SYCL implementation for SX-Aurora TSUBASA". The International Conference on High Performance Computing in Asia-Pacific Region. pp. 50–57. doi:10.1145/3432261.3432268. ISBN   9781450388429. S2CID   231597238.
  20. Ke, Yinan; Agung, Mulya; Takizawa, Hiroyuki (2021). "NeoSYCL: A SYCL implementation for SX-Aurora TSUBASA". The International Conference on High Performance Computing in Asia-Pacific Region. pp. 50–57. doi:10.1145/3432261.3432268. ISBN   9781450388429. S2CID   231597238.
  21. "Sycl-GTX". GitHub . 10 April 2021.
  22. https://www.iwocl.org/wp-content/uploads/14-iwocl-syclcon-2021-thoman-slides.pdf [ bare URL PDF ]
  23. "Polygeist". GitHub . 25 February 2022.
  24. "Inteon". 25 February 2022.
  25. https://www.iwocl.org/wp-content/uploads/k03-iwocl-syclcon-2021-trevett-updated.mp4.pdf [ bare URL PDF ]
  26. https://www.iwocl.org/wp-content/uploads/20-iwocl-syclcon-2021-rudkin-slides.pdf [ bare URL PDF ]
  27. "SYCL Resources". khronos.org. Khronos group. 20 January 2014.
  28. "SYCL Open Source Specification". GitHub . 10 January 2022.
  29. Hammond, Jeff R.; Kinsner, Michael; Brodman, James (2019). "A comparative analysis of Kokkos and SYCL as heterogeneous, parallel programming models for C++ applications". Proceedings of the International Workshop on OpenCL. pp. 1–2. doi:10.1145/3318170.3318193. ISBN   9781450362306. S2CID   195777149.