Virtual thread

Last updated

In computer programming, a virtual thread is a thread that is managed by a runtime library or virtual machine (VM) and made to resemble "real" operating system thread to code executing on it, while requiring substantially fewer resources than the latter.

Contents

Virtual threads allows for tens of millions of preemptive tasks and events on a 2021 consumer-grade computer, [1] compared to low thousands of operating system threads. [2] Preemptive execution [3] is important to performance gains through parallelism and fast preemptive response times for tens of millions of events.

Earlier constructs that are not or not always preemptive, such as coroutines, green threads or the largely single-threaded Node.js, introduce delays in responding to asynchronous events such as every incoming request in a server application. [4]

Definition

Virtual threads were commercialized with Google's Chrome browser in 2008 [5] where virtual threads may hop physical threads. Virtual threads are truly virtual, created in user-space software.

Underlying reasons

Java servers have featured extensive and memory consuming software constructs allowing dozens of pooled operating system threads to preemptively execute thousands of requests per second without the use of virtual threads. Key to performance here is to reduce the initial latency in thread processing and minimize the time operating system threads are blocked. [7]

Virtual threads increase possible concurrency by many orders of magnitudes while the actual parallelism achieved is limited by available execution units and pipelining offered by present processors and processor cores. In 2021, a consumer grade computers typically offer a parallelism of tens of concurrent execution units. [8] For increased performance through parallelism, the language runtime need to use all present hardware, [9] not be single-threaded or feature global synchronization such as global interpreter lock.

The many magnitudes of increase in possible preemptive items offered by virtual threads is achieved by the language runtime managing resizable thread stacks. [10] Those stacks are smaller in size than those of operating system threads. The maximum number of threads possible without swapping is proportional to the amount of main memory. [11]

In order to support virtual threads efficiently, the language runtime has to be largely rewritten to prevent blocking calls from holding up an operating system thread assigned to execute a virtual thread [12] and to manage thread stacks. [13] An example of a retrofit of virtual threads is Java Loom. [14] An example of a new language designed for virtual threads is Go. [15]

Complexity

Because virtual threads offer parallelism, the programmer needs to be skilled in multi-threaded programming and synchronization.

Because a blocked virtual thread would block the OS thread it occupies at the moment, much effort must be taken in the runtime to handle blocking system calls. Typically, a thread from a pool of spare OS threads is used to execute the blocking call for the virtual thread so that the initially executing OS thread is not blocked.

Management of the virtual thread stack requires care in the linker and short predictions of additional stack space requirements.

Implementations

Google Chrome Browser

Virtual threads are used to serialize singleton input/output activities. When a virtual thread is executing, it can hop on different OS thread. The Chrome browser first appeared in 2008. Chrome's virtual threads are available to developers extending the browser.

Go

Go's goroutines became preemptive with Go 1.4 in 2014 and are a prominent application of virtual threads.

Java

Java introduced virtual threads in 2023 with JDK 21, with the limitation that any code running on a virtual thread which uses synchronised blocks or native calls will become pinned to its carrier OS thread. [16]

Other uses of the term

Intel [17] in 2007 referred to an Intel compiler specific optimization technique as virtual threads.

See also

Related Research Articles

<span class="mw-page-title-main">Garbage collection (computer science)</span> Form of automatic memory management

In computer science, garbage collection (GC) is a form of automatic memory management. The garbage collector attempts to reclaim memory that was allocated by the program, but is no longer referenced; such memory is called garbage. Garbage collection was invented by American computer scientist John McCarthy around 1959 to simplify manual memory management in Lisp.

<span class="mw-page-title-main">Computer multitasking</span> Concurrent execution of multiple processes

In computing, multitasking is the concurrent execution of multiple tasks over a certain period of time. New tasks can interrupt already started ones before they finish, instead of waiting for them to end. As a result, a computer executes segments of multiple tasks in an interleaved manner, while the tasks share common processing resources such as central processing units (CPUs) and main memory. Multitasking automatically interrupts the running program, saving its state and loading the saved state of another program and transferring control to it. This "context switch" may be initiated at fixed time intervals, or the running program may be coded to signal to the supervisory software when it can be interrupted.

<span class="mw-page-title-main">Java virtual machine</span> Virtual machine that runs Java programs

A Java virtual machine (JVM) is a virtual machine that enables a computer to run Java programs as well as programs written in other languages that are also compiled to Java bytecode. The JVM is detailed by a specification that formally describes what is required in a JVM implementation. Having a specification ensures interoperability of Java programs across different implementations so that program authors using the Java Development Kit (JDK) need not worry about idiosyncrasies of the underlying hardware platform.

<span class="mw-page-title-main">Process (computing)</span> Particular execution of a computer program

In computing, a process is the instance of a computer program that is being executed by one or many threads. There are many different process models, some of which are light weight, but almost all processes are rooted in an operating system (OS) process which comprises the program code, assigned system resources, physical and logical access permissions, and data structures to initiate, control and coordinate execution activity. Depending on the OS, a process may be made up of multiple threads of execution that execute instructions concurrently.

<span class="mw-page-title-main">Thread (computing)</span> Smallest sequence of programmed instructions that can be managed independently by a scheduler

In computer science, a thread of execution is the smallest sequence of programmed instructions that can be managed independently by a scheduler, which is typically a part of the operating system. In many cases, a thread is a component of a process.

A computing platform, digital platform, or software platform is the infrastructure on which software is executed. While the individual components of a computing platform may be obfuscated under layers of abstraction, the summation of the required components comprise the computing platform.

<span class="mw-page-title-main">Parallel computing</span> Programming paradigm in which many processes are executed simultaneously

Parallel computing is a type of computation in which many calculations or processes are carried out simultaneously. Large problems can often be divided into smaller ones, which can then be solved at the same time. There are several different forms of parallel computing: bit-level, instruction-level, data, and task parallelism. Parallelism has long been employed in high-performance computing, but has gained broader interest due to the physical constraints preventing frequency scaling. As power consumption by computers has become a concern in recent years, parallel computing has become the dominant paradigm in computer architecture, mainly in the form of multi-core processors.

Coroutines are computer program components that allow execution to be suspended and resumed, generalizing subroutines for cooperative multitasking. Coroutines are well-suited for implementing familiar program components such as cooperative tasks, exceptions, event loops, iterators, infinite lists and pipes.

<span class="mw-page-title-main">OpenMP</span> Open standard for parallelizing

OpenMP is an application programming interface (API) that supports multi-platform shared-memory multiprocessing programming in C, C++, and Fortran, on many platforms, instruction-set architectures and operating systems, including Solaris, AIX, FreeBSD, HP-UX, Linux, macOS, and Windows. It consists of a set of compiler directives, library routines, and environment variables that influence run-time behavior.

In computer science, a finalizer or finalize method is a special method that performs finalization, generally some form of cleanup. A finalizer is executed during object destruction, prior to the object being deallocated, and is complementary to an initializer, which is executed during object creation, following allocation. Finalizers are strongly discouraged by some, due to difficulty in proper use and the complexity they add, and alternatives are suggested instead, mainly the dispose pattern.

In computer programming, a runtime system or runtime environment is a sub-system that exists in the computer where a program is created, as well as in the computers where the program is intended to be run. The name comes from the compile time and runtime division from compiled languages, which similarly distinguishes the computer processes involved in the creation of a program (compilation) and its execution in the target machine.

Concurrent computing is a form of computing in which several computations are executed concurrently—during overlapping time periods—instead of sequentially—with one completing before the next starts.

<span class="mw-page-title-main">Binary Modular Dataflow Machine</span>

Binary Modular Dataflow Machine (BMDFM) is a software package that enables running an application in parallel on shared memory symmetric multiprocessing (SMP) computers using the multiple processors to speed up the execution of single applications. BMDFM automatically identifies and exploits parallelism due to the static and mainly dynamic scheduling of the dataflow instruction sequences derived from the formerly sequential program.

In computer science, a fiber is a particularly lightweight thread of execution.

Cooperative multitasking, also known as non-preemptive multitasking, is a style of computer multitasking in which the operating system never initiates a context switch from a running process to another process. Instead, in order to run multiple applications concurrently, processes voluntarily yield control periodically or when idle or logically blocked. This type of multitasking is called cooperative because all programs must cooperate for the scheduling scheme to work.

In computer programming, a green thread is a thread that is scheduled by a runtime library or virtual machine (VM) instead of natively by the underlying operating system (OS). Green threads emulate multithreaded environments without relying on any native OS abilities, and they are managed in user space instead of kernel space, enabling them to work in environments that do not have native thread support.

Thread Level Speculation (TLS), also known as Speculative Multi-threading, or Speculative Parallelization, is a technique to speculatively execute a section of computer code that is anticipated to be executed later in parallel with the normal execution on a separate independent thread. Such a speculative thread may need to make assumptions about the values of input variables. If these prove to be invalid, then the portions of the speculative thread that rely on these input variables will need to be discarded and squashed. If the assumptions are correct the program can complete in a shorter time provided the thread was able to be scheduled efficiently.

Task parallelism is a form of parallelization of computer code across multiple processors in parallel computing environments. Task parallelism focuses on distributing tasks—concurrently performed by processes or threads—across different processors. In contrast to data parallelism which involves running the same task on different components of data, task parallelism is distinguished by running many different tasks at the same time on the same data. A common type of task parallelism is pipelining, which consists of moving a single set of data through a series of separate tasks where each task can execute independently of the others.

<span class="mw-page-title-main">Parallel Extensions</span>

Parallel Extensions was the development name for a managed concurrency library developed by a collaboration between Microsoft Research and the CLR team at Microsoft. The library was released in version 4.0 of the .NET Framework. It is composed of two parts: Parallel LINQ (PLINQ) and Task Parallel Library (TPL). It also consists of a set of coordination data structures (CDS) – sets of data structures used to synchronize and co-ordinate the execution of concurrent tasks.

In parallel computing, work stealing is a scheduling strategy for multithreaded computer programs. It solves the problem of executing a dynamically multithreaded computation, one that can "spawn" new threads of execution, on a statically multithreaded computer, with a fixed number of processors. It does so efficiently in terms of execution time, memory usage, and inter-processor communication.

References

  1. Rudell, Harald (2022-03-19). "massivevirtualparallelism".
  2. baeldung (2022-01-02). "Maximum Number of Threads Per Process in Linux | Baeldung on Linux". www.baeldung.com. Retrieved 2022-03-30.
  3. "Go 1.14 Release Notes - The Go Programming Language". go.dev. Retrieved 2022-03-30.
  4. Node.js. "Don't Block the Event Loop (or the Worker Pool)". Node.js. Retrieved 2022-03-30.
  5. "Threading and Tasks in Chrome". chromium.googlesource.com. Retrieved 2022-04-05.
  6. Lu, Genchi (2021-07-22). "Java's Thread Model and Golang Goroutine". Medium. Retrieved 2022-04-05.
  7. "Principles to Handle Thousands of Connections in Java Using Netty - DZone Performance". dzone.com. Retrieved 2022-03-30.
  8. "MacBook Pro 14-inch and MacBook Pro 16-inch". Apple. Retrieved 2022-03-30.
  9. "Frequently Asked Questions (FAQ) - The Go Programming Language". go.dev. Retrieved 2022-03-30.
  10. "JEP draft: Virtual Threads (Preview)". openjdk.java.net. Retrieved 2022-03-30.
  11. Rudell, Harald (2022-03-22). "Maximum number of virtual threads in Go".
  12. Szczukocki, Denis (2020-03-18). "Difference Between Thread and Virtual Thread in Java | Baeldung". www.baeldung.com. Retrieved 2022-03-30.
  13. "Why you can have millions of Goroutines but only thousands of Java Threads". rcoh.me. 2018-04-12. Retrieved 2022-03-30.
  14. "Main - Main - OpenJDK Wiki". wiki.openjdk.java.net. Retrieved 2022-03-30.
  15. "The Go Programming Language". go.dev. 2022-03-22. Retrieved 2022-03-30.
  16. "Virtual Threads". Oracle Help Center. Retrieved 2024-09-10.
  17. "Intel Technology Journal" (PDF).