Global interpreter lock

Last updated September 28, 2024

A global interpreter lock (GIL) is a mechanism used in computer-language interpreters to synchronize the execution of threads so that only one native thread (per process) can execute basic operations (such as memory allocation and reference counting) at a time.^[1] As a general rule, an interpreter that uses GIL will see only one thread to execute at a time, even if it runs on a multi-core processor, although some implementations provide for CPU intensive code to release the GIL, allowing multiple threads to use multiple cores. Some popular interpreters that have GIL are CPython and Ruby MRI.

Technical background concepts

A global interpreter lock (GIL) is a mutual-exclusion lock held by a programming language interpreter thread to avoid sharing code that is not thread-safe with other threads. In implementations with a GIL, there is always one GIL for each interpreter process.

Applications running on implementations with a GIL can be designed to use separate processes to achieve full parallelism, as each process has its own interpreter and in turn has its own GIL. Otherwise, the GIL can be a significant barrier to parallelism.

Advantages

Reasons for employing a global interpreter lock include:

increased speed of single-threaded programs (no necessity to acquire or release locks on all data structures separately),
easy integration of C libraries that usually are not thread-safe,
ease of implementation (having a single GIL is much simpler to implement than a lock-free interpreter or one using fine-grained locks).

A way to get around a GIL is creating a separate interpreter per thread, which is too expensive with most languages.^{[ citation needed ]}

Drawbacks

Use of a global interpreter lock in a language effectively limits the amount of parallelism reachable through concurrency of a single interpreter process with multiple threads. If the process is almost purely made up of interpreted code and does not make calls outside of the interpreter which block for long periods of time (allowing the GIL to be released by that thread while they process), there is likely to be very little increase in speed when running the process on a multiprocessor machine. Due to signaling with a CPU-bound thread, it can cause a significant slowdown, even on single processors.^[2] More seriously, when the single native thread calls a blocking OS process (such as disk access), the entire process is blocked, even though other application threads may be waiting.

Examples

Some language implementations that implement a global interpreter lock are CPython, the most widely-used implementation of Python,^[3]^[4] and Ruby MRI, the reference implementation of Ruby (where it is called Global VM Lock).

JVM-based equivalents of these languages (Jython and JRuby) do not use global interpreter locks. IronPython and IronRuby are implemented on top of Microsoft's Dynamic Language Runtime and also avoid using a GIL.^[5]

An example of an interpreted language without a GIL is Tcl, which is used in the benchmarking tool HammerDB.^[6]

Related Research Articles

Python is a high-level, general-purpose programming language. Its design philosophy emphasizes code readability with the use of significant indentation.

Ruby is an interpreted, high-level, general-purpose programming language. It was designed with an emphasis on programming productivity and simplicity. In Ruby, everything is an object, including primitive data types. It was developed in the mid-1990s by Yukihiro "Matz" Matsumoto in Japan.

In computer science, a thread of execution is the smallest sequence of programmed instructions that can be managed independently by a scheduler, which is typically a part of the operating system. In many cases, a thread is a component of a process.

Parallel computing is a type of computation in which many calculations or processes are carried out simultaneously. Large problems can often be divided into smaller ones, which can then be solved at the same time. There are several different forms of parallel computing: bit-level, instruction-level, data, and task parallelism. Parallelism has long been employed in high-performance computing, but has gained broader interest due to the physical constraints preventing frequency scaling. As power consumption by computers has become a concern in recent years, parallel computing has become the dominant paradigm in computer architecture, mainly in the form of multi-core processors.

In computer science, a lock or mutex is a synchronization primitive that prevents state from being modified or accessed by multiple threads of execution at once. Locks enforce mutual exclusion concurrency control policies, and with a variety of possible methods there exist multiple unique implementations for different applications.

In software engineering, a spinlock is a lock that causes a thread trying to acquire it to simply wait in a loop ("spin") while repeatedly checking whether the lock is available. Since the thread remains active but is not performing a useful task, the use of such a lock is a kind of busy waiting. Once acquired, spinlocks will usually be held until they are explicitly released, although in some implementations they may be automatically released if the thread being waited on blocks or "goes to sleep".

Jython is an implementation of the Python programming language designed to run on the Java platform. It was known as JPython until 1999.

Resource acquisition is initialization (RAII) is a programming idiom used in several object-oriented, statically typed programming languages to describe a particular language behavior. In RAII, holding a resource is a class invariant, and is tied to object lifetime. Resource allocation is done during object creation, by the constructor, while resource deallocation (release) is done during object destruction, by the destructor. In other words, resource acquisition must succeed for initialization to succeed. Thus the resource is guaranteed to be held between when initialization finishes and finalization starts, and to be held only when the object is alive. Thus if there are no object leaks, there are no resource leaks.

IronPython is an implementation of the Python programming language targeting the .NET and Mono frameworks. The project is currently maintained by a group of volunteers at GitHub. It is free and open-source software, and can be implemented with Python Tools for Visual Studio, which is a free and open-source extension for Microsoft's Visual Studio IDE.

JRuby is an implementation of the Ruby programming language atop the Java Virtual Machine, written largely in Java. It is free software released under a three-way EPL/GPL/LGPL license. JRuby is tightly integrated with Java to allow the embedding of the interpreter into any Java application with full two-way access between the Java and the Ruby code.

CPython is the reference implementation of the Python programming language. Written in C and Python, CPython is the default and most widely used implementation of the Python language.

Stackless Python, or Stackless, is a Python programming language interpreter, so named because it avoids depending on the C call stack for its own stack. In practice, Stackless Python uses the C stack, but the stack is cleared between function calls. The most prominent feature of Stackless is microthreads, which avoid much of the overhead associated with usual operating system threads. In addition to Python features, Stackless also adds support for coroutines, communication channels, and task serialization.

<span class="mw-page-title-main">Binary Modular Dataflow Machine</span>

Binary Modular Dataflow Machine (BMDFM) is a software package that enables running an application in parallel on shared memory symmetric multiprocessing (SMP) computers using the multiple processors to speed up the execution of single applications. BMDFM automatically identifies and exploits parallelism due to the static and mainly dynamic scheduling of the dataflow instruction sequences derived from the formerly sequential program.

In computer science and engineering, transactional memory attempts to simplify concurrent programming by allowing a group of load and store instructions to execute in an atomic way. It is a concurrency control mechanism analogous to database transactions for controlling access to shared memory in concurrent computing. Transactional memory systems provide high-level abstraction as an alternative to low-level thread synchronization. This abstraction allows for coordination between concurrent reads and writes of shared data in parallel systems.

In computer programming, a green thread is a thread that is scheduled by a runtime library or virtual machine (VM) instead of natively by the underlying operating system (OS). Green threads emulate multithreaded environments without relying on any native OS abilities, and they are managed in user space instead of kernel space, enabling them to work in environments that do not have native thread support.

Task parallelism is a form of parallelization of computer code across multiple processors in parallel computing environments. Task parallelism focuses on distributing tasks—concurrently performed by processes or threads—across different processors. In contrast to data parallelism which involves running the same task on different components of data, task parallelism is distinguished by running many different tasks at the same time on the same data. A common type of task parallelism is pipelining, which consists of moving a single set of data through a series of separate tasks where each task can execute independently of the others.

Matz's Ruby Interpreter or Ruby MRI is an implementation of the Ruby programming language named after Ruby creator Yukihiro Matsumoto ("Matz"). Until the specification of the Ruby language in 2012, the MRI implementation was considered the de facto reference, especially since an independent attempt to create the specification (RubySpec) had failed. Starting with Ruby 1.9, and continuing with Ruby 2.x and above, the official Ruby interpreter has been YARV.

In computer programming, a virtual thread is a thread that is managed by a runtime library or virtual machine (VM) and made to resemble "real" operating system thread to code executing on it, while requiring substantially fewer resources than the latter.

HammerDB is an open source database benchmarking application developed by Steve Shaw. HammerDB supports databases such as Oracle, SQL Server, Db2, MySQL and MariaDB. HammerDB is written in TCL and C, and is licensed under the GPL v3.

References

↑ "GlobalInterpreterLock" . Retrieved 30 November 2015.
↑ David Beazley (2009-06-11). "Inside the Python GIL" (PDF). Chicago: Chicago Python User Group. Retrieved 2009-10-07.
↑ Shannon -jj Behrens (2008-02-03). "Concurrency and Python". Dr. Dobb's Journal. p. 2. Retrieved 2008-07-12. The GIL is a lock that is used to protect all the critical sections in Python. Hence, even if you have multiple CPUs, only one thread may be doing "pythony" things at a time.
↑ "Python/C API Reference Manual: Thread State and the Global Interpreter Lock". Archived from the original on 2008-09-14. Retrieved 2014-08-15.
↑ "IronPython at python.org". python.org. Retrieved 2011-04-04. IronPython has no GIL and multi-threaded code can use multi core processors.
↑ "HammerDB Concepts and Architecture". HammerDB. 2018-11-30. Retrieved 2020-05-10. It is important to understand at the outset that HammerDB is written in TCL because of the unique threading capabilities that TCL brings.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] "GlobalInterpreterLock" . Retrieved 30 November 2015.

[2] David Beazley (2009-06-11). "Inside the Python GIL" (PDF). Chicago: Chicago Python User Group. Retrieved 2009-10-07.

[3] Shannon -jj Behrens (2008-02-03). "Concurrency and Python". Dr. Dobb's Journal. p. 2. Retrieved 2008-07-12. The GIL is a lock that is used to protect all the critical sections in Python. Hence, even if you have multiple CPUs, only one thread may be doing "pythony" things at a time.

[4] "Python/C API Reference Manual: Thread State and the Global Interpreter Lock". Archived from the original on 2008-09-14. Retrieved 2014-08-15.

[5] "IronPython at python.org". python.org. Retrieved 2011-04-04. IronPython has no GIL and multi-threaded code can use multi core processors.

[6] "HammerDB Concepts and Architecture". HammerDB. 2018-11-30. Retrieved 2020-05-10. It is important to understand at the outset that HammerDB is written in TCL because of the unique threading capabilities that TCL brings.

[1]

[2]

[3]

[4]

[5]

[6]