Thread-local storage

Last updated

Thread-local storage (TLS) is a computer programming method that uses static or global memory local to a thread.


While the use of global variables is generally discouraged in modern programming, legacy operating systems such as UNIX are designed for uniprocessor hardware and require some additional mechanism to retain the semantics of pre-reentrant APIs. An example of such situations is where functions use a global variable to set an error condition (for example the global variable errno used by many functions of the C library). If errno were a global variable, a call of a system function on one thread may overwrite the value previously set by a call of a system function on a different thread, possibly before following code on that different thread could check for the error condition. The solution is to have errno be a variable that looks like it is global, but in fact exists once per thread—i.e., it lives in thread-local storage. A second use case would be multiple threads accumulating information into a global variable. To avoid a race condition, every access to this global variable would have to be protected by a mutex. Alternatively, each thread might accumulate into a thread-local variable (that, by definition, cannot be read from or written to from other threads, implying that there can be no race conditions). Threads then only have to synchronise a final accumulation from their own thread-local variable into a single, truly global variable.

Many systems impose restrictions on the size of the thread-local memory block, in fact often rather tight limits. On the other hand, if a system can provide at least a memory address (pointer) sized variable thread-local, then this allows the use of arbitrarily sized memory blocks in a thread-local manner, by allocating such a memory block dynamically and storing the memory address of that block in the thread-local variable. On RISC machines, the calling convention often reserves a thread pointer register for this use.

Windows implementation

The application programming interface (API) function TlsAlloc can be used to obtain an unused TLS slot index; the TLS slot index will then be considered 'used'.

The TlsGetValue and TlsSetValue functions are then used to read and write a memory address to a thread-local variable identified by the TLS slot index. TlsSetValue only affects the variable for the current thread. The TlsFree function can be called to release the TLS slot index.

There is a Win32 Thread Information Block for each thread. One of the entries in this block is the thread-local storage table for that thread. [1] TlsAlloc returns an index to this table, unique per address space, for each call. Each thread has its own copy of the thread-local storage table. Hence, each thread can independently use TlsSetValue(index) and obtain the specified value via TlsGetValue(index), because these set and look up an entry in the thread's own table.

Apart from TlsXxx function family, Windows executables can define a section which is mapped to a different page for each thread of the executing process. Unlike TlsXxx values, these pages can contain arbitrary and valid addresses. These addresses, however, are different for each executing thread and therefore should not be passed to asynchronous functions (which may execute in a different thread) or otherwise passed to code which assume that a virtual address is unique within the whole process. TLS sections are managed using memory paging and its size is quantized to a page size (4kB on x86 machines). Such sections may only be defined inside a main executable of a program - DLLs should not contain such sections, because they are not correctly initialized when loading with LoadLibrary.

Pthreads implementation

In the Pthreads API, memory local to a thread is designated with the term Thread-specific data.

The functions pthread_key_create and pthread_key_delete are used respectively to create and delete a key for thread-specific data. The type of the key is explicitly left opaque and is referred to as pthread_key_t. This key can be seen by all threads. In each thread, the key can be associated with thread-specific data via pthread_setspecific. The data can later be retrieved using pthread_getspecific.

In addition pthread_key_create can optionally accept a destructor function that will automatically be called at thread exit, if the thread-specific data is not NULL. The destructor receives the value associated with the key as parameter so it can perform cleanup actions (close connections, free memory, etc.). Even when a destructor is specified, the program must still call pthread_key_delete to free the thread-specific data at process level (the destructor only frees the data local to the thread).

Language-specific implementation

Apart from relying on programmers to call the appropriate API functions, it is also possible to extend the programming language to support thread local storage (TLS).

C and C++

In C11, the keyword _Thread_local is used to define thread-local variables. The header <threads.h>, if supported, defines thread_local as a synonym for that keyword. Example usage:


C++11 introduces the thread_local [2] keyword which can be used in the following cases

Aside from that, various compiler implementations provide specific ways to declare thread-local variables:

On Windows versions before Vista and Server 2008, __declspec(thread) works in DLLs only when those DLLs are bound to the executable, and will not work for those loaded with LoadLibrary() (a protection fault or data corruption may occur). [7]

Common Lisp (and maybe other dialects)

Common Lisp provides a feature called dynamically scoped variables.

Dynamic variables have a binding which is private to the invocation of a function and all of the children called by that function.

This abstraction naturally maps to thread-specific storage, and Lisp implementations that provide threads do this. Common Lisp has numerous standard dynamic variables, and so threads cannot be sensibly added to an implementation of the language without these variables having thread-local semantics in dynamic binding.

For instance the standard variable *print-base* determines the default radix in which integers are printed. If this variable is overridden, then all enclosing code will print integers in an alternate radix:

;;; function foo and its children will print;; in hexadecimal:(let((*print-base*16))(foo))

If functions can execute concurrently on different threads, this binding has to be properly thread-local, otherwise each thread will fight over who controls a global printing radix.


In D version 2, all static and global variables are thread-local by default and are declared with syntax similar to "normal" global and static variables in other languages. Global variables must be explicitly requested using the shared keyword:

intthreadLocal;// This is a thread-local variable.sharedintglobal;// This is a global variable shared with all threads.

The shared keyword works both as the storage class, and as a type qualifiershared variables are subject to some restrictions which statically enforce data integrity. [9] To declare a "classic" global variable without these restrictions, the unsafe __gshared keyword must be used: [10]

__gsharedintglobal;// This is a plain old global variable.


In Java, thread-local variables are implemented by the ThreadLocal class object. ThreadLocal holds variable of type T, which is accessible via get/set methods. For example, ThreadLocal variable holding Integer value looks like this:


At least for Oracle/OpenJDK, this does not use native thread-local storage in spite of OS threads being used for other aspects of Java threading. Instead, each Thread object stores a (non-thread-safe) map of ThreadLocal objects to their values (as opposed to each ThreadLocal having a map of Thread objects to values and incurring a performance overhead). [11]

.NET languages: C# and others

In .NET Framework languages such as C#, static fields can be marked with the ThreadStatic attribute:

classFooBar{    [ThreadStatic]privatestaticint_foo;}

In .NET Framework 4.0 the System.Threading.ThreadLocal<T> class is available for allocating and lazily loading thread-local variables.


Also an API is available for dynamically allocating thread-local variables.

Object Pascal

In Object Pascal (Delphi) or Free Pascal the threadvar reserved keyword can be used instead of 'var' to declare variables using the thread-local storage.



In Cocoa, GNUstep, and OpenStep, each NSThread object has a thread-local dictionary that can be accessed through the thread's threadDictionary method.

NSMutableDictionary*dict=[[NSThreadcurrentThread]threadDictionary];dict[@"A key"]=@"Some data";


In Perl threads were added late in the evolution of the language, after a large body of extant code was already present on the Comprehensive Perl Archive Network (CPAN). Thus, threads in Perl by default take their own local storage for all variables, to minimise the impact of threads on extant non-thread-aware code. In Perl, a thread-shared variable can be created using an attribute:



In PureBasic thread variables are declared with the keyword Threaded.



In Python version 2.4 or later, local class in threading module can be used to create thread-local storage.



Ruby can create/access thread-local variables using []=/[] methods:



Thread-local variables can be created in Rust using the thread_local! macro provided by the Rust standard library:

usestd::cell::RefCell;usestd::thread;thread_local!(staticFOO: RefCell<u32>=RefCell::new(1));FOO.with(|f|{assert_eq!(*f.borrow(),1);*f.borrow_mut()=2;});// each thread starts out with the initial value of 1, even though this thread already changed its copy of the thread local value to 2lett=thread::spawn(move||{FOO.with(|f|{assert_eq!(*f.borrow(),1);*f.borrow_mut()=3;});});// wait for the thread to complete and bail out on panict.join().unwrap();// original thread retains the original value of 2 despite the child thread changing the value to 3 for that threadFOO.with(|f|{assert_eq!(*f.borrow(),2);});

See also

Related Research Articles

Thread safety is a computer programming concept applicable to multi-threaded code. Thread-safe code only manipulates shared data structures in a manner that ensures that all threads behave properly and fulfill their design specifications without unintended interaction. There are various strategies for making thread-safe data structures.

In programming languages, a closure, also lexical closure or function closure, is a technique for implementing lexically scoped name binding in a language with first-class functions. Operationally, a closure is a record storing a function together with an environment. The environment is a mapping associating each free variable of the function with the value or reference to which the name was bound when the closure was created. Unlike a plain function, a closure allows the function to access those captured variables through the closure's copies of their values or references, even when the function is invoked outside their scope.

In computing, a computer program or subroutine is called reentrant if multiple invocations can safely run concurrently on multiple processors, or on a single processor system, where a reentrant procedure can be interrupted in the middle of its execution and then safely be called again ("re-entered") before its previous invocations complete execution. The interruption could be caused by an internal action such as a jump or call, or by an external action such as an interrupt or signal, unlike recursion, where new invocations can only be caused by internal call.

In object-oriented programming (OOP), the object lifetime of an object is the time between an object's creation and its destruction. Rules for object lifetime vary significantly between languages, in some cases between implementations of a given language, and lifetime of a particular object may vary from one run of the program to another.

In computer programming, a global variable is a variable with global scope, meaning that it is visible throughout the program, unless shadowed. The set of all global variables is known as the global environment or global state. In compiled languages, global variables are generally static variables, whose extent (lifetime) is the entire runtime of the program, though in interpreted languages, global variables are generally dynamically allocated when declared, since they are not known ahead of time.

errno.h is a header file in the standard library of the C programming language. It defines macros for reporting and retrieving error conditions using the symbol errno.

The syntax of the C programming language is the set of rules governing writing of software in the C language. It is designed to allow for programs that are extremely terse, have a close relationship with the resulting object code, and yet provide relatively high-level data abstraction. C was the first widely successful high-level language for portable operating-system development.

In computer science, a union is a value that may have any of several representations or formats within the same position in memory; that consists of a variable that may hold such a data structure. Some programming languages support special data types, called union types, to describe such values and variables. In other words, a union type definition will specify which of a number of permitted primitive types may be stored in its instances, e.g., "float or long integer". In contrast with a record, which could be defined to contain both a float and an integer; in a union, there is only one value at any given time.

In computer science and software engineering, busy-waiting, busy-looping or spinning is a technique in which a process repeatedly checks to see if a condition is true, such as whether keyboard input or a lock is available. Spinning can also be used to generate an arbitrary time delay, a technique that was necessary on systems that lacked a method of waiting a specific length of time. Processor speeds vary greatly from computer to computer, especially as some processors are designed to dynamically adjust speed based on current workload. Consequently, spinning as a time-delay technique can produce unpredictable or even inconsistent results on different systems unless code is included to determine the time a processor takes to execute a "do nothing" loop, or the looping code explicitly checks a real-time clock.

In class-based, object-oriented programming, a constructor is a special type of subroutine called to create an object. It prepares the new object for use, often accepting arguments that the constructor uses to set required member variables.

In computer programming, particularly in the C, C++, C#, and Java programming languages, the volatile keyword indicates that a value may change between different accesses, even if it does not appear to be modified. This keyword prevents an optimizing compiler from optimizing away subsequent reads or writes and thus incorrectly reusing a stale value or omitting writes. Volatile values primarily arise in hardware access, where reading from or writing to memory is used to communicate with peripheral devices, and in threading, where a different thread may have modified a value.

<span class="mw-page-title-main">Java syntax</span> Set of rules defining correctly structured programs

The syntax of Java refers to the set of rules defining how a Java program is written and interpreted.

In computer programming, a static variable is a variable that has been allocated "statically", meaning that its lifetime is the entire run of the program. This is in contrast to shorter-lived automatic variables, whose storage is stack allocated and deallocated on the call stack; and in contrast to objects, whose storage is dynamically allocated and deallocated in heap memory.

In some programming languages, const is a type qualifier that indicates that the data is read-only. While this can be used to declare constants, const in the C family of languages differs from similar constructs in other languages in being part of the type, and thus has complicated behavior when combined with pointers, references, composite data types, and type-checking. In other languages, the data is not in a single memory location, but copied at compile time on each use. Languages which utilize it include C, C++, D, JavaScript, Julia, and Rust.

typedef is a reserved keyword in the programming languages C and C++. It is used to create an additional name (alias) for another data type, but does not create a new type, except in the obscure case of a qualified typedef of an array type where the typedef qualifiers are transferred to the array element type. As such, it is often used to simplify the syntax of declaring complex data structures consisting of struct and union types, although it is also commonly used to provide specific descriptive type names for integer data types of varying sizes.

A class in C++ is a user-defined type or data structure declared with keyword class that has data and functions as its members whose access is governed by the three access specifiers private, protected or public. By default access to members of a C++ class is private. The private members are not accessible outside the class; they can be accessed only through methods of the class. The public members form an interface to the class and are accessible outside the class.

C++11 is a version of the ISO/IEC 14882 standard for the C++ programming language. C++11 replaced the prior version of the C++ standard, called C++03, and was later replaced by C++14. The name follows the tradition of naming language versions by the publication year of the specification, though it was formerly named C++0x because it was expected to be published before 2010.

Protel stands for "Procedure Oriented Type Enforcing Language". It is a programming language created by Nortel Networks and used on telecommunications switching systems such as the DMS-100. Protel-2 is the object-oriented version of Protel.

This article describes the syntax of the C# programming language. The features described are compatible with .NET Framework and Mono.

<span class="mw-page-title-main">TScript</span>

TScript is an object-oriented embeddable scripting language for C++ that supports hierarchical transient typed variables (TVariable). Its main design criterion is to create a scripting language that can interface with C++, transforming data and returning the result. This enables C++ applications to change their functionality after installation.


  1. Pietrek, Matt (May 2006). "Under the Hood". MSDN . Retrieved 6 April 2010.
  2. Section 3.7.2 in C++11 standard
  3. IBM XL C/C++: Thread-local storage
  4. GCC 3.3.1: Thread-Local Storage
  5. Clang 2.0: release notes
  6. Intel C++ Compiler 8.1 (linux) release notes: Thread-local Storage
  7. 1 2 Visual Studio 2003: "Thread Local Storage (TLS)". Microsoft Docs .
  8. Intel C++ Compiler 10.0 (windows): Thread-local storage
  9. Alexandrescu, Andrei (6 July 2010). Chapter 13 - Concurrency. The D Programming Language. InformIT. p. 3. Retrieved 3 January 2014.
  10. Bright, Walter (12 May 2009). "Migrating to Shared". Retrieved 3 January 2014.
  11. "How is Java's ThreadLocal implemented under the hood?". Stack Overflow. Stack Exchange. Retrieved 27 December 2015.