Affinity mask

Last updated

An affinity mask is a bit mask indicating what processor(s) a thread or process should be run on by the scheduler of an operating system. [1] Setting the affinity mask for certain processes running under Windows can be useful as there are several system processes (especially on domain controllers) that are restricted to the first CPU / Core. So, excluding the first CPU might lead to better application performance.

Contents

Windows API

Thread affinity in Microsoft Windows can be specified with the SetThreadAffinityMask function. [1] Forcing of each OpenMP thread to distinctive cores in Windows can be accomplished by means of the following C code:

#include<windows.h>#include<omp.h>// Set OpenMP thread affinityvoidset_thread_affinity(){#pragma omp parallel default(shared){DWORD_PTRmask=(DWORD_PTR)1<<omp_get_thread_num();SetThreadAffinityMask(GetCurrentThread(),mask);}}

See also

Related Research Articles

<span class="mw-page-title-main">Thread (computing)</span> Smallest sequence of programmed instructions that can be managed independently by a scheduler

In computer science, a thread of execution is the smallest sequence of programmed instructions that can be managed independently by a scheduler, which is typically a part of the operating system. The implementation of threads and processes differs between operating systems. In Modern Operating Systems, Tanenbaum shows that many distinct models of process organization are possible. In many cases, a thread is a component of a process. The multiple threads of a given process may be executed concurrently, sharing resources such as memory, while different processes do not share these resources. In particular, the threads of a process share its executable code and the values of its dynamically allocated variables and non-thread-local global variables at any given time.

<span class="mw-page-title-main">Hyper-threading</span> Proprietary simultaneous multithreading implementation by Intel

Hyper-threading is Intel's proprietary simultaneous multithreading (SMT) implementation used to improve parallelization of computations performed on x86 microprocessors. It was introduced on Xeon server processors in February 2002 and on Pentium 4 desktop processors in November 2002. Since then, Intel has included this technology in Itanium, Atom, and Core 'i' Series CPUs, among others.

Processor affinity, or CPU pinning or "cache affinity", enables the binding and unbinding of a process or a thread to a central processing unit (CPU) or a range of CPUs, so that the process or thread will execute only on the designated CPU or CPUs rather than any CPU. This can be viewed as a modification of the native central queue scheduling algorithm in a symmetric multiprocessing operating system. Each item in the queue has a tag indicating its kin processor. At the time of resource allocation, each task is allocated to its kin processor in preference to others.

In computing, scheduling is the action of assigning resources to perform tasks. The resources may be processors, network links or expansion cards. The tasks may be threads, processes or data flows.

<span class="mw-page-title-main">F Sharp (programming language)</span> Microsoft programming language

F# is a functional-first, general-purpose, strongly typed, multi-paradigm programming language that encompasses functional, imperative, and object-oriented programming methods. It is most often used as a cross-platform Common Language Infrastructure (CLI) language on .NET, but can also generate JavaScript and graphics processing unit (GPU) code.

<span class="mw-page-title-main">OpenMP</span> Open standard for parallelizing

OpenMP is an application programming interface (API) that supports multi-platform shared-memory multiprocessing programming in C, C++, and Fortran, on many platforms, instruction-set architectures and operating systems, including Solaris, AIX, FreeBSD, HP-UX, Linux, macOS, and Windows. It consists of a set of compiler directives, library routines, and environment variables that influence run-time behavior.

<span class="mw-page-title-main">System Idle Process</span> Kernel process in Windows NT operating systems

In Windows NT operating systems, the System Idle Process contains one or more kernel threads which run when no other runnable thread can be scheduled on a CPU. In a multiprocessor system, there is one idle thread associated with each CPU core. For a system with hyperthreading enabled, there is an idle thread for each logical processor.

In computer programming, the term hooking covers a range of techniques used to alter or augment the behaviour of an operating system, of applications, or of other software components by intercepting function calls or messages or events passed between software components. Code that handles such intercepted function calls, events or messages is called a hook.

<span class="mw-page-title-main">Multi-core processor</span> Microprocessor with more than one processing unit

A multi-core processor is a microprocessor on a single integrated circuit with two or more separate processing units, called cores, each of which reads and executes program instructions. The instructions are ordinary CPU instructions but the single processor can run instructions on separate cores at the same time, increasing overall speed for programs that support multithreading or other parallel computing techniques. Manufacturers typically integrate the cores onto a single integrated circuit die or onto multiple dies in a single chip package. The microprocessors currently used in almost all personal computers are multi-core.

In the x86 architecture, the CPUID instruction is a processor supplementary instruction allowing software to discover details of the processor. It was introduced by Intel in 1993 with the launch of the Pentium and SL-enhanced 486 processors.

The Time Stamp Counter (TSC) is a 64-bit register present on all x86 processors since the Pentium. It counts the number of CPU cycles since its reset. The instruction RDTSC returns the TSC in EDX:EAX. In x86-64 mode, RDTSC also clears the upper 32 bits of RAX and RDX. Its opcode is 0F 31. Pentium competitors such as the Cyrix 6x86 did not always have a TSC and may consider RDTSC an illegal instruction. Cyrix included a Time Stamp Counter in their MII.

In computer programming, DLL injection is a technique used for running code within the address space of another process by forcing it to load a dynamic-link library. DLL injection is often used by external programs to influence the behavior of another program in a way its authors did not anticipate or intend. For example, the injected code could hook system function calls, or read the contents of password textboxes, which cannot be done the usual way. A program used to inject arbitrary code into arbitrary processes is called a DLL injector.

<span class="mw-page-title-main">Task Manager (Windows)</span> Task manager application included with the Windows NT family of operating systems

Task Manager, previously known as Windows Task Manager, is a task manager, system monitor, and startup manager included with Microsoft Windows systems. It provides information about computer performance and running software, including name of running processes, CPU and GPU load, commit charge, I/O details, logged-in users, and Windows services. Task Manager can also be used to set process priorities, processor affinity, start and stop services, and forcibly terminate processes.

<span class="mw-page-title-main">Process Explorer</span> Freeware system monitor for Windows

Process Explorer is a freeware task manager and system monitor for Microsoft Windows created by SysInternals, which has been acquired by Microsoft and re-branded as Windows Sysinternals. It provides the functionality of Windows Task Manager along with a rich set of features for collecting information about processes running on the user's system. It can be used as the first step in debugging software or system problems.

Intel oneAPI DPC++/C++ Compiler and Intel C++ Compiler Classic are Intel’s C, C++, SYCL, and Data Parallel C++ (DPC++) compilers for Intel processor-based systems, available for Windows, Linux, and macOS operating systems.

Multimedia Class Scheduler Service (MMCSS) is a Windows service that allows multimedia applications to get prioritized access to CPU for time-sensitive processing as well as prioritized disc access to ensure that the process is not starved of data to process. The MMCSS service monitors the CPU load and dynamically adjusts priority so that the application can use as much CPU time as possible without denying CPU to lower priority applications. MMCSS uses heuristics to determine the relative priority required for the task the thread is performing and dynamically adjusts priority based on that. A thread must invoke MMCSS explicitly to use its services by calling the AvSetMmMaxThreadCharacteristics or AvSetMmThreadCharacteristics APIs.

In computing on Microsoft platforms, WoW64 is a subsystem of the Windows operating system capable of running 32-bit applications on 64-bit Windows. It is included in all 64-bit versions of Windows—including Windows XP Professional x64 Edition, IA-64 and x64 versions of Windows Server 2003, as well as x64 versions of Windows Vista, Windows Server 2008, Windows 7, Windows 8, Windows Server 2012, Windows 8.1, Windows 10, Windows Server 2016, Windows Server 2019, and Windows 11, as well as ARM64 versions of Windows 10, Windows 11 and Windows Server 2022, except in Windows Server Server Core where it is an optional component, and Windows Nano Server where it is not included. WoW64 aims to take care of many of the differences between 32-bit Windows and 64-bit Windows, particularly involving structural changes to Windows itself.

For several years parallel hardware was only available for distributed computing but recently it is becoming available for the low end computers as well. Hence it has become inevitable for software programmers to start writing parallel applications. It is quite natural for programmers to think sequentially and hence they are less acquainted with writing multi-threaded or parallel processing applications. Parallel programming requires handling various issues such as synchronization and deadlock avoidance. Programmers require added expertise for writing such applications apart from their expertise in the application domain. Hence programmers prefer to write sequential code and most of the popular programming languages support it. This allows them to concentrate more on the application. Therefore, there is a need to convert such sequential applications to parallel applications with the help of automated tools. The need is also non-trivial because large amount of legacy code written over the past few decades needs to be reused and parallelized.

<span class="mw-page-title-main">Shared memory</span> Computer memory that can be accessed by multiple processes

In computer science, shared memory is memory that may be simultaneously accessed by multiple programs with an intent to provide communication among them or avoid redundant copies. Shared memory is an efficient means of passing data between programs. Depending on context, programs may run on a single processor or on multiple separate processors.

<span class="mw-page-title-main">Kaby Lake</span> Intel microprocessor, released in 2016

Kaby Lake is Intel's codename for its seventh generation Core microprocessor family announced on August 30, 2016. Like the preceding Skylake, Kaby Lake is produced using a 14 nanometer manufacturing process technology. Breaking with Intel's previous "tick–tock" manufacturing and design model, Kaby Lake represents the optimized step of the newer process–architecture–optimization model. Kaby Lake began shipping to manufacturers and OEMs in the second quarter of 2016, with its desktop chips officially launched in January 2017.

References

  1. 1 2 "SetThreadAffinityMask function (winbase.h) - Win32 apps". learn.microsoft.com. January 27, 2022. Retrieved April 7, 2023.