Dynamic linker

Last updated

In computing, a dynamic linker is the part of an operating system that loads and links the shared libraries needed by an executable when it is executed (at "run time"), by copying the content of libraries from persistent storage to RAM, filling jump tables and relocating pointers. The specific operating system and executable format determine how the dynamic linker functions and how it is implemented.

Contents

Linking is often referred to as a process that is performed when the executable is compiled, while a dynamic linker is a special part of an operating system that loads external shared libraries into a running process and then binds those shared libraries dynamically to the running process. This approach is also called dynamic linking or late linking.

Implementations

Microsoft Windows

Dynamic-link library, or DLL, is Microsoft's implementation of the shared library concept in the Microsoft Windows and OS/2 operating systems. These libraries usually have the file extension DLL, OCX (for libraries containing ActiveX controls), or DRV (for legacy system drivers). The file formats for DLLs are the same as for Windows EXE files  that is, Portable Executable (PE) for 32-bit and 64-bit Windows, and New Executable (NE) for 16-bit Windows. As with EXEs, DLLs can contain code, data, and resources, in any combination.

Data files with the same file format as a DLL, but with different file extensions and possibly containing only resource sections, can be called resource DLLs. Examples of such DLLs include multi-language user interface libraries with extension MUI, icon libraries, sometimes having the extension ICL, and font files, having the extensions FON and FOT. [1]

Unix-like systems using ELF, and Darwin-based systems

In most Unix-like systems, most of the machine code that makes up the dynamic linker is actually an external executable that the operating system kernel loads and executes first in a process address space newly constructed as a result of calling exec or posix_spawn functions. At link time, the path of the dynamic linker that should be used is embedded into the executable image.

When an executable file is loaded, the operating system kernel reads the path of the dynamic linker from it and then attempts to load and execute this other executable binary; if that attempt fails because, for example, there is no file with that path, the attempt to execute the original executable fails. The dynamic linker then loads the initial executable image and all the dynamically-linked libraries on which it depends and starts the executable. As a result, the pathname of the dynamic linker is part of the operating system's application binary interface.

Systems using ELF

In Unix-like systems that use ELF for executable images and dynamic libraries, such as Solaris, 64-bit versions of HP-UX, Linux, FreeBSD, NetBSD, OpenBSD, and DragonFly BSD, the path of the dynamic linker that should be used is embedded at link time into the .interp section of the executable's PT_INTERP segment. In those systems, dynamically loaded shared libraries can be identified by the filename suffix .so (shared object).

The dynamic linker can be influenced into modifying its behavior during either the program's execution or the program's linking, and the examples of this can be seen in the run-time linker manual pages for various Unix-like systems. [2] [3] [4] [5] [6] A typical modification of this behavior is the use of LD_LIBRARY_PATH and LD_PRELOAD environment variables, which adjust the runtime linking process by searching for shared libraries at alternate locations and by forcibly loading and linking libraries that would otherwise not be, respectively. An example is zlibc, [7] also known as uncompress.so, [lower-alpha 1] which facilitates transparent decompression when used through the LD_PRELOAD hack; consequently, it is possible to read pre-compressed (gzipped) file data on BSD and Linux systems as if the files were not compressed, essentially allowing a user to add transparent compression to the underlying filesystem, although with some caveats. The mechanism is flexible, allowing trivial adaptation of the same code to perform additional or alternate processing of data during the file read, prior to the provision of said data to the user process that has requested it. [8] [9]

macOS and iOS

In the Apple Darwin operating system, and in the macOS and iOS operating systems built on top of it, the path of the dynamic linker that should be used is embedded at link time into one of the Mach-O load commands in the executable image. In those systems, dynamically loaded shared libraries can be identified either by the filename suffix .dylib or by their placement inside the bundle for a framework.

The dynamic linker not only links the target executable to the shared libraries but also places machine code functions at specific address points in memory that the target executable knows about at link time. When an executable wishes to interact with the dynamic linker, it simply executes the machine-specific call or jump instruction to one of those well-known address points. The executables on the macOS and iOS platforms often interact with the dynamic linker during the execution of the process; it is even known that an executable might interact with the dynamic linker, causing it to load more libraries and resolve more symbols, hours after it initially launches. The reason that a macOS or iOS program interacts with the dynamic linker so often is due both to Apple's Cocoa and Cocoa Touch APIs and Objective-C, the language in which they are implemented (see their main articles for more information).

The dynamic linker can be coerced into modifying some of its behavior; however, unlike other Unix-like operating systems, these modifications are hints that can be (and sometimes are) ignored by the dynamic linker. Examples of this can be seen in dyld's manual page. [10] A typical modification of this behavior is the use of the DYLD_FRAMEWORK_PATH and DYLD_PRINT_LIBRARIES environment variables. The former of the previously-mentioned variables adjusts the executables' search path for the shared libraries, while the latter displays the names of the libraries as they are loaded and linked.

Apple's macOS dynamic linker is an open-source project released as part of Darwin and can be found in the Apple's open-source dyld project. [11]

XCOFF-based Unix-like systems

In Unix-like operating systems using XCOFF, such as AIX, dynamically-loaded shared libraries use the filename suffix .a.

The dynamic linker can be influenced into modifying its behavior during either the program's execution or the program's linking. A typical modification of this behavior is the use of the LIBPATH environment variable. This variable adjusts the runtime linking process by searching for shared libraries at alternate locations and by forcibly loading and linking libraries that would otherwise not be, respectively.

OS/360 and successors

Dynamic linking from Assembler language programs in IBM OS/360 and its successors is done typically using a LINK macro instruction containing a Supervisor Call instruction that activates the operating system routines that makes the library module to be linked available to the program. Library modules may reside in a "STEPLIB" or "JOBLIB" specified in control cards and only available to a specific execution of the program, in a library included in the LINKLIST in the PARMLIB (specified at system startup time), or in the "link pack area" where specific reentrant modules are loaded at system startup time.

Multics

In the Multics operating system all files, including executables, are segments. A call to a routine not part of the current segment will cause the system to find the referenced segment, in memory or on disk, and add it to the address space of the running process. Dynamic linking is the normal method of operation, and static linking (using the binder) is the exception.

Efficiency

Dynamic linking is generally slower (requires more CPU cycles) than linking during compilation time, [12] as is the case for most processes executed at runtime. However, dynamic linking is often more space-efficient (on disk and in memory at runtime). When a library is linked statically, every process being run is linked with its own copy of the library functions being called upon. Therefore, if a library is called upon many times by different programs, the same functions in that library are duplicated in several places in the system's memory. Using shared, dynamic libraries means that, instead of linking each file to its own copy of a library at compilation time and potentially wasting memory space, only one copy of the library is ever stored in memory at a time, freeing up memory space to be used elsewhere. [13] Additionally, in dynamic linking, a library is only loaded if it is actually being used. [14]

See also

Notes

  1. Not to be confused with the zlib compression library.

Related Research Articles

<span class="mw-page-title-main">Executable and Linkable Format</span> Standard file format for executables, object code, shared libraries, and core dumps.

In computing, the Executable and Linkable Format, is a common standard file format for executable files, object code, shared libraries, and core dumps. First published in the specification for the application binary interface (ABI) of the Unix operating system version named System V Release 4 (SVR4), and later in the Tool Interface Standard, it was quickly accepted among different vendors of Unix systems. In 1999, it was chosen as the standard binary file format for Unix and Unix-like systems on x86 processors by the 86open project.

<span class="mw-page-title-main">Linker (computing)</span> Computer program which combines multiple object files into a single file

In computing, a linker or link editor is a computer system program that takes one or more object files and combines them into a single executable file, library file, or another "object" file.

A monolithic kernel is an operating system architecture where the entire operating system is working in kernel space. The monolithic model differs from other operating system architectures in that it alone defines a high-level virtual interface over computer hardware. A set of primitives or system calls implement all operating system services such as process management, concurrency, and memory management. Device drivers can be added to the kernel as modules.

A shared library or shared object is a computer file that contains executable code designed to be used by multiple computer programs or other libraries at runtime.

<span class="mw-page-title-main">System call</span> Way for programs to access kernel services

In computing, a system call is the programmatic way in which a computer program requests a service from the operating system on which it is executed. This may include hardware-related services, creation and execution of new processes, and communication with integral kernel services such as process scheduling. System calls provide an essential interface between a process and the operating system.

<span class="mw-page-title-main">Library (computing)</span> Collection of resources used to develop a computer program

In computer science, a library is a collection of read-only resources that is leveraged during software development to implement a computer program.

In computing, a loadable kernel module (LKM) is an object file that contains code to extend the running kernel, or so-called base kernel, of an operating system. LKMs are typically used to add support for new hardware and/or filesystems, or for adding system calls. When the functionality provided by an LKM is no longer required, it can be unloaded in order to free memory and other resources.

In computer systems a loader is the part of an operating system that is responsible for loading programs and libraries. It is one of the essential stages in the process of starting a program, as it places programs into memory and prepares them for execution. Loading a program involves either memory-mapping or copying the contents of the executable file containing the program instructions into memory, and then carrying out other required preparatory tasks to prepare the executable for running. Once loading is complete, the operating system starts the program by passing control to the loaded program code.

In computing, position-independent code (PIC) or position-independent executable (PIE) is a body of machine code that, being placed somewhere in the primary memory, executes properly regardless of its absolute address. PIC is commonly used for shared libraries, so that the same library code can be loaded at a location in each program's address space where it does not overlap with other memory in use by, for example, other shared libraries. PIC was also used on older computer systems that lacked an MMU, so that the operating system could keep applications away from each other even within the single address space of an MMU-less system.

Mach-O, short for Mach object file format, is a file format for executables, object code, shared libraries, dynamically loaded code, and core dumps. It was developed to replace the a.out format.

The proc filesystem (procfs) is a special filesystem in Unix-like operating systems that presents information about processes and other system information in a hierarchical file-like structure, providing a more convenient and standardized method for dynamically accessing process data held in the kernel than traditional tracing methods or direct access to kernel memory. Typically, it is mapped to a mount point named /proc at boot time. The proc file system acts as an interface to internal data structures about running processes in the kernel. In Linux, it can also be used to obtain information about the kernel and to change certain kernel parameters at runtime (sysctl).

In computer science, a static library or statically linked library is a set of routines, external functions and variables which are resolved in a caller at compile-time and copied into a target application by a compiler, linker, or binder, producing an object file and a stand-alone executable. This executable and the process of compiling it are both known as a static build of the program. Historically, libraries could only be static. Static libraries are either merged with other static libraries and object files during building/linking to form a single executable or loaded at run-time into the address space of their corresponding executable at a static memory offset determined at compile-time/link-time.

A dynamic-link library (DLL) is a shared library in the Microsoft Windows or OS/2 operating system.

<span class="mw-page-title-main">Binary Modular Dataflow Machine</span>

Binary Modular Dataflow Machine (BMDFM) is a software package that enables running an application in parallel on shared memory symmetric multiprocessing (SMP) computers using the multiple processors to speed up the execution of single applications. BMDFM automatically identifies and exploits parallelism due to the static and mainly dynamic scheduling of the dataflow instruction sequences derived from the formerly sequential program.

In computing, rpath designates the run-time search path hard-coded in an executable file or library. Dynamic linking loaders use the rpath to find required libraries.

In computing, prebinding, also called prelinking, is a method for optimizing application load times by resolving library symbols prior to launch.

In computer programming, DLL injection is a technique used for running code within the address space of another process by forcing it to load a dynamic-link library. DLL injection is often used by external programs to influence the behavior of another program in a way its authors did not anticipate or intend. For example, the injected code could hook system function calls, or read the contents of password textboxes, which cannot be done the usual way. A program used to inject arbitrary code into arbitrary processes is called a DLL injector.

A memory-mapped file is a segment of virtual memory that has been assigned a direct byte-for-byte correlation with some portion of a file or file-like resource. This resource is typically a file that is physically present on disk, but can also be a device, shared memory object, or other resource that an operating system can reference through a file descriptor. Once present, this correlation between the file and the memory space permits applications to treat the mapped portion as if it were primary memory.

Dynamic loading is a mechanism by which a computer program can, at run time, load a library into memory, retrieve the addresses of functions and variables contained in the library, execute those functions or access those variables, and unload the library from memory. It is one of the 3 mechanisms by which a computer program can use some other software; the other two are static linking and dynamic linking. Unlike static linking and dynamic linking, dynamic loading allows a computer program to start up in the absence of these libraries, to discover available libraries, and to potentially gain additional functionality.

ptrace is a system call found in Unix and several Unix-like operating systems. By using ptrace one process can control another, enabling the controller to inspect and manipulate the internal state of its target. ptrace is used by debuggers and other code-analysis tools, mostly as aids to software development.

References

  1. Microsoft Corporation. "Creating a Resource-Only DLL". Microsoft Developer Network Library.
  2. ld.so.1(1) : Solaris dynamic linker/loader   Solaris 11.4 User Commands Reference Manual
  3. ld-linux.so(8)    Linux Programmer's Manual – Administration and Privileged Commands
  4. rtld(1) : FreeBSD dynamic linker/loader   FreeBSD General Commands Manual
  5. ld.elf_so(1) : NetBSD dynamic linker/loader   NetBSD General Commands Manual
  6. ld.so(1) : OpenBSD dynamic linker/loader   OpenBSD General Commands Manual
  7. "ZLIBC - Transparent access to compressed files". Archived from the original on 2000-06-04.
  8. "uncompress.so". delorie.com. Retrieved 2014-07-04.
  9. "zlibc.conf". delorie.com. Retrieved 2014-07-04.
  10. dyld(1) : Darwin/Mac OS X dynamic linker/loader   Darwin and macOS General Commands Manual
  11. Apple Inc. "Open Source - Releases". apple.com. Retrieved 2014-07-04.
  12. Xuxian, Jiang (2009). "Operating Systems Principles: Linking and Loading" (PDF). North Carolina State University . Retrieved 2020-09-24.
  13. Jones, M. (2008-08-28). "Anatomy of Linux dynamic libraries". IBM . Retrieved 2020-09-24.
  14. Sivilotti, Paul (August 2012). "Dynamic Linking and Loading" (PDF). Ohio State University . Retrieved 2020-09-24.

Further reading