Dynamic loading

Last updated

Dynamic loading is a mechanism by which a computer program can, at run time, load a library (or other binary) into memory, retrieve the addresses of functions and variables contained in the library, execute those functions or access those variables, and unload the library from memory. It is one of the 3 mechanisms by which a computer program can use some other software; the other two are static linking and dynamic linking. Unlike static linking and dynamic linking, dynamic loading allows a computer program to start up in the absence of these libraries, to discover available libraries, and to potentially gain additional functionality. [1] [2]

Contents

History

Dynamic loading was a common technique for IBM's operating systems for System/360 such as OS/360, particularly for I/O subroutines, and for COBOL and PL/I runtime libraries, and continues to be used in IBM's operating systems for z/Architecture, such as z/OS. As far as the application programmer is concerned, the loading is largely transparent, since it is mostly handled by the operating system (or its I/O subsystem). The main advantages are:

IBM's strategic transaction processing system, CICS (1970s onwards) uses dynamic loading extensively both for its kernel and for normal application program loading. Corrections to application programs could be made offline and new copies of changed programs loaded dynamically without needing to restart CICS [3] [4] (which can, and frequently does, run 24/7).

Shared libraries were added to Unix in the 1980s, but initially without the ability to let a program load additional libraries after startup. [5]

Uses

Dynamic loading is most frequently used in implementing software plugins. [1] For example, the Apache Web Server's *.dso "dynamic shared object" plugin files are libraries which are loaded at runtime with dynamic loading. [6] Dynamic loading is also used in implementing computer programs where multiple different libraries may supply the requisite functionality and where the user has the option to select which library or libraries to provide.

In C/C++

Not all systems support dynamic loading. UNIX-like operating systems such as macOS, Linux, and Solaris provide dynamic loading with the C programming language "dl" library. The Windows operating system provides dynamic loading through the Windows API.

Summary

Name Standard POSIX/UNIX API Microsoft Windows API
Header file inclusion#include <dlfcn.h>#include <windows.h>
Definitions for headerdl

(libdl.so, libdl.dylib, etc. depending on the OS)

kernel32.dll
Loading the librarydlopenLoadLibrary
LoadLibraryEx
Extracting contentsdlsymGetProcAddress
Unloading the librarydlcloseFreeLibrary

Loading the library

Loading the library is accomplished with LoadLibrary or LoadLibraryEx on Windows and with dlopen on UNIX-like operating systems. Examples follow:

Most UNIX-like operating systems (Solaris, Linux, *BSD, etc.)

void*sdl_library=dlopen("libSDL.so",RTLD_LAZY);if(sdl_library==NULL){// report error ...}else{// use the result in a call to dlsym}

macOS

As a UNIX library:

void*sdl_library=dlopen("libSDL.dylib",RTLD_LAZY);if(sdl_library==NULL){// report error ...}else{// use the result in a call to dlsym}

As a macOS Framework:

void*sdl_library=dlopen("/Library/Frameworks/SDL.framework/SDL",RTLD_LAZY);if(sdl_library==NULL){// report error ...}else{// use the result in a call to dlsym}

Or if the framework or bundle contains Objective-C code:

NSBundle*bundle=[NSBundlebundleWithPath:@"/Library/Plugins/Plugin.bundle"];NSError*err=nil;if([bundleloadAndReturnError:&err]){// Use the classes and functions in the bundle.}else{// Handle error.}

Windows

HMODULEsdl_library=LoadLibrary(TEXT("SDL.dll"));if(sdl_library==NULL){// report error ...}else{// use the result in a call to GetProcAddress}

Extracting library contents

Extracting the contents of a dynamically loaded library is achieved with GetProcAddress on Windows and with dlsym on UNIX-like operating systems.

UNIX-like operating systems (Solaris, Linux, *BSD, macOS, etc.)

void*initializer=dlsym(sdl_library,"SDL_Init");if(initializer==NULL){// report error ...}else{// cast initializer to its proper type and use}

On macOS, when using Objective-C bundles, one can also:

ClassrootClass=[bundleprincipalClass];// Alternatively, NSClassFromString() can be used to obtain a class by name.if(rootClass){idobject=[[rootClassalloc]init];// Use the object.}else{// Report error.}

Windows

FARPROCinitializer=GetProcAddress(sdl_library,"SDL_Init");if(initializer==NULL){// report error ...}else{// cast initializer to its proper type and use}

Converting a library function pointer

The result of dlsym() or GetProcAddress() has to be converted to a pointer of the appropriate type before it can be used.

Windows

In Windows, the conversion is straightforward, since FARPROC is essentially already a function pointer:

typedefINT_PTR(*FARPROC)(void);

This can be problematic when the address of an object is to be retrieved rather than a function. However, usually one wants to extract functions anyway, so this is normally not a problem.

typedefvoid(*sdl_init_function_type)(void);sdl_init_function_typeinit_func=(sdl_init_function_type)initializer;

UNIX (POSIX)

According to the POSIX specification, the result of dlsym() is a void pointer. However, a function pointer is not required to even have the same size as a data object pointer, and therefore a valid conversion between type void* and a pointer to a function may not be easy to implement on all platforms.

On most systems in use today, function and object pointers are de facto convertible. The following code snippet demonstrates one workaround which allows to perform the conversion anyway on many systems:

typedefvoid(*sdl_init_function_type)(void);sdl_init_function_typeinit_func=(sdl_init_function_type)initializer;

The above snippet will give a warning on some compilers: warning: dereferencing type-punned pointer will break strict-aliasing rules. Another workaround is:

typedefvoid(*sdl_init_function_type)(void);union{sdl_init_function_typefunc;void*obj;}alias;alias.obj=initializer;sdl_init_function_typeinit_func=alias.func;

which disables the warning even if strict aliasing is in effect. This makes use of the fact that reading from a different union member than the one most recently written to (called "type punning") is common, and explicitly allowed even if strict aliasing is in force, provided the memory is accessed through the union type directly. [7] However, this is not strictly the case here, since the function pointer is copied to be used outside the union. Note that this trick may not work on platforms where the size of data pointers and the size of function pointers is not the same.

Solving the function pointer problem on POSIX systems

The fact remains that any conversion between function and data object pointers has to be regarded as an (inherently non-portable) implementation extension, and that no "correct" way for a direct conversion exists, since in this regard the POSIX and ISO standards contradict each other.

Because of this problem, the POSIX documentation on dlsym() for the outdated issue 6 stated that "a future version may either add a new function to return function pointers, or the current interface may be deprecated in favor of two new functions: one that returns data pointers and the other that returns function pointers". [8]

For the subsequent version of the standard (issue 7, 2008), the problem has been discussed and the conclusion was that function pointers have to be convertible to void* for POSIX compliance. [8] This requires compiler makers to implement a working cast for this case.

If the contents of the library can be changed (i.e. in the case of a custom library), in addition to the function itself a pointer to it can be exported. Since a pointer to a function pointer is itself an object pointer, this pointer can always be legally retrieved by call to dlsym() and subsequent conversion. However, this approach requires maintaining separate pointers to all functions that are to be used externally, and the benefits are usually small.

Unloading the library

Loading a library causes memory to be allocated; the library must be deallocated in order to avoid a memory leak. Additionally, failure to unload a library can prevent filesystem operations on the file which contains the library. Unloading the library is accomplished with FreeLibrary on Windows and with dlclose on UNIX-like operating systems. However, unloading a DLL can lead to program crashes if objects in the main application refer to memory allocated within the DLL. For example, if a DLL introduces a new class and the DLL is closed, further operations on instances of that class from the main application will likely cause a memory access violation. Likewise, if the DLL introduces a factory function for instantiating dynamically loaded classes, calling or dereferencing that function after the DLL is closed leads to undefined behaviour.

UNIX-like operating systems (Solaris, Linux, *BSD, macOS, etc.)

dlclose(sdl_library);

Windows

FreeLibrary(sdl_library);

Special library

The implementations of dynamic loading on UNIX-like operating systems and Windows allow programmers to extract symbols from the currently executing process.

UNIX-like operating systems allow programmers to access the global symbol table, which includes both the main executable and subsequently loaded dynamic libraries.

Windows allows programmers to access symbols exported by the main executable. Windows does not use a global symbol table and has no API to search across multiple modules to find a symbol by name.

UNIX-like operating systems (Solaris, Linux, *BSD, macOS, etc.)

void*this_process=dlopen(NULL,0);

Windows

HMODULEthis_process=GetModuleHandle(NULL);HMODULEthis_process_again;GetModuleHandleEx(0,0,&this_process_again);

In Java

In the Java programming language, classes can be dynamically loaded using the ClassLoader object. For example:

Classtype=ClassLoader.getSystemClassLoader().loadClass(name);Objectobj=type.newInstance();

The Reflection mechanism also provides a means to load a class if it isn't already loaded. It uses the classloader of the current class:

Classtype=Class.forName(name);Objectobj=type.newInstance();

However, there is no simple way to unload a class in a controlled way. Loaded classes can only be unloaded in a controlled way, i.e. when the programmer wants this to happen, if the classloader used to load the class is not the system class loader, and is itself unloaded. When doing so, various details need to be observed to ensure the class is really unloaded. This makes unloading of classes tedious.

Implicit unloading of classes, i.e. in an uncontrolled way by the garbage collector, has changed a few times in Java. Until Java 1.2. the garbage collector could unload a class whenever it felt it needed the space, independent of which class loader was used to load the class. Starting with Java 1.2 classes loaded via the system classloader were never unloaded and classes loaded via other classloaders only when this other classloader was unloaded. Starting with Java 6 classes can contain an internal marker indicating to the garbage collector they can be unloaded if the garbage collector desires to do so, independent of the classloader used to load the class. The garbage collector is free to ignore this hint.

Similarly, libraries implementing native methods are dynamically loaded using the System.loadLibrary method. There is no System.unloadLibrary method.

Platforms without dynamic loading

Despite its promulgation in the 1980s through UNIX and Windows, some systems still chose not to add—or even to remove—dynamic loading. For example, Plan 9 from Bell Labs and its successor 9front intentionally avoid dynamic linking, as they consider it to be "harmful". [9] The Go programming language, by some of the same developers as Plan 9, also did not support dynamic linking, but plugin loading is available since Go 1.8 (February 2017). The Go runtime and any library functions are statically linked into the compiled binary. [10]

See also

Related Research Articles

<span class="mw-page-title-main">Library (computing)</span> Collection of non-volatile resources used by computer programs

In computer science, a library is a collection of non-volatile resources used by computer programs, often for software development. These may include configuration data, documentation, help data, message templates, pre-written code and subroutines, classes, values or type specifications. In IBM's OS/360 and its successors they are referred to as partitioned data sets.

Multiple dispatch or multimethods is a feature of some programming languages in which a function or method can be dynamically dispatched based on the run-time (dynamic) type or, in the more general case, some other attribute of more than one of its arguments. This is a generalization of single-dispatch polymorphism where a function or method call is dynamically dispatched based on the derived type of the object on which the method has been called. Multiple dispatch routes the dynamic dispatch to the implementing function or method using the combined characteristics of one or more arguments.

The C standard library or libc is the standard library for the C programming language, as specified in the ISO C standard. Starting from the original ANSI C standard, it was developed at the same time as the C library POSIX specification, which is a superset of it. Since ANSI C was adopted by the International Organization for Standardization, the C standard library is also called the ISO C library.

In computing, a loadable kernel module (LKM) is an object file that contains code to extend the running kernel, or so-called base kernel, of an operating system. LKMs are typically used to add support for new hardware and/or filesystems, or for adding system calls. When the functionality provided by an LKM is no longer required, it can be unloaded in order to free memory and other resources.

C dynamic memory allocation refers to performing manual memory management for dynamic memory allocation in the C programming language via a group of functions in the C standard library, namely malloc, realloc, calloc, aligned_alloc and free.

<span class="mw-page-title-main">Pointer (computer programming)</span> Object which stores memory addresses in a computer program

In computer science, a pointer is an object in many programming languages that stores a memory address. This can be that of another value located in computer memory, or in some cases, that of memory-mapped computer hardware. A pointer references a location in memory, and obtaining the value stored at that location is known as dereferencing the pointer. As an analogy, a page number in a book's index could be considered a pointer to the corresponding page; dereferencing such a pointer would be done by flipping to the page with the given page number and reading the text found on that page. The actual format and content of a pointer variable is dependent on the underlying computer architecture.

In computer systems a loader is the part of an operating system that is responsible for loading programs and libraries. It is one of the essential stages in the process of starting a program, as it places programs into memory and prepares them for execution. Loading a program involves either memory-mapping or copying the contents of the executable file containing the program instructions into memory, and then carrying out other required preparatory tasks to prepare the executable for running. Once loading is complete, the operating system starts the program by passing control to the loaded program code.

In computing, position-independent code (PIC) or position-independent executable (PIE) is a body of machine code that, being placed somewhere in the primary memory, executes properly regardless of its absolute address. PIC is commonly used for shared libraries, so that the same library code can be loaded at a location in each program's address space where it does not overlap with other memory in use by, for example, other shared libraries. PIC was also used on older computer systems that lacked an MMU, so that the operating system could keep applications away from each other even within the single address space of an MMU-less system.

In computer programming, a callback or callback function is any reference to executable code that is passed as an argument to another piece of code; that code is expected to call back (execute) the callback function as part of its job. This execution may be immediate as in a synchronous callback, or it might happen at a later point in time as in an asynchronous callback. Programming languages support callbacks in different ways, often implementing them with subroutines, lambda expressions, blocks, or function pointers.

In the C++ programming language, a reference is a simple reference datatype that is less powerful but safer than the pointer type inherited from C. The name C++ reference may cause confusion, as in computer science a reference is a general concept datatype, with pointers and C++ references being specific reference datatype implementations. The definition of a reference in C++ is such that it does not need to exist. It can be implemented as a new name for an existing object.

<span class="mw-page-title-main">Dangling pointer</span> Pointer that does not point to a valid object

Dangling pointers and wild pointers in computer programming are pointers that do not point to a valid object of the appropriate type. These are special cases of memory safety violations. More generally, dangling references and wild references are references that do not resolve to a valid destination.

typedef is a reserved keyword in the programming languages C, C++, and Objective-C. It is used to create an additional name (alias) for another data type, but does not create a new type, except in the obscure case of a qualified typedef of an array type where the typedef qualifiers are transferred to the array element type. As such, it is often used to simplify the syntax of declaring complex data structures consisting of struct and union types, although it is also commonly used to provide specific descriptive type names for integer data types of varying sizes.

In computer science, a static library or statically-linked library is a set of routines, external functions and variables which are resolved in a caller at compile-time and copied into a target application by a compiler, linker, or binder, producing an object file and a stand-alone executable. This executable and the process of compiling it are both known as a static build of the program. Historically, libraries could only be static. Static libraries are either merged with other static libraries and object files during building/linking to form a single executable or loaded at run-time into the address space of their corresponding executable at a static memory offset determined at compile-time/link-time.

Dynamic-link library (DLL) is Microsoft's implementation of the shared library concept in the Microsoft Windows and OS/2 operating systems. These libraries usually have the file extension DLL, OCX, or DRV . The file formats for DLLs are the same as for Windows EXE files – that is, Portable Executable (PE) for 32-bit and 64-bit Windows, and New Executable (NE) for 16-bit Windows. As with EXEs, DLLs can contain code, data, and resources, in any combination.

In computer programming, the term hooking covers a range of techniques used to alter or augment the behaviour of an operating system, of applications, or of other software components by intercepting function calls or messages or events passed between software components. Code that handles such intercepted function calls, events or messages is called a hook.

In computing, a dynamic linker is the part of an operating system that loads and links the shared libraries needed by an executable when it is executed, by copying the content of libraries from persistent storage to RAM, filling jump tables and relocating pointers. The specific operating system and executable format determine how the dynamic linker functions and how it is implemented.

In computer programming, DLL injection is a technique used for running code within the address space of another process by forcing it to load a dynamic-link library. DLL injection is often used by external programs to influence the behavior of another program in a way its authors did not anticipate or intend. For example, the injected code could hook system function calls, or read the contents of password textboxes, which cannot be done the usual way. A program used to inject arbitrary code into arbitrary processes is called a DLL injector.

Spawn in computing refers to a function that loads and executes a new child process. The current process may wait for the child to terminate or may continue to execute concurrent computing. Creating a new subprocess requires enough memory in which both the child process and the current program can execute.

Java Native Access (JNA) is a community-developed library that provides Java programs easy access to native shared libraries without using the Java Native Interface (JNI). JNA's design aims to provide native access in a natural way with a minimum of effort. Unlike JNI, no boilerplate or generated glue code is required.

The write is one of the most basic routines provided by a Unix-like operating system kernel. It writes data from a buffer declared by the user to a given device, such as a file. This is the primary way to output data from a program by directly using a system call. The destination is identified by a numeric code. The data to be written, for instance a piece of text, is defined by a pointer and a size, given in number of bytes.

References

  1. 1 2 Autoconf, Automake, and Libtool: Dynamic Loading
  2. "Linux4U: ELF Dynamic Loading". Archived from the original on 2011-03-11. Retrieved 2007-12-31.
  3. "Using the CICS-supplied procedures to install application programs".
  4. "IBM CEMT NEWCOPY or PHASEIN request fails with NOT FOR HOLD PROG - United States". 2013-03-15.
  5. Ho, W. Wilson; Olsson, Ronald A. (1991). "An approach to genuine dynamic linking". Software: Practice and Experience. 21 (4): 375–390. CiteSeerX   10.1.1.37.933 . doi:10.1002/spe.4380210404. S2CID   9422227.
  6. "Apache 1.3 Dynamic Shared Object (DSO) Support". Archived from the original on 2011-04-22. Retrieved 2007-12-31.
  7. GCC 4.3.2 Optimize Options: -fstrict-aliasing
  8. 1 2 POSIX documentation on dlopen() (issues 6 and 7).
  9. "Dynamic Linking". cat-v.org. 9front. Retrieved 2014-12-22.
  10. "Go FAQ".

Further reading