Rpath

Last updated

In computing, rpath designates the run-time search path hard-coded in an executable file or library. Dynamic linking loaders use the rpath to find required libraries.

Contents

Specifically, it encodes a path to shared libraries into the header of an executable (or another shared library). This RPATH header value (so named in the Executable and Linkable Format header standards) may either override or supplement the system default dynamic linking search paths.

The rpath of an executable or shared library is an optional entry in the .dynamic section of the ELF executable or shared libraries, with the type DT_RPATH, called the DT_RPATH attribute. It can be stored there at link time by the linker. Tools such as chrpath and patchelf can create or modify the entry later.

Use of the DT_RPATH entry by the dynamic linker

The different dynamic linkers for ELF implement the use of the DT_RPATH attribute in different ways.

GNU ld.so

The dynamic linker of the GNU C Library searches for shared libraries in the following locations in order: [1]

  1. The (colon-separated) paths in the DT_RPATH dynamic section attribute of the binary if present and the DT_RUNPATH attribute does not exist.
  2. The (colon-separated) paths in the environment variable LD_LIBRARY_PATH, unless the executable is a setuid/setgid binary, in which case it is ignored. LD_LIBRARY_PATH can be overridden by calling the dynamic linker with the option --library-path (e.g. /lib/ld-linux.so.2 --library-path $HOME/mylibs myprogram).
  3. The (colon-separated) paths in the DT_RUNPATH dynamic section attribute of the binary if present.
  4. Lookup based on the ldconfig cache file (often located at /etc/ld.so.cache) which contains a compiled list of candidate libraries previously found in the augmented library path (set by /etc/ld.so.conf). If, however, the binary was linked with the -z nodefaultlib linker option, libraries in the default library paths are skipped.
  5. In the trusted default path /lib, and then /usr/lib. If the binary was linked with the -z nodefaultlib linker option, this step is skipped.

Failing to find the shared library in all these locations will raise the "cannot open shared object file: No such file or directory" error.

Notes:

The role of GNU ld

The GNU Linker (GNU ld) implements a feature which it calls "new-dtags", which can be used to insert an rpath that has lower precedence than the LD_LIBRARY_PATH environment variable. [2]

If the new-dtags feature is enabled in the linker (--enable-new-dtags), GNU ld, besides setting the DT_RPATH attribute, also sets the DT_RUNPATH attribute to the same string. At run time, if the dynamic linker finds a DT_RUNPATH attribute, it ignores the value of the DT_RPATH attribute, with the effect that LD_LIBRARY_PATH is checked first and the paths in the DT_RUNPATH attribute are only searched afterwards.

The ld dynamic linker does not search DT_RUNPATH locations for transitive dependencies, unlike DT_RPATH. [3]

Instead of specifying the -rpath to the linker, the environment variable LD_RUN_PATH can be set to the same effect.

Solaris ld.so

The dynamic linker of Solaris, specifically /lib/ld.so of SunOS 5.8 and similar systems looks for libraries in the directories specified in the LD_LIBRARY_PATH variable before looking at the DT_RPATH attribute. Sun Microsystems was the first[ citation needed ] to introduce dynamic library loading. Sun later added the rpath option to ld and used it in essential libraries as an added security feature. GNU ld did the same to support Sun-style dynamic libraries.

Example

$ cc-shared-Wl,-soname,termcap.so.4,-rpath,/lib/termcap.so.4-otermcap.so.4  $ objdump-xtermcap.so.4   NEEDED               libc.so.6  SONAME               termcap.so.4  RPATH                /lib/termcap.so.4

In this example, GNU or Sun ld (ld.so) will REFUSE to load termcap for a program needing it unless the file termcap.so is in /lib/ and named termcap.so.4. LD_LIBRARY_PATH is ignored. If /lib/termcap.so.4 is removed to remediate, the shell dies (one cannot load an alternate termcap.so and a rescue disk is needed, but also if a new termcap.so.4 has RPATH /lib, ld.so will refuse to use to load it unless it clobbered /lib/termcap.so.4). But there's another issue: it isn't safe to copy over some libs in /lib as they are "in use," further restricting the would-be lib tester. Furthermore, SONAME termcap.so.4 vs. SONAME termcap.so means programs needing basic termcap.so are denied because the library above deleted the ABI access to basic support.

$ cc-shared-Wl,-soname,libtermcap.so.2-olibtermcap.so.2  $ objdump-xtermcap.so.2   NEEDED               libc.so.6  SONAME               termcap.so.2

Old Linux/Sun used the above, which allows a user to direct any program to use any termcap.so they specify in LD_LIBRARY_PATH, or what is found in /usr/local/lib(n) using the search rules such as ld.so.conf. However, GNU ld always uses /lib or /usr/lib regardless before LD_LIBRARY_PATH, so first /lib/termcap.so is moved to /usr/local/lib and that mentioned in ld.so.conf, which enables use of moving libs and ld.so.conf or use of LD_LIBRARY_PATH to use. A preferred practice is to use "SONAME termcap.so" and have programs check version (all libs do support that) to use features available, but that was often skipped in old releases due to slow computing speed and lack of time to code correctly.

That being said, test this kind of thing thoroughly on a given platform before deciding to rely on it. Release administrators today are not guaranteed to respect past guidelines or documentation. Some UNIX varieties link and load in a completely different way. rpath is specific to ld shipped with a particular distribution.

Lastly, as said, rpath is a security feature however "mandatory access control" (MAC) and other techniques can be as effective or more effective than rpath to control lib reading and writing.

Control over rpath using today's compilers is often nearly impossible given lengthy and convoluted make(1) scripting. Worse, some build scripts ignore --disable-rpath even though they present it as an option. It would be time-consuming and frustrating, and probably unfeasible, to fix build scripting in every odd program to compile.

A simple sh(1) "wrapper" can call the real ld, named ld.bin. The wrapper can filter in/out -rpath option before invoking ld.

#!/bin/sh# - filter ld options here -ld.bin$opts

However, note that some builds incorrectly use rpath instead of rpath-link or LD_LIBRARY_PATH or $(TOP)/dir/foo.so to locate intermediate products that stay in the build directory - thus backwardly demand rpath in the final product, which is a new issue concerning "what is rpath".

Security considerations

The use of rpath and also runpath can present security risks where the value applied includes directories under an attacker's control. This can include cases where the value defined explicitly references an attacker writable location but also instances where a relative path is used, either through the presence of . or .., via $ORIGIN etc or where a directory statement is left unpopulated. In particular, this can allow for setUID binaries to be exploited, where an insecure path is used. This can be leveraged to trick the binary into loading malicious libraries from one or other of the directories under an attacker's control.

Related Research Articles

<span class="mw-page-title-main">Executable and Linkable Format</span> Standard file format for executables, object code, shared libraries, and core dumps.

In computing, the Executable and Linkable Format, is a common standard file format for executable files, object code, shared libraries, and core dumps. First published in the specification for the application binary interface (ABI) of the Unix operating system version named System V Release 4 (SVR4), and later in the Tool Interface Standard, it was quickly accepted among different vendors of Unix systems. In 1999, it was chosen as the standard binary file format for Unix and Unix-like systems on x86 processors by the 86open project.

<span class="mw-page-title-main">Emacs Lisp</span> Dialect of Lisp used as the primary implementation and extension language for Emacs

Emacs Lisp is a dialect of the Lisp programming language used as a scripting language by Emacs. It is used for implementing most of the editing functionality built into Emacs, the remainder being written in C, as is the Lisp interpreter. Emacs Lisp is also termed Elisp, although there are also older, unrelated Lisp dialects with that name.

<span class="mw-page-title-main">Linker (computing)</span> Computer program which combines multiple object files into a single file

In computing, a linker or link editor is a computer system program that takes one or more object files and combines them into a single executable file, library file, or another "object" file.

<span class="mw-page-title-main">Library (computing)</span> Collection of resources used to develop a computer program

In computer science, a library is a collection of read-only resources that is leveraged during software development to implement a computer program.

In computing, a loadable kernel module (LKM) is an object file that contains code to extend the running kernel, or so-called base kernel, of an operating system. LKMs are typically used to add support for new hardware and/or filesystems, or for adding system calls. When the functionality provided by an LKM is no longer required, it can be unloaded in order to free memory and other resources.

Mach-O, short for Mach object file format, is a file format for executables, object code, shared libraries, dynamically loaded code, and core dumps. It was developed to replace the a.out format.

file (command) Standard Unix program

The file command is a standard program of Unix and Unix-like operating systems for recognizing the type of data contained in a computer file.

In computer science, a static library or statically linked library is a set of routines, external functions and variables which are resolved in a caller at compile-time and copied into a target application by a compiler, linker, or binder, producing an object file and a stand-alone executable. This executable and the process of compiling it are both known as a static build of the program. Historically, libraries could only be static. Static libraries are either merged with other static libraries and object files during building/linking to form a single executable or loaded at run-time into the address space of their corresponding executable at a static memory offset determined at compile-time/link-time.

A dynamic-link library (DLL) is a shared library in the Microsoft Windows or OS/2 operating system.

<span class="mw-page-title-main">Portable application</span> Type of computer program

A portable application, sometimes also called standalone, is a program designed to operate without changing other files or requiring other software to be installed. In this way, it can be easily added to, run, and removed from any compatible computer without setup or side-effects.

In computing, a dynamic linker is the part of an operating system that loads and links the shared libraries needed by an executable when it is executed, by copying the content of libraries from persistent storage to RAM, filling jump tables and relocating pointers. The specific operating system and executable format determine how the dynamic linker functions and how it is implemented.

In computing, a shebang is the character sequence consisting of the characters number sign and exclamation mark at the beginning of a script. It is also called sharp-exclamation, sha-bang, hashbang, pound-bang, or hash-pling.

In computer programming, DLL injection is a technique used for running code within the address space of another process by forcing it to load a dynamic-link library. DLL injection is often used by external programs to influence the behavior of another program in a way its authors did not anticipate or intend. For example, the injected code could hook system function calls, or read the contents of password textboxes, which cannot be done the usual way. A program used to inject arbitrary code into arbitrary processes is called a DLL injector.

In Unix and Unix-like operating systems, a soname is a field of data in a shared object file. The soname is a string, which is used as a "logical name" describing the functionality of the object. Typically, that name is equal to the filename of the library, or to a prefix thereof, e.g. libc.so.6.

A weak symbol denotes a specially annotated symbol during linking of Executable and Linkable Format (ELF) object files. By default, without any annotation, a symbol in an object file is strong. During linking, a strong symbol can override a weak symbol of the same name. In contrast, in the presence of two strong symbols by the same name, the linker resolves the symbol in favor of the first one found. This behavior allows an executable to override standard library functions, such as malloc(3). When linking a binary executable, a weakly declared symbol does not need a definition. In comparison, a declared strong symbol without a definition triggers an undefined symbol link error.

The tsort program is a command line utility on Unix and Unix-like platforms, that performs a topological sort on its input. As of 2017, it is part of the POSIX.1 standard.

authbind is an open-source system utility written by Ian Jackson and is distributed under the GNU General Public License. The authbind software allows a program that would normally require superuser privileges to access privileged network services to run as a non-privileged user. authbind allows the system administrator to permit specific users and groups access to bind to TCP and UDP ports below 1024. Ports 0 - 1023 are normally privileged and reserved for programs that are run as the root user. Allowing regular users limited access to privileged ports helps prevent possible privilege escalation and system compromise if the software happens to contain software bugs or is found to be vulnerable to unknown exploits.

gold (linker)

In software engineering, gold is a linker for ELF files. It became an official GNU package and was added to binutils in March 2008 and first released in binutils version 2.19. gold was developed by Ian Lance Taylor and a small team at Google. The motivation for writing gold was to make a linker that is faster than the GNU linker, especially for large applications coded in C++.

ldd is a *nix utility that prints the shared libraries required by each program or shared library specified on the command line. It was developed by Roland McGrath and Ulrich Drepper. If some shared library is missing for any program, that program won't come up.

The Global Offset Table, or GOT, is a section of a computer program's memory used to enable computer program code compiled as an ELF file to run correctly, independent of the memory address where the program's code or data is loaded at runtime.

References

  1. "Linux / Unix Command: ld.so". man7.org. Retrieved 19 February 2018.
  2. "Shared Libraries: distribution and build-system issues". Official website of the Haskell Compiler . Retrieved 4 April 2019.
  3. "Bug #1253638 "dynamic linker does not use DT_RUNPATH for transit... : Bugs : Eglibc package : Ubuntu". 21 November 2013.