Library (computing)

Last updated

Illustration of an application which uses libvorbisfile to play an Ogg Vorbis file Ogg vorbis libs and application dia.svg
Illustration of an application which uses libvorbisfile to play an Ogg Vorbis file

In computing, a library is a collection of resources that is leveraged during software development to implement a computer program. Commonly, a library consists of executable code such as compiled functions and classes, or a library can be a collection of source code. A resource library may contain data such as images and text.

Contents

A library can be used by multiple, independent consumers (programs and other libraries). This differs from resources defined in a program which can usually only be used by that program. When a consumer uses a library resource, it gains the value of the library without having to implement it itself. Libraries encourage software reuse in a modular fashion. Libraries can use other libraries resulting in a hierarchy of libraries in a program.

When writing code that uses a library, a programmer only needs to know how to use it not its the internal details. For example, a program could use a library that abstracts a complicated system call so that the programmer can use the system feature without spending time to learn the intricacies of the system function.

History

The idea of a computer library dates back to the first computers created by Charles Babbage. An 1888 paper on his Analytical Engine suggested that computer operations could be punched on separate cards from numerical input. If these operation punch cards were saved for reuse then "by degrees the engine would have a library of its own." [1]

A woman working next to a filing cabinet containing the subroutine library on reels of punched tape for the EDSAC computer. FirstCodeLibrary-ESDAC-ThePreparationOfProgramsForAnElectronicDigitalComputer-1951.jpg
A woman working next to a filing cabinet containing the subroutine library on reels of punched tape for the EDSAC computer.

In 1947 Goldstine and von Neumann speculated that it would be useful to create a "library" of subroutines for their work on the IAS machine, an early computer that was not yet operational at that time. [2] They envisioned a physical library of magnetic wire recordings, with each wire storing reusable computer code. [3]

Inspired by von Neumann, Wilkes and his team constructed EDSAC. A filing cabinet of punched tape held the subroutine library for this computer. [4] Programs for EDSAC consisted of a main program and a sequence of subroutines copied from the subroutine library. [5] In 1951 the team published the first textbook on programming, The Preparation of Programs for an Electronic Digital Computer , which detailed the creation and the purpose of the library. [6]

COBOL included "primitive capabilities for a library system" in 1959, [7] but Jean Sammet described them as "inadequate library facilities" in retrospect. [8]

JOVIAL has a Communication Pool (COMPOOL), roughly a library of header files.

Another major contributor to the modern library concept came in the form of the subprogram innovation of FORTRAN. FORTRAN subprograms can be compiled independently of each other, but the compiler lacked a linker. So prior to the introduction of modules in Fortran-90, type checking between FORTRAN [NB 1] subprograms was impossible. [9]

By the mid 1960s, copy and macro libraries for assemblers were common. Starting with the popularity of the IBM System/360, libraries containing other types of text elements, e.g., system parameters, also became common.

In IBM's OS/360 and its successors this is called a partitioned data set.

The first object-oriented programming language, Simula, developed in 1965, supported adding classes to libraries via its compiler. [10] [11]

Linking

The linking (or binding) process resolves references known as symbols (or links) by searching for them in various locations including configured libraries. If a linker (or binder) does not find a symbol, then it fails, but multiple matches may or may not cause failure.

Static linking is linking at build time, such that the library executable code is included in the program. Dynamic linking is linking at run time; it involves building the program with information that supports run-time linking to a dynamic link library (DLL). For dynamic linking, a compatible DLL file must be available to the program at run time, but for static linking, the program is standalone.

Smart linking is performed by a build tool that excludes unused code in the linking process. For example, a program that only uses integers for arithmetic, or does no arithmetic operations at all, can exclude floating-point library routines. This can lead to smaller program file size and reduced memory usage.

Relocation

Some references in a program or library module are stored in a relative or symbolic form which cannot be resolved until all code and libraries are assigned final static addresses. Relocation is the process of adjusting these references, and is done either by the linker or the loader. In general, relocation cannot be done to individual libraries themselves because the addresses in memory may vary depending on the program using them and other libraries they are combined with. Position-independent code avoids references to absolute addresses and therefore does not require relocation.

Categories

Executable

An executable library consists of code that has been converted from source code into machine code or an intermediate form such as bytecode. A linker allows for using library objects by associating each reference with an address at which the object is located. For example, in C, a library function is invoked via C's normal function call syntax and semantics. [12]

A variant is a library containing compiled code (object code in IBM's nomenclature) in a form that cannot be loaded by the OS but that can be read by the linker.

Static

A static library is an executable library that is linked into a program at build-time by a linker (or whatever the build tool is called that does linking). [13] [14] This process, and the resulting stand-alone file, is known as a static build of the program. A static build may not need any further relocation if virtual memory is used and no address space layout randomization is desired. [15]

A static library is sometimes called an archive on Unix-like systems.

Dynamic

A dynamic library is linked when the program is run either at load-time or runtime. The dynamic library was intended after the static library to support additional software deployment flexibility.

Source

A source library consists of source code; not compiled code.

Shared

A shared library is a library that contains executable code designed to be used by multiple computer programs or other libraries at runtime, with only one copy of that code in memory, shared by all programs using the code. [16] [17] [18]

Object

Although generally an obsolete technology today, an object library exposes resources for object-oriented programming (OOP) and a distributed object is a remote object library. Examples include: COM/DCOM, SOM/DSOM, DOE, PDO and various CORBA-based systems.

The object library technology was developed since as OOP became popular, it became apparent that OOP runtime binding required information than contemporary libraries did not provide. In addition to the names and entry points of the code located within, due to inheritance, OOP binding also requires a list of dependencies since the full definition of a method may be in different places. Further, this requires more than listing that one library requires the services of another. In OOP, the libraries themselves may not be known at compile time, and vary from system to system.

The remote object technology was developed in parallel to support multi-tier programs with a user interface application running on a personal computer (PC) using services of a mainframe or minicomputer such as data storage and processing. For instance, a program on a PC would send messages to a minicomputer via remote procedure call (RPC) to retrieve relatively small samples from a relatively large dataset. In response, distrubuted object technology was developed.

Class

A class library contains classes that can be used to create objects. In Java, for example, classes are contained in JAR files and objects are created at runtime from the classes. However, in Smalltalk, a class library is the starting point for a system image that includes the entire state of the environment, classes and all instantiated objects. Most class libraries are stored in a package repository (such as Maven Central for Java). Client code explicitly specifies dependencies to external libraries in build configuration files (such as a Maven Pom in Java).

Remote

A remote library runs on another computer and its assets are accessed via remote procedure call (RPC) over a network. This distributed architecture allows for minimizing installation of the library and support for it on each consuming system and ensuring consistent versioning. A significant downside is that each library call entails significantly more overhead than for a local library.

Runtime

A runtime library provides access to the runtime environment that is available to a program tailored to the host platform.

Language standard

Many modern programming languages specify a standard library that provides a base level of functionality for the language environment.

Code generation

A code generation library has a high-level API generating or transforming byte code for Java. They are used by aspect-oriented programming, some data access frameworks, and for testing to generate dynamic proxy objects. They also are used to intercept field access. [19]

File naming

Unix-like

On most modern Unix-like systems, library files are stored in directories such as /lib, /usr/lib and /usr/local/lib. A filename typically starts with lib, and ends with .a for a static library (archive) or .so for a shared object (dynamically linked library). For example, libfoo.a and libfoo.so.

Often, symbolic link files are used to manage versioning of a library by providing a link file named without a version that links to a file named with a version. For example, libfoo.so.2 might be version 2 of library foo and a link file named libfoo.so provides a version independent name to that file that programs link to. The link file could be changed to a refer to a version 3 (libfoo.so.3) such that consuming programs will then use version 3 without having to change the program.

Files with extension .la are libtool archives; not usable by the system.

macOS

The macOS system inherits static library conventions from BSD, with the library stored in a .a file. It uses either .so or .dylib for dynamic libraries. Most libraries in macOS, however, consist of "frameworks", placed inside special directories called "bundles" which wrap the library's required files and metadata. For example, a framework called Abc would be implemented in a bundle called Abc.framework, with Abc.framework/Abc being either the dynamically linked library file or a symlink to the dynamically linked library file in Abc.framework/Versions/Current/Abc.

Windows

Often, a Windows dynamic-link library (DLL) has the file extension .dll, [20] although sometimes different extensions are used to indicate general content, e.g. .ocx for a OLE library.

A .lib file can be either a static library or contain the information needed to build an application that consumes the associated DLL. In the latter case, the associated DLL file must be present at runtime.

See also

Notes

  1. It was possible earlier between, e.g., Ada subprograms.

Related Research Articles

<span class="mw-page-title-main">Linker (computing)</span> Program that combines intermediate build files into an executable file

A linker or link editor is a computer program that combines intermediate software build files such as object and library files into a single executable file such a program or library. A linker is often part of a toolchain that includes a compiler and/or assembler that generates intermediate files that the linker processes. The linker may be integrated with other toolchain tools such that the user does not interact with the linker directly.

DLL hell is an umbrella term for the complications that arise when one works with dynamic-link libraries (DLLs) used with older Microsoft Windows operating systems, particularly legacy 16-bit editions, which all run in a single memory space. DLL hell can appear in many different ways, wherein affected programs may fail to run correctly, if at all. It is the Windows ecosystem-specific form of the general concept dependency hell.

<span class="mw-page-title-main">PowerBASIC</span> Software compiler

PowerBASIC, formerly Turbo Basic, is the brand of several commercial compilers by PowerBASIC Inc. that compile a dialect of the BASIC programming language. There are both MS-DOS and Windows versions, and two kinds of the latter: Console and Windows. The MS-DOS version has a syntax similar to that of QBasic and QuickBASIC. The Windows versions use a BASIC syntax expanded to include many Windows functions, and the statements can be combined with calls to the Windows API.

A shared library is a library that contains executable code designed to be used by multiple computer programs or other libraries at runtime, with only one copy of that code in memory, shared by all programs using the code.

In computer science, imperative programming is a programming paradigm of software that uses statements that change a program's state. In much the same way that the imperative mood in natural languages expresses commands, an imperative program consists of commands for the computer to perform. Imperative programming focuses on describing how a program operates step by step, rather than on high-level descriptions of its expected results.

In computing, just-in-time (JIT) compilation is compilation during execution of a program rather than before execution. This may consist of source code translation but is more commonly bytecode translation to machine code, which is then executed directly. A system implementing a JIT compiler typically continuously analyses the code being executed and identifies parts of the code where the speedup gained from compilation or recompilation would outweigh the overhead of compiling that code.

<span class="mw-page-title-main">MinGW</span> Free and open-source software for developing applications in Microsoft Windows

MinGW, formerly mingw32, is a free and open source software development environment to create Microsoft Windows applications.

In compiler construction, name mangling is a technique used to solve various problems caused by the need to resolve unique names for programming entities in many modern programming languages.

In computing, late binding or dynamic linkage—though not an identical process to dynamically linking imported code libraries—is a computer programming mechanism in which the method being called upon an object, or the function being called with arguments, is looked up by name at runtime. In other words, a name is associated with a particular operation or object at runtime, rather than during compilation. The name dynamic binding is sometimes used, but is more commonly used to refer to dynamic scope.

In computer programming, an entry point is the place in a program where the execution of a program begins, and where the program has access to command line arguments.

In software engineering, inversion of control (IoC) is a design principle in which custom-written portions of a computer program receive the flow of control from an external source. The term "inversion" is historical: a software architecture with this design "inverts" control as compared to procedural programming. In procedural programming, a program's custom code calls reusable libraries to take care of generic tasks, but with inversion of control, it is the external source or framework that calls the custom code.

In computer programming, a runtime system or runtime environment is a sub-system that exists in the computer where a program is created, as well as in the computers where the program is intended to be run. The name comes from the compile time and runtime division from compiled languages, which similarly distinguishes the computer processes involved in the creation of a program (compilation) and its execution in the target machine.

In computing, a dynamic linker is the part of an operating system that loads and links the shared libraries needed by an executable when it is executed, by copying the content of libraries from persistent storage to RAM, filling jump tables and relocating pointers. The specific operating system and executable format determine how the dynamic linker functions and how it is implemented.

In software development, the programming language Java was historically considered slower than the fastest third-generation typed languages such as C and C++. In contrast to those languages, Java compiles by default to a Java Virtual Machine (JVM) with operations distinct from those of the actual computer hardware. Early JVM implementations were interpreters; they simulated the virtual operations one-by-one rather than translating them into machine code for direct hardware execution.

Dynamic loading is a mechanism by which a computer program can, at run time, load a library into memory, retrieve the addresses of functions and variables contained in the library, execute those functions or access those variables, and unload the library from memory. It is one of the three mechanisms by which a computer program can use some other software within the program; the others are static linking and dynamic linking. Unlike static linking and dynamic linking, dynamic loading allows a computer program to start up in the absence of these libraries, to discover available libraries, and to potentially gain additional functionality.

OS 2200 has had several generations of compilers and linkers in its history supporting a wide variety of programming languages. In the first releases, the Exec II assembler (SLEUTH) and compilers were used. The assembler was quickly replaced with an updated version (ASM) designed specifically for the 1108 computer and Exec 8 but the early compilers continued in use for quite some time.

<span class="mw-page-title-main">Speakeasy (computational environment)</span> Computer software environment with own programming language

Speakeasy was a numerical computing interactive environment also featuring an interpreted programming language. It was initially developed for internal use at the Physics Division of Argonne National Laboratory by the theoretical physicist Stanley Cohen. He eventually founded Speakeasy Computing Corporation to make the program available commercially.

<span class="mw-page-title-main">Object-oriented programming</span> Programming paradigm based on the concept of objects

Object-oriented programming (OOP) is a programming paradigm based on the concept of objects, which can contain data and code: data in the form of fields, and code in the form of procedures. In OOP, computer programs are designed by making them out of objects that interact with one another.

In computer programming, a function is a callable unit of software logic that has a well-defined interface and behavior and can be invoked multiple times.

References

  1. Babbage, H. P. (1888-09-12). "The Analytical Engine". Proceedings of the British Association. Bath.
  2. Goldstine, Herman H. (2008-12-31). The Computer from Pascal to von Neumann. Princeton: Princeton University Press. doi:10.1515/9781400820139. ISBN   978-1-4008-2013-9.
  3. Goldstine, Herman; von Neumann, John (1947). Planning and coding of problems for an electronic computing instrument (Report). Institute for Advanced Study. pp. 3, 21–22. OCLC   26239859. it will probably be very important to develop an extensive "library" of subroutines
  4. Wilkes, M. V. (1951). "The EDSAC Computer". 1951 International Workshop on Managing Requirements Knowledge. 1951 International Workshop on Managing Requirements Knowledge. IEEE. p. 79. doi:10.1109/afips.1951.13.
  5. Campbell-Kelly, Martin (September 2011). "In Praise of 'Wilkes, Wheeler, and Gill'". Communications of the ACM. 54 (9): 25–27. doi:10.1145/1995376.1995386. S2CID   20261972.
  6. Wilkes, Maurice; Wheeler, David; Gill, Stanley (1951). The Preparation of Programs for an Electronic Digital Computer. Addison-Wesley. pp. 45, 80–91, 100. OCLC   641145988.
  7. Wexelblat, Richard (1981). History of Programming Languages. ACM Monograph Series. New York, NY: Academic Press (A subsidiary of Harcourt Brace). p.  274. ISBN   0-12-745040-8.
  8. Wexelblat, op. cit., p. 258
  9. Wilson, Leslie B.; Clark, Robert G. (1988). Comparative Programming Languages. Wokingham, England: Addison-Wesley. p. 126. ISBN   0-201-18483-4.
  10. Wilson and Clark, op. cit., p. 52
  11. Wexelblat, op. cit., p. 716
  12. Deshpande, Prasad (2013). Metamorphic Detection Using Function Call Graph Analysis (Thesis). San Jose State University Library. doi: 10.31979/etd.t9xm-ahsc .
  13. "Static Libraries". TLDP. Archived from the original on 2013-07-03. Retrieved 2013-10-03.
  14. Kaminsky, Dan (2008). "Chapter 3 - Portable Executable and Executable and Linking Formats". Reverse Engineering Code with IDA Pro. Elsevier. pp. 37–66. doi:10.1016/b978-1-59749-237-9.00003-x. ISBN   978-1-59749-237-9 . Retrieved 2021-05-27.
  15. Collberg, Christian; Hartman, John H.; Babu, Sridivya; Udupa, Sharath K. (2003). SLINKY: Static Linking Reloaded. USENIX '05. Department of Computer Science, University of Arizona. Archived from the original on 2016-03-23. Retrieved 2016-03-17.
  16. Levine, John R. (2000). "9. Shared Libraries". Linkers and Loaders. ISBN   1-55860-496-0.
  17. UNIX System V/386 Release 3.2 Programmers Guide, Vol. 1 (PDF). 1989. p. 8-2. ISBN   0-13-944877-2.
  18. "Shared Libraries in SunOS" (PDF). pp. 1, 3.
  19. "Code Generation Library". Source Forge. Archived from the original on 2010-01-12. Retrieved 2010-03-03. Byte Code Generation Library is high level API to generate and transform JAVA byte code. It is used by AOP, testing, data access frameworks to generate dynamic proxy objects and intercept field access.
  20. Bresnahan, Christine; Blum, Richard (2015-04-27). LPIC-1 Linux Professional Institute Certification Study Guide: Exam 101-400 and Exam 102-400. John Wiley & Sons (published 2015). p. 82. ISBN   9781119021186. Archived from the original on 2015-09-24. Retrieved 2015-09-03. Linux shared libraries are similar to the dynamic link libraries (DLLs) of Windows. Windows DLLs are usually identified by .dll filename extensions.

Further reading