Debian build toolchain

Last updated
A typical input of the Debian build tools: three files constituting the source package (the bottom) and the unpacked source tree with a debian subdirectory added there by the package maintainer. Debian-source-package.png
A typical input of the Debian build tools: three files constituting the source package (the bottom) and the unpacked source tree with a debian subdirectory added there by the package maintainer.

The Debian build toolchain is a collection of software utilities used to create Debian source packages (.dsc) and Debian binary packages (.deb files) from upstream source tarballs.

Contents

These tools are used in the Debian project and also in Debian-based distributions such as Ubuntu.

Overview

Source code for free software is typically distributed in compressed tar archives called tarballs. Debian is a binary-oriented distribution, meaning that its deb packages include precompiled binaries and data files arranged into a file system hierarchy that the software expects. The Debian build toolchain thus needs instructions on how to use the upstream build system to build correct deb packages.

These instructions are stored in the debian subdirectory, which is added to the source tree for the software being packaged by the package maintainer. While it is possible to build the package directly from the modified source tree, it is standard practice to create source packages, which contain the changes the maintainer made to the upstream sources in redistributable form.

Source packages

A typical Debian source package consists of three files:

For example, a source package named foo with upstream version 1.2.3 and Debian revision 4 can consist of the following files:

A source package is created using the dpkg-buildpackage tool or its wrapper debuild. When invoked to create a source package, dpkg-buildpackage calls the maintainer's rules to clean the source tree of any intermediate files, does various sanity checks, and finally, signs the dsc file with the packager's key using the debsign utility.

The reverse process producing the unpacked source tree from a source package is accomplished using the dpkg-source utility, which extracts the original tarball to a subdirectory, extracts the debian.tar tarball inside it, and applies any quilt patches present. This is the first step that a build system does when building binary packages from a source package.

Older source packages (using Source Format 1) have a .diff.gz file instead of the debian.tar. This is a unified diff that contains the debian directory and any changes to the upstream source that aren't managed by a patch system.

The debian directory

The debian directory contains files used by dpkg-buildpackage to create both binary and source packages. Unlike RPM, which uses a single spec file for instructions, the Debian tools use an entire subdirectory with multiple files. Three files are required at minimum to correctly build a package changelog, control and rules. A fourth file, copyright, is mandated by the Debian policy, but is a legal requirement rather than a technical one.

By design, all files in the debian directory are text files, most of which are human-readable and edited with a simple text editor.

debian/changelog

This file contains information about all versions of the package since it was created. The build tools only process the top entry, which is used to determine the package version, urgency (which is only of relevance to Debian itself), and bugs in the distribution that this release fixes.

For example, for a package named foo, an example debian/changelog entry can read like this:

foo (1.2.3-1) unstable; urgency=low    * New upstream release.   * Dropped 02_manpage_hyphens.dpatch, fixed upstream.   * Added 04_edit_button_crash.dpatch: fix a crash after pressing the edit button. (Closes: #654321)   * debian/control: foo should conflict with libbar. (Closes: #987654)   -- John Doe <jdoe@example.com>  Fri, 30 Nov 2007 15:29:42 +0100 

Debian provides two main utilities for manipulating the debian/changelog file:

debian/control

This file contains information about the source package and all binary packages it builds (there can be more than one; for example, the source package libbar can serve as the source for binary packages libbar0, which contains just the shared library, and libbar-dev, which contains a static version of the library and header files).

It lists (among others) such things as the package name, maintainer, target architectures (for binary packages), build dependencies (packages that must be installed for the package to successfully build) and dependencies (packages that must be installed for the package to function properly when installed).

debian/rules

This file is a script that is invoked by dpkg-buildpackage with a single argument that specifies the action to take (clean, build, install, binary). Although it can technically be any kind of script, it is always implemented as a makefile.

Apart from invoking the upstream build system, most instructions in debian/rules are highly repetitive and ubiquitous, and thus, virtually all debian/rules files wrap this functionality in debhelper scripts. For example, automatically determining the dependencies based on shared libraries used is a very common action, and thus, instead of including the code necessary to do it, the debian/rules file simply calls dh_shlibdeps. Other examples of debhelper scripts include dh_installdocs, which installs stock documentation files such as debian/copyright into their appropriate locations, or dh_fixperms, which ensures that files in the package have correct access rights (for example, executables in /usr/bin have the "executable" bit set, but are only writable by the superuser).

Since sequences of debhelper scripts are themselves repetitive, some packages simplify debian/rules files directly by using dh or CDBS instead of running each debhelper command directly.

Patch systems

Sometimes, a maintainer needs to modify the original source. While, in the past, this was often done simply by editing the files in place and including the changes in the diff.gz, this could make maintenance difficult when new upstream versions were released, because all the changes had to be examined and merged when necessary.

The newer source format, 3.0 (quilt), uses the quilt patch system, to allow the modifications to be broken into groups of logically separated patches, each of which deals with one change and can be sent upstream as is. These patches live in debian/patches.

There are also packages using other patch systems, such as dpatch. It generates and executes shell scripts that are non-standard unified diff files with a header, which nevertheless are compatible with the standard diff utility. The debian/rules file is modified to call dpatch apply-all before building the binary package and dpatch deapply-all before building the source package (and cleaning up any build byproducts). quilt and certain other patch systems eliminate the need for special headers and use standard diff files.

Tracking changes in source packages: debdiff and interdiff

Sometimes a user may want to look at differences between two source packages for example, to generate a proposed patch against the version currently in the repository for inclusion in the distribution's bug tracking system. If both packages use the same upstream version, this can done using the debdiff tool, which produces differences between two source trees with packaging changes included.

If the upstream tarballs for the two versions are different, such a naive comparison cannot be used. Instead, the interdiff utility can be used to produce a diff between two diff files (in this case, between two diff.gz files). A drawback is that an interdiff output requires more effort to apply, and the one applying the changes must also find and download the newer upstream tarball, which is typically done using the get-orig-source rule in debian/rules. [1]

Sanity checks with lintian

This tool provides automated checks for common packaging mistakes in both binary and source packages, including Debian policy violations and potential compatibility problems.

While a maintainer typically aims to correct all issues pointed out by lintian, different distributions can have different policies regarding them. For example, Ubuntu requires all packages originating in Ubuntu to be clean, but for a package merged into Ubuntu from Debian, there is no such requirement: new changes should simply not introduce any warnings in addition to existing ones. This is done to minimize the divergence between Debian and Ubuntu packages.

Here are example lintian outputs:

Isolated build environments

Source packages are intended to be buildable on any installation of the target distribution version, provided that build dependencies are met. In addition, builds can be affected by packages already present in the system.

To verify that a package builds on any system, and to exclude any external factors, tools to create isolated build environments are used. These are pbuilder (Personal Builder) and sbuild.

These tools maintain minimal working systems in chroot, install only the necessary build dependencies listed in debian/control, and remove them when the build is finished. Therefore, using pbuilder, a package maintainer can detect if some build dependencies were not specified in debian/control. Also, pbuilder makes it possible to test-build for distributions other than the one the maintainer is running: for example, for the development version, while actually running the stable version.

sbuild is designed for integration with automated build daemons (buildd). It is used by Debian build servers, which automatically build binary packages for every supported architecture. The Launchpad service provides similar build daemons for Ubuntu, both the official distribution and personal package archives (PPAs).

See also

Related Research Articles

APT (software) Free software package management system

Advanced package tool, or APT, is a free-software user interface that works with core libraries to handle the installation and removal of software on Debian, and Debian-based Linux distributions. APT simplifies the process of managing software on Unix-like computer systems by automating the retrieval, configuration and installation of software packages, either from precompiled files or by compiling source code.

dpkg is the software at the base of the package management system in the free operating system Debian and its numerous derivatives. dpkg is used to install, remove, and provide information about .deb packages.

GoboLinux

GoboLinux is an open source operating system whose most prominent feature is a reorganization of the traditional Linux file system. Rather than following the Filesystem Hierarchy Standard like most Unix-like systems, each program in a GoboLinux system has its own subdirectory tree, where all of its files may be found. Thus, a program "Foo" has all of its specific files and libraries in /Programs/Foo, under the corresponding version of this program at hand. For example, the commonly known GCC compiler suite version 8.1.0, would reside under the directory /Programs/GCC/8.1.0.

Fink (software) Project to port and package open-source Unix software to macOS

The Fink project is an effort to port and package open-source Unix programs to macOS. Fink uses dpkg and APT, as well as its own frontend program, fink.

Portage (software) Gentoo package management system

Portage is a package management system originally created for and used by Gentoo Linux and also by Chrome OS, Calculate, Sabayon, and Funtoo Linux among others. Portage is based on the concept of ports collections. Gentoo is sometimes referred to as a meta-distribution due to the extreme flexibility of Portage, which makes it operating-system-independent. The Gentoo/Alt project is concerned with using Portage to manage other operating systems, such as BSDs, macOS and Solaris. The most notable of these implementations is the Gentoo/FreeBSD project.

deb is the format, as well as extension of the software package format for the Debian Linux distribution and its derivatives.

MirOS BSD

MirOS BSD is a discontinued free and open source operating system which started as a fork of OpenBSD 3.1 in August 2002. It was intended to maintain the security of OpenBSD with better support for European localisation. Since then it has also incorporated code from other free BSD descendants, including NetBSD, MicroBSD and FreeBSD. Code from MirOS BSD was also incorporated into ekkoBSD, and when ekkoBSD ceased to exist, artwork, code and developers ended up working on MirOS BSD for a while.

patch (Unix)

The computer tool patch is a Unix program that updates text files according to instructions contained in a separate file, called a patch file. The patch file is a text file that consists of a list of differences and is produced by running the related diff program with the original and updated file as arguments. Updating files with patch is often referred to as applying the patch or simply patching the files.

Backporting is the action of taking parts from a newer version of a software system or software component and porting them to an older version of the same software. It forms part of the maintenance step in a software development process, and it is commonly used for fixing security issues in older versions of the software and also for providing new features to older versions.

slapt-get

slapt-get is an APT-like package management system for Slackware. Slapt-get tries to emulate the features of Debian's (apt-get) as closely as possible.

file (command) Standard Unix program

The file command is a standard program of Unix and Unix-like operating systems for recognizing the type of data contained in a computer file.

Alien is a computer program that converts between different Linux package formats, created by Joey Hess and presently maintained by Kyle Barry.

Debian configuration system

debconf is a software utility for performing system-wide configuration tasks on Unix-like operating systems. It is developed for the Debian Linux distribution, and is closely integrated with Debian's package management system, dpkg.

Ports collections are the sets of makefiles and patches provided by the BSD-based operating systems, FreeBSD, NetBSD, and OpenBSD, as a simple method of installing software or creating binary packages. They are usually the base of a package management system, with ports handling package creation and additional tools managing package removal, upgrade, and other tasks. In addition to the BSDs, a few Linux distributions have implemented similar infrastructure, including Gentoo's Portage, Arch's Arch Build System (ABS), CRUX's Ports and Void Linux's Templates.

The Environment Modules system is a tool to help users manage their Unix or Linux shell environment, by allowing groups of related environment-variable settings to be made or removed dynamically.

RPM Package Manager Package management system

RPM Package Manager (RPM) is a free and open-source package management system. The name RPM refers to the .rpm file format and the package manager program itself. RPM was intended primarily for Linux distributions; the file format is the baseline package format of the Linux Standard Base.

XZ Utils is a set of free software command-line lossless data compressors, including lzma and xz, for Unix-like operating systems and, from version 5.0 onwards, Microsoft Windows.

A delta update is an update that only requires the user to download the code that has changed, not the whole program. It can significantly save time and bandwidth. The name is drawn from the fact that the Greek letter delta, Δ or δ, is used to denote change in mathematical sciences.

GNU Guix Purely functional package manager for the GNU system

GNU Guix is a functional cross-platform package manager and a tool to instantiate and manage Unix-like operating systems, based on the Nix package manager. Configuration and package recipes are written in Guile Scheme. GNU Guix is the default package manager of the GNU Guix System distribution.

References

  1. "Chapter 4 - Source packages". Debian Policy Manual. Retrieved 1 October 2014.