Abbreviation | SPDX |
---|---|
Status | Published |
First published | August 2011 |
Latest version | 3.0 April 2024 |
Organization | Linux Foundation |
Committee | SPDX Project |
Domain | Software bill of materials |
License | CC-BY-3.0 |
Website | spdx |
System Package Data Exchange (SPDX, formerly Software Package Data Exchange) is an open standard capable of representing systems with digital components as bills of materials (BOMs) [1] . First designed to describe software components, SPDX can describe the components of software systems, AI models, software builds, security data, and other data packages. SPDX allows the expression of components, licenses, copyrights, security references and other metadata relating to systems. [2]
The original purpose of SPDX was to improve license compliance, [3] and it has since been expanded to facilitate additional use cases such as supply-chain transparency and security. [4] SPDX is authored by the community-driven SPDX Project involving key industry experts, organizations, and open-source enthusiasts under the auspices of the Linux Foundation.
The SPDX specification is recognized as the international open standard for security, license compliance, and other software supply chain artifacts as ISO/IEC 5962:2021. The current version of the standard is 3.0. [5]
The SPDX 2.x standard defines an SBOM document, which contains SPDX metadata about software. The document itself can be expressed in multiple formats, including JSON, YAML, RDF/XML, tag–value, and spreadsheet. Each SPDX document describes one or more elements, which can be a software package, a specific file, or a snippet from a file. Each element is given a unique identifier, and metadata for an element can refer to other elements. [6]
SPDX 3.0 allows users to communicate information at a much more granular level without having to package it as “envelope” data. A key design principle in SPDX 3.0 is that all elements may be expressed and referenced independent of any other element. This independence is required to support a variety of content exchange and analysis use cases and makes it easier to communicate single elements of interest. The relationship structure has also been updated to be both more expressive and easier to understand compared to older versions of the spec.
The SPDX 3.0 data model is based on the Resource Description Framework (RDF). Data may be serialized in a variety of formats for storage and transmission, including formats defined in RDF 1.1 such as JSON-LD, Turtle (Terse RDF Triple Language), N-Triples, and RDF/XML.
The 3.0 specification introduced profiles to support the expansion of use cases beyond software, without increasing overall complexity. Profiles allow users to define data for the use cases they need, while also increasing the amount of information that can be gathered directly from the SPDX data. There are eight profiles defined by SPDX 3.0:
Version number | Publication date | Notes | References |
---|---|---|---|
3.0 | April 2024 | Introduced a comprehensive set of updates encompassing the model, specification, and license list, with the new addition of SPDX profiles to handle modern system use cases like security and AI. | [7] |
2.3 | November 2022 | Added new fields to improve the ability to capture security related information and interoperability with other SBOM formats. | [8] |
2.2.2 | April 2022 | Functionally equivalent to SPDX 2.2.1 but with spelling, grammar and other editorial improvements. | [9] |
2.2.1 | October 2020 | Functionally equivalent to SPDX 2.2 but with typesetting for publication as an ISO standard. | [10] |
2.2 | May 2020 | Added 'SPDX-lite' profile for minimal software bill of materials and improved support for external references. | [11] |
2.1 | November 2016 | Added support for describing 'snippets' of code and the ability to reference non-SPDX data (such as CVEs). | [12] [13] |
2.0 | May 2015 | Added the ability to describe multiple packages and the relationships between different packages and files. | [14] |
1.2 | October 2013 | Improved interaction with the SPDX License List, and added new fields for documenting extra information about software projects. | [15] |
1.1 | August 2012 | Fixed a flaw in the SPDX Package Verification Code (a cryptographic hash function) and added support for free-form comments. | [16] |
1.0 | August 2011 | The first release of the SPDX specification; handles packages. | [3] |
The first version of the SPDX specification was intended to make compliance with software licenses easier, [3] but subsequent versions of the specification added capabilities intended for other use-cases, such as being able to contain references to known software vulnerabilities. [13] Recent versions of SPDX fulfill the NTIA's 'Minimum Elements For a Software Bill of Materials'. [17]
SPDX 2.2.1 was submitted to the International Organization for Standardization (ISO) in October, 2020, and was published as ISO/IEC 5962:2021 Information technology — SPDX® Specification V2.2.1 in August, 2021. [10] [18]
Each license is identified by a full name, such as "Mozilla Public License 2.0" and a short identifier, here "MPL-2.0". Licenses can be combined by operators AND
and OR
, and grouping (
, )
.
For example, (Apache-2.0 OR MIT)
means that one can choose between Apache-2.0
(Apache License) or MIT
(MIT license). On the other hand, (Apache-2.0 AND MIT)
means that both licenses apply.
There is also a "+" operator which, when applied to a license, means that future versions of the license apply as well. For example, Apache-1.1+
means that Apache-1.1
and Apache-2.0
may apply (and future versions if any).
SPDX describes the exact terms under which a piece of software is licensed. It does not attempt to categorize licenses by type, for instance by describing licenses with similar terms to the BSD License as "BSD-like". [19]
In 2020, the European Commission published its Joinup Licensing Assistant, [20] which makes possible the selection and comparison of more than 50 licenses, with access to their SPDX identifier and full text.
The GNU family of licenses (e.g., GNU General Public License version 2) have the choice of choosing a later version of the license built in. Sometimes, it was not clear whether the SPDX expression GPL-2.0
meant "exactly GPL version 2.0" or "GPL version 2.0 or any later version". [21] Thus, since version 3.0 of the SPDX License List, the GNU family of licenses got new names. [22] GPL-2.0-only
means "exactly version 2.0" and GPL-2.0-or-later
means "version 2.0 or any later version".
The SPDX license identifier can be added to the top of source code files as a short string unambiguously declaring the license used. The SPDX-License-Identifier
syntax, pioneered by Das U-Boot in 2013, became part of SPDX in version 2.1. In 2017, the FSFE launched REUSE, which provides tools to validate the comment and to efficiently extract copyright information. [23]
The SPDX license identifier is also used in a number of package managers such as npm, [24] Python, [25] and Rust cargo. [26] SPDX license expressions are used in RPM package metadata in Fedora Linux, replacing the earlier use of the Callaway system. [27] Debian uses a slightly different license specification. [28]
Free software, libre software, libreware or rarely known as freedom-respecting software is computer software distributed under terms that allow users to run the software for any purpose as well as to study, change, and distribute it and any adapted versions. Free software is a matter of liberty, not price; all users are legally free to do what they want with their copies of a free software regardless of how much is paid to obtain the program. Computer programs are deemed "free" if they give end-users ultimate control over the software and, subsequently, over their devices.
The MIT License is a permissive software license originating at the Massachusetts Institute of Technology (MIT) in the late 1980s. As a permissive license, it puts very few restrictions on reuse and therefore has high license compatibility.
Portable Document Format (PDF), standardized as ISO 32000, is a file format developed by Adobe in 1992 to present documents, including text formatting and images, in a manner independent of application software, hardware, and operating systems. Based on the PostScript language, each PDF file encapsulates a complete description of a fixed-layout flat document, including the text, fonts, vector graphics, raster images and other information needed to display it. PDF has its roots in "The Camelot Project" initiated by Adobe co-founder John Warnock in 1991. PDF was standardized as ISO 32000 in 2008. The last edition as ISO 32000-2:2020 was published in December 2020.
The Apache License is a permissive free software license written by the Apache Software Foundation (ASF). It allows users to use the software for any purpose, to distribute it, to modify it, and to distribute modified versions of the software under the terms of the license, without concern for royalties. The ASF and its projects release their software products under the Apache License. The license is also used by many non-ASF projects.
The Linux Standard Base (LSB) was a joint project by several Linux distributions under the organizational structure of the Linux Foundation to standardize the software system structure, including the Filesystem Hierarchy Standard. LSB was based on the POSIX specification, the Single UNIX Specification (SUS), and several other open standards, but extended them in certain areas.
Free and open-source software (FOSS) is software that is available under a license that grants the right to use, modify, and distribute the software, modified or not, to everyone free of charge. The public availability of the source code is, therefore, a necessary but not sufficient condition. FOSS is an inclusive umbrella term for free software and open-source software. FOSS is in contrast to proprietary software, where the software is under restrictive copyright or licensing and the source code is hidden from the users.
Apache Harmony is a retired open source, free Java implementation, developed by the Apache Software Foundation. It was announced in early May 2005 and on October 25, 2006, the board of directors voted to make Apache Harmony a top-level project. The Harmony project achieved 99% completeness for J2SE 5.0, and 97% for Java SE 6. The Android operating system has historically been a major user of Harmony, although since Android Nougat it increasingly relies on OpenJDK libraries.
The Extensible Metadata Platform (XMP) is an ISO standard, originally created by Adobe Systems Inc., for the creation, processing and interchange of standardized and custom metadata for digital documents and data sets.
A LAMP is one of the most common software stacks for the web's most popular applications. Its generic software stack model has largely interchangeable components.
The Python Software Foundation License (PSFL) is a BSD-style, permissive software license which is compatible with the GNU General Public License (GPL). Its primary use is for distribution of the Python project software and its documentation. Since the license is permissive, it allows proprietization of the derivations. The PSFL is listed as approved on both FSF's approved licenses list, and OSI's approved licenses list.
This comparison only covers software licenses which have a linked Wikipedia article for details and which are approved by at least one of the following expert groups: the Free Software Foundation, the Open Source Initiative, the Debian Project and the Fedora Project. For a list of licenses not specifically intended for software, see List of free-content licences.
Metalink is an extensible metadata file format that describes one or more computer files available for download. It specifies files appropriate for the user's language and operating system; facilitates file verification and recovery from data corruption; and lists alternate download sources.
Public-domain-equivalent license are licenses that grant public-domain-like rights and/or act as waivers. They are used to make copyrighted works usable by anyone without conditions, while avoiding the complexities of attribution or license compatibility that occur with other licenses.
LV2 is a set of royalty-free open standards for music production plug-ins and matching host applications. It includes support for the synthesis and processing of digital audio and CV, events such as MIDI and OSC, and provides a free alternative to audio plug-in standards such as Virtual Studio Technology (VST) and Audio Units (AU).
License compatibility is a legal framework that allows for pieces of software with different software licenses to be distributed together. The need for such a framework arises because the different licenses can contain contradictory requirements, rendering it impossible to legally combine source code from separately-licensed software in order to create and publish a new program. Proprietary licenses are generally program-specific and incompatible; authors must negotiate to combine code. Copyleft licenses are commonly deliberately incompatible with proprietary licenses, in order to prevent copyleft software from being re-licensed under a proprietary license, turning it into proprietary software. Many copyleft licenses explicitly allow relicensing under some other copyleft licenses. Permissive licenses are compatible with everything, including proprietary licenses; there is thus no guarantee that all derived works will remain under a permissive license.
The Microsoft Open Specification Promise is a promise by Microsoft, published in September 2006, to not assert its patents, in certain conditions, against implementations of a certain list of specifications.
BSD licenses are a family of permissive free software licenses, imposing minimal restrictions on the use and distribution of covered software. This is in contrast to copyleft licenses, which have share-alike requirements. The original BSD license was used for its namesake, the Berkeley Software Distribution (BSD), a Unix-like operating system. The original version has since been revised, and its descendants are referred to as modified BSD licenses.
The GNU General Public Licenses are a series of widely used free software licenses, or copyleft licenses, that guarantee end users the freedoms to run, study, share, and modify the software. The license was the first copyleft for general use and was originally written by Richard Stallman, the founder of the Free Software Foundation (FSF), for the GNU Project. The license grants the recipients of a computer program the rights of the Free Software Definition. The licenses in the GPL series are all copyleft licenses, which means that any derivative work must be distributed under the same or equivalent license terms. It is more restrictive than the Lesser General Public License and even further distinct from the more widely-used permissive software licenses such as BSD, MIT, and Apache.
A Rights Expression Language or REL is a machine-processable language used to express intellectual property rights and other terms and conditions for use over content. RELs can be used as standalone expressions or within a DRM system.
The Linear Tape File System (LTFS) is a file system that allows files stored on magnetic tape to be accessed in a similar fashion to those on disk or removable flash drives. It requires both a specific format of data on the tape media and software to provide a file system interface to the data.