Software composition analysis (SCA) is a practice in the fields of Information technology and software engineering for analyzing custom-built software applications to detect embedded open-source software and detect if they are up-to-date, contain security flaws, or have licensing requirements. [1]
It is a common software engineering practice to develop software by using different components. [2] Using software components segments the complexity of larger elements into smaller pieces of code and increases flexibility by enabling easier reuse of components to address new requirements. [3] The practice has widely expanded since the late 1990s with the popularization of open-source software (OSS) to help speed up the software development process and reduce time to market. [4]
However, using open-source software introduces many risks for the software applications being developed. These risks can be organized into 5 categories: [5]
Shortly after the foundation of the Open Source Initiative in February 1998, [6] the risks associated with OSS were raised [7] and organizations tried to manage this using spreadsheets and documents to track all the open source components used by their developers. [8]
For organizations using open-source components extensively, there was a need to help automate the analysis and management of open source risk. This resulted in a new category of software products called Software Composition Analysis (SCA) which helps organizations manage open source risk. SCA strives to detect all the 3rd party components in use within a software application to help reduce risks associated with security vulnerabilities, IP licensing requirements, and obsolescence of components being used.
SCA products typically work as follows: [9]
As SCA impacts different functions in organizations, different teams may use the data depending on the organization's corporation size and structure. The IT department will often use SCA for implementing and operationalizing the technology with common stakeholders including the chief information officer (CIO), the Chief Technology Officer (CTO), and the Chief Enterprise Architects (EA). [12] Security and license data are often used by roles such as Chief Information Security Officers (CISO) for security risks, and Chief IP / Compliance officer for Intellectual Property risk management. [13]
Depending on the SCA product capabilities, it can be implemented directly within a developer's Integrated Development Environment (IDE) who uses and integrates OSS components, or it can be implemented as a dedicated step in the software quality control process. [14] [15]
SCA products, and particularly their capacity to generate an SBOM is required in some countries such as the United States to enforce the security of software delivered to one of their agencies by a vendor. [16]
Another common use case for SCA is for Technology Due diligence. Prior to a Merger & Acquisition (M&A) transaction, Advisory firms review the risks associated with the software of the target firm. [17]
The automatic nature of SCA products is their primary strength. Developers don't have to manually do an extra work when using and integrating OSS components. [18] The automation also applies to indirect references to other OSS components within code and artifacts [19]
Conversely, some key weaknesses of current SCA products may include:
Software engineering is an engineering approach to software development. A practitioner, called a software engineer, applies the engineering design process to develop software.
In computer science, static program analysis is the analysis of computer programs performed without executing them, in contrast with dynamic program analysis, which is performed on programs during their execution in the integrated environment.
Open-source software (OSS) is computer software that is released under a license in which the copyright holder grants users the rights to use, study, change, and distribute the software and its source code to anyone and for any purpose. Open-source software may be developed in a collaborative, public manner. Open-source software is a prominent example of open collaboration, meaning any capable user is able to participate online in development, making the number of possible contributors indefinite. The ability to examine the code facilitates public trust in the software.
Code review is a software quality assurance activity in which one or more people check a program, mainly by viewing and reading parts of its source code, either after implementation or as an interruption of implementation. At least one of the persons must not have authored the code. The persons performing the checking, excluding the author, are called "reviewers".
In the context of software engineering, software quality refers to two related but distinct notions:
In programming and software development, fuzzing or fuzz testing is an automated software testing technique that involves providing invalid, unexpected, or random data as inputs to a computer program. The program is then monitored for exceptions such as crashes, failing built-in code assertions, or potential memory leaks. Typically, fuzzers are used to test programs that take structured inputs. This structure is specified, e.g., in a file format or protocol and distinguishes valid from invalid input. An effective fuzzer generates semi-valid inputs that are "valid enough" in that they are not directly rejected by the parser, but do create unexpected behaviors deeper in the program and are "invalid enough" to expose corner cases that have not been properly dealt with.
Architecture description languages (ADLs) are used in several disciplines: system engineering, software engineering, and enterprise modelling and engineering.
Application security includes all tasks that introduce a secure software development life cycle to development teams. Its final goal is to improve security practices and, through that, to find, fix and preferably prevent security issues within applications. It encompasses the whole application life cycle from requirements analysis, design, implementation, verification as well as maintenance.
Software visualization or software visualisation refers to the visualization of information of and related to software systems—either the architecture of its source code or metrics of their runtime behavior—and their development process by means of static, interactive or animated 2-D or 3-D visual representations of their structure, execution, behavior, and evolution.
Software assurance (SwA) is a critical process in software development that ensures the reliability, safety, and security of software products. It involves a variety of activities, including requirements analysis, design reviews, code inspections, testing, and formal verification. One crucial component of software assurance is secure coding practices, which follow industry-accepted standards and best practices, such as those outlined by the Software Engineering Institute (SEI) in their CERT Secure Coding Standards (SCS).
Experimental software engineering involves running experiments on the processes and procedures involved in the creation of software systems, with the intent that the data be used as the basis of theories about the processes involved in software engineering. A number of research groups primarily use empirical and experimental techniques.
Search-based software engineering (SBSE) applies metaheuristic search techniques such as genetic algorithms, simulated annealing and tabu search to software engineering problems. Many activities in software engineering can be stated as optimization problems. Optimization techniques of operations research such as linear programming or dynamic programming are often impractical for large scale software engineering problems because of their computational complexity or their assumptions on the problem structure. Researchers and practitioners use metaheuristic search techniques, which impose little assumptions on the problem structure, to find near-optimal or "good-enough" solutions.
Continuous delivery (CD) is a software engineering approach in which teams produce software in short cycles, ensuring that the software can be reliably released at any time. It aims at building, testing, and releasing software with greater speed and frequency. The approach helps reduce the cost, time, and risk of delivering changes by allowing for more incremental updates to applications in production. A straightforward and repeatable deployment process is important for continuous delivery.
Software intelligence is insight into the inner workings and structural condition of software assets produced by software designed to analyze database structure, software framework and source code to better understand and control complex software systems in information technology environments. Similarly to business intelligence (BI), software intelligence is produced by a set of software tools and techniques for the mining of data and the software's inner-structure. Results are automatically produced and feed a knowledge base containing technical documentation and blueprints of the innerworking of applications, and make it available to all to be used by business and software stakeholders to make informed decisions, measure the efficiency of software development organizations, communicate about the software health, prevent software catastrophes.
American Fuzzy Lop (AFL), stylized in all lowercase as american fuzzy lop, is a free software fuzzer that employs genetic algorithms in order to efficiently increase code coverage of the test cases. So far it has detected dozens of significant software bugs in major free software projects, including X.Org Server, PHP, OpenSSL, pngcrush, bash, Firefox, BIND, Qt, and SQLite.
EvoSuite is a tool that automatically generates unit tests for Java software. EvoSuite uses an evolutionary algorithm to generate JUnit tests. EvoSuite can be run from the command line, and it also has plugins to integrate it in Maven, IntelliJ and Eclipse. EvoSuite has been used on more than a hundred open-source software and several industrial systems, finding thousands of potential bugs.
Automatic bug-fixing is the automatic repair of software bugs without the intervention of a human programmer. It is also commonly referred to as automatic patch generation, automatic bug repair, or automatic program repair. The typical goal of such techniques is to automatically generate correct patches to eliminate bugs in software programs without causing software regression.
A software bot is a type of software agent in the service of software project management and software engineering. A software bot has an identity and potentially personified aspects in order to serve their stakeholders. Software bots often compose software services and provide an alternative user interface, which is sometimes, but not necessarily conversational.
Static application security testing (SAST) is used to secure software by reviewing the source code of the software to identify sources of vulnerabilities. Although the process of statically analyzing the source code has existed as long as computers have existed, the technique spread to security in the late 90s and the first public discussion of SQL injection in 1998 when Web applications integrated new technologies like JavaScript and Flash.
In computer science, a code property graph (CPG) is a computer program representation that captures syntactic structure, control flow, and data dependencies in a property graph. The concept was originally introduced to identify security vulnerabilities in C and C++ system code, but has since been employed to analyze web applications, cloud deployments, and smart contracts. Beyond vulnerability discovery, code property graphs find applications in code clone detection, attack-surface detection, exploit generation, measuring code testability, and backporting of security patches.