Software composition analysis (SCA) is a practice in the fields of Information technology and software engineering for analyzing custom-built software applications to detect embedded open-source software and detect if they are up-to-date, contain security flaws, or have licensing requirements. [1]
It is a common software engineering practice to develop software by using different components. [2] Using software components segments the complexity of larger elements into smaller pieces of code and increases flexibility by enabling easier reuse of components to address new requirements. [3] The practice has widely expanded since the late 1990s with the popularization of open-source software (OSS) to help speed up the software development process and reduce time to market. [4]
However, using open-source software introduces many risks for the software applications being developed. These risks can be organized into 5 categories: [5]
Shortly after the foundation of the Open Source Initiative in February 1998, [6] the risks associated with OSS were raised [7] and organizations tried to manage this using spreadsheets and documents to track all the open source components used by their developers. [8]
For organizations using open-source components extensively, there was a need to help automate the analysis and management of open source risk. This resulted in a new category of software products called Software Composition Analysis (SCA) which helps organizations manage open source risk. SCA strives to detect all the 3rd party components in use within a software application to help reduce risks associated with security vulnerabilities, IP licensing requirements, and obsolescence of components being used.
SCA products typically work as follows: [9]
As SCA impacts different functions in organizations, different teams may use the data depending on the organization's corporation size and structure. The IT department will often use SCA for implementing and operationalizing the technology with common stakeholders including the chief information officer (CIO), the Chief Technology Officer (CTO), and the Chief Enterprise Architects (EA). [12] Security and license data are often used by roles such as Chief Information Security Officers (CISO) for security risks, and Chief IP / Compliance officer for Intellectual Property risk management. [13]
Depending on the SCA product capabilities, it can be implemented directly within a developer's Integrated Development Environment (IDE) who uses and integrates OSS components, or it can be implemented as a dedicated step in the software quality control process. [14] [15]
SCA products, and particularly their capacity to generate an SBOM is required in some countries such as the United States to enforce the security of software delivered to one of their agencies by a vendor. [16]
Another common use case for SCA is for Technology Due diligence. Prior to a Merger & Acquisition (M&A) transaction, Advisory firms review the risks associated with the software of the target firm. [17]
The automatic nature of SCA products is their primary strength. Developers don't have to manually do an extra work when using and integrating OSS components. [18] The automation also applies to indirect references to other OSS components within code and artifacts. [19]
Conversely, some key weaknesses of current SCA products may include:
Software engineering is a field within computer science focused on designing, developing, testing, and maintaining of software applications. It involves applying engineering principles and computer programming expertise to develop software systems that meet user needs.
In computer science, static program analysis is the analysis of computer programs performed without executing them, in contrast with dynamic program analysis, which is performed on programs during their execution in the integrated environment.
Open-source software (OSS) is computer software that is released under a license in which the copyright holder grants users the rights to use, study, change, and distribute the software and its source code to anyone and for any purpose. Open-source software may be developed in a collaborative, public manner. Open-source software is a prominent example of open collaboration, meaning any capable user is able to participate online in development, making the number of possible contributors indefinite. The ability to examine the code facilitates public trust in the software.
Code review is a software quality assurance activity in which one or more people examine the source code of a computer program, either after implementation or during the development process. The persons performing the checking, excluding the author, are called "reviewers". At least one reviewer must not be the code's author.
Secure by design, in software engineering, means that software products and capabilities have been designed to be foundationally secure.
In programming and software development, fuzzing or fuzz testing is an automated software testing technique that involves providing invalid, unexpected, or random data as inputs to a computer program. The program is then monitored for exceptions such as crashes, failing built-in code assertions, or potential memory leaks. Typically, fuzzers are used to test programs that take structured inputs. This structure is specified, such as in a file format or protocol and distinguishes valid from invalid input. An effective fuzzer generates semi-valid inputs that are "valid enough" in that they are not directly rejected by the parser, but do create unexpected behaviors deeper in the program and are "invalid enough" to expose corner cases that have not been properly dealt with.
Architecture description languages (ADLs) are used in several disciplines: system engineering, software engineering, and enterprise modelling and engineering.
Software visualization or software visualisation refers to the visualization of information of and related to software systems—either the architecture of its source code or metrics of their runtime behavior—and their development process by means of static, interactive or animated 2-D or 3-D visual representations of their structure, execution, behavior, and evolution.
Software assurance (SwA) is a critical process in software development that ensures the reliability, safety, and security of software products. It involves a variety of activities, including requirements analysis, design reviews, code inspections, testing, and formal verification. One crucial component of software assurance is secure coding practices, which follow industry-accepted standards and best practices, such as those outlined by the Software Engineering Institute (SEI) in their CERT Secure Coding Standards (SCS).
Experimental software engineering involves running experiments on the processes and procedures involved in the creation of software systems, with the intent that the data be used as the basis of theories about the processes involved in software engineering. A number of research groups primarily use empirical and experimental techniques.
Security testing is a process intended to detect flaws in the security mechanisms of an information system and as such help enable it to protect data and maintain functionality as intended. Due to the logical limitations of security testing, passing the security testing process is not an indication that no flaws exist or that the system adequately satisfies the security requirements.
End-user development (EUD) or end-user programming (EUP) refers to activities and tools that allow end-users – people who are not professional software developers – to program computers. People who are not professional developers can use EUD tools to create or modify software artifacts and complex data objects without significant knowledge of a programming language. In 2005 it was estimated that by 2012 there would be more than 55 million end-user developers in the United States, compared with fewer than 3 million professional programmers. Various EUD approaches exist, and it is an active research topic within the field of computer science and human-computer interaction. Examples include natural language programming, spreadsheets, scripting languages, visual programming, trigger-action programming and programming by example.
Continuous delivery (CD) is a software engineering approach in which teams produce software in short cycles, ensuring that the software can be reliably released at any time. It aims at building, testing, and releasing software with greater speed and frequency. The approach helps reduce the cost, time, and risk of delivering changes by allowing for more incremental updates to applications in production. A straightforward and repeatable deployment process is important for continuous delivery.
Software intelligence is insight into the inner workings and structural condition of software assets produced by software designed to analyze database structure, software framework and source code to better understand and control complex software systems in information technology environments. Similarly to business intelligence (BI), software intelligence is produced by a set of software tools and techniques for the mining of data and the software's inner-structure. Results are automatically produced and feed a knowledge base containing technical documentation and blueprints of the innerworking of applications, and make it available to all to be used by business and software stakeholders to make informed decisions, measure the efficiency of software development organizations, communicate about the software health, prevent software catastrophes.
American Fuzzy Lop (AFL), stylized in all lowercase as american fuzzy lop, is a free software fuzzer that employs genetic algorithms in order to efficiently increase code coverage of the test cases. So far it has detected dozens of significant software bugs in major free software projects, including X.Org Server, PHP, OpenSSL, pngcrush, bash, Firefox, BIND, Qt, and SQLite.
EvoSuite is a tool that automatically generates unit tests for Java software. EvoSuite uses an evolutionary algorithm to generate JUnit tests. EvoSuite can be run from the command line, and it also has plugins to integrate it in Maven, IntelliJ and Eclipse. EvoSuite has been used on more than a hundred open-source software and several industrial systems, finding thousands of potential bugs.
Automatic bug-fixing is the automatic repair of software bugs without the intervention of a human programmer. It is also commonly referred to as automatic patch generation, automatic bug repair, or automatic program repair. The typical goal of such techniques is to automatically generate correct patches to eliminate bugs in software programs without causing software regression.
A software bot is a type of software agent in the service of software project management and software engineering. A software bot has an identity and potentially personified aspects in order to serve their stakeholders. Software bots often compose software services and provide an alternative user interface, which is sometimes, but not necessarily conversational.
Static application security testing (SAST) is used to secure software by reviewing the source code of the software to identify sources of vulnerabilities. Although the process of statically analyzing the source code has existed as long as computers have existed, the technique spread to security in the late 90s and the first public discussion of SQL injection in 1998 when Web applications integrated new technologies like JavaScript and Flash.
In computer science, a code property graph (CPG) is a computer program representation that captures syntactic structure, control flow, and data dependencies in a property graph. The concept was originally introduced to identify security vulnerabilities in C and C++ system code, but has since been employed to analyze web applications, cloud deployments, and smart contracts. Beyond vulnerability discovery, code property graphs find applications in code clone detection, attack-surface detection, exploit generation, measuring code testability, and backporting of security patches.