Software safety

Last updated

Software safety (sometimes called software system safety) is an engineering discipline that aims to ensure that software, which is used in safety-related systems (i.e. safety-related software), does not contribute to any hazards such a system might pose. There are numerous standards that govern the way how safety-related software should be developed and assured in various domains. Most of them classify software according to their criticality and propose techniques and measures that should be employed during the development and assurance:

Contents

Terminology

System Safety is the overarching discipline that aims to achieve safety by reducing risks in technical systems to an acceptable level. According to the widely adopted system safety standard IEC 61508, [1] safety is “freedom from unacceptable risk of harm”. As software alone – which can be considered as pure information – cannot cause any harm by itself, the term software safety is sometimes dismissed and replaced by “software system safety” (e.g. the Joint Software Systems Safety Engineering Handbook [8] and MIL-STD-882E [9] use this terminology). This stresses that software can only cause harm in the context of a technical system (see NASA Software Safety Guidebook, [10] chapter 2.1.2), that has some effect on its environment.

The goal of software safety is to make sure that software does not cause or contribute to any hazards in the system where it is used and that it can be assured and demonstrated that this is the case. This is typically achieved by the assignment of a "safety level" to the software and the selection of appropriate processes for the development and assurance of the software.

Assignment of safety levels

One of the first steps when creating safety-related software is to classify software according to its safety-criticality. Various standards suggest different levels, e.g. Software Levels A-E in DO-178C, [4] SIL (Safety Integrity Level) 1-4 in IEC 61508, [1] ASIL (Automotive Safety Integrity Level) A-D in ISO 26262. [2] The assignment is typically done in the context of an overarching system, where the worst case consequences of software failures are investigated. For example, automotive standard ISO 26262 requires the performance of a Hazard and Risk Assessment ("HARA") on vehicle level to derive the ASIL of the software executed on a component.

Process adherence and assurance

It is essential to use an adequate development and assurance process, with appropriate methods and techniques, commensurate with the safety criticality of the software. Software safety standards recommend and sometimes forbid the use of such methods and techniques, depending on the safety level. Most standards suggest a lifecycle model (e.g. EN 50716, [3] SIL (Safety Integrity Level) 1-4 in IEC 61508 [1] suggests – among others – a V-model) and prescribe required activities to be executed during the various phases of the software. For example, IEC 61508 requires that software is specified adequately (e.g. by using formal or semi-formal methods), that the software design should be modular and testable, that adequate programming languages are used, documented code reviews are performed and that testing should be performed an several layers to achieve an adequately high test coverage. The focus on the software development and assurance process stems from the fact that software quality (and hence safety) is heavily influenced by the software process, as suggested by IEC 25010. [11] It is claimed that the process influences the internal software quality attributes (e.g. code quality) and these in turn influence external software quality attributes (e.g. functionality and reliability).

The following activities and topics addressed in the development process contribute to safe software.

Documentation

Comprehensive documentation of the complete development and assurance process is required by virtually all software safety standards. Typically, this documentation is reviewed and endorsed by third parties and therefore a prerequisite for the approval of safety-related software. The documentation ranges from various planning documents, requirements specifications, software architecture and design documentation, test cases on various abstraction levels, tool qualification reports, review evidence, verification and validation results etc. Fig C.2 in EN 50716 [3] lists 32 documents that need to be created along the development lifecycle.

Traceability

Traceability is the practice to establish relationships between different types of requirements and between requirements and design, implementation and testing artefacts. According to EN 50716, [3] the objective “is to ensure that all requirements can be shown to have been properly met and that no untraceable material has been introduced”. By documenting and maintaining traceability, it becomes possible to follow e.g. a safety requirement into the design of a system (to verify if it considered adequately), further on into the software source code (to verify if the code fulfils the requirement), and to an appropriate test case and test execution (to verify if the safety requirement has been tested adequately).

Software implementation

Safety standards can have requirements directly affecting the implementation of the software in source code, such as e.g. the selection of an appropriate programming language, the size and complexity of functions, the use of certain programming constructs and the need for coding standards. Part 3 of IEC 61508 contains the following requirements and recommendations:

Test coverage

Appropriate test coverage needs to be demonstrated, i.e. depending on the safety level more rigorous testing schemes have to be applied. A well known requirement regarding test coverage depending on the software level is given in DO-178C: [4]

Independence

Software safety standards typically require some activities to be executed with independence, i.e. by a different person, by a person with different reporting lines, or even by an independent organization. This ensures that conflicts of interest are avoided and increases the chances that faults (e.g. in the software design) are identified. For example, EN 50716 [3] Figure 2 requires the roles “implementer”, “tester” and “verifier” to be held by different people, the role “validator” to be held by a person with different reporting line and the role “assessor” to be held by a person from a different organizational unit. DO-178C [4] and DO-278A [5] require several activities (e.g. test coverage verification, assurance activities) to be executed “with independence”, with independence being defined as “separation of responsibilities which ensures the accomplishment of objective evaluation”.

Open questions and issues

Software failure rates

In system safety engineering, it is common to allocate upper bounds for failure rates of subsystems or components. It must then be shown that these subsystems or components do not exceed their allocated failure rates, or otherwise redundancy or other fault tolerance mechanisms must be employed. This approach is not practicable for software. Software failure rates cannot be predicted with any confidence. Although significant research in the field of software reliability has been conducted (see for example Lyu (1996), [12] current software safety standards do not require any of these methods to be used or even discourage their usage, e.g. DO178C [4] (p. 73) states: “Many methods for predicting software reliability based on developmental metrics have been published, for example, software structure, defect detection rate, etc. This document does not provide guidance for those types of methods, because at the time of writing, currently available methods did not provide results in which confidence can be placed.” ARP 4761 [13] clause 4.1.2 states that software design errors “are not the same as hardware failures. Unlike hardware failures, probabilities of such errors cannot be quantified.”

Safety and security

Software safety and security may have differing interests in some cases. On the one hand safety-related software that is not secure can pose a safety risk, on the other hand, some security practices (e.g. frequent and timely patching) contradict established safety practices (rigorous testing and verification before anything is changed in an operational system).

Artificial intelligence

Software that employs artificial intelligence techniques such as machine learning follows a radically different lifecycle. In addition, the behavior is harder to predict than for a traditionally developed system. Hence, the question whether and how these technologies can be used, is under current investigation. Currently, standards generally do not endorse their use. For example, EN 50716 (Table A.3) states that artificial intelligence and machine learning are not recommended for any safety integrity level.

Agile development methods

Agile software development, which typically features many iterations, is sometimes still stigmatized as being too chaotic for safety-related software development. This might be partially caused by statements such as "working software over comprehensive documentation", which is found in the manifesto for agile development. [14] Although most software safety standards present the software lifecycle in the traditional waterfall-like sequence, some do contain statements that allow for more flexible lifecycles. DO-178C states that "The processes of a software life cycle may be iterative, that is, entered and reentered." EN 50716 contains Annex C that shows how iterative development lifecycles can be used in line with the requirements of the standard.

Goals

See also

Notes

    Related Research Articles

    <span class="mw-page-title-main">Safety-critical system</span> System whose failure would be serious

    A safety-critical system or life-critical system is a system whose failure or malfunction may result in one of the following outcomes:

    Avionics software is embedded software with legally mandated safety and reliability concerns used in avionics. The main difference between avionic software and conventional embedded software is that the development process is required by law and is optimized for safety. It is claimed that the process described below is only slightly slower and more costly than the normal ad hoc processes used for commercial software. Since most software fails because of mistakes, eliminating the mistakes at the earliest possible step is also a relatively inexpensive and reliable way to produce software. In some projects however, mistakes in the specifications may not be detected until deployment. At that point, they can be very expensive to fix.

    DO-178B, Software Considerations in Airborne Systems and Equipment Certification is a guideline dealing with the safety of safety-critical software used in certain airborne systems. It was jointly developed by the safety-critical working group RTCA SC-167 of the Radio Technical Commission for Aeronautics (RTCA) and WG-12 of the European Organisation for Civil Aviation Equipment (EUROCAE). RTCA published the document as RTCA/DO-178B, while EUROCAE published the document as ED-12B. Although technically a guideline, it was a de facto standard for developing avionics software systems until it was replaced in 2012 by DO-178C.

    In functional safety, safety integrity level (SIL) is defined as the relative level of risk-reduction provided by a safety instrumented function (SIF), i.e. the measurement of the performance required of the SIF.

    A hazard analysis is one of many methods that may be used to assess risk. At its core, the process entails describing a system object that intends to conduct some activity. During the performance of that activity, an adverse event may be encountered that could cause or contribute to an occurrence. Finally, that occurrence will result in some outcome that may be measured in terms of the degree of loss or harm. This outcome may be measured on a continuous scale, such as an amount of monetary loss, or the outcomes may be categorized into various levels of severity.

    <span class="mw-page-title-main">ARP4754</span> Aerospace Practice

    ARP4754, Aerospace Recommended Practice (ARP) ARP4754B, is a guideline from SAE International, dealing with the development processes which support certification of Aircraft systems, addressing "the complete aircraft development cycle, from systems requirements through systems verification." Revision A was released in December 2010. It was recognized by the FAA through Advisory Circular AC 20-174 published November 2011. EUROCAE jointly issues the document as ED–79.

    RTCA DO-254 / EUROCAE ED-80, Design Assurance Guidance for Airborne Electronic Hardware is a document providing guidance for the development of airborne electronic hardware, published by RTCA, Incorporated and EUROCAE. The DO-254/ED-80 standard was formally recognized by the FAA in 2005 via AC 20-152 as a means of compliance for the design assurance of electronic hardware in airborne systems. The guidance in this document is applicable, but not limited, to such electronic hardware items as

    IEC 61508 is an international standard published by the International Electrotechnical Commission (IEC) consisting of methods on how to apply, design, deploy and maintain automatic protection systems called safety-related systems. It is titled Functional Safety of Electrical/Electronic/Programmable Electronic Safety-related Systems.

    IEC standard 61511 is a technical standard which sets out practices in the engineering of systems that ensure the safety of an industrial process through the use of instrumentation. Such systems are referred to as Safety Instrumented Systems. The title of the standard is "Functional safety - Safety instrumented systems for the process industry sector".

    In functional safety a safety instrumented system (SIS) is an engineered set of hardware and software controls which provides a protection layer that shuts down a chemical, nuclear, electrical, or mechanical system, or part of it, if a hazardous condition is detected.

    Functional safety is the part of the overall safety of a system or piece of equipment that depends on automatic protection operating correctly in response to its inputs or failure in a predictable manner (fail-safe). The automatic protection system should be designed to properly handle likely systematic errors, hardware failures and operational/environmental stress.

    <span class="mw-page-title-main">LDRA</span> Software companies of the United Kingdom

    LDRA, previously known as the Liverpool Data Research Associates, is a privately held company producing software analysis, testing, and requirements traceability tools for the public and private sectors. It is involved static and dynamic software analysis.

    ISO 26262, titled "Road vehicles – Functional safety", is an international standard for functional safety of electrical and/or electronic systems that are installed in serial production road vehicles, defined by the International Organization for Standardization (ISO) in 2011, and revised in 2018.

    DO-178C, Software Considerations in Airborne Systems and Equipment Certification is the primary document by which the certification authorities such as FAA, EASA and Transport Canada approve all commercial software-based aerospace systems. The document is published by RTCA, Incorporated, in a joint effort with EUROC and replaces DO-178B. The new document is called DO-178C/ED-12C and was completed in November 2011 and approved by the RTCA in December 2011. It became available for sale and use in January 2012.

    <span class="mw-page-title-main">Parasoft C/C++test</span> Integrated set of tools

    Parasoft C/C++test is an integrated set of tools for testing C and C++ source code that software developers use to analyze, test, find defects, and measure the quality and security of their applications. It supports software development practices that are part of development testing, including static code analysis, dynamic code analysis, unit test case generation and execution, code coverage analysis, regression testing, runtime error detection, requirements traceability, and code review. It's a commercial tool that supports operation on Linux, Windows, and Solaris platforms as well as support for on-target embedded testing and cross compilers.

    Development testing is a software development process that involves synchronized application of a broad spectrum of defect prevention and detection strategies in order to reduce software development risks, time, and costs.

    Automotive Safety Integrity Level (ASIL) is a risk classification scheme defined by the ISO 26262 - Functional Safety for Road Vehicles standard. This is an adaptation of the Safety Integrity Level (SIL) used in IEC 61508 for the automotive industry. This classification helps defining the safety requirements necessary to be in line with the ISO 26262 standard. The ASIL is established by performing a risk analysis of a potential hazard by looking at the Severity, Exposure and Controllability of the vehicle operating scenario. The safety goal for that hazard in turn carries the ASIL requirements.

    <span class="mw-page-title-main">AC 25.1309-1</span> American aviation regulatory document

    AC 25.1309–1 is an FAA Advisory Circular (AC) that identifies acceptable means for showing compliance with the airworthiness requirements of § 25.1309 of the Federal Aviation Regulations. Revision A was released in 1988. In 2002, work was done on Revision B, but it was not formally released; the result is the Rulemaking Advisory Committee-recommended revision B-Arsenal Draft (2002). The Arsenal Draft is "considered to exist as a relatively mature draft". The FAA and EASA have subsequently accepted proposals by type certificate applicants to use the Arsenal Draft on development programs.

    Cantata++, commonly referred to as Cantata in newer versions, is a commercial computer program designed for dynamic testing, with a focus on unit testing and integration testing, as well as run time code coverage analysis for C and C++ programs. It is developed and marketed by QA Systems, a multinational company with headquarters in Waiblingen, Germany.

    High-integrity software is software whose failure may cause serious damage with possible "life-threatening consequences." "Integrity is important as it demonstrates the safety, security, and maintainability of... code." Examples of high-integrity software are nuclear reactor control, avionics software, automotive safety-critical software and process control software.

    [H]igh integrity means that the code:

    References

    1. 1 2 3 4 IEC (2010). IEC 61508 - Functional Safety of Electrical/Electronic/Programmable Electronic Safety-related Systems. International Electrotechnical Commission.
    2. 1 2 ISO (2018). ISO 26262 - Road vehicles — Functional safety. International Standardisation Organisation.
    3. 1 2 3 4 5 CENELEC (2023). EN 50716 - Railway Applications - Requirements for software development. CENELEC.
    4. 1 2 3 4 5 RTCA (2012). DO-178C - Software Considerations in Airborne Systems and Equipment Certification. RTCA (also published as ED-12C by Eurocae).
    5. 1 2 RTCA (2011). DO-278A - Software Integrity Assurance Considerations for Communication, Navigation, Surveillance and Air Traffic Management (CNS/ATM) Systems. RTCA (also published as ED-109A by Eurocae).
    6. IEC (2006). Medical device software — Software life cycle processes. International Electrotechnical Commission.
    7. IEC (2006). Nuclear power plants - Instrumentation and control systems important to safety - Software aspects for computer-based systems performing category A functions. International Electrotechnical Commission.
    8. US DoD (2010). Joint Software Systems Safety Engineering Handbook. US Department of Defense.
    9. US DoD (2012). MIL-STD-882E - System Safety. US Department of Defense.
    10. NASA (2004). NASA Software Safety Guidebook. NASA.
    11. ISO (2011). ISO 25010 - Systems and software engineering — Systems and software Quality Requirements and Evaluation (SQuaRE) — System and software quality models. International Standardisation Organisation.
    12. Michael R. Lyu (1996). Handbook of Software Reliability Engineering. IEEE Computer Society Press and McGraw-Hill Book Company.
    13. SAE ARP (2023). ARP 4761 - Guidelines for conducting the safety assessment process on civil aircraft, systems, and equipment. SAE Aerospace Recommended Practice (also published as ED-135 by Eurocae.
    14. https://agilemanifesto.org/ [ bare URL ]

    PD-icon.svg This article incorporates public domain material from Software handbook. United States Army.