Code coverage

Last updated

In software engineering, code coverage, also called test coverage, is a percentage measure of the degree to which the source code of a program is executed when a particular test suite is run. A program with high test coverage has more of its source code executed during testing, which suggests it has a lower chance of containing undetected software bugs compared to a program with low test coverage. [1] [2] Many different metrics can be used to calculate test coverage. Some of the most basic are the percentage of program subroutines and the percentage of program statements called during execution of the test suite.

Contents

Code coverage was among the first methods invented for systematic software testing. The first published reference was by Miller and Maloney in Communications of the ACM , in 1963. [3]

Coverage criteria

To measure what percentage of code has been executed by a test suite, one or more coverage criteria are used. These are usually defined as rules or requirements, which a test suite must satisfy. [4]

Basic coverage criteria

There are a number of coverage criteria, but the main ones are: [5]

For example, consider the following C function:

intfoo(intx,inty){intz=0;if((x>0)&&(y>0)){z=x;}returnz;}

Assume this function is a part of some bigger program and this program was run with some test suite.

In programming languages that do not perform short-circuit evaluation, condition coverage does not necessarily imply branch coverage. For example, consider the following Pascal code fragment:

ifaandbthen

Condition coverage can be satisfied by two tests:

However, this set of tests does not satisfy branch coverage since neither case will meet the if condition.

Fault injection may be necessary to ensure that all conditions and branches of exception-handling code have adequate coverage during testing.

Modified condition/decision coverage

A combination of function coverage and branch coverage is sometimes also called decision coverage. This criterion requires that every point of entry and exit in the program has been invoked at least once, and every decision in the program has taken on all possible outcomes at least once. In this context, the decision is a boolean expression comprising conditions and zero or more boolean operators. This definition is not the same as branch coverage, [6] however, the term decision coverage is sometimes used as a synonym for it. [7]

Condition/decision coverage requires that both decision and condition coverage be satisfied. However, for safety-critical applications (such as avionics software) it is often required that modified condition/decision coverage (MC/DC) be satisfied. This criterion extends condition/decision criteria with requirements that each condition should affect the decision outcome independently.

For example, consider the following code:

if(aorb)andcthen

The condition/decision criteria will be satisfied by the following set of tests:

abc
truetruetrue
falsefalsefalse

However, the above tests set will not satisfy modified condition/decision coverage, since in the first test, the value of 'b' and in the second test the value of 'c' would not influence the output. So, the following test set is needed to satisfy MC/DC:

abc
falsetruefalse
falsetruetrue
falsefalsetrue
truefalsetrue

Multiple condition coverage

This criterion requires that all combinations of conditions inside each decision are tested. For example, the code fragment from the previous section will require eight tests:

abc
falsefalsefalse
falsefalsetrue
falsetruefalse
falsetruetrue
truefalsefalse
truefalsetrue
truetruefalse
truetruetrue

Parameter value coverage

Parameter value coverage (PVC) requires that in a method taking parameters, all the common values for such parameters be considered. The idea is that all common possible values for a parameter are tested. [8] For example, common values for a string are: 1) null, 2) empty, 3) whitespace (space, tabs, newline), 4) valid string, 5) invalid string, 6) single-byte string, 7) double-byte string. It may also be appropriate to use very long strings. Failure to test each possible parameter value may result in a bug. Testing only one of these could result in 100% code coverage as each line is covered, but as only one of seven options are tested, there is only 14.2% PVC.

Other coverage criteria

There are further coverage criteria, which are used less often:

Safety-critical or dependable applications are often required to demonstrate 100% of some form of test coverage. For example, the ECSS-E-ST-40C standard demands 100% statement and decision coverage for two out of four different criticality levels; for the other ones, target coverage values are up to negotiation between supplier and customer. [11] However, setting specific target values - and, in particular, 100% - has been criticized by practitioners for various reasons (cf. [12] ) Martin Fowler writes: "I would be suspicious of anything like 100% - it would smell of someone writing tests to make the coverage numbers happy, but not thinking about what they are doing". [13]

Some of the coverage criteria above are connected. For instance, path coverage implies decision, statement and entry/exit coverage. Decision coverage implies statement coverage, because every statement is part of a branch.

Full path coverage, of the type described above, is usually impractical or impossible. Any module with a succession of decisions in it can have up to paths within it; loop constructs can result in an infinite number of paths. Many paths may also be infeasible, in that there is no input to the program under test that can cause that particular path to be executed. However, a general-purpose algorithm for identifying infeasible paths has been proven to be impossible (such an algorithm could be used to solve the halting problem). [14] Basis path testing is for instance a method of achieving complete branch coverage without achieving complete path coverage. [15]

Methods for practical path coverage testing instead attempt to identify classes of code paths that differ only in the number of loop executions, and to achieve "basis path" coverage the tester must cover all the path classes.[ citation needed ][ clarification needed ]

In practice

The target software is built with special options or libraries and run under a controlled environment, to map every executed function to the function points in the source code. This allows testing parts of the target software that are rarely or never accessed under normal conditions, and helps reassure that the most important conditions (function points) have been tested. The resulting output is then analyzed to see what areas of code have not been exercised and the tests are updated to include these areas as necessary. Combined with other test coverage methods, the aim is to develop a rigorous, yet manageable, set of regression tests.

In implementing test coverage policies within a software development environment, one must consider the following:

Software authors can look at test coverage results to devise additional tests and input or configuration sets to increase the coverage over vital functions. Two common forms of test coverage are statement (or line) coverage and branch (or edge) coverage. Line coverage reports on the execution footprint of testing in terms of which lines of code were executed to complete the test. Edge coverage reports which branches or code decision points were executed to complete the test. They both report a coverage metric, measured as a percentage. The meaning of this depends on what form(s) of coverage have been used, as 67% branch coverage is more comprehensive than 67% statement coverage.

Generally, test coverage tools incur computation and logging in addition to the actual program thereby slowing down the application, so typically this analysis is not done in production. As one might expect, there are classes of software that cannot be feasibly subjected to these coverage tests, though a degree of coverage mapping can be approximated through analysis rather than direct testing.

There are also some sorts of defects which are affected by such tools. In particular, some race conditions or similar real time sensitive operations can be masked when run under test environments; though conversely, some of these defects may become easier to find as a result of the additional overhead of the testing code.

Most professional software developers use C1 and C2 coverage. C1 stands for statement coverage and C2 for branch or condition coverage. With a combination of C1 and C2, it is possible to cover most statements in a code base. Statement coverage would also cover function coverage with entry and exit, loop, path, state flow, control flow and data flow coverage. With these methods, it is possible to achieve nearly 100% code coverage in most software projects. [17]

Usage in industry

Test coverage is one consideration in the safety certification of avionics equipment. The guidelines by which avionics gear is certified by the Federal Aviation Administration (FAA) is documented in DO-178B [16] and DO-178C. [18]

Test coverage is also a requirement in part 6 of the automotive safety standard ISO 26262 Road Vehicles - Functional Safety. [19]

See also

Related Research Articles

In computing, an optimizing compiler is a compiler that tries to minimize or maximize some attributes of an executable computer program. Common requirements are to minimize a program's execution time, memory footprint, storage size, and power consumption.

In computer programming, unit testing is a software testing method by which isolated source code is tested to validate expected behavior.

In computer programming, specifically when using the imperative programming paradigm, an assertion is a predicate connected to a point in the program, that always should evaluate to true at that point in code execution. Assertions can help a programmer read the code, help a compiler compile it, or help the program detect its own defects.

<span class="mw-page-title-main">Conditional (computer programming)</span> Control flow statement that executes code according to some condition(s)

In computer science, conditionals are programming language commands for handling decisions. Specifically, conditionals perform different computations or actions depending on whether a programmer-defined Boolean condition evaluates to true or false. In terms of control flow, the decision is always achieved by selectively altering the control flow based on some condition . Although dynamic dispatch is not usually classified as a conditional construct, it is another way to select between alternatives at runtime. Conditional statements are the checkpoints in the programme that determines behaviour according to situation.

In computer programming, the ternary conditional operator is a ternary operator that is part of the syntax for basic conditional expressions in several programming languages. It is commonly referred to as the conditional operator, ternary if, or inline if. An expression a ? b : c evaluates to b if the value of a is true, and otherwise to c. One can read it aloud as "if a then b otherwise c". The form a ? b : c is by far and large the most common, but alternative syntaxes do exist; for example, Raku uses the syntax a ?? b !! c to avoid confusion with the infix operators ? and !, whereas in Visual Basic .NET, it instead takes the form If(a, b, c).

White-box testing is a method of software testing that tests internal structures or workings of an application, as opposed to its functionality. In white-box testing, an internal perspective of the system is used to design test cases. The tester chooses inputs to exercise paths through the code and determine the expected outputs. This is analogous to testing nodes in a circuit, e.g. in-circuit testing (ICT). White-box testing can be applied at the unit, integration and system levels of the software testing process. Although traditional testers tended to think of white-box testing as being done at the unit level, it is used for integration and system testing more frequently today. It can test paths within a unit, paths between units during integration, and between subsystems during a system–level test. Though this method of test design can uncover many errors or problems, it has the potential to miss unimplemented parts of the specification or missing requirements. Where white-box testing is design-driven, that is, driven exclusively by agreed specifications of how each component of software is required to behave, white-box test techniques can accomplish assessment for unimplemented or missing requirements.

A branch is an instruction in a computer program that can cause a computer to begin executing a different instruction sequence and thus deviate from its default behavior of executing instructions in order. Branch may also refer to the act of switching execution to a different instruction sequence as a result of executing a branch instruction. Branch instructions are used to implement control flow in program loops and conditionals.

Cyclomatic complexity is a software metric used to indicate the complexity of a program. It is a quantitative measure of the number of linearly independent paths through a program's source code. It was developed by Thomas J. McCabe, Sr. in 1976.

<span class="mw-page-title-main">Java syntax</span> Set of rules defining correctly structured program

The syntax of Java is the set of rules defining how a Java program is written and interpreted.

In computer programming languages, a switch statement is a type of selection control mechanism used to allow the value of a variable or expression to change the control flow of program execution via search and map.

Mutation testing is used to design new software tests and evaluate the quality of existing software tests. Mutation testing involves modifying a program in small ways. Each mutated version is called a mutant and tests detect and reject mutants by causing the behaviour of the original version to differ from the mutant. This is called killing the mutant. Test suites are measured by the percentage of mutants that they kill. New tests can be designed to kill additional mutants. Mutants are based on well-defined mutation operators that either mimic typical programming errors or force the creation of valuable tests. The purpose is to help the tester develop effective tests or locate weaknesses in the test data used for the program or in sections of the code that are seldom or never accessed during execution. Mutation testing is a form of white-box testing.

DO-178B, Software Considerations in Airborne Systems and Equipment Certification is a guideline dealing with the safety of safety-critical software used in certain airborne systems. It was jointly developed by the safety-critical working group RTCA SC-167 of the Radio Technical Commission for Aeronautics (RTCA) and WG-12 of the European Organisation for Civil Aviation Equipment (EUROCAE). RTCA published the document as RTCA/DO-178B, while EUROCAE published the document as ED-12B. Although technically a guideline, it was a de facto standard for developing avionics software systems until it was replaced in 2012 by DO-178C.

<span class="mw-page-title-main">JavaScript syntax</span> Set of rules defining correctly structured programs

The syntax of JavaScript is the set of rules that define a correctly structured JavaScript program.

<span class="mw-page-title-main">Decision-to-decision path</span>

A decision-to-decision path, or DD-path, is a path of execution between two decisions. More recent versions of the concept also include the decisions themselves in their own DD-paths.

Modified condition/decision coverage (MC/DC) is a code coverage criterion used in software testing.

<span class="mw-page-title-main">Control table</span> Data structures that control the execution order of computer commands

Control tables are tables that control the control flow or play a major part in program control. There are no rigid rules about the structure or content of a control table—its qualifying attribute is its ability to direct control flow in some way through "execution" by a processor or interpreter. The design of such tables is sometimes referred to as table-driven design. In some cases, control tables can be specific implementations of finite-state-machine-based automata-based programming. If there are several hierarchical levels of control table they may behave in a manner equivalent to UML state machines

Linear code sequence and jump (LCSAJ), in the broad sense, is a software analysis method used to identify structural units in code under test. Its primary use is with dynamic software analysis to help answer the question "How much testing is enough?". Dynamic software analysis is used to measure the quality and efficacy of software test data, where the quantification is performed in terms of structural units of the code under test. When used to quantify the structural units exercised by a given set of test data, dynamic analysis is also referred to as structural coverage analysis.

DO-178C, Software Considerations in Airborne Systems and Equipment Certification is the primary document by which the certification authorities such as FAA, EASA and Transport Canada approve all commercial software-based aerospace systems. The document is published by RTCA, Incorporated, in a joint effort with EUROC and replaces DO-178B. The new document is called DO-178C/ED-12C and was completed in November 2011 and approved by the RTCA in December 2011. It became available for sale and use in January 2012.

<span class="mw-page-title-main">Parasoft C/C++test</span> Integrated set of tools

Parasoft C/C++test is an integrated set of tools for testing C and C++ source code that software developers use to analyze, test, find defects, and measure the quality and security of their applications. It supports software development practices that are part of development testing, including static code analysis, dynamic code analysis, unit test case generation and execution, code coverage analysis, regression testing, runtime error detection, requirements traceability, and code review. It's a commercial tool that supports operation on Linux, Windows, and Solaris platforms as well as support for on-target embedded testing and cross compilers.

DO-248C, Supporting Information for DO-178C and DO-278A, published by RTCA, Incorporated, is a collection of Frequently Asked Questions and Discussion Papers addressing applications of DO-178C and DO-278A in the safety assurance of software for aircraft and software for CNS/ATM systems, respectively. Like DO-178C and DO-278A, it is a joint RTCA undertaking with EUROCAE and the document is also published as ED-94C, Supporting Information for ED-12C and ED-109A. The publication does not provide any guidance additional to DO-178C or DO-278A; rather, it only provides clarification for the guidance established in those standards. The present revision is also expanded to include the "Rationale for DO-178C/DO-278A" section to document items that were considered when developing DO-178B and then DO-178C, DO-278A, and DO-330, as well as the supplements that accompany those publications.

References

  1. Brader, Larry; Hilliker, Howie; Wills, Alan (March 2, 2013). "Chapter 2 Unit Testing: Testing the Inside". Testing for Continuous Delivery with Visual Studio 2012. Microsoft. p. 30. ISBN   978-1621140184 . Retrieved 16 June 2016.
  2. Williams, Laurie; Smith, Ben; Heckman, Sarah. "Test Coverage with EclEmma". Open Seminar Software Engineering. North Carolina State University. Archived from the original on 14 March 2016. Retrieved 16 June 2016.
  3. Joan C. Miller, Clifford J. Maloney (February 1963). "Systematic mistake analysis of digital computer programs". Communications of the ACM . 6 (2). New York, NY, USA: ACM: 58–63. doi: 10.1145/366246.366248 . ISSN   0001-0782.
  4. Paul Ammann, Jeff Offutt (2013). Introduction to Software Testing. Cambridge University Press.
  5. Glenford J. Myers (2004). The Art of Software Testing, 2nd edition. Wiley. ISBN   0-471-46912-2.
  6. Position Paper CAST-10 (June 2002). What is a "Decision" in Application of Modified Condition/Decision Coverage (MC/DC) and Decision Coverage (DC)?
  7. MathWorks. Types of Model Coverage.
  8. "Unit Testing with Parameter Value Coverage (PVC)".
  9. M. R. Woodward, M. A. Hennell, "On the relationship between two control-flow coverage criteria: all JJ-paths and MCDC", Information and Software Technology 48 (2006) pp. 433-440
  10. Ting Su, Ke Wu, Weikai Miao, Geguang Pu, Jifeng He, Yuting Chen, and Zhendong Su. "A Survey on Data-Flow Testing". ACM Comput. Surv. 50, 1, Article 5 (March 2017), 35 pages.
  11. ECSS-E-ST-40C: Space engineering - Software. ECSS Secretariat, ESA-ESTEC. March, 2009
  12. C. Prause, J. Werner, K. Hornig, S. Bosecker, M. Kuhrmann (2017): Is 100% Test Coverage a Reasonable Requirement? Lessons Learned from a Space Software Project . In: PROFES 2017. Springer. Last accessed: 2017-11-17
  13. Martin Fowler's blog: TestCoverage. Last accessed: 2017-11-17
  14. Dorf, Richard C.: Computers, Software Engineering, and Digital Devices, Chapter 12, pg. 15. CRC Press, 2006. ISBN   0-8493-7340-9, ISBN   978-0-8493-7340-4; via Google Book Search
  15. Y.N. Srikant; Priti Shankar (2002). The Compiler Design Handbook: Optimizations and Machine Code Generation. CRC Press. p. 249. ISBN   978-1-4200-4057-9.
  16. 1 2 RTCA/DO-178B, Software Considerations in Airborne Systems and Equipment Certification, Radio Technical Commission for Aeronautics, December 1, 1992
  17. Boris beizer (2009). Software testing techniques, 2nd edition. Dreamtech press. ISBN   978-81-7722-260-9.
  18. RTCA/DO-178C, Software Considerations in Airborne Systems and Equipment Certification, Radio Technical Commission for Aeronautics, January, 2012.
  19. ISO 26262-6:2011(en) Road vehicles -- Functional safety -- Part 6: Product development at the software level. International Standardization Organization.