Equivalence partitioning

Last updated

Equivalence partitioning or equivalence class partitioning (ECP) [1] is a software testing technique that divides the input data of a software unit into partitions of equivalent data from which test cases can be derived. In principle, test cases are designed to cover each partition at least once. This technique tries to define test cases that uncover classes of errors, thereby reducing the total number of test cases that must be developed. An advantage of this approach is reduction in the time required for testing software due to lesser number of test cases.

Contents

Equivalence partitioning is typically applied to the inputs of a tested component, but may be applied to the outputs in rare cases. The equivalence partitions are usually derived from the requirements specification for input attributes that influence the processing of the test object.

The fundamental concept of ECP comes from equivalence class which in turn comes from equivalence relation. A software system is in effect a computable function implemented as an algorithm in some implementation programming language. Given an input test vector some instructions of that algorithm get covered, ( see code coverage for details ) others do not. This gives the interesting relationship between input test vectors:- is an equivalence relation between test vectors a, b if and only if the coverage foot print of the vectors a, b are exactly the same, that is, they cover the same instructions, at same step. This would evidently mean that the relation cover C would partition the domain of the test vector into multiple equivalence class. This partitioning is called equivalence class partitioning of test input. If there are N equivalent classes, only N vectors are sufficient to fully cover the system.

The demonstration can be done using a function written in C:

intsafe_add(inta,intb){intc=a+b;if(a>0&&b>0&&c<=0){fprintf(stderr,"Overflow (positive)!\n");}if(a<0&&b<0&&c>=0){fprintf(stderr,"Overflow (negative)!\n");}returnc;}

On the basis of the code, the input vectors of [a,b] are partitioned. The blocks we need to cover are the overflow in the positive direction, negative direction, and neither of these 2. That gives rise to 3 equivalent classes, from the code review itself.

Demonstrating Equivalence Class Partitioning ECP.png
Demonstrating Equivalence Class Partitioning

To solve the input problem, we take refuge in the inequation

There is a fixed size of Integer (computer science) hence, the z can be replaced with:-

INT_MIN x + y INT_MAX

and

with x { INT_MIN , ... , INT_MAX } and y { INT_MIN , ... , INT_MAX }

The values of the test vector at the strict condition of the equality that is INT_MIN = x + y and INT_MAX = x + y are called the boundary values, Boundary-value analysis has detailed information about it. Note that the graph only covers the overflow case, first quadrant for X and Y positive values.

In general an input has certain ranges which are valid and other ranges which are invalid. Invalid data here does not mean that the data is incorrect, it means that this data lies outside of specific partition. This may be best explained by the example of a function which takes a parameter "month". The valid range for the month is 1 to 12, representing January to December. This valid range is called a partition. In this example there are two further partitions of invalid ranges. The first invalid partition would be 0 and the second invalid partition would be 13.

        ... -2 -1  0 1 .............. 12 13  14  15 .....       --------------|-------------------|---------------------  invalid partition 1     valid partition    invalid partition 2

The testing theory related to equivalence partitioning says that only one test case of each partition is needed to evaluate the behaviour of the program for the related partition. In other words, it is sufficient to select one test case out of each partition to check the behaviour of the program. To use more or even all test cases of a partition will not find new faults in the program. The values within one partition are considered to be "equivalent". Thus the number of test cases can be reduced considerably.

An additional effect of applying this technique is that you also find the so-called "dirty" test cases. An inexperienced tester may be tempted to use as test cases the input data 1 to 12 for the month and forget to select some out of the invalid partitions. This would lead to a huge number of unnecessary test cases on the one hand, and a lack of test cases for the dirty ranges on the other hand.

The tendency is to relate equivalence partitioning to so called black box testing which is strictly checking a software component at its interface, without consideration of internal structures of the software. But having a closer look at the subject there are cases where it applies to grey box testing as well. Imagine an interface to a component which has a valid range between 1 and 12 like the example above. However internally the function may have a differentiation of values between 1 and 6 and the values between 7 and 12. Depending upon the input value the software internally will run through different paths to perform slightly different actions. Regarding the input and output interfaces to the component this difference will not be noticed, however in your grey-box testing you would like to make sure that both paths are examined. To achieve this it is necessary to introduce additional equivalence partitions which would not be needed for black-box testing. For this example this would be:

        ... -2 -1  0 1 ..... 6 7 ..... 12 13  14  15 .....       --------------|---------|----------|---------------------  invalid partition 1      P1         P2     invalid partition 2                        valid partitions

To check for the expected results you would need to evaluate some internal intermediate values rather than the output interface. It is not necessary that we should use multiple values from each partition. In the above scenario we can take -2 from invalid partition 1, 6 from valid partition P1, 7 from valid partition P2 and 15 from invalid partition 2.

Equivalence partitioning is not a stand-alone method to determine test cases. It has to be supplemented by boundary value analysis. Having determined the partitions of possible inputs the method of boundary value analysis has to be applied to select the most effective test cases out of these partitions.

Limitations

In cases where the data ranges or sets involved approach simplicity (Example: 0-10, 11-20, 21-30), and testing all values would be practical, blanket test coverage using all values within and bordering the ranges should be considered. Blanket test coverage can reveal bugs that would not be caught using the equivalence partitioning method, if the software includes sub-partitions which are unknown to the tester. [2] Also, in simplistic cases, the benefit of reducing the number of test values by using equivalence partitioning is diminished, in comparison to cases involving larger ranges (Example: 0-1000, 1001-2000, 2001-3000).

Further reading

Related Research Articles

<span class="mw-page-title-main">Computer program</span> Instructions a computer can execute

A computer program is a sequence or set of instructions in a programming language for a computer to execute. It is one component of software, which also includes documentation and other intangible components.

In software engineering, code coverage, also called test coverage, is a percentage measure of the degree to which the source code of a program is executed when a particular test suite is run. A program with high code coverage has more of its source code executed during testing, which suggests it has a lower chance of containing undetected software bugs compared to a program with low code coverage. Many different metrics can be used to calculate test coverage. Some of the most basic are the percentage of program subroutines and the percentage of program statements called during execution of the test suite.

<span class="mw-page-title-main">Equivalence class</span> Mathematical concept

In mathematics, when the elements of some set have a notion of equivalence, then one may naturally split the set into equivalence classes. These equivalence classes are constructed so that elements and belong to the same equivalence class if, and only if, they are equivalent.

In computing, NaN, standing for Not a Number, is a particular value of a numeric data type which is undefined as a number, such as the result of 0/0. Systematic use of NaNs was introduced by the IEEE 754 floating-point standard in 1985, along with the representation of other non-finite quantities such as infinities.

GNU Bison, commonly known as Bison, is a parser generator that is part of the GNU Project. Bison reads a specification in Bison syntax, warns about any parsing ambiguities, and generates a parser that reads sequences of tokens and decides whether the sequence conforms to the syntax specified by the grammar.

<span class="mw-page-title-main">Pulse-width modulation</span> Representation of a signal as a rectangular wave with varying duty cycle

Pulse-width modulation (PWM), also known as pulse-duration modulation (PDM) or pulse-length modulation (PLM), is any method of representing a signal as a rectangular wave with a varying duty cycle.

Generic programming is a style of computer programming in which algorithms are written in terms of data types to-be-specified-later that are then instantiated when needed for specific types provided as parameters. This approach, pioneered by the ML programming language in 1973, permits writing common functions or types that differ only in the set of types on which they operate when used, thus reducing duplicate code.

In computer programming, a type system is a logical system comprising a set of rules that assigns a property called a type to every term. Usually the terms are various language constructs of a computer program, such as variables, expressions, functions, or modules. A type system dictates the operations that can be performed on a term. For variables, the type system determines the allowed values of that term.

In computer programming, standard streams are preconnected input and output communication channels between a computer program and its environment when it begins execution. The three input/output (I/O) connections are called standard input (stdin), standard output (stdout) and standard error (stderr). Originally I/O happened via a physically connected system console, but standard streams abstract this. When a command is executed via an interactive shell, the streams are typically connected to the text terminal on which the shell is running, but can be changed with redirection or a pipeline. More generally, a child process inherits the standard streams of its parent process.

<span class="mw-page-title-main">Quantization (signal processing)</span> Process of mapping a continuous set to a countable set

Quantization, in mathematics and digital signal processing, is the process of mapping input values from a large set to output values in a (countable) smaller set, often with a finite number of elements. Rounding and truncation are typical examples of quantization processes. Quantization is involved to some degree in nearly all digital signal processing, as the process of representing a signal in digital form ordinarily involves rounding. Quantization also forms the core of essentially all lossy compression algorithms.

In computer programming, undefined behavior (UB) is the result of executing a program whose behavior is prescribed to be unpredictable, in the language specification of the programming language in which the source code is written. This is different from unspecified behavior, for which the language specification does not prescribe a result, and implementation-defined behavior that defers to the documentation of another component of the platform.

Boundary-value analysis is a software testing technique in which tests are designed to include representatives of boundary values in a range. The idea comes from the boundary. Given that there is a set of test vectors to test the system, a topology can be defined on that set. Those inputs which belong to the same equivalence class as defined by the equivalence partitioning theory would constitute the basis. Given that the basis sets are neighbors, there would exist a boundary between them. The test vectors on either side of the boundary are called boundary values. In practice, this would require that the test vectors can be ordered, and that the individual parameters follows some kind of order.

In C programming, the functions getaddrinfo and getnameinfo convert domain names, hostnames, and IP addresses between human-readable text representations and structured binary formats for the operating system's networking API. Both functions are contained in the POSIX standard application programming interface (API).

C++11 is a version of a joint technical standard, ISO/IEC 14882, by the International Organization for Standardization (ISO) and International Electrotechnical Commission (IEC), for the C++ programming language. C++11 replaced the prior version of the C++ standard, named C++03, and was later replaced by C++14. The name follows the tradition of naming language versions by the publication year of the specification, though it was formerly named C++0x because it was expected to be published before 2010.

This is an overview of Fortran 95 language features. Included are the additional features of TR-15581:Enhanced Data Type Facilities, which have been universally implemented. Old features that have been superseded by new ones are not described – few of those historic features are used in modern programs although most have been retained in the language to maintain backward compatibility. The additional features of subsequent standards, up to Fortran 2023, are described in the Fortran 2023 standard document, ISO/IEC 1539-1:2023. Many of its new features are still being implemented in compilers.

This article compares a large number of programming languages by tabulating their data types, their expression, statement, and declaration syntax, and some common operating-system interfaces.

Secure coding is the practice of developing computer software in such a way that guards against the accidental introduction of security vulnerabilities. Defects, bugs and logic flaws are consistently the primary cause of commonly exploited software vulnerabilities. Through the analysis of thousands of reported vulnerabilities, security professionals have discovered that most vulnerabilities stem from a relatively small number of common software programming errors. By identifying the insecure coding practices that lead to these errors and educating developers on secure alternatives, organizations can take proactive steps to help significantly reduce or eliminate vulnerabilities in software before deployment.

qsort is a C standard library function that implements a sorting algorithm for arrays of arbitrary objects according to a user-provided comparison function. It is named after the "quicker sort" algorithm, which was originally used to implement it in the Unix C library, although the C standard does not require it to implement quicksort.

Although C++ is one of the most widespread programming languages, many prominent software engineers criticize C++ arguing that it is overly complex and fundamentally flawed. Among the critics have been: Robert Pike, Joshua Bloch, Linus Torvalds, Donald Knuth, Richard Stallman, and Ken Thompson. C++ has been widely adopted and implemented as a systems language through most of its existence. It has been used to build many pieces of very important software.

Negative testing is a method of testing an application or system to improve the likelihood that an application works as intended/specified and can handle unexpected input and user behavior. Invalid data is inserted to compare the output against the given input. Negative testing is also known as failure testing or error path testing. When performing negative testing exceptions are expected. This shows that the application is able to handle improper user behavior. Users input values that do not work in the system to test its ability to handle incorrect values or system failure.

References

  1. Burnstein, Ilene (2003), Practical Software Testing, Springer-Verlag, p. 623, ISBN   0-387-95131-8
  2. Mathur, Aditya (2007), Foundations of Software Testing: Fundamental Algorithms and Techniques, Pearson India, p. 96, ISBN   978-81-317-0795-1