Random testing

Last updated

Random testing is a black-box software testing technique where programs are tested by generating random, independent inputs. Results of the output are compared against software specifications to verify that the test output is pass or fail. [1] In case of absence of specifications the exceptions of the language are used which means if an exception arises during test execution then it means there is a fault in the program, it is also used as a way to avoid biased testing.

Contents

History of random testing

Random testing for hardware was first examined by Melvin Breuer in 1971 and initial effort to evaluate its effectiveness was done by Pratima and Vishwani Agrawal in 1975. [2]

In software, Duran and Ntafos had examined random testing in 1984. [3]

The use of hypothesis testing as a theoretical basis for random testing was described by Howden in Functional Testing and Analysis. The book also contained the development of a simple formula for estimating the number of tests n that are needed to have confidence at least 1-1/n in a failure rate of no larger than 1/n. The formula is the lower bound nlogn, which indicates the large number of failure-free tests needed to have even modest confidence in a modest failure rate bound. [4]

Overview

Consider the following C++ function:

intmyAbs(intx){if(x>0){returnx;}else{returnx;// bug: should be '-x'}}

Now the random tests for this function could be {123, 36, -35, 48, 0}. Only the value '-35' triggers the bug. If there is no reference implementation to check the result, the bug still could go unnoticed. However, an assertion could be added to check the results, like:

voidtestAbs(intn){for(inti=0;i<n;i++){intx=getRandomInput();intresult=myAbs(x);assert(result>=0);}}

The reference implementation is sometimes available, e.g. when implementing a simple algorithm in a much more complex way for better performance. For example, to test an implementation of the Schönhage–Strassen algorithm, the standard "*" operation on integers can be used:

intgetRandomInput(){// …}voidtestFastMultiplication(intn){for(inti=0;i<n;i++){longx=getRandomInput();longy=getRandomInput();longresult=fastMultiplication(x,y);assert(x*y==result);}}

While this example is limited to simple types (for which a simple random generator can be used), tools targeting object-oriented languages typically explore the program to test and find generators (constructors or methods returning objects of that type) and call them using random inputs (either themselves generated the same way or generated using a pseudo-random generator if possible). Such approaches then maintain a pool of randomly generated objects and use a probability for either reusing a generated object or creating a new one. [5]

On randomness

According to the seminal paper on random testing by D. Hamlet

[..] the technical, mathematical meaning of "random testing" refers to an explicit lack of "system" in the choice of test data, so that there is no correlation among different tests. [1]

Strengths and weaknesses

Random testing is praised for the following strengths:

The following weaknesses have been described :

Types of random testing

With respect to the input

Guided vs. unguided

Implementations

Some tools implementing random testing:

Critique

Random testing has only a specialized niche in practice, mostly because an effective oracle is seldom available, but also because of difficulties with the operational profile and with generation of pseudorandom input values. [1]

A test oracle is an instrument for verifying whether the outcomes match the program specification or not. An operation profile is knowledge about usage patterns of the program and thus which parts are more important.

For programming languages and platforms which have contracts (e.g. Eiffel. .NET or various extensions of Java like JML, CoFoJa...) contracts act as natural oracles and the approach has been applied successfully. [5] In particular, random testing finds more bugs than manual inspections or user reports (albeit different ones). [9]

See also

Related Research Articles

Software testing is the act of examining the artifacts and the behavior of the software under test by validation and verification. Software testing can also provide an objective, independent view of the software to allow the business to appreciate and understand the risks of software implementation. Test techniques include, but are not necessarily limited to:

<span class="mw-page-title-main">Design by contract</span> Approach for designing software

Design by contract (DbC), also known as contract programming, programming by contract and design-by-contract programming, is an approach for designing software.

Generic programming is a style of computer programming in which algorithms are written in terms of data types to-be-specified-later that are then instantiated when needed for specific types provided as parameters. This approach, pioneered by the ML programming language in 1973, permits writing common functions or types that differ only in the set of types on which they operate when used, thus reducing duplicate code.

In computer programming, unit testing is a software testing method by which individual units of source code—sets of one or more computer program modules together with associated control data, usage procedures, and operating procedures—are tested to determine whether they are fit for use. It is a standard step in development and implementation approaches such as Agile.

In the context of hardware and software systems, formal verification is the act of proving or disproving the correctness of a system with respect to a certain formal specification or property, using formal methods of mathematics. Formal verification is a key incentive for formal specification of systems, and is at the core of formal methods. It represents an important dimension of analysis and verification in electronic design automation and is one approach to software verification. The use of formal verification enables the highest Evaluation Assurance Level (EAL7) in the framework of common criteria for computer security certification.

In computer programming, unreachable code is part of the source code of a program which can never be executed because there exists no control flow path to the code from the rest of the program.

In programming and software development, fuzzing or fuzz testing is an automated software testing technique that involves providing invalid, unexpected, or random data as inputs to a computer program. The program is then monitored for exceptions such as crashes, failing built-in code assertions, or potential memory leaks. Typically, fuzzers are used to test programs that take structured inputs. This structure is specified, e.g., in a file format or protocol and distinguishes valid from invalid input. An effective fuzzer generates semi-valid inputs that are "valid enough" in that they are not directly rejected by the parser, but do create unexpected behaviors deeper in the program and are "invalid enough" to expose corner cases that have not been properly dealt with.

Java Pathfinder (JPF) is a system to verify executable Java bytecode programs. JPF was developed at the NASA Ames Research Center and open sourced in 2005. The acronym JPF is not to be confused with the unrelated Java Plugin Framework project.

<span class="mw-page-title-main">SystemVerilog</span> Hardware description and hardware verification language

SystemVerilog, standardized as IEEE 1800, is a hardware description and hardware verification language used to model, design, simulate, test and implement electronic systems. SystemVerilog is based on Verilog and some extensions, and since 2008, Verilog is now part of the same IEEE standard. It is commonly used in the semiconductor and electronic design industry as an evolution of Verilog.

<span class="mw-page-title-main">Random number generation</span> Producing a sequence that cannot be predicted better than by random chance

Random number generation is a process by which, often by means of a random number generator (RNG), a sequence of numbers or symbols that cannot be reasonably predicted better than by random chance is generated. This means that the particular outcome sequence will contain some patterns detectable in hindsight but impossible to foresee. True random number generators can be hardware random-number generators (HRNGs), wherein each generation is a function of the current value of a physical environment's attribute that is constantly changing in a manner that is practically impossible to model. This would be in contrast to so-called "random number generations" done by pseudorandom number generators (PRNGs), which generate numbers that only look random but are in fact pre-determined—these generations can be reproduced simply by knowing the state of the PRNG.

Runtime verification is a computing system analysis and execution approach based on extracting information from a running system and using it to detect and possibly react to observed behaviors satisfying or violating certain properties. Some very particular properties, such as datarace and deadlock freedom, are typically desired to be satisfied by all systems and may be best implemented algorithmically. Other properties can be more conveniently captured as formal specifications. Runtime verification specifications are typically expressed in trace predicate formalisms, such as finite state machines, regular expressions, context-free patterns, linear temporal logics, etc., or extensions of these. This allows for a less ad-hoc approach than normal testing. However, any mechanism for monitoring an executing system is considered runtime verification, including verifying against test oracles and reference implementations. When formal requirements specifications are provided, monitors are synthesized from them and infused within the system by means of instrumentation. Runtime verification can be used for many purposes, such as security or safety policy monitoring, debugging, testing, verification, validation, profiling, fault protection, behavior modification, etc. Runtime verification avoids the complexity of traditional formal verification techniques, such as model checking and theorem proving, by analyzing only one or a few execution traces and by working directly with the actual system, thus scaling up relatively well and giving more confidence in the results of the analysis, at the expense of less coverage. Moreover, through its reflective capabilities runtime verification can be made an integral part of the target system, monitoring and guiding its execution during deployment.

Functional verification is the task of verifying that the logic design conforms to specification. Functional verification attempts to answer the question "Does this proposed design do what is intended?" This is complex and takes the majority of time and effort in most large electronic system design projects. Functional verification is a part of more encompassing design verification, which, besides functional verification, considers non-functional aspects like timing, layout and power.

<span class="mw-page-title-main">KeY</span>

The KeY tool is used in formal verification of Java programs. It accepts specifications written in the Java Modeling Language to Java source files. These are transformed into theorems of dynamic logic and then compared against program semantics that are likewise defined in terms of dynamic logic. KeY is significantly powerful in that it supports both interactive and fully automated correctness proofs. Failed proof attempts can be used for a more efficient debugging or verification-based testing. There have been several extensions to KeY in order to apply it to the verification of C programs or hybrid systems. KeY is jointly developed by Karlsruhe Institute of Technology, Germany; Technische Universität Darmstadt, Germany; and Chalmers University of Technology in Gothenburg, Sweden and is licensed under the GPL.

Intelligent Verification, including intelligent testbench automation, is a form of functional verification of electronic hardware designs used to verify that a design conforms to specification before device fabrication. Intelligent verification uses information derived from the design and specification(s) to expose bugs in and between hardware IPs. Intelligent verification tools require considerably less engineering effort and user guidance to achieve verification results that meet or exceed the standard approach of writing a testbench program.

In computing, compiler correctness is the branch of computer science that deals with trying to show that a compiler behaves according to its language specification. Techniques include developing the compiler using formal methods and using rigorous testing on an existing compiler.

Concolic testing is a hybrid software verification technique that performs symbolic execution, a classical technique that treats program variables as symbolic variables, along a concrete execution path. Symbolic execution is used in conjunction with an automated theorem prover or constraint solver based on constraint logic programming to generate new concrete inputs with the aim of maximizing code coverage. Its main focus is finding bugs in real-world software, rather than demonstrating program correctness.

The Classification Tree Method is a method for test design, as it is used in different areas of software development. It was developed by Grimm and Grochtmann in 1993. Classification Trees in terms of the Classification Tree Method must not be confused with decision trees.

Automatic bug-fixing is the automatic repair of software bugs without the intervention of a human programmer. It is also commonly referred to as automatic patch generation, automatic bug repair, or automatic program repair. The typical goal of such techniques is to automatically generate correct patches to eliminate bugs in software programs without causing software regression.

Differential testing, also known as differential fuzzing, is a popular software testing technique that attempts to detect bugs, by providing the same input to a series of similar applications, and observing differences in their execution. Differential testing complements traditional software testing, because it is well-suited to find semantic or logic bugs that do not exhibit explicit erroneous behaviors like crashes or assertion failures. Differential testing is sometimes called back-to-back testing.

<span class="mw-page-title-main">Dafny</span> Programming language

Dafny is an imperative and functional compiled language that compiles to other programming languages, such as C#, Java, JavaScript, Go and Python. It supports formal specification through preconditions, postconditions, loop invariants, loop variants, termination specifications and read/write framing specifications. The language combines ideas from the functional and imperative paradigms; it includes support for object-oriented programming. Features include generic classes, dynamic allocation, inductive datatypes and a variation of separation logic known as implicit dynamic frames for reasoning about side effects. Dafny was created by Rustan Leino at Microsoft Research after his previous work on developing ESC/Modula-3, ESC/Java, and Spec#.

References

  1. 1 2 3 Richard Hamlet (1994). "Random Testing". In John J. Marciniak (ed.). Encyclopedia of Software Engineering (1st ed.). John Wiley and Sons. ISBN   978-0471540021.
  2. Agrawal, P.; Agrawal, V. D. (1 July 1975). "Probabilistic Analysis of Random Test Generation Method for Irredundant Combinational Logic Networks". IEEE Transactions on Computers. C-24 (7): 691–695. doi:10.1109/T-C.1975.224289.
  3. Duran, J. W.; Ntafos, S. C. (1 July 1984). "An Evaluation of Random Testing". IEEE Transactions on Software Engineering. SE-10 (4): 438–444. doi:10.1109/TSE.1984.5010257.
  4. 1 2 Howden, William (1987). Functional Program Testing and Analysis. New York: McGraw Hill. pp. 51–53. ISBN   0-07-030550-1.
  5. 1 2 3 "AutoTest - Chair of Software Engineering". se.inf.ethz.ch. Retrieved 15 November 2017.
  6. 1 2 "Is it a bad practice to randomly-generate test data?". stackoverflow.com. Retrieved 15 November 2017.
  7. Pacheco, Carlos; Shuvendu K. Lahiri; Michael D. Ernst; Thomas Ball (May 2007). "Feedback-directed random test generation" (PDF). ICSE '07: Proceedings of the 29th International Conference on Software Engineering: 75–84. ISSN   0270-5257.
  8. T.Y. Chen; F.-C. Kuo; R.G. Merkel; T.H. Tse (2010), "Adaptive random testing: The ART of test case diversity", Journal of Systems and Software, 83 (1): 60–66, doi:10.1016/j.jss.2009.02.022, hdl: 10722/89054
  9. Ilinca Ciupa; Alexander Pretschner; Manuel Oriol; Andreas Leitner; Bertrand Meyer (2009). "On the number and nature of faults found by random testing". Software Testing, Verification and Reliability. 21: 3–28. doi:10.1002/stvr.415.