Concurrent testing

Last updated

Research [1] and literature [2] on concurrency testing and concurrent testing typically focuses on testing software and systems that use concurrent computing. The purpose is, as with most software testing, to understand the behaviour and performance of a software system that uses concurrent computing, particularly assessing the stability of a system or application during normal activity.

Contents

Research and study of program concurrency started in the 1950s, [3] with research and study of testing program concurrency appearing in the 1960s. [4] Examples of problems that concurrency testing might expose are incorrect shared memory access and unexpected order sequence of message or thread execution. [5] :2 [1] Resource contention resolution, scheduling, deadlock avoidance, priority inversion and race conditions are also highlighted. [6] :745

Selected history & approaches of testing concurrency

Approaches to concurrency testing may be on a limited unit test level right up to system test level. [7]

Some approaches to research and application of testing program/software concurrency have been:

This was considered to be ineffective for testing concurrency in a non-deterministic system and was equivalent to the testing of a sequential non-concurrent program on a system
Considered likely to find some issues in non-deterministic software execution.
This later became called non-deterministic testing. [9]
This is an approach to set the system into a particular state so that code can be executed in a known order.
An attempt to test synchronisation sequence combinations for a specified input (shared variable access not being corrupted, effectively testing race conditions variables). The sequence is typically derived for non-deterministic test execution.
Analysis of code structure and static analysis tools.
An example was a heuristic approach [11]
This led to code checker development, for example jlint. [12] Research and comparison of static analysis and code checkers for concurrency bugs [13]
See also List of tools for static code analysis
This is an approach to testing program concurrency by looking at multiple user access, either serving different users or tasks simultaneously. [2] [6] :745

Testing software and system concurrency should not be confused with stress testing, which is usually associated with loading a system beyond its defined limits. Testing of concurrent programs can exhibit problems when a system is performing within its defined limits. Most of the approaches above do not rely on overloading a system. Some literature [6] :745 states that testing of concurrency is a pre-requisite to stress testing.

Lessons learned from concurrency bug characteristics study

A study in 2008 [11] analysed bug databases in a selection of open source software. It was thought to be the first real-world study of concurrency bugs. 105 bugs were classified as concurrency bugs and analysed, split as 31 being deadlock bugs and 74 non-deadlock bugs. The study had several findings, for potential follow-up and investigation:

I.e. focusing on atomicity (protected use of shared data) or sequence will potentially find most non-deadlock type bugs.
I.e. Heavy simultaneous users/usage is not the trigger for these bugs. There is a suggestion that pairwise testing may be effective to catch these types of bugs.
An implication that pairwise testing from a resource usage perspective could be applied to reveal deadlocks.

See also

Related Research Articles

In computer science, static program analysis is the analysis of computer programs performed without executing them, in contrast with dynamic program analysis, which is performed on programs during their execution.

Software testing is the act of examining the artifacts and the behavior of the software under test by validation and verification. Software testing can also provide an objective, independent view of the software to allow the business to appreciate and understand the risks of software implementation. Test techniques include, but are not necessarily limited to:

<span class="mw-page-title-main">Race condition</span> When a systems behavior depends on timing of uncontrollable events

A race condition or race hazard is the condition of an electronics, software, or other system where the system's substantive behavior is dependent on the sequence or timing of other uncontrollable events. It becomes a bug when one or more of the possible behaviors is undesirable.

<span class="mw-page-title-main">Model-based testing</span>

Model-based testing is an application of model-based design for designing and optionally also executing artifacts to perform software testing or system testing. Models can be used to represent the desired behavior of a system under test (SUT), or to represent testing strategies and a test environment. The picture on the right depicts the former approach.

<span class="mw-page-title-main">Fuzzing</span> Automated software testing technique

In programming and software development, fuzzing or fuzz testing is an automated software testing technique that involves providing invalid, unexpected, or random data as inputs to a computer program. The program is then monitored for exceptions such as crashes, failing built-in code assertions, or potential memory leaks. Typically, fuzzers are used to test programs that take structured inputs. This structure is specified, e.g., in a file format or protocol and distinguishes valid from invalid input. An effective fuzzer generates semi-valid inputs that are "valid enough" in that they are not directly rejected by the parser, but do create unexpected behaviors deeper in the program and are "invalid enough" to expose corner cases that have not been properly dealt with.

Java Pathfinder (JPF) is a system to verify executable Java bytecode programs. JPF was developed at the NASA Ames Research Center and open sourced in 2005. The acronym JPF is not to be confused with the unrelated Java Plugin Framework project.

Runtime verification is a computing system analysis and execution approach based on extracting information from a running system and using it to detect and possibly react to observed behaviors satisfying or violating certain properties. Some very particular properties, such as datarace and deadlock freedom, are typically desired to be satisfied by all systems and may be best implemented algorithmically. Other properties can be more conveniently captured as formal specifications. Runtime verification specifications are typically expressed in trace predicate formalisms, such as finite state machines, regular expressions, context-free patterns, linear temporal logics, etc., or extensions of these. This allows for a less ad-hoc approach than normal testing. However, any mechanism for monitoring an executing system is considered runtime verification, including verifying against test oracles and reference implementations. When formal requirements specifications are provided, monitors are synthesized from them and infused within the system by means of instrumentation. Runtime verification can be used for many purposes, such as security or safety policy monitoring, debugging, testing, verification, validation, profiling, fault protection, behavior modification, etc. Runtime verification avoids the complexity of traditional formal verification techniques, such as model checking and theorem proving, by analyzing only one or a few execution traces and by working directly with the actual system, thus scaling up relatively well and giving more confidence in the results of the analysis, at the expense of less coverage. Moreover, through its reflective capabilities runtime verification can be made an integral part of the target system, monitoring and guiding its execution during deployment.

In computer programming jargon, a heisenbug is a software bug that seems to disappear or alter its behavior when one attempts to study it. The term is a pun on the name of Werner Heisenberg, the physicist who first asserted the observer effect of quantum mechanics, which states that the act of observing a system inevitably alters its state. In electronics, the traditional term is probe effect, where attaching a test probe to a device changes its behavior.

Dynamic program analysis is analysis of computer software that involves executing the program in question. Dynamic program analysis includes familiar techniques from software engineering such as unit testing, debugging, and measuring code coverage, but also includes lesser-known techniques like program slicing and invariant inference. Dynamic program analysis is widely applied in security in the form of runtime memory error detection, fuzzing, dynamic symbolic execution, and taint tracking.

A software regression is a type of software bug where a feature that has worked before stops working. This may happen after changes are applied to the software's source code, including the addition of new features and bug fixes. They may also be introduced by changes to the environment in which the software is running, such as system upgrades, system patching or a change to daylight saving time. A software performance regression is a situation where the software still functions correctly, but performs more slowly or uses more memory or resources than before. Various types of software regressions have been identified in practice, including the following:

Manual testing is the process of manually testing software for defects. It requires a tester to play the role of an end user where by they use most of the application's features to ensure correct behaviour. To guarantee completeness of testing, the tester often follows a written test plan that leads them through a set of important test cases.

In computer programming and software development, debugging is the process of finding and resolving bugs within computer programs, software, or systems.

Jinx was a concurrency debugger that deterministically controlled the interleaving of workloads across processor cores, focusing on shared memory interactions. Using this deterministic approach, Jinx aimed to increase the frequency of occurrence of elusive shared memory bugs, sometimes called Heisenbugs. Jinx is no longer available. Corensic, the company that was developing Jinx, was bought by F5 Networks and the Jinx project was cancelled.

Astrée is a static analyzer based on abstract interpretation. It analyzes programs written in the C and C++ programming languages and outputs an exhaustive list of possible runtime errors and assertion violations. The defect classes covered include divisions by zero, buffer overflows, dereferences of null or dangling pointers, data races, deadlocks, etc. Astrée includes a static taint checker and helps finding cybersecurity vulnerabilities, such as Spectre.

For several years parallel hardware was only available for distributed computing but recently it is becoming available for the low end computers as well. Hence it has become inevitable for software programmers to start writing parallel applications. It is quite natural for programmers to think sequentially and hence they are less acquainted with writing multi-threaded or parallel processing applications. Parallel programming requires handling various issues such as synchronization and deadlock avoidance. Programmers require added expertise for writing such applications apart from their expertise in the application domain. Hence programmers prefer to write sequential code and most of the popular programming languages support it. This allows them to concentrate more on the application. Therefore, there is a need to convert such sequential applications to parallel applications with the help of automated tools. The need is also non-trivial because large amount of legacy code written over the past few decades needs to be reused and parallelized.

High performance computing applications run on massively parallel supercomputers consist of concurrent programs designed using multi-threaded, multi-process models. The applications may consist of various constructs with varying degree of parallelism. Although high performance concurrent programs use similar design patterns, models and principles as that of sequential programs, unlike sequential programs, they typically demonstrate non-deterministic behavior. The probability of bugs increases with the number of interactions between the various parallel constructs. Race conditions, data races, deadlocks, missed signals and live lock are common error types.

Real-time testing is the process of testing real-time computer systems.

<span class="mw-page-title-main">ThreadSafe</span>

ThreadSafe is a source code analysis tool that identifies application risks and security vulnerabilities associated with concurrency in Java code bases, using whole-program interprocedural analysis. ThreadSafe is used to identify and avoid software failures in concurrent applications running in complex environments.

A software map represents static, dynamic, and evolutionary information of software systems and their software development processes by means of 2D or 3D map-oriented information visualization. It constitutes a fundamental concept and tool in software visualization, software analytics, and software diagnosis. Its primary applications include risk analysis for and monitoring of code quality, team activity, or software development progress and, generally, improving effectiveness of software engineering with respect to all related artifacts, processes, and stakeholders throughout the software engineering process and software maintenance.

Static application security testing (SAST) is used to secure software by reviewing the source code of the software to identify sources of vulnerabilities. Although the process of statically analyzing the source code has existed as long as computers have existed, the technique spread to security in the late 90s and the first public discussion of SQL injection in 1998 when Web applications integrated new technologies like JavaScript and Flash.

References

  1. 1 2 Wang, Chao; Said, Mahmoud; Gupta, Aarti (21–28 May 2011). Coverage guided systematic concurrency testing. ICSE '11 Proceedings of the 33rd International Conference on Software Engineering. Waikiki. pp. 221–230.
  2. 1 2 Dustin, Elfriede (28 December 2002). Effective Software Testing: 50 Ways to Improve Your Software Testing. Addison-Wesley Longman. p. 186. ISBN   0201794292.
  3. Leiner, A.L.; Notz, W.A.; Smith, J.L.; Weinberger, A. (July 1959). "PILOT—A New Multiple Computer System". Journal of the ACM. 6 (3): 313–335. doi: 10.1145/320986.320987 . S2CID   19867617.
  4. Dijkstra, Edsger W. (May 1968). "The structure of the "THE"-multiprogramming system". Communications of the ACM. 11 (5): 341–346. doi: 10.1145/363095.363143 . S2CID   2021311.
  5. "Concurrent Software Testing: A Systematic Review" (PDF). Archived from the original on 24 September 2015. Retrieved 4 March 2014.{{cite web}}: CS1 maint: bot: original URL status unknown (link)
  6. 1 2 3 Binder, Robert V. (1999). Testing object-oriented systems: models, patterns, and tools . Addison-Wesley Longman. ISBN   0-201-80938-9.
  7. Melo, Silvana Morita; Souza, Simone do Rocio Senger de; Souza, Paulo Sérgio Lopes de; Carver, Jeffrey C. (2017). How to test your concurrent software: an approach for the selection of testing techniques. Conference on Systems, Programming, Languages, and Applications: Software for Humanity - SPLASH.
  8. 1 2 3 K.C., Tai (20–22 September 1989). Testing of concurrent software. Proceedings of the Thirteenth Annual International Computer Software & Applications Conference. Orlando, FL, USA, USA. pp. 62–64.
  9. 1 2 Hwang, Gwan-Hwan; Tai, Kuo-Chung; Huang, Ting-Lu (1995). "Reachability Testing: An Approach To Testing Concurrent Software". International Journal of Software Engineering and Knowledge Engineering. 5 (4): 493–510. doi:10.1142/S0218194095000241.
  10. Qi, Xiaofang; Li, Yueran (23–24 November 2018). Parallel Reachability Testing Based on Hadoop MapReduce. th International Conference, SATE 2018. Shenzhen, Guangdong, China. pp. 173–184. doi:10.1007/978-3-030-04272-1_11.
  11. 1 2 Lu, Shan; Park, Soyeon; Seo, Eunsoo; Zhou, Yuanyuan (1–5 March 2008). Learning from mistakes: a comprehensive study on real world concurrency bug characteristics. ASPLOS XIII Proceedings of the 13th international conference on Architectural support for programming languages and operating systems. Seattle, WA, USA. pp. 329–339.
  12. Artho, Cyrille; Biere, Armin (27–28 August 2001). Applying static analysis to large-scale, multi-threaded Java programs. Proceedings 2001 Australian Software Engineering Conference. Canberra, ACT, Australia, Australia. pp. 68–75.
  13. Manzoor, Numan; Munir, Hussan; Moayyed, Misagh (27–30 November 2012). Comparison of Static Analysis Tools for Finding Concurrency Bugs. 2012 IEEE 23rd International Symposium on Software Reliability Engineering Workshops. Dallas, TX, USA. pp. 129–133.

General References