Graphical user interface testing

Last updated January 04, 2022

In software engineering, graphical user interface testing is the process of testing a product's graphical user interface (GUI) to ensure it meets its specifications. This is normally done through the use of a variety of test cases.

Test case generation

To generate a set of test cases, test designers attempt to cover all the functionality of the system and fully exercise the GUI itself. The difficulty in accomplishing this task is twofold: to deal with domain size and sequences. In addition, the tester faces more difficulty when they have to do regression testing.

Unlike a CLI (command-line interface) system, a GUI may have additional operations that need to be tested. A relatively small program such as Microsoft WordPad has 325 possible GUI operations.^[1] In a large program, the number of operations can easily be an order of magnitude larger.

The second problem is the sequencing problem. Some functionality of the system may only be accomplished with a sequence of GUI events. For example, to open a file a user may first have to click on the File Menu, then select the Open operation, use a dialog box to specify the file name, and focus the application on the newly opened window. Increasing the number of possible operations increases the sequencing problem exponentially. This can become a serious issue when the tester is creating test cases manually.

Regression testing is often a challenge with GUIs as well. A GUI may change significantly, even though the underlying application does not. A test designed to follow a certain path through the GUI may then fail since a button, menu item, or dialog may have changed location or appearance.

These issues have driven the GUI testing problem domain towards automation. Many different techniques have been proposed to automatically generate test suites that are complete and that simulate user behavior.

Most of the testing techniques attempt to build on those previously used to test CLI programs, but these can have scaling problems when applied to GUIs. For example, Finite State Machine-based modeling^[2]^[3] — where a system is modeled as a finite state machine and a program is used to generate test cases that exercise all states — can work well on a system that has a limited number of states but may become overly complex and unwieldy for a GUI (see also model-based testing).

Planning and artificial intelligence

A novel approach to test suite generation, adapted from a CLI technique^[4] involves using a planning system.^[5] Planning is a well-studied technique from the artificial intelligence (AI) domain that attempts to solve problems that involve four parameters:

an initial state,
a goal state,
a set of operators, and
a set of objects to operate on.

Planning systems determine a path from the initial state to the goal state by using the operators. As a simple example of a planning problem, given two words and a single operation which replaces a single letter in a word with another, the goal might be to change one word into another.

In ^[1] the authors used the planner IPP^[6] to demonstrate this technique. The system's UI is first analyzed to determine the possible operations. These become the operators used in the planning problem. Next an initial system state is determined, and a goal state is specified that the tester feels would allow exercising of the system. The planning system determines a path from the initial state to the goal state, which becomes the test plan.

Using a planner to generate the test cases has some specific advantages over manual generation. A planning system, by its very nature, generates solutions to planning problems in a way that is very beneficial to the tester:

The plans are always valid. The output of the system is either a valid and correct plan that uses the operators to attain the goal state or no plan at all. This is beneficial because much time can be wasted when manually creating a test suite due to invalid test cases that the tester thought would work but didn’t.
A planning system pays attention to order. Often to test a certain function, the test case must be complex and follow a path through the GUI where the operations are performed in a specific order. When done manually, this can lead to errors and also can be quite difficult and time-consuming to do.
Finally, and most importantly, a planning system is goal oriented. The tester is focusing test suite generation on what is most important, testing the functionality of the system.

When manually creating a test suite, the tester is more focused on how to test a function (i. e. the specific path through the GUI). By using a planning system, the path is taken care of and the tester can focus on what function to test. An additional benefit of this is that a planning system is not restricted in any way when generating the path and may often find a path that was never anticipated by the tester. This problem is a very important one to combat.^[7]

Another method of generating GUI test cases simulates a novice user. An expert user of a system tends to follow a direct and predictable path through a GUI, whereas a novice user would follow a more random path. A novice user is then likely to explore more possible states of the GUI than an expert.

The difficulty lies in generating test suites that simulate ‘novice’ system usage. Using genetic algorithms have been proposed to solve this problem.^[7] Novice paths through the system are not random paths. First, a novice user will learn over time and generally won’t make the same mistakes repeatedly, and, secondly, a novice user is following a plan and probably has some domain or system knowledge.

Genetic algorithms work as follows: a set of ‘genes’ are created randomly and then are subjected to some task. The genes that complete the task best are kept and the ones that don’t are discarded. The process is again repeated with the surviving genes being replicated and the rest of the set filled in with more random genes. Eventually one gene (or a small set of genes if there is some threshold set) will be the only gene in the set and is naturally the best fit for the given problem.

In the case of GUI testing, the method works as follows. Each gene is essentially a list of random integer values of some fixed length. Each of these genes represents a path through the GUI. For example, for a given tree of widgets, the first value in the gene (each value is called an allele) would select the widget to operate on, the following alleles would then fill in input to the widget depending on the number of possible inputs to the widget (for example a pull down list box would have one input…the selected list value). The success of the genes are scored by a criterion that rewards the best ‘novice’ behavior.

A system to do this testing for the X window system, but extensible to any windowing system is described in.^[7] The X Window system provides functionality (via XServer and the editors' protocol) to dynamically send GUI input to and get GUI output from the program without directly using the GUI. For example, one can call XSendEvent() to simulate a click on a pull-down menu, and so forth. This system allows researchers to automate the gene creation and testing so for any given application under test, a set of novice user test cases can be created.

Running the test cases

At first the strategies were migrated and adapted from the CLI testing strategies.

Mouse position capture

A popular method used in the CLI environment is capture/playback. Capture playback is a system where the system screen is “captured” as a bitmapped graphic at various times during system testing. This capturing allowed the tester to “play back” the testing process and compare the screens at the output phase of the test with expected screens. This validation could be automated since the screens would be identical if the case passed and different if the case failed.

Using capture/playback worked quite well in the CLI world but there are significant problems when one tries to implement it on a GUI-based system.^[8] The most obvious problem one finds is that the screen in a GUI system may look different while the state of the underlying system is the same, making automated validation extremely difficult. This is because a GUI allows graphical objects to vary in appearance and placement on the screen. Fonts may be different, window colors or sizes may vary but the system output is basically the same. This would be obvious to a user, but not obvious to an automated validation system.

Event capture

To combat this and other problems, testers have gone ‘under the hood’ and collected GUI interaction data from the underlying windowing system.^[9] By capturing the window ‘events’ into logs the interactions with the system are now in a format that is decoupled from the appearance of the GUI. Now, only the event streams are captured. There is some filtering of the event streams necessary since the streams of events are usually very detailed and most events aren’t directly relevant to the problem. This approach can be made easier by using an MVC architecture for example and making the view (i. e. the GUI here) as simple as possible while the model and the controller hold all the logic. Another approach is to use the software's built-in assistive technology, to use an HTML interface or a three-tier architecture that makes it also possible to better separate the user interface from the rest of the application.

Another way to run tests on a GUI is to build a driver into the GUI so that commands or events can be sent to the software from another program.^[7] This method of directly sending events to and receiving events from a system is highly desirable when testing, since the input and output testing can be fully automated and user error is eliminated.

Related Research Articles

The graphical user interface is a form of user interface that allows users to interact with electronic devices through graphical icons and audio indicator such as primary notation, instead of text-based user interfaces, typed command labels or text navigation. GUIs were introduced in reaction to the perceived steep learning curve of command-line interfaces (CLIs), which require commands to be typed on a computer keyboard.

Software testing is the act of examining the artifacts and the behavior of the software under test by validation and verification. Software testing can also provide an objective, independent view of the software to allow the business to appreciate and understand the risks of software implementation. Test techniques include, but not necessarily limited to:

In computing, cross-platform software is computer software that is designed to work in several computing platforms. Some cross-platform software requires a separate build for each platform, but some can be directly run on any platform without special preparation, being written in an interpreted language or compiled to portable bytecode for which the interpreters or run-time packages are common or standard components of all supported platforms.

A widget toolkit, widget library, GUI toolkit, or UX library is a library or a collection of libraries containing a set of graphical control elements used to construct the graphical user interface (GUI) of programs.

The Standard Widget Toolkit (SWT) is a graphical widget toolkit for use with the Java platform. It was originally developed by Stephen Northover at IBM and is now maintained by the Eclipse Foundation in tandem with the Eclipse IDE. It is an alternative to the Abstract Window Toolkit (AWT) and Swing Java graphical user interface (GUI) toolkits provided by Sun Microsystems as part of the Java Platform, Standard Edition (J2SE).

The FOX toolkit is an open-source, cross-platform widget toolkit, i.e. a library of basic elements for building a graphical user interface (GUI). FOX stands for Free Objects for X.

In software testing, test automation is the use of software separate from the software being tested to control the execution of tests and the comparison of actual outcomes with predicted outcomes. Test automation can automate some repetitive but necessary tasks in a formalized testing process already in place, or perform additional testing that would be difficult to do manually. Test automation is critical for continuous delivery and continuous testing.

Expect is an extension to the Tcl scripting language written by Don Libes. The program automates interactions with programs that expose a text terminal interface. Expect, originally written in 1990 for the Unix platform, has since become available for Microsoft Windows and other systems.

A graphical user interface builder, also known as GUI designer, is a software development tool that simplifies the creation of GUIs by allowing the designer to arrange graphical control elements using a drag-and-drop WYSIWYG editor. Without a GUI builder, a GUI must be built by manually specifying each widget's parameters in source-code, with no visual feedback until the program is run.

In software engineering, a test case is a specification of the inputs, execution conditions, testing procedure, and expected results that define a single test to be executed to achieve a particular software testing objective, such as to exercise a particular program path or to verify compliance with a specific requirement. Test cases underlie testing that is methodical rather than haphazard. A battery of test cases can be built to produce the desired coverage of the software being tested. Formally defined test cases allow the same tests to be run repeatedly against successive versions of the software, allowing for effective and consistent regression testing.

The Magic User Interface is an object-oriented system by Stefan Stuntz to generate and maintain graphical user interfaces. With the aid of a preferences program, the user of an application has the ability to customize the system according to personal taste.

Shell (computing) Computer program which exposes an operating systems services to a human user or other programs

In computing, a shell is a computer program which exposes an operating system's services to a human user or other programs. In general, operating system shells use either a command-line interface (CLI) or graphical user interface (GUI), depending on a computer's role and particular operation. It is named a shell because it is the outermost layer around the operating system.

ARINC 661 is a standard which aims to normalize the definition of a Cockpit Display System (CDS), and the communication between the CDS and User Applications (UA) which manage aircraft avionics functions. The GUI definition is completely defined in binary Definition Files (DF).

QF-Test from Quality First Software is a cross-platform software tool for the GUI test automation specialized on Java/Swing, SWT, Eclipse plug-ins and RCP applications, Java applets, Java Web Start, ULC and cross-browser test automation of static and dynamic web-based applications. Version 4.0 added Windows support for the Web browser Chrome, support for JavaFX and the AJAX frameworks jQuery UI and jQueryEasyUI were added.

Manual testing is the process of manually testing software for defects. It requires a tester to play the role of an end user whereby they use most of the application's features to ensure correct behavior. To guarantee completeness of testing, the tester often follows a written test plan that leads them through a set of important test cases.

In software development, the V-model represents a development process that may be considered an extension of the waterfall model, and is an example of the more general V-model. Instead of moving down in a linear way, the process steps are bent upwards after the coding phase, to form the typical V shape. The V-Model demonstrates the relationships between each phase of the development life cycle and its associated phase of testing. The horizontal and vertical axes represent time or project completeness (left-to-right) and level of abstraction, respectively.

Micro Focus Unified Functional Testing (UFT), formerly known as QuickTest Professional (QTP), is software that provides functional and regression test automation for software applications and environments.

A command-line interface (CLI) processes commands to a computer program in the form of lines of text. The program which handles the interface is called a command-line interpreter or command-line processor. Operating systems implement a command-line interface in a shell for interactive access to operating system functions or services. Such access was primarily provided to users by computer terminals starting in the mid-1960s, and continued to be used throughout the 1970s and 1980s on VAX/VMS, Unix systems and personal computer systems including DOS, CP/M and Apple DOS.

This article discusses a set of tactics useful in software testing. It is intended as a comprehensive list of tactical approaches to Software Quality Assurance (more widely colloquially known as Quality Assurance and general application of the test method.

Katalon Studio is an automation testing software tool developed by Katalon, Inc. The software is built on top of the open-source automation frameworks Selenium, Appium with a specialized IDE interface for web, API, mobile and desktop application testing. Its initial release for internal use was in January 2015. Its first public release was in September 2016. In 2018, the software acquired 9% of market penetration for UI test automation, according to The State of Testing 2018 Report by SmartBear.

References

1 2 Atif M. Memon, Martha E. Pollack and Mary Lou Soffa. Using a Goal-driven Approach to Generate Test Cases for GUIs. ICSE '99 Proceedings of the 21st international conference on Software engineering.
↑ J.M. Clarke. Automated test generation from a Behavioral Model. In Proceedings of Pacific Northwest Software Quality Conference. IEEE Press, May 1998.
↑ S. Esmelioglu and L. Apfelbaum. Automated Test generation, execution and reporting. In Proceedings of Pacific Northwest Software Quality Conference. IEEE Press, October 1997.
↑ A. Howe, A. von Mayrhauser and R. T. Mraz. Test case generation as an AI planning problem. Automated Software Engineering, 4:77-106, 1997.
↑ Atif M. Memon, Martha E. Pollack, and Mary Lou Soffa. Hierarchical GUI Test Case Generation Using Automated Planning. IEEE Trans. Softw. Eng., vol. 27, no. 2, 2001, pp. 144-155, IEEE Press.
↑ J. Koehler, B. Nebel, J. Hoffman and Y. Dimopoulos. Extending planning graphs to an ADL subset. Lecture Notes in Computer Science, 1348:273, 1997.
1 2 3 4 D. J. Kasik and H. G. George. Toward automatic generation of novice user test scripts. In M. J. Tauber, V. Bellotti, R. Jeffries, J. D. Mackinlay, and J. Nielsen, editors, Proceedings of the Conference on Human Factors in Computing Systems: Common Ground, pages 244-251, New York, 13–18 April 1996, ACM Press.
↑ L.R. Kepple. The black art of GUI testing. Dr. Dobb’s Journal of Software Tools, 19(2):40, Feb. 1994.
↑ M. L. Hammontree, J. J. Hendrickson and B. W. Hensley. Integrated data capture and analysis tools for research and testing on graphical user interfaces. In P. Bauersfeld, J. Bennett and G. Lynch, editors, Proceedings of the Conference on Human Factors in Computing System, pages 431-432, New York, NY, USA, May 1992. ACM Press.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[Memon-Pollack-Soffa-1999-1] 1 2 Atif M. Memon, Martha E. Pollack and Mary Lou Soffa. Using a Goal-driven Approach to Generate Test Cases for GUIs. ICSE '99 Proceedings of the 21st international conference on Software engineering.

[Clarke-2] J.M. Clarke. Automated test generation from a Behavioral Model. In Proceedings of Pacific Northwest Software Quality Conference. IEEE Press, May 1998.

[Esmelioglu-Apfelbaum-3] S. Esmelioglu and L. Apfelbaum. Automated Test generation, execution and reporting. In Proceedings of Pacific Northwest Software Quality Conference. IEEE Press, October 1997.

[Howe-vonMayrhauser-Mraz-4] A. Howe, A. von Mayrhauser and R. T. Mraz. Test case generation as an AI planning problem. Automated Software Engineering, 4:77-106, 1997.

[Memon-Pollack-Soffa-2001-5] Atif M. Memon, Martha E. Pollack, and Mary Lou Soffa. Hierarchical GUI Test Case Generation Using Automated Planning. IEEE Trans. Softw. Eng., vol. 27, no. 2, 2001, pp. 144-155, IEEE Press.

[Koehler-et-al-6] J. Koehler, B. Nebel, J. Hoffman and Y. Dimopoulos. Extending planning graphs to an ADL subset. Lecture Notes in Computer Science, 1348:273, 1997.

[Kasik-George-7] 1 2 3 4 D. J. Kasik and H. G. George. Toward automatic generation of novice user test scripts. In M. J. Tauber, V. Bellotti, R. Jeffries, J. D. Mackinlay, and J. Nielsen, editors, Proceedings of the Conference on Human Factors in Computing Systems: Common Ground, pages 244-251, New York, 13–18 April 1996, ACM Press.

[Kepple-8] L.R. Kepple. The black art of GUI testing. Dr. Dobb’s Journal of Software Tools, 19(2):40, Feb. 1994.

[Hammontree-Hendrickson-Hensley-9] M. L. Hammontree, J. J. Hendrickson and B. W. Hensley. Integrated data capture and analysis tools for research and testing on graphical user interfaces. In P. Bauersfeld, J. Bennett and G. Lynch, editors, Proceedings of the Conference on Human Factors in Computing System, pages 431-432, New York, NY, USA, May 1992. ACM Press.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

v t e Software testing
The "box" approach	Black-box testing All-pairs testing Exploratory testing Fuzz testing Model-based testing Scenario testing Grey-box testing White-box testing API testing Mutation testing Static testing
Testing levels	Acceptance testing Integration testing System testing Unit testing
Testing types, techniques, and tactics	A/B testing Benchmark Compatibility testing Concolic testing Concurrent testing Conformance testing Continuous testing Destructive testing Development testing Dynamic program analysis Installation testing Regression testing Security testing Smoke testing (software) Software performance testing Symbolic execution Usability testing
See also	Graphical user interface testing Manual testing Orthogonal array testing Pair testing Soak testing Software reliability testing Stress testing Web testing