Software reliability testing

Last updated

Software reliability testing is a field of software-testing that relates to testing a software's ability to function, given environmental conditions, for a particular amount of time. Software reliability testing helps discover many problems in the software design and functionality.

Contents

Overview

Software reliability is the probability that software will work properly in a specified environment and for a given amount of time. Using the following formula, the probability of failure is calculated by testing a sample of all available input states. Mean Time Between Failure(MTBF)=Mean Time To Failure(MTTF)+ Mean Time To Repair(MTTR)

Probability = Number of failing cases / Total number of cases under consideration

The set of all possible input states is called the input space. To find reliability of software, we need to find output space from given input space and software. [1]

For reliability testing, data is gathered from various stages of development, such as the design and operating stages. The tests are limited due to restrictions such as cost and time restrictions. Statistical samples are obtained from the software products to test for the reliability of the software. Once sufficient data or information is gathered, statistical studies are done. Time constraints are handled by applying fixed dates or deadlines for the tests to be performed. After this phase, design of the software is stopped and the actual implementation phase starts. As there are restrictions on costs and time, the data is gathered carefully so that each data has some purpose and gets its expected precision. [2] To achieve the satisfactory results from reliability testing one must take care of some reliability characteristics. For example, Mean Time to Failure (MTTF) [3] is measured in terms of three factors:

  1. operating time,
  2. number of on off cycles,
  3. and calendar time.

If the restrictions are on operation time or if the focus is on first point for improvement, then one can apply compressed time accelerations to reduce the testing time. If the focus is on calendar time (i.e. if there are predefined deadlines), then intensified stress testing is used. [2] [4]

Measurement

Software availability is measured in terms of mean time between failures (MTBF). [5]

MTBF consists of mean time to failure (MTTF) and mean time to repair (MTTR). MTTF is the difference of time between two consecutive failures and MTTR is the time required to fix the failure. [6]

Steady state availability represents the percentage the software is operational.

For example, if MTTF = 1000 hours for a software, then the software should work for 1000 hours of continuous operations.

For the same software if the MTTR = 2 hours, then the .

Accordingly,

Software reliability is measured in terms of failure rate ().

Reliability for software is a number between 0 and 1. Reliability increases when errors or bugs from the program are removed. [7] There are many software reliability growth models (SRGM) (List of software reliability models) including, logarithmic, polynomial, exponential, power, and S-shaped

Objectives of reliability testing

The main objective of the reliability testing is to test software performance under given conditions without any type of corrective measure using known fixed procedures considering its specifications.

Secondary objectives

The secondary objectives of reliability testing is:

  1. To find perceptual structure of repeating failures.
  2. To find the number of failures occurring in a specified amount of time.
  3. To find the mean life of the software.
  4. To discover the main cause of failure.
  5. Checking the performance of different units of software after taking preventive actions.

Points for defining objectives

Some restrictions on creating objectives include:

  1. Behaviour of the software should be defined in given conditions.
  2. The objective should be feasible.
  3. Time constraints should be provided. [8]

Importance of reliability testing

The application of computer software has crossed into many different fields, with software being an essential part of industrial, commercial and military systems. Because of its many applications in safety critical systems, software reliability is now an important research area. Although software engineering is becoming the fastest developing technology of the last century, there is no complete, scientific, quantitative measure to assess them. Software reliability testing is being used as a tool to help assess these software engineering technologies. [9]

To improve the performance of software product and software development process, a thorough assessment of reliability is required. Testing software reliability is important because it is of great use for software managers and practitioners. [10]

To verify the reliability of the software via testing:

  1. A sufficient number of test cases should be executed for a sufficient amount of time to get a reasonable estimate of how long the software will execute without failure. Long duration tests are needed to identify defects (such as memory leakage and buffer overflows) that take time to cause a fault or failure to occur.
  2. The distribution of test cases should match the actual or planned operational profile of the software. The more often a function or subset of the software is executed, the greater the percentage of test cases that should be allocated to that function or subset.

Types of reliability testing

Software reliability testing includes feature testing, load testing, and regression testing. [11]

Feature test

Feature testing checks the features provided by the software and is conducted in the following steps:

The feature test is followed by the load test. [11]

Load test

This test is conducted to check the performance of the software under maximum work load. Any software performs better up to some amount of workload, after which the response time of the software starts degrading. For example, a web site can be tested to see how many simultaneous users it can support without performance degradation. This testing mainly helps for Databases and Application servers. Load testing also requires software performance testing, which checks how well some software performs under workload. [11]

Regression test

Regression testing is used to check if any new bugs have been introduced through previous bug fixes. Regression testing is conducted after every change or update in the software features. This testing is periodic, depending on the length and features of the software. [11]

Test planning

Reliability testing is more costly compared to other types of testing. Thus while doing reliability testing, proper management and planning is required. This plan includes testing process to be implemented, data about its environment, test schedule, test points, etc.

Problems in designing test cases

Some common problems that occur when designing test cases include:

Reliability enhancement through testing

Studies during development and design of software help for improving the reliability of a product. Reliability testing is essentially performed to eliminate the failure mode of the software. Life testing of the product should always be done after the design part is finished or at least the complete design is finalized. [12] Failure analysis and design improvement is achieved through testings.

Reliability growth testing

[12] This testing is used to check new prototypes of the software which are initially supposed to fail frequently. The causes of failure are detected and actions are taken to reduce defects. Suppose T is total accumulated time for prototype. n(T) is number of failure from start to time T. The graph drawn for n(T)/T is a straight line. This graph is called Duane Plot. One can get how much reliability can be gained after all other cycles of test and fix it.

solving eq.1 for n(T),

where K is e^b. If the value of alpha in the equation is zero the reliability can not be improved as expected for given number of failure. For alpha greater than zero, cumulative time T increases. This explains that number of the failures doesn't depends on test lengths.

Designing test cases for current release

If new features are being added to the current version of software, then writing a test case for that operation is done differently.

There is a predefined rule to calculate count of new test cases for the software. If N is the probability of occurrence of new operations for new release of the software, R is the probability of occurrence of used operations in the current release and T is the number of all previously used test cases then

Reliability evaluation based on operational testing

The method of operational testing is used to test the reliability of software. Here one checks how the software works in its relevant operational environment. The main problem with this type of evaluation is constructing such an operational environment. Such type of simulation is observed in some industries like nuclear industries, in aircraft, etc. Predicting future reliability is a part of reliability evaluation.

There are two techniques used for operational testing to test the reliability of software:

Steady state reliability estimation
In this case, we use feedback from delivered software products. Depending on those results, we can predict the future reliability for the next version of product. This is similar to sample testing for physical products.
Reliability growth based prediction
This method uses documentation of the testing procedure. For example, consider a developed software and that we are creating different new versions of that software. We consider data on the testing of each version and based on the observed trend, we predict the reliability of the new version of software. [13]

Reliability growth assessment and prediction

In the assessment and prediction of software reliability, we use the reliability growth model. During operation of the software, any data about its failure is stored in statistical form and is given as input to the reliability growth model. Using this data, the reliability growth model can evaluate the reliability of software.

Much data about reliability growth model is available with probability models claiming to represent failure process. But there is no model which is best suited for all conditions. Therefore, we must choose a model based on the appropriate conditions.

Reliability estimation based on failure-free working

In this case, the reliability of the software is estimated with assumptions like the following:

See also

Related Research Articles

The weighted arithmetic mean is similar to an ordinary arithmetic mean, except that instead of each of the data points contributing equally to the final average, some data points contribute more than others. The notion of weighted mean plays a role in descriptive statistics and also occurs in a more general form in several other areas of mathematics.

In reliability engineering, the term availability has the following meanings:

Unavailability, in mathematical terms, is the probability that an item will not operate correctly at a given time and under specified conditions. It opposes availability.

<span class="mw-page-title-main">Negative binomial distribution</span> Probability distribution

In probability theory and statistics, the negative binomial distribution is a discrete probability distribution that models the number of failures in a sequence of independent and identically distributed Bernoulli trials before a specified (non-random) number of successes occurs. For example, we can define rolling a 6 on a dice as a success, and rolling any other number as a failure, and ask how many failure rolls will occur before we see the third success. In such a case, the probability distribution of the number of failures that appear will be a negative binomial distribution.

Mean time between failures (MTBF) is the predicted elapsed time between inherent failures of a mechanical or electronic system during normal system operation. MTBF can be calculated as the arithmetic mean (average) time between failures of a system. The term is used for repairable systems while mean time to failure (MTTF) denotes the expected time to failure for a non-repairable system.

<span class="mw-page-title-main">Weibull distribution</span> Continuous probability distribution

In probability theory and statistics, the Weibull distribution is a continuous probability distribution. It models a broad range of random variables, largely in the nature of a time to failure or time between events. Examples are maximum one-day rainfalls and the time a user spends on a web page.

Survival analysis is a branch of statistics for analyzing the expected duration of time until one event occurs, such as death in biological organisms and failure in mechanical systems. This topic is called reliability theory or reliability analysis in engineering, duration analysis or duration modelling in economics, and event history analysis in sociology. Survival analysis attempts to answer certain questions, such as what is the proportion of a population which will survive past a certain time? Of those that survive, at what rate will they die or fail? Can multiple causes of death or failure be taken into account? How do particular circumstances or characteristics increase or decrease the probability of survival?

Annualized failure rate (AFR) gives the estimated probability that a device or component will fail during a full year of use. It is a relation between the mean time between failure (MTBF) and the hours that a number of devices are run per year. AFR is estimated from a sample of like components—AFR and MTBF as given by vendors are population statistics that can not predict the behaviour of an individual unit.

Failure rate is the frequency with which an engineered system or component fails, expressed in failures per unit of time. It is usually denoted by the Greek letter λ (lambda) and is often used in reliability engineering.

Reliability engineering is a sub-discipline of systems engineering that emphasizes the ability of equipment to function without failure. Reliability describes the ability of a system or component to function under stated conditions for a specified period of time. Reliability is closely related to availability, which is typically described as the ability of a component or system to function at a specified moment or interval of time.

In systems engineering and requirements engineering, a non-functional requirement (NFR) is a requirement that specifies criteria that can be used to judge the operation of a system, rather than specific behaviours. They are contrasted with functional requirements that define specific behavior or functions. The plan for implementing functional requirements is detailed in the system design. The plan for implementing non-functional requirements is detailed in the system architecture, because they are usually architecturally significant requirements.

High availability (HA) is a characteristic of a system that aims to ensure an agreed level of operational performance, usually uptime, for a higher than normal period.

Black's Equation is a mathematical model for the mean time to failure (MTTF) of a semiconductor circuit due to electromigration: a phenomenon of molecular rearrangement (movement) in the solid phase caused by an electromagnetic field.

<span class="mw-page-title-main">Probabilistic design</span> Discipline within engineering design

Probabilistic design is a discipline within engineering design. It deals primarily with the consideration and minimization of the effects of random variability upon the performance of an engineering system during the design phase. Typically, these effects studied and optimized are related to quality and reliability. It differs from the classical approach to design by assuming a small probability of failure instead of using the safety factor. Probabilistic design is used in a variety of different applications to assess the likelihood of failure. Disciplines which extensively use probabilistic design principles include product design, quality control, systems engineering, machine design, civil engineering and manufacturing.

A prediction of reliability is an important element in the process of selecting equipment for use by telecommunications service providers and other buyers of electronic equipment, and it is essential during the design stage of engineering systems life cycle. Reliability is a measure of the frequency of equipment failures as a function of time. Reliability has a major impact on maintenance and repair costs and on the continuity of service.

Availability is the probability that a system will work as required when required during the period of a mission. The mission could be the 18-hour span of an aircraft flight. The mission period could also be the 3 to 15-month span of a military deployment. Availability includes non-operational periods associated with reliability, maintenance, and logistics.

Maintenance Philosophy is the mix of strategies that ensure an item works as expected when needed.

Operational availability in systems engineering is a measurement of how long a system has been available to use when compared with how long it should have been available to be used.

<span class="mw-page-title-main">High-temperature operating life</span> Reliability test applied to integrated circuits

High-temperature operating life (HTOL) is a reliability test applied to integrated circuits (ICs) to determine their intrinsic reliability. This test stresses the IC at an elevated temperature, high voltage and dynamic operation for a predefined period of time. The IC is usually monitored under stress and tested at intermediate intervals. This reliability stress test is sometimes referred to as a lifetime test, device life test or extended burn in test and is used to trigger potential failure modes and assess IC lifetime.

Mean Time to Dangerous Failure. In a safety system MTTFD is the portion of failure modes that can lead to failures that may result in hazards to personnel, environment or equipment.

References

  1. Software Reliability. Hoang Pham.
  2. 1 2 E.E.Lewis. Introduction to Reliability Engineering.
  3. "MTTF".
  4. IEEE Recommended Practice on Software Reliability, IEEE, doi:10.1109/ieeestd.2017.7827907, ISBN   978-1-5044-3648-9
  5. Roger Pressman (1982). Software Engineering A Practitioner's Approach . McGraw Hill.
  6. "Approaches to Reliability Testing & Setting of Reliability Test Objectives".
  7. Aditya P. Mathur. Foundations of Software Testing. Pearson publications.
  8. Reliability and life testing handbook. Dimitri kececioglu.
  9. A Statistical Basis for Software Reliability Assessment. M. xie.
  10. Software Reliability modelling. M. Xie.
  11. 1 2 3 4 5 6 John D. Musa (2004). Software reliability engineering: more reliable software, faster and cheaper. McGraw-Hill. ISBN   0-07-060319-7.
  12. 1 2 E.E.Liwis (1995-11-15). Introduction to Reliability Engineering. ISBN   0-471-01833-3.
  13. 1 2 "Problem of Assessing reliability". CiteSeerX   10.1.1.104.9831 .{{cite web}}: Missing or empty |url= (help)