Stress testing (sometimes called torture testing) is a form of deliberately intense or thorough testing used to determine the stability of a given system or entity. It involves testing beyond normal operational capacity, often to a breaking point, in order to observe the results. Reasons can include:
Failure causes are defects in design, process, quality, or part application, which are the underlying cause of a failure or which initiate a process which leads to failure. Where failure depends on the user of the product or process, then human error must be considered.
Reliability engineers often test items under expected stress or even under accelerated stress in order to determine the operating life of the item or to determine modes of failure.
Reliability engineering is a sub-discipline of systems engineering that emphasizes dependability in the lifecycle management of a product. Dependability, or reliability, describes the ability of a system or component to function under stated conditions for a specified period of time. Reliability is closely related to availability, which is typically described as the ability of a component or system to function at a specified moment or interval of time.
The term "stress" may have a more specific meaning in certain industries, such as material sciences, and therefore stress testing may sometimes have a technical meaning – one example is in fatigue testing for materials.
In continuum mechanics, stress is a physical quantity that expresses the internal forces that neighbouring particles of a continuous material exert on each other, while strain is the measure of the deformation of the material which is not a physical quantity. For example, when a solid vertical bar is supporting an overhead weight, each particle in the bar pushes on the particles immediately below it. When a liquid is in a closed container under pressure, each particle gets pushed against by all the surrounding particles. The container walls and the pressure-inducing surface push against them in (Newtonian) reaction. These macroscopic forces are actually the net result of a very large number of intermolecular forces and collisions between the particles in those molecules. Stress is frequently represented by a lowercase Greek letter sigma (σ).
Fatigue testing is a specialised form of mechanical testing that is performed by applying cyclic loading to a coupon or structure. These tests are used either to generate fatigue data, identify critical locations or demonstrate the safety of a structure that may be susceptible to fatigue. Fatigue tests are used on a range components from coupons through to full size test articles such as automobiles and aircraft.
Stress testing, in general, should put computer hardware under exaggerated levels of stress in order to ensure stability when used in a normal environment. These can include extremes of workload, type of task, memory use, thermal load (heat), clock speed, or voltages. Memory and CPU are two components that are commonly stress tested in this way.
There is considerable overlap between stress testing software and benchmarking software, since both seek to assess and measure maximum performance. Of the two, stress testing software aims to test stability by trying to force a system to fail; benchmarking aims to measure and assess the maximum performance possible at a given task or function.
When modifying the operating parameters of a CPU, such as temperature, overclocking, underclocking, overvolting, and undervolting, it may be necessary to verify if the new parameters (usually CPU core voltage and frequency) are suitable for heavy CPU loads. This is done by running a CPU-intensive program for extended periods of time, to test whether the computer hangs or crashes. CPU stress testing is also referred to as torture testing. Software that is suitable for torture testing should typically run instructions that utilise the entire chip rather than only a few of its units. Stress testing a CPU over the course of 24 hours at 100% load is, in most cases, sufficient to determine that the CPU will function correctly in normal usage scenarios such as in a desktop computer, where CPU usage typically fluctuates at low levels (50% and under).
A central processing unit (CPU), also called a central processor or main processor, is the electronic circuitry within a computer that carries out the instructions of a computer program by performing the basic arithmetic, logic, controlling, and input/output (I/O) operations specified by the instructions. The computer industry has used the term "central processing unit" at least since the early 1960s. Traditionally, the term "CPU" refers to a processor, more specifically to its processing unit and control unit (CU), distinguishing these core elements of a computer from external components such as main memory and I/O circuitry.
Temperature is a physical quantity expressing hot and cold. It is measured with a thermometer calibrated in one or more temperature scales. The most commonly used scales are the Celsius scale, Fahrenheit scale, and Kelvin scale. The kelvin is the unit of temperature in the International System of Units (SI). The Kelvin scale is widely used in science and technology.
In computing, overclocking is the practice of increasing the clock rate of a computer to exceed that certified by the manufacturer. Commonly operating voltage is also increased to maintain a component's operational stability at accelerated speeds. Semiconductor devices operated at higher frequencies and voltages increase power consumption and heat. An overclocked device may be unreliable or fail completely if the additional heat load is not removed or power delivery components cannot meet increased power demands. Many device warranties state that overclocking and/or over-specification voids any warranty.
Hardware stress testing and stability are subjective and may vary according to how the system will be used. A stress test for a system running 24/7 or that will perform error sensitive tasks such as distributed computing or "folding" projects may differ from one that needs to be able to run a single game with reasonably reliability. For example, a comprehensive guide on overclocking Sandy Bridge found that:
Distributed computing is a field of computer science that studies distributed systems. A distributed system is a system whose components are located on different networked computers, which communicate and coordinate their actions by passing messages to one another. The components interact with one another in order to achieve a common goal. Three significant characteristics of distributed systems are: concurrency of components, lack of a global clock, and independent failure of components. Examples of distributed systems vary from SOA-based systems to massively multiplayer online games to peer-to-peer applications.
Folding@home is a distributed computing project for disease research that simulates protein folding, computational drug design, and other types of molecular dynamics. The project uses the idle processing resources of thousands of personal computers owned by volunteers who have installed the software on their systems. Its main purpose is to determine the mechanisms of protein folding, which is the process by which proteins reach their final three-dimensional structure, and to examine the causes of protein misfolding. This is of significant academic interest with major implications for medical research into Alzheimer's disease, Huntington's disease, and many forms of cancer, among other diseases. To a lesser extent, Folding@home also tries to predict a protein's final structure and determine how other molecules may interact with it, which has applications in drug design. Folding@home is developed and operated by the Pande Laboratory at Stanford University, under the direction of Prof. Vijay Pande, and is shared by various scientific institutions and research laboratories across the world.
Sandy Bridge is the codename for the microarchitecture used in the "second generation" of the Intel Core processors - the Sandy Bridge microarchitecture is the successor to Nehalem microarchitecture. Intel demonstrated a Sandy Bridge processor in 2009, and released first products based on the architecture in January 2011 under the Core brand.
Even though in the past IntelBurnTest was just as good, it seems that something in the SB uArch [Sandy Bridge microarchitecture] is more heavily stressed with Prime95 ... IBT really does pull more power [make greater thermal demands]. But ... Prime95 failed first every time, and it failed when IBT would pass. So same as Sandy Bridge, Prime95 is a better stability tester for Sandy Bridge-E than IBT/LinX.
Stability is subjective; some might call stability enough to run their game, other like folders [folding projects] might need something that is just as stable as it was at stock, and ... would need to run Prime95 for at least 12 hours to a day or two to deem that stable ... There are [bench testers] who really don't care for stability like that and will just say if it can [complete] a benchmark it is stable enough. No one is wrong and no one is right. Stability is subjective. [But] 24/7 stability is not subjective.
An engineer at ASUS advised in a 2012 article on overclocking an Intel X79 system, that it is important to choose testing software carefully in order to obtain useful results:
Intel Corporation is an American multinational corporation and technology company headquartered in Santa Clara, California, in the Silicon Valley. It is the world's second largest and second highest valued semiconductor chip manufacturer based on revenue after being overtaken by Samsung Electronics, and is the inventor of the x86 series of microprocessors, the processors found in most personal computers (PCs). Intel ranked No. 46 in the 2018 Fortune 500 list of the largest United States corporations by total revenue.
Unvalidated stress tests are not advised (such as Prime95 or LinX or other comparable applications). For high grade CPU/IMC and System Bus testing Aida64 is recommended along with general applications usage like PC Mark 7. Aida has an advantage as its stability test has been designed for the Sandy Bridge E architecture and test specific functions like AES, AVX and other instruction sets that prime and like synthetics do not touch. As such not only does it load the CPU 100% but will also test other parts of CPU not used under applications like Prime 95. Other applications to consider are SiSoft 2012 or Passmark BurnIn. Be advised validation has not been completed using Prime 95 version 26 and LinX (10.3.7.012) and OCCT 4.1.0 beta 1 but once we have internally tested to ensure at least limited support and operation.
In software testing, a system stress test refers to tests that put a greater emphasis on robustness, availability, and error handling under a heavy load, rather than on what would be considered correct behavior under normal circumstances. In particular, the goals of such tests may be to ensure the software does not crash in conditions of insufficient computational resources (such as memory or disk space), unusually high concurrency, or denial of service attacks.
Stress testing may be contrasted with load testing:
In software quality assurance, performance testing is in general a testing practice performed to determine how a system performs in terms of responsiveness and stability under a particular workload. It can also serve to investigate, measure, validate or verify other quality attributes of the system, such as scalability, reliability and resource usage.
The K6-III, code-named "Sharptooth", is an x86 microprocessor manufactured by AMD, released on February 22, 1999, with 400 and 450 MHz models. It was the last Socket 7 desktop processor. For an extremely short time after its release, the fastest available desktop processor from Intel was the Pentium II 450 MHz. However, the K6-III also competed against the Pentium III "Katmai" line, released just days later on February 26. "Katmai" CPUs reached speeds of 500 MHz, slightly faster than the K6-III 450 MHz. K6-III performance was significantly improved over the K6-2 due to the addition of an on-die L2 cache running at full clock speed. When equipped with a 1MB L3 cache on the motherboard the 400 and 450 MHz K6-IIIs is claimed by Ars Technica to often outperform the more expensive Pentium III "Katmai" 450- and 500-MHz models, respectively.
Prime95, also distributed as a command-line utility mprime under FreeBSD and Linux, is a freeware application written by George Woltman. It is used by Great Internet Mersenne Prime Search (GIMPS), a distributed computing project dedicated to Mersenne prime hunting. In overclocking circles, its also commonly used for stability testing.
A northbridge or host bridge is one of the two chips in the core logic chipset architecture on a PC motherboard, the other being the southbridge. Unlike the southbridge, northbridge is connected directly to the CPU via the front-side bus (FSB) and is thus responsible for tasks that require the highest performance. The northbridge, also known as Memory Controller Hub, is usually paired with a southbridge. In systems where they are included, these two chips manage communications between the CPU and other parts of the motherboard, and constitute the core logic chipset of the PC motherboard.
A power virus is a computer program that executes specific machine code to reach the maximum CPU power dissipation. Computer cooling apparatus are designed to dissipate power up to the thermal design power, rather than maximum power, and a power virus could cause the system to overheat if it does not have logic to stop the processor. This may cause permanent physical damage. Power viruses can be malicious, but are often suites of test software used for integration testing and thermal testing of computer components during the design phase of a product, or for product benchmarking.
Super PI is a computer program that calculates pi to a specified number of digits after the decimal point—up to a maximum of 32 million. It uses Gauss–Legendre algorithm and is a Windows port of the program used by Yasumasa Kanada in 1995 to compute pi to 232 digits.
SuperPrime is a computer program used for calculating the primality of a large set of positive natural numbers. Because of its multi-threaded nature and dynamic load scheduling, it scales excellently when using more than one thread. It is commonly used as an overclocking benchmark to test the speed and stability of a system.
Dynamic frequency scaling is a technique in computer architecture whereby the frequency of a microprocessor can be automatically adjusted "on the fly" depending on the actual needs, to conserve power and reduce the amount of heat generated by the chip. Dynamic frequency scaling helps preserve battery on mobile devices and decrease cooling cost and noise on quiet computing settings, or can be useful as a security measure for overheated systems. Dynamic frequency scaling is used in all ranges of computing systems, ranging from mobile systems to data centers to reduce the power at the times of low workload.
Haswell is the codename for a processor microarchitecture developed by Intel as the "fourth-generation core" successor to the Ivy Bridge microarchitecture. Intel officially announced CPUs based on this microarchitecture on June 4, 2013, at Computex Taipei 2013, while a working Haswell chip was demonstrated at the 2011 Intel Developer Forum. With Haswell, which uses a 22 nm process, Intel also introduced low-power processors designed for convertible or "hybrid" ultrabooks, designated by the "Y" suffix.
Carry-less Multiplication (CLMUL) is an extension to the x86 instruction set used by microprocessors from Intel and AMD which was proposed by Intel in March 2008 and made available in the Intel Westmere processors announced in early 2010. Mathematically, the instruction implements multiplication of polynomials over the finite field GF(2) where the bitstring represents the polynomial . The CLMUL instruction also allows a more efficient implementation of the closely related multiplication of larger finite fields GF(2k) than the traditional instruction set.
Intel Core is a line of mid- to high-end consumer, workstation, and enthusiast central processing units (CPU) marketed by Intel Corporation. These processors displaced the existing mid- to high-end Pentium processors of the time, moving the Pentium to the entry level, and bumping the Celeron series of processors to the low end. Identical or more capable versions of Core processors are also sold as Xeon processors for the server and workstation markets.
LGA 2011, also called Socket R, is a CPU socket by Intel. Released on November 14, 2011, it replaces Intel's LGA 1366 and LGA 1567 in the performance and high-end desktop and server platforms. The socket has 2011 protruding pins that touch contact points on the underside of the processor.
LGA 1155, also called Socket H2, is a socket used for Intel microprocessors based on Sandy Bridge and Ivy Bridge microarchitectures.
The Intel X79 is a Platform Controller Hub (PCH) designed and manufactured by Intel for their LGA 2011 and LGA 2011-1.
Skylake is the codename used by Intel for a processor microarchitecture that was launched in August 2015 succeeding the Broadwell microarchitecture. Skylake is a microarchitecture redesign using the same 14 nm manufacturing process technology as its predecessor, serving as a "tock" in Intel's "tick–tock" manufacturing and design model. According to Intel, the redesign brings greater CPU and GPU performance and reduced power consumption. Skylake CPUs share its microarchitecture with Kaby Lake, Coffee Lake and Cannon Lake CPUs.
Ivy Bridge is the codename for the "third generation" of the Intel Core processors. Ivy Bridge is a die shrink to 22 nanometer manufacturing process based on the 32 nanometer Sandy Bridge - see tick–tock model. The name is also applied more broadly to the 22 nm die shrink of the Sandy Bridge microarchitecture based on FinFET ("3D") Tri-Gate transistors, which is also used in the Xeon and Core i7 Ivy Bridge-EX (Ivytown), Ivy Bridge-EP and Ivy Bridge-E microprocessors released in 2013.
Stress test is a form of deliberately intense and thorough testing used to determine the stability of a given system or entity. It involves testing beyond normal operational capacity, often to a breaking point, in order to observe the results. Reasons can include: to determine breaking points and safe usage limits; to confirm that the intended specifications are being met; to determine modes of failure, and to test stable operation of a part or system outside standard usage. Reliability engineers often test items under expected stress or even under accelerated stress in order to determine the operating life of the item or to determine modes of failure.
WPrime is a computer program that calculates a set number of square roots using Newton's method for estimating functions verifying the results by squaring them then comparing them with the original numbers.