# Replication (statistics)

Last updated

In engineering, science, and statistics, replication is the repetition of an experimental condition so that the variability associated with the phenomenon can be estimated. ASTM, in standard E1847, defines replication as "the repetition of the set of all the treatment combinations to be compared in an experiment. Each of the repetitions is called a replicate."

Engineering is the use of scientific principles to design and build machines, structures, and other items, including bridges, tunnels, roads, vehicles, and buildings. The discipline of engineering encompasses a broad range of more specialized fields of engineering, each with a more specific emphasis on particular areas of applied mathematics, applied science, and types of application. See glossary of engineering.

Science is a systematic enterprise that builds and organizes knowledge in the form of testable explanations and predictions about the universe.

Statistics is the discipline that concerns the collection, organization, displaying, analysis, interpretation and presentation of data. In applying statistics to a scientific, industrial, or social problem, it is conventional to begin with a statistical population or a statistical model to be studied. Populations can be diverse groups of people or objects such as "all people living in a country" or "every atom composing a crystal". Statistics deals with every aspect of data, including the planning of data collection in terms of the design of surveys and experiments. See glossary of probability and statistics.

## Contents

Replication is not the same as repeated measurements of the same item: they are dealt with differently in statistical experimental design and data analysis.

Measurement is the assignment of a number to a characteristic of an object or event, which can be compared with other objects or events. The scope and application of measurement are dependent on the context and discipline. In the natural sciences and engineering, measurements do not apply to nominal properties of objects or events, which is consistent with the guidelines of the International vocabulary of metrology published by the International Bureau of Weights and Measures. However, in other fields such as statistics as well as the social and behavioral sciences, measurements can have multiple levels, which would include nominal, ordinal, interval and ratio scales.

Data analysis is a process of inspecting, cleansing, transforming and modeling data with the goal of discovering useful information, informing conclusion and supporting decision-making. Data analysis has multiple facets and approaches, encompassing diverse techniques under a variety of names, and is used in different business, science, and social science domains. In today's business world, data analysis plays a role in making decisions more scientific and helping businesses operate more effectively.

For proper sampling, a process or batch of products should be in reasonable statistical control; inherent random variation is present but variation due to assignable (special) causes is not. Evaluation or testing of a single item does not allow for item-to-item variation and may not represent the batch or process. Replication is needed to account for this variation among items and treatments.

In statistics, quality assurance, and survey methodology, sampling is the selection of a subset of individuals from within a statistical population to estimate characteristics of the whole population. Statisticians attempt for the samples to represent the population in question. Two advantages of sampling are lower cost and faster data collection than measuring the entire population.

## Example

As an example, consider a continuous process which produces items. Batches of items are then processed or treated. Finally, tests or measurements are conducted. Several options might be available to obtain ten test values. Some possibilities are:

• One finished and treated item might be measured repeatedly to obtain ten test results. Only one item was measured so there is no replication. The repeated measurements help identify observational error.
• Ten finished and treated items might be taken from a batch and each measured once. This is not full replication because the ten samples are not random and not representative of the continuous nor batch processing.
• Five items are taken from the continuous process based on sound statistical sampling. These are processed in a batch and tested twice each. This includes replication of initial samples but does not allow for batch-to-batch variation in processing. The repeated tests on each provide some measure and control of testing error.
• Five items are taken from the continuous process based on sound statistical sampling. These are processed in five different batches and tested twice each. This plan includes proper replication of initial samples and also includes batch-to-batch variation. The repeated tests on each provide some measure and control of testing error.

Observational error is the difference between a measured value of a quantity and its true value. In statistics, an error is not a "mistake". Variability is an inherent part of the results of measurements and of the measurement process.

Each option would call for different data analysis methods and yield different conclusions.

In statistics, the number of degrees of freedom is the number of values in the final calculation of a statistic that are free to vary.

The design of experiments is the design of any task that aims to describe or explain the variation of information under conditions that are hypothesized to reflect the variation. The term is generally associated with experiments in which the design introduces conditions that directly affect the variation, but may also refer to the design of quasi-experiments, in which natural conditions that influence the variation are selected for observation.

Statistical process control (SPC) is a method of quality control which employs statistical methods to monitor and control a process. This helps to ensure that the process operates efficiently, producing more specification-conforming products with less waste. SPC can be applied to any process where the "conforming product" output can be measured. Key tools used in SPC include run charts, control charts, a focus on continuous improvement, and the design of experiments. An example of a process where SPC is applied is manufacturing lines.

## Bibliography

• ASTM E122-07 Standard Practice for Calculating Sample Size to Estimate, With Specified Precision, the Average for a Characteristic of a Lot or Process
• "Engineering Statistics Handbook", NIST/SEMATEK
• Pyzdek, T, "Quality Engineering Handbook", 2003, ISBN   0-8247-4614-7.
• Godfrey, A. B., "Juran's Quality Handbook", 1999, ISBN   9780070340039.

## Related Research Articles

In measurement of a set, accuracy refers to closeness of the measurements to a specific value, while precision refers to the closeness of the measurements to each other.

A mechanical or physical shock is a sudden acceleration caused, for example, by impact, drop, kick, earthquake, or explosion. Shock is a transient physical excitation.

Reliability in statistics and psychometrics is the overall consistency of a measure. A measure is said to have a high reliability if it produces similar results under consistent conditions. "It is the characteristic of a set of test scores that relates to the amount of random error from the measurement process that might be embedded in the scores. Scores that are highly reliable are accurate, reproducible, and consistent from one testing occasion to another. That is, if the testing process were repeated with a group of test takers, essentially the same results would be obtained. Various kinds of reliability coefficients, with values ranging between 0.00 and 1.00, are usually used to indicate the amount of error in the scores." For example, measurements of people's height and weight are often extremely reliable.

Engineering tolerance is the permissible limit or limits of variation in:

1. a physical dimension;
2. a measured value or physical property of a material, manufactured object, system, or service;
3. other measured values ;
4. in engineering and safety, a physical distance or space (tolerance), as in a truck (lorry), train or boat under a bridge as well as a train in a tunnel ;
5. in mechanical engineering the space between a bolt and a nut or a hole, etc.

Repeatability or test–retest reliability is the closeness of the agreement between the results of successive measurements of the same measure carried out under the same conditions of measurement. In other words, the measurements are taken by a single person or instrument on the same item, under the same conditions, and in a short period of time. A less-than-perfect test–retest reliability causes test–retest variability. Such variability can be caused by, for example, intra-individual variability and intra-observer variability. A measurement may be said to be repeatable when this variation is smaller than a pre-determined acceptance criterion.

In probability theory and statistics, the coefficient of variation (CV), also known as relative standard deviation (RSD), is a standardized measure of dispersion of a probability distribution or frequency distribution. It is often expressed as a percentage, and is defined as the ratio of the standard deviation to the mean . The CV or RSD is widely used in analytical chemistry to express the precision and repeatability of an assay. It is also commonly used in fields such as engineering or physics when doing quality assurance studies and ANOVA gauge R&R. In addition, CV is utilized by economists and investors in economic models.

A process is a unique combination of tools, materials, methods, and people engaged in producing a measurable output; for example a manufacturing line for machine parts. All processes have inherent statistical variability which can be evaluated by statistical methods.

A measurement systems analysis (MSA) is a thorough assessment of a measurement process, and typically includes a specially designed experiment that seeks to identify the components of variation in that measurement process.

ANOVA gage repeatability and reproducibility is a measurement systems analysis technique that uses an analysis of variance (ANOVA) random effects model to assess a measurement system.

A triaxial shear test is a common method to measure the mechanical properties of many deformable solids, especially soil and rock, and other granular materials or powders. There are several variations on the test.

A test method is a method for a test in science or engineering, such as a physical test, chemical test, or statistical test. It is a definitive procedure that produces a test result. In order to ensure accurate and relevant test results, a test method should be "explicit, unambiguous, and experimentally feasible.", as well as effective and reproducible.

Repeated measures design is a research design that involves multiple measures of the same variable taken on the same or matched subjects either under different conditions or over two or more time periods. For instance, repeated measurements are collected in a longitudinal study in which change over time is assessed.

In statistics, restricted randomization occurs in the design of experiments and in particular in the context of randomized experiments and randomized controlled trials. Restricted randomization allows intuitively poor allocations of treatments to experimental units to be avoided, while retaining the theoretical benefits of randomization. For example, in a clinical trial of a new proposed treatment of obesity compared to a control, an experimenter would want to avoid outcomes of the randomization in which the new treatment was allocated only to the heaviest patients.

The following is a glossary of terms. It is not intended to be all-inclusive.

In statistics, dispersion is the extent to which a distribution is stretched or squeezed. Common examples of measures of statistical dispersion are the variance, standard deviation, and interquartile range.

Acceptance sampling uses statistical sampling to determine whether to accept or reject a production lot of material. It has been a common quality control technique used in industry. It is usually done as products leaves the factory, or in some cases even within the factory. Most often a producer supplies a consumer a number of items and a decision to accept or reject the items is made by determining the number of defective items in a sample from the lot. The lot is accepted if the number of defects falls below where the acceptance number or otherwise the lot is rejected.