Scientific control

Last updated
Take identical growing plants (Argyroxiphium sandwicense) and give fertilizer to half of them. If there are differences between the fertilized treatment and the unfertilized treatment, these differences may be due to the fertilizer as long as there weren't other confounding factors that affected the result. For example, if the fertilizer was spread by a tractor but no tractor was used on the unfertilized treatment, then the effect of the tractor needs to be controlled. Starr 011107-0010 Argyroxiphium sandwicense subsp. macrocephalum.jpg
Take identical growing plants (Argyroxiphium sandwicense) and give fertilizer to half of them. If there are differences between the fertilized treatment and the unfertilized treatment, these differences may be due to the fertilizer as long as there weren't other confounding factors that affected the result. For example, if the fertilizer was spread by a tractor but no tractor was used on the unfertilized treatment, then the effect of the tractor needs to be controlled.

A scientific control is an experiment or observation designed to minimize the effects of variables other than the independent variable (i.e. confounding variables). [1] This increases the reliability of the results, often through a comparison between control measurements and the other measurements. Scientific controls are a part of the scientific method.


Controlled experiments

Controls eliminate alternate explanations of experimental results, especially experimental errors and experimenter bias. Many controls are specific to the type of experiment being performed, as in the molecular markers used in SDS-PAGE experiments, and may simply have the purpose of ensuring that the equipment is working properly. The selection and use of proper controls to ensure that experimental results are valid (for example, absence of confounding variables) can be very difficult. Control measurements may also be used for other purposes: for example, a measurement of a microphone's background noise in the absence of a signal allows the noise to be subtracted from later measurements of the signal, thus producing a processed signal of higher quality.

For example, if a researcher feeds an experimental artificial sweetener to sixty laboratories rats and observes that ten of them subsequently become sick, the underlying cause could be the sweetener itself or something unrelated. Other variables, which may not be readily obvious, may interfere with the experimental design. For instance, the artificial sweetener might be mixed with a dilutant and it might be the dilutant that causes the effect. To control for the effect of the dilutant, the same test is run twice; once with the artificial sweetener in the dilutant, and another done exactly the same way but using the dilutant alone. Now the experiment is controlled for the dilutant and the experimenter can distinguish between sweetener, dilutant, and non-treatment. Controls are most often necessary where a confounding factor cannot easily be separated from the primary treatments. For example, it may be necessary to use a tractor to spread fertilizer where there is no other practicable way to spread fertilizer. The simplest solution is to have a treatment where a tractor is driven over plots without spreading fertilizer and in that way, the effects of tractor traffic are controlled.

The simplest types of control are negative and positive controls, and both are found in many different types of experiments. [2] These two controls, when both are successful, are usually sufficient to eliminate most potential confounding variables: it means that the experiment produces a negative result when a negative result is expected, and a positive result when a positive result is expected.


Where there are only two possible outcomes, e.g. positive or negative, if the treatment group and the negative control both produce a negative result, it can be inferred that the treatment had no effect. If the treatment group and the negative control both produce a positive result, it can be inferred that a confounding variable is involved in the phenomenon under study, and the positive results are not solely due to the treatment.

In other examples, outcomes might be measured as lengths, times, percentages, and so forth. In the drug testing example, we could measure the percentage of patients cured. In this case, the treatment is inferred to have no effect when the treatment group and the negative control produce the same results. Some improvement is expected in the placebo group due to the placebo effect, and this result sets the baseline upon which the treatment must improve upon. Even if the treatment group shows improvement, it needs to be compared to the placebo group. If the groups show the same effect, then the treatment was not responsible for the improvement (because the same number of patients were cured in the absence of the treatment). The treatment is only effective if the treatment group shows more improvement than the placebo group.


Positive controls are often used to assess test validity. For example, to assess a new test's ability to detect a disease (its sensitivity), then we can compare it against a different test that is already known to work. The well-established test is a positive control since we already know that the answer to the question (whether the test works) is yes.

Similarly, in an enzyme assay to measure the amount of an enzyme in a set of extracts, a positive control would be an assay containing a known quantity of the purified enzyme (while a negative control would contain no enzyme). The positive control should give a large amount of enzyme activity, while the negative control should give very low to no activity.

If the positive control does not produce the expected result, there may be something wrong with the experimental procedure, and the experiment is repeated. For difficult or complicated experiments, the result from the positive control can also help in comparison to previous experimental results. For example, if the well-established disease test was determined to have the same effect as found by previous experimenters, this indicates that the experiment is being performed in the same way that the previous experimenters did.

When possible, multiple positive controls may be used—if there is more than one disease test that is known to be effective, more than one might be tested. Multiple positive controls also allow finer comparisons of the results (calibration, or standardization) if the expected results from the positive controls have different sizes. For example, in the enzyme assay discussed above, a standard curve may be produced by making many different samples with different quantities of the enzyme.


In randomization, the groups that receive different experimental treatments are determined randomly. While this does not ensure that there are no differences between the groups, it ensures that the differences are distributed equally, thus correcting for systematic errors.

For example, in experiments where crop yield is affected (e.g. soil fertility), the experiment can be controlled by assigning the treatments to randomly selected plots of land. This mitigates the effect of variations in soil composition on the yield.

Blind experiments

Blinding is the practice of withholding information that may bias an experiment. For example, participants may not know who received an active treatment and who received a placebo. If this information were to become available to trial participants, patients could receive a larger placebo effect, researchers could influence the experiment to meet their expectations (the observer effect), and evaluators could be subject to confirmation bias. A blind can be imposed on any participant of an experiment, including subjects, researchers, technicians, data analysts, and evaluators. In some cases, sham surgery may be necessary to achieve blinding.

During the course of an experiment, a participant becomes unblinded if they deduce or otherwise obtain information that has been masked to them. Unblinding that occurs before the conclusion of a study is a source of experimental error, as the bias that was eliminated by blinding is re-introduced. Unblinding is common in blind experiments and must be measured and reported. Meta-research has revealed high levels of unblinding in pharmacological trials. In particular, antidepressant trials are poorly blinded. Reporting guidelines recommend that all studies assess and report unblinding. In practice, very few studies assess unblinding. [3]

Blinding is an important tool of the scientific method, and is used in many fields of research. In some fields, such as medicine, it is considered essential. [4] In clinical research, a trial that is not blinded trial is called an open trial.

See also

Related Research Articles

<span class="mw-page-title-main">Design of experiments</span> Design of tasks set to uncover from

The design of experiments is the design of any task that aims to describe and explain the variation of information under conditions that are hypothesized to reflect the variation. The term is generally associated with experiments in which the design introduces conditions that directly affect the variation, but may also refer to the design of quasi-experiments, in which natural conditions that influence the variation are selected for observation.

<span class="mw-page-title-main">Experiment</span> Scientific procedure performed to validate a hypothesis

An experiment is a procedure carried out to support or refute a hypothesis, or determine the efficacy or likelihood of something previously untried. Experiments provide insight into cause-and-effect by demonstrating what outcome occurs when a particular factor is manipulated. Experiments vary greatly in goal and scale but always rely on repeatable procedure and logical analysis of the results. There also exist natural experimental studies.

<span class="mw-page-title-main">Randomized controlled trial</span> Form of scientific experiment

A randomized controlled trial is a form of scientific experiment used to control factors not under direct experimental control. Examples of RCTs are clinical trials that compare the effects of drugs, surgical techniques, medical devices, diagnostic procedures or other medical treatments.

In a blind or blinded experiment, information which may influence the participants of the experiment is withheld until after the experiment is complete. Good blinding can reduce or eliminate experimental biases that arise from a participants' expectations, observer's effect on the participants, observer bias, confirmation bias, and other sources. A blind can be imposed on any participant of an experiment, including subjects, researchers, technicians, data analysts, and evaluators. In some cases, while blinding would be useful, it is impossible or unethical. For example, it is not possible to blind a patient to their treatment in a physical therapy intervention. A good clinical protocol ensures that blinding is as effective as possible within ethical and practical constraints.

A cohort study is a particular form of longitudinal study that samples a cohort, performing a cross-section at intervals through time. It is a type of panel study where the individuals in the panel share a common characteristic.

Observer bias is one of the types of detection bias and is defined as any kind of systematic divergence from accurate facts during observation and the recording of data and information in studies. The definition can be further expanded upon to include the systematic difference between what is observed due to variation in observers, and what the true value is.

<span class="mw-page-title-main">Observer-expectancy effect</span> Cognitive bias of experimental subject

The observer-expectancy effect is a form of reactivity in which a researcher's cognitive bias causes them to subconsciously influence the participants of an experiment. Confirmation bias can lead to the experimenter interpreting results incorrectly because of the tendency to look for information that conforms to their hypothesis, and overlook information that argues against it. It is a significant threat to a study's internal validity, and is therefore typically controlled using a double-blind experimental design.

Internal validity is the extent to which a piece of evidence supports a claim about cause and effect, within the context of a particular study. It is one of the most important properties of scientific studies and is an important concept in reasoning about evidence more generally. Internal validity is determined by how well a study can rule out alternative explanations for its findings. It contrasts with external validity, the extent to which results can justify conclusions about other contexts.

In the design of experiments, hypotheses are applied to experimental units in a treatment group. In comparative experiments, members of a control group receive a standard treatment, a placebo, or no treatment at all. There may be more than one treatment group, more than one control group, or both.

This glossary of statistics and probability is a list of definitions of terms and concepts used in the mathematical sciences of statistics and probability, their sub-disciplines, and related fields. For additional related terms, see Glossary of mathematics and Glossary of experimental design.

<span class="mw-page-title-main">Confounding</span> Variable in statistics

In statistics, a confounder is a variable that influences both the dependent variable and independent variable, causing a spurious association. Confounding is a causal concept, and as such, cannot be described in terms of correlations or associations. The existence of confounders is an important quantitative explanation why correlation does not imply causation.

<span class="mw-page-title-main">Observational study</span> Study with uncontrolled variable of interest

In fields such as epidemiology, social sciences, psychology and statistics, an observational study draws inferences from a sample to a population where the independent variable is not under the control of the researcher because of ethical concerns or logistical constraints. One common observational study is about the possible effect of a treatment on subjects, where the assignment of subjects into a treated group versus a control group is outside the control of the investigator. This is in contrast with experiments, such as randomized controlled trials, where each subject is randomly assigned to a treated group or a control group. Observational studies, for lacking an assignment mechanism, naturally present difficulties for inferential analysis.

In natural and social science research, a protocol is most commonly a predefined procedural method in the design and implementation of an experiment. Protocols are written whenever it is desirable to standardize a laboratory method to ensure successful replication of results by others in the same laboratory or by other laboratories. Additionally, and by extension, protocols have the advantage of facilitating the assessment of experimental results through peer review. In addition to detailed procedures, equipment, and instruments, protocols will also contain study objectives, reasoning for experimental design, reasoning for chosen sample sizes, safety precautions, and how results were calculated and reported, including statistical analysis and any rules for predefining and documenting excluded data to avoid bias.

<span class="mw-page-title-main">Quasi-experiment</span> Empirical interventional study

A quasi-experiment is an empirical interventional study used to estimate the causal impact of an intervention on target population without random assignment. Quasi-experimental research shares similarities with the traditional experimental design or randomized controlled trial, but it specifically lacks the element of random assignment to treatment or control. Instead, quasi-experimental designs typically allow the researcher to control the assignment to the treatment condition, but using some criterion other than random assignment.

Impact evaluation assesses the changes that can be attributed to a particular intervention, such as a project, program or policy, both the intended ones, as well as ideally the unintended ones. In contrast to outcome monitoring, which examines whether targets have been achieved, impact evaluation is structured to answer the question: how would outcomes such as participants' well-being have changed if the intervention had not been undertaken? This involves counterfactual analysis, that is, "a comparison between what actually happened and what would have happened in the absence of the intervention." Impact evaluations seek to answer cause-and-effect questions. In other words, they look for the changes in outcome that are directly attributable to a program.

A glossary of terms used in experimental research.

A glossary of terms used in clinical research.

The Jadad scale, sometimes known as Jadad scoring or the Oxford quality scoring system, is a procedure to independently assess the methodological quality of a clinical trial. It is named after Colombian physician Alex Jadad who in 1996 described a system for allocating such trials a score of between zero and five (rigorous). It is the most widely used such assessment in the world, and as of 2022, its seminal paper has been cited in over 23,000 scientific works.

<span class="mw-page-title-main">Placebo-controlled study</span>

Placebo-controlled studies are a way of testing a medical therapy in which, in addition to a group of subjects that receives the treatment to be evaluated, a separate control group receives a sham "placebo" treatment which is specifically designed to have no real effect. Placebos are most commonly used in blinded trials, where subjects do not know whether they are receiving real or placebo treatment. Often, there is also a further "natural history" group that does not receive any treatment at all.

<span class="mw-page-title-main">Between-group design</span>

In the design of experiments, a between-group design is an experiment that has two or more groups of subjects each being tested by a different testing factor simultaneously. This design is usually used in place of, or in some cases in conjunction with, the within-subject design, which applies the same variations of conditions to each subject to observe the reactions. The simplest between-group design occurs with two groups; one is generally regarded as the treatment group, which receives the ‘special’ treatment, and the control group, which receives no variable treatment and is used as a reference The between-group design is widely used in psychological, economic, and sociological experiments, as well as in several other fields in the natural or social sciences.


  1. Life, Vol. II: Evolution, Diversity and Ecology: (Chs. 1, 21–33, 52–57). W. H. Freeman. 2006. p. 15. ISBN   978-0-7167-7674-1 . Retrieved 14 February 2015.
  2. Johnson PD, Besselsen DG (2002). "Practical aspects of experimental design in animal research" (PDF). ILAR J. 43 (4): 202–206. doi: 10.1093/ilar.43.4.202 . PMID   12391395. Archived from the original (PDF) on 2010-05-29.
  3. Bello, Segun; Moustgaard, Helene; Hróbjartsson, Asbjørn (October 2014). "The risk of unblinding was infrequently and incompletely reported in 300 randomized clinical trial publications". Journal of Clinical Epidemiology. 67 (10): 1059–1069. doi:10.1016/j.jclinepi.2014.05.007. ISSN   1878-5921. PMID   24973822.
  4. "Oxford Centre for Evidence-based Medicine – Levels of Evidence (March 2009)". 11 June 2009. Archived from the original on 26 October 2017. Retrieved 2 May 2018.
  5. James Lind (1753). A Treatise of the Scurvy. PDF
  6. Simon, Harvey B. (2002). The Harvard Medical School guide to men's health . New York: Free Press. p.  31. ISBN   0-684-87181-5.