Blocking (statistics)

Last updated

In the statistical theory of the design of experiments, blocking is the arranging of experimental units that are similar to one another in groups (blocks) based on one or more variables. These variables are chosen carefully to minimize the impact of their variability on the observed outcomes. There are different ways that blocking can be implemented, resulting in different confounding effects. However, the different methods share the same purpose: to control variability introduced by specific factors that could influence the outcome of an experiment. The roots of blocking originated from the statistician, Ronald Fisher, following his development of ANOVA. [1]

Contents

History

The use of blocking in experimental design has an evolving history that spans multiple disciplines. The foundational concepts of blocking date back to the early 20th century with statisticians like Ronald A. Fisher. His work in developing analysis of variance (ANOVA) set the groundwork for grouping experimental units to control for extraneous variables. Blocking evolved over the years, leading to the formalization of randomized block designs and Latin square designs. [1] Today, blocking still plays a pivotal role in experimental design, and in recent years, advancements in statistical software and computational capabilities have allowed researchers to explore more intricate blocking designs.

Use

Blocking reduces unexplained variability. Its principle lies in the fact that variability which cannot be overcome (e.g. needing two batches of raw material to produce 1 container of a chemical) is confounded or aliased with a(n) (higher/highest order) interaction to eliminate its influence on the end product. [2] High order interactions are usually of the least importance (think of the fact that temperature of a reactor or the batch of raw materials is more important than the combination of the two – this is especially true when more (3, 4, ...) factors are present); thus it is preferable to confound this variability with the higher interaction. [2]

Examples

Nuisance variables

Nuisance variable effect on response variable Nuisance variable.jpg
Nuisance variable effect on response variable
Nuisance variable (sex) effect on response variable (weight loss) Nuisance variable on weight loss.jpg
Nuisance variable (sex) effect on response variable (weight loss)

In the examples listed above, a nuisance variable is a variable that is not the primary focus of the study but can affect the outcomes of the experiment. [3] They are considered potential sources of variability that, if not controlled or accounted for, may confound the interpretation between the independent and dependent variables.

To address nuisance variables, researchers can employ different methods such as blocking or randomization. Blocking involves grouping experimental units based on levels of the nuisance variable to control for its influence. Randomization helps distribute the effects of nuisance variables evenly across treatment groups.

By using one of these methods to account for nuisance variables, researchers can enhance the internal validity of their experiments, ensuring that the effects observed are more likely attributable to the manipulated variables rather than extraneous influences.

In the first example provided above, the sex of the patient would be a nuisance variable. For example, consider if the drug was a diet pill and the researchers wanted to test the effect of the diet pills on weight loss. The explanatory variable is the diet pill and the response variable is the amount of weight loss. Although the sex of the patient is not the main focus of the experiment—the effect of the drug is—it is possible that the sex of the individual will affect the amount of weight lost.

Blocking used for nuisance factors that can be controlled

In the statistical theory of the design of experiments, blocking is the arranging of experimental units in groups (blocks) that are similar to one another. Typically, a blocking factor is a source of variability that is not of primary interest to the experimenter. [3] [4]

No blocking (left) vs blocking (right) experimental design No block vs block chart.jpg
No blocking (left) vs blocking (right) experimental design

When studying probability theory the blocks method consists of splitting a sample into blocks (groups) separated by smaller subblocks so that the blocks can be considered almost independent. [5] The blocks method helps proving limit theorems in the case of dependent random variables.

The blocks method was introduced by S. Bernstein: [6] The method was successfully applied in the theory of sums of dependent random variables and in extreme value theory. [7] [8] [9]

Example

Without blocking: diet pills vs placebo on weight loss Without blocking diet pills vs placebo.jpg
Without blocking: diet pills vs placebo on weight loss

In our previous diet pills example, a blocking factor could be the sex of a patient. We could put individuals into one of two blocks (male or female). And within each of the two blocks, we can randomly assign the patients to either the diet pill (treatment) or placebo pill (control).  By blocking on sex, this source of variability is controlled, therefore, leading to greater interpretation of how the diet pills affect weight loss.

With blocking: diet pills vs placebo on weight loss With blocking diet pills vs placebo.jpg
With blocking: diet pills vs placebo on weight loss

Definition of blocking factors

A nuisance factor is used as a blocking factor if every level of the primary factor occurs the same number of times with each level of the nuisance factor. [3] The analysis of the experiment will focus on the effect of varying levels of the primary factor within each block of the experiment.

Block a few of the most important nuisance factors

The general rule is:

"Block what you can; randomize what you cannot." [3]

Blocking is used to remove the effects of a few of the most important nuisance variables. Randomization is then used to reduce the contaminating effects of the remaining nuisance variables. For important nuisance variables, blocking will yield higher significance in the variables of interest than randomizing. [10]

Implementation

Implementing blocking in experimental design involves a series of steps to effectively control for extraneous variables and enhance the precision of treatment effect estimates.

Identify nuisance variables

Identify potential factors that are not the primary focus of the study but could introduce variability.

Select appropriate blocking factors

Carefully choose blocking factors based on their relevance to the study as well as their potential to confound the primary factors of interest. [11]

Define block sizes

There are consequences to partitioning a certain sized experiment into a certain number of blocks as the number of blocks determines the number of confounded effects. [12]

Assign treatments to blocks

You may choose to randomly assign experimental units to treatment conditions within each block which may help ensure that any unaccounted for variability is spread evenly across treatment groups. However, depending on how you assign treatments to blocks, you may obtain a different number of confounded effects. [4] Therefore, the number of as well as which specific effects get confounded can be chosen which means that assigning treatments to blocks is superior over random assignment. [4]

Replication

By running a different design for each replicate, where a different effect gets confounded each time, the interaction effects are partially confounded instead of completely sacrificing one single effect. [4] Replication enhances the reliability of results and allows for a more robust assessment of treatment effects. [12]

Example

Table

One useful way to look at a randomized block experiment is to consider it as a collection of completely randomized experiments, each run within one of the blocks of the total experiment. [3]

Randomized block designs (RBD)
Name of designNumber of factors kNumber of runs n
2-factor RBD2L1 * L2
3-factor RBD3L1 * L2 * L3
4-factor RBD4L1 * L2 * L3 * L4
k-factor RBDkL1 * L2 * * Lk

with

L1 = number of levels (settings) of factor 1
L2 = number of levels (settings) of factor 2
L3 = number of levels (settings) of factor 3
L4 = number of levels (settings) of factor 4
Lk = number of levels (settings) of factor k

Example

Suppose engineers at a semiconductor manufacturing facility want to test whether different wafer implant material dosages have a significant effect on resistivity measurements after a diffusion process taking place in a furnace. They have four different dosages they want to try and enough experimental wafers from the same lot to run three wafers at each of the dosages.

The nuisance factor they are concerned with is "furnace run" since it is known that each furnace run differs from the last and impacts many process parameters.

An ideal way to run this experiment would be to run all the 4x3=12 wafers in the same furnace run. That would eliminate the nuisance furnace factor completely. However, regular production wafers have furnace priority, and only a few experimental wafers are allowed into any furnace run at the same time.

A non-blocked way to run this experiment would be to run each of the twelve experimental wafers, in random order, one per furnace run. That would increase the experimental error of each resistivity measurement by the run-to-run furnace variability and make it more difficult to study the effects of the different dosages. The blocked way to run this experiment, assuming you can convince manufacturing to let you put four experimental wafers in a furnace run, would be to put four wafers with different dosages in each of three furnace runs. The only randomization would be choosing which of the three wafers with dosage 1 would go into furnace run 1, and similarly for the wafers with dosages 2, 3 and 4.

Description of the experiment

Let X1 be dosage "level" and X2 be the blocking factor furnace run. Then the experiment can be described as follows:

k = 2 factors (1 primary factor X1 and 1 blocking factor X2)
L1 = 4 levels of factor X1
L2 = 3 levels of factor X2
n = 1 replication per cell
N = L1 * L2 = 4 * 3 = 12 runs

Before randomization, the design trials look like:

X1X2
11
12
13
21
22
23
31
32
33
41
42
43

Matrix representation

An alternate way of summarizing the design trials would be to use a 4x3 matrix whose 4 rows are the levels of the treatment X1 and whose columns are the 3 levels of the blocking variable X2. The cells in the matrix have indices that match the X1, X2 combinations above.

TreatmentBlock 1Block 2Block 3
1111
2111
3111
4111

By extension, note that the trials for any K-factor randomized block design are simply the cell indices of a k dimensional matrix.

Model

The model for a randomized block design with one nuisance variable is

where

Yij is any observation for which X1 = i and X2 = j
X1 is the primary factor
X2 is the blocking factor
μ is the general location parameter (i.e., the mean)
Ti is the effect for being in treatment i (of factor X1)
Bj is the effect for being in block j (of factor X2)

Estimates

Estimate for μ : = the average of all the data
Estimate for Ti : with = average of all Y for which X1 = i.
Estimate for Bj : with = average of all Y for which X2 = j.

Generalizations

See also

Related Research Articles

Analysis of variance (ANOVA) is a collection of statistical models and their associated estimation procedures used to analyze the differences among means. ANOVA was developed by the statistician Ronald Fisher. ANOVA is based on the law of total variance, where the observed variance in a particular variable is partitioned into components attributable to different sources of variation. In its simplest form, ANOVA provides a statistical test of whether two or more population means are equal, and therefore generalizes the t-test beyond two means. In other words, the ANOVA is used to test the difference between two or more means.

<span class="mw-page-title-main">Design of experiments</span> Design of tasks

The design of experiments, also known as experiment design or experimental design, is the design of any task that aims to describe and explain the variation of information under conditions that are hypothesized to reflect the variation. The term is generally associated with experiments in which the design introduces conditions that directly affect the variation, but may also refer to the design of quasi-experiments, in which natural conditions that influence the variation are selected for observation.

<span class="mw-page-title-main">Regression toward the mean</span> Statistical phenomenon

In statistics, regression toward the mean is the phenomenon where if one sample of a random variable is extreme, the next sampling of the same random variable is likely to be closer to its mean. Furthermore, when many random variables are sampled and the most extreme results are intentionally picked out, it refers to the fact that a second sampling of these picked-out variables will result in "less extreme" results, closer to the initial mean of all of the variables.

Analysis of covariance (ANCOVA) is a general linear model that blends ANOVA and regression. ANCOVA evaluates whether the means of a dependent variable (DV) are equal across levels of one or more categorical independent variables (IV) and across one or more continuous variables. For example, the categorical variable(s) might describe treatment and the continuous variable(s) might be covariates or nuisance variables; or vice versa. Mathematically, ANCOVA decomposes the variance in the DV into variance explained by the CV(s), variance explained by the categorical IV, and residual variance. Intuitively, ANCOVA can be thought of as 'adjusting' the DV by the group means of the CV(s).

<span class="mw-page-title-main">Dependent and independent variables</span> Concept in mathematical modeling, statistical modeling and experimental sciences

A variable is considered dependent if it depends on an independent variable. Dependent variables are studied under the supposition or demand that they depend, by some law or rule, on the values of other variables. Independent variables, in turn, are not seen as depending on any other variable in the scope of the experiment in question. In this sense, some common independent variables are time, space, density, mass, fluid flow rate, and previous values of some observed value of interest to predict future values.

Taguchi methods are statistical methods, sometimes called robust design methods, developed by Genichi Taguchi to improve the quality of manufactured goods, and more recently also applied to engineering, biotechnology, marketing and advertising. Professional statisticians have welcomed the goals and improvements brought about by Taguchi methods, particularly by Taguchi's development of designs for studying variation, but have criticized the inefficiency of some of Taguchi's proposals.

<span class="mw-page-title-main">Interaction (statistics)</span> Statistical term

In statistics, an interaction may arise when considering the relationship among three or more variables, and describes a situation in which the effect of one causal variable on an outcome depends on the state of a second causal variable. Although commonly thought of in terms of causal relationships, the concept of an interaction can also describe non-causal associations. Interactions are often considered in the context of regression analyses or factorial experiments.

<span class="mw-page-title-main">Factorial experiment</span> Experimental design in statistics

In statistics, a full factorial experiment is an experiment whose design consists of two or more factors, each with discrete possible values or "levels", and whose experimental units take on all possible combinations of these levels across all such factors. A full factorial design may also be called a fully crossed design. Such an experiment allows the investigator to study the effect of each factor on the response variable, as well as the effects of interactions between factors on the response variable.

<span class="mw-page-title-main">Confounding</span> Variable or factor in causal inference

In causal inference, a confounder is a variable that influences both the dependent variable and independent variable, causing a spurious association. Confounding is a causal concept, and as such, cannot be described in terms of correlations or associations. The existence of confounders is an important quantitative explanation why correlation does not imply causation. Some notations are explicitly designed to identify the existence, possible existence, or non-existence of confounders in causal relationships between elements of a system.

<span class="mw-page-title-main">Randomized experiment</span> Experiment using randomness in some aspect, usually to aid in removal of bias

In science, randomized experiments are the experiments that allow the greatest reliability and validity of statistical estimates of treatment effects. Randomization-based inference is especially important in experimental design and in survey sampling.

In statistics, fractional factorial designs are experimental designs consisting of a carefully chosen subset (fraction) of the experimental runs of a full factorial design. The subset is chosen so as to exploit the sparsity-of-effects principle to expose information about the most important features of the problem studied, while using a fraction of the effort of a full factorial design in terms of experimental runs and resources. In other words, it makes use of the fact that many experiments in full factorial design are often redundant, giving little or no new information about the system.

In the design of experiments and analysis of variance, a main effect is the effect of an independent variable on a dependent variable averaged across the levels of any other independent variables. The term is frequently used in the context of factorial designs and regression models to distinguish main effects from interaction effects.

<span class="mw-page-title-main">Quasi-experiment</span> Empirical interventional study

A quasi-experiment is an empirical interventional study used to estimate the causal impact of an intervention on target population without random assignment. Quasi-experimental research shares similarities with the traditional experimental design or randomized controlled trial, but it specifically lacks the element of random assignment to treatment or control. Instead, quasi-experimental designs typically allow the researcher to control the assignment to the treatment condition, but using some criterion other than random assignment.

In statistics, one-way analysis of variance is a technique to compare whether two samples' means are significantly different. This analysis of variance technique requires a numeric response variable "Y" and a single explanatory variable "X", hence "one-way".

Repeated measures design is a research design that involves multiple measures of the same variable taken on the same or matched subjects either under different conditions or over two or more time periods. For instance, repeated measurements are collected in a longitudinal study in which change over time is assessed.

In statistics, restricted randomization occurs in the design of experiments and in particular in the context of randomized experiments and randomized controlled trials. Restricted randomization allows intuitively poor allocations of treatments to experimental units to be avoided, while retaining the theoretical benefits of randomization. For example, in a clinical trial of a new proposed treatment of obesity compared to a control, an experimenter would want to avoid outcomes of the randomization in which the new treatment was allocated only to the heaviest patients.

In the design of experiments, completely randomized designs are for studying the effects of one primary factor without the need to take other nuisance variables into account. This article describes completely randomized designs that have one primary factor. The experiment compares the values of a response variable based on the different levels of that primary factor. For completely randomized designs, the levels of the primary factor are randomly assigned to the experimental units.

A glossary of terms used in experimental research.

In randomized statistical experiments, generalized randomized block designs (GRBDs) are used to study the interaction between blocks and treatments. For a GRBD, each treatment is replicated at least two times in each block; this replication allows the estimation and testing of an interaction term in the linear model.

<span class="mw-page-title-main">Between-group design experiment</span>

In the design of experiments, a between-group design is an experiment that has two or more groups of subjects each being tested by a different testing factor simultaneously. This design is usually used in place of, or in some cases in conjunction with, the within-subject design, which applies the same variations of conditions to each subject to observe the reactions. The simplest between-group design occurs with two groups; one is generally regarded as the treatment group, which receives the ‘special’ treatment, and the control group, which receives no variable treatment and is used as a reference. The between-group design is widely used in psychological, economic, and sociological experiments, as well as in several other fields in the natural or social sciences.

References

  1. 1 2 Box, Joan Fisher (1980). "R. A. Fisher and the Design of Experiments, 1922-1926". The American Statistician. 34 (1): 1–7. doi:10.2307/2682986. ISSN   0003-1305.
  2. 1 2 "5.3.3.3.3. Blocking of full factorial designs". www.itl.nist.gov. Retrieved 2023-12-11.
  3. 1 2 3 4 5 "5.3.3.2. Randomized block designs". www.itl.nist.gov. Retrieved 2023-12-11.
  4. 1 2 3 4 Berger, Paul D.; Maurer, Robert E.; Celli, Giovana B. (2018). "Experimental Design". SpringerLink. doi:10.1007/978-3-319-64583-4.
  5. "Randomized Block Design", The Concise Encyclopedia of Statistics, New York, NY: Springer, pp. 447–448, 2008, doi:10.1007/978-0-387-32833-1_344, ISBN   978-0-387-32833-1 , retrieved 2023-12-11
  6. Bernstein S.N. (1926) Sur l'extension du théorème limite du calcul des probabilités aux sommes de quantités dépendantes. Math. Annalen, v. 97, 1–59.
  7. Ibragimov I.A. and Linnik Yu.V. (1971) Independent and stationary sequences of random variables. Wolters-Noordhoff, Groningen.
  8. Leadbetter M.R., Lindgren G. and Rootzén H. (1983) Extremes and Related Properties of Random Sequences and Processes. New York: Springer Verlag.
  9. Novak S.Y. (2011) Extreme Value Methods with Applications to Finance. Chapman & Hall/CRC Press, London.
  10. Karmakar, Biikram (November 2022). "An Approximation Algorithm for Blocking of an Experimental Design". Journal of Royal Statistical Society: 1726–1750.
  11. Pashley, Nicole E.; Miratrix, Luke W. (July 7, 2021). "Block What You Can, Except When You Shouldn't". Journal of Educational and Behavioral Statistics. 47 (1): 69–100. arXiv: 2010.14078 . doi:10.3102/10769986211027240. ISSN   1076-9986.
  12. 1 2 Ledolter, Johannes; Kardon, Randy H. (2020-07-09). "Focus on Data: Statistical Design of Experiments and Sample Size Selection Using Power Analysis". Investigative Ophthalmology & Visual Science. 61 (8): 11. doi:10.1167/iovs.61.8.11. ISSN   0146-0404. PMC   7425741 . PMID   32645134.

Bibliography