Synthetic control method

Last updated

The synthetic control method is a statistical method used to evaluate the effect of an intervention in comparative case studies. It involves the construction of a weighted combination of groups used as controls, to which the treatment group is compared. [1] This comparison is used to estimate what would have happened to the treatment group if it had not received the treatment. Unlike difference in differences approaches, this method can account for the effects of confounders changing over time, by weighting the control group to better match the treatment group before the intervention. [2] Another advantage of the synthetic control method is that it allows researchers to systematically select comparison groups. [3] It has been applied to the fields of political science, [3] health policy, [2] criminology, [4] and economics. [5]

The synthetic control method combines elements from matching and difference-in-differences techniques. Difference-in-differences methods are often-used policy evaluation tools that estimate the effect of an intervention at an aggregate level (e.g. state, country, age group etc.) by averaging over a set of unaffected units. Famous examples include studies of the employment effects of a raise in the minimum wage in New Jersey fast food restaurants by comparing them to fast food restaurants just across the border in Philadelphia that were unaffected by a minimum wage raise, [6] and studies that look at crime rates in southern cities to evaluate the impact of the Mariel boat lift on crime. [7] The control group in this specific scenario can be interpreted as a weighted average, where some units effectively receive zero weight while others get an equal, non-zero weight.

The synthetic control method tries to offer a more systematic way to assign weights to the control group. It typically uses a relatively long time series of the outcome prior to the intervention and estimates weights in such a way that the control group mirrors the treatment group as closely as possible. In particular, assume we have J observations over T time periods where the relevant treatment occurs at time where Let

be the treatment effect for unit at time , where is the outcome in absence of the treatment. Without loss of generality, if unit 1 receives the relevant treatment, only is not observed for . We aim to estimate .

Imposing some structure

and assuming there exist some optimal weights such that

for , the synthetic controls approach suggests using these weights to estimate the counterfactual

for . So under some regularity conditions, such weights would provide estimators for the treatment effects of interest. In essence, the method uses the idea of matching and using the training data pre-intervention to set up the weights and hence a relevant control post-intervention. [8]

Synthetic controls have been used in a number of empirical applications, ranging from studies examining natural catastrophes and growth, [9] studies that examine the effect of vaccine mandates on childhood immunisation, [10] and studies linking political murders to house prices. [11]

See also

Related Research Articles

Analysis of variance (ANOVA) is a collection of statistical models and their associated estimation procedures used to analyze the differences among means. ANOVA was developed by the statistician Ronald Fisher. ANOVA is based on the law of total variance, where the observed variance in a particular variable is partitioned into components attributable to different sources of variation. In its simplest form, ANOVA provides a statistical test of whether two or more population means are equal, and therefore generalizes the t-test beyond two means. In other words, the ANOVA is used to test the difference between two or more means.

<span class="mw-page-title-main">Design of experiments</span> Design of tasks

The design of experiments, also known as experiment design or experimental design, is the design of any task that aims to describe and explain the variation of information under conditions that are hypothesized to reflect the variation. The term is generally associated with experiments in which the design introduces conditions that directly affect the variation, but may also refer to the design of quasi-experiments, in which natural conditions that influence the variation are selected for observation.

<span class="mw-page-title-main">Randomized controlled trial</span> Form of scientific experiment

A randomized controlled trial is a form of scientific experiment used to control factors not under direct experimental control. Examples of RCTs are clinical trials that compare the effects of drugs, surgical techniques, medical devices, diagnostic procedures or other medical treatments.

Analysis of covariance (ANCOVA) is a general linear model that blends ANOVA and regression. ANCOVA evaluates whether the means of a dependent variable (DV) are equal across levels of one or more categorical independent variables (IV) and across one or more continuous variables. For example, the categorical variable(s) might describe treatment and the continuous variable(s) might be covariates or nuisance variables; or vice versa. Mathematically, ANCOVA decomposes the variance in the DV into variance explained by the CV(s), variance explained by the categorical IV, and residual variance. Intuitively, ANCOVA can be thought of as 'adjusting' the DV by the group means of the CV(s).

Mann–Whitney test is a nonparametric test of the null hypothesis that, for randomly selected values X and Y from two populations, the probability of X being greater than Y is equal to the probability of Y being greater than X.

In statistics, an effect size is a value measuring the strength of the relationship between two variables in a population, or a sample-based estimate of that quantity. It can refer to the value of a statistic calculated from a sample of data, the value of a parameter for a hypothetical population, or to the equation that operationalizes how statistics or parameters lead to the effect size value. Examples of effect sizes include the correlation between two variables, the regression coefficient in a regression, the mean difference, or the risk of a particular event happening. Effect sizes complement statistical hypothesis testing, and play an important role in power analyses, sample size planning, and in meta-analyses. The cluster of data-analysis methods concerning effect sizes is referred to as estimation statistics.

Alberto Abadie is a Spanish economist who has served as a professor of economics at the Massachusetts Institute of Technology since 2016, where he is also Associate Director of the Institute for Data, Systems, and Society (IDSS). He is principally known for his work in econometrics and empirical microeconomics, and is a specialist in causal inference and program evaluation. He has made fundamental contributions to important areas in econometrics and statistics, including treatment effect models, instrumental variable estimation, matching estimators, difference in differences, and synthetic controls.

<span class="mw-page-title-main">Field experiment</span> Experiment conducted outside the laboratory

Field experiments are experiments carried out outside of laboratory settings.

In the statistical theory of the design of experiments, blocking is the arranging of experimental units that are similar to one another in groups (blocks) based on one or more variables. These variables are chosen carefully to minimize the impact of their variability on the observed outcomes. There are different ways that blocking can be implemented, resulting in different confounding effects. However, the different methods share the same purpose: to control variability introduced by specific factors that could influence the outcome of an experiment. The roots of blocking originated from the statistician, Ronald Fisher, following his development of ANOVA.

<span class="mw-page-title-main">Number needed to treat</span> Epidemiological measure

The number needed to treat (NNT) or number needed to treat for an additional beneficial outcome (NNTB) is an epidemiological measure used in communicating the effectiveness of a health-care intervention, typically a treatment with medication. The NNT is the average number of patients who need to be treated to prevent one additional bad outcome. It is defined as the inverse of the absolute risk reduction, and computed as , where is the incidence in the control (unexposed) group, and is the incidence in the treated (exposed) group. This calculation implicitly assumes monotonicity, that is, no individual can be harmed by treatment. The modern approach, based on counterfactual conditionals, relaxes this assumption and yields bounds on NNT.

<span class="mw-page-title-main">Confounding</span> Variable or factor in causal inference

In causal inference, a confounder is a variable that influences both the dependent variable and independent variable, causing a spurious association. Confounding is a causal concept, and as such, cannot be described in terms of correlations or associations. The existence of confounders is an important quantitative explanation why correlation does not imply causation. Some notations are explicitly designed to identify the existence, possible existence, or non-existence of confounders in causal relationships between elements of a system.

Difference in differences is a statistical technique used in econometrics and quantitative research in the social sciences that attempts to mimic an experimental research design using observational study data, by studying the differential effect of a treatment on a 'treatment group' versus a 'control group' in a natural experiment. It calculates the effect of a treatment on an outcome by comparing the average change over time in the outcome variable for the treatment group to the average change over time for the control group. Although it is intended to mitigate the effects of extraneous factors and selection bias, depending on how the treatment group is chosen, this method may still be subject to certain biases.

The Rubin causal model (RCM), also known as the Neyman–Rubin causal model, is an approach to the statistical analysis of cause and effect based on the framework of potential outcomes, named after Donald Rubin. The name "Rubin causal model" was first coined by Paul W. Holland. The potential outcomes framework was first proposed by Jerzy Neyman in his 1923 Master's thesis, though he discussed it only in the context of completely randomized experiments. Rubin extended it into a general framework for thinking about causation in both observational and experimental studies.

The average treatment effect (ATE) is a measure used to compare treatments in randomized experiments, evaluation of policy interventions, and medical trials. The ATE measures the difference in mean (average) outcomes between units assigned to the treatment and units assigned to the control. In a randomized trial, the average treatment effect can be estimated from a sample using a comparison in mean outcomes for treated and untreated units. However, the ATE is generally understood as a causal parameter that a researcher desires to know, defined without reference to the study design or estimation procedure. Both observational studies and experimental study designs with random assignment may enable one to estimate an ATE in a variety of ways.

In statistics, econometrics, political science, epidemiology, and related disciplines, a regression discontinuity design (RDD) is a quasi-experimental pretest–posttest design that aims to determine the causal effects of interventions by assigning a cutoff or threshold above or below which an intervention is assigned. By comparing observations lying closely on either side of the threshold, it is possible to estimate the average treatment effect in environments in which randomisation is unfeasible. However, it remains impossible to make true causal inference with this method alone, as it does not automatically reject causal effects by any potential confounding variable. First applied by Donald Thistlethwaite and Donald Campbell (1960) to the evaluation of scholarship programs, the RDD has become increasingly popular in recent years. Recent study comparisons of randomised controlled trials (RCTs) and RDDs have empirically demonstrated the internal validity of the design.

In the statistical analysis of observational data, propensity score matching (PSM) is a statistical matching technique that attempts to estimate the effect of a treatment, policy, or other intervention by accounting for the covariates that predict receiving the treatment. PSM attempts to reduce the bias due to confounding variables that could be found in an estimate of the treatment effect obtained from simply comparing outcomes among units that received the treatment versus those that did not. Paul R. Rosenbaum and Donald Rubin introduced the technique in 1983.

<span class="mw-page-title-main">Errors-in-variables models</span> Regression models accounting for possible errors in independent variables

In statistics, errors-in-variables models or measurement error models are regression models that account for measurement errors in the independent variables. In contrast, standard regression models assume that those regressors have been measured exactly, or observed without error; as such, those models account only for errors in the dependent variables, or responses.

In medicine, a stepped-wedge trial is a type of randomised controlled trial (RCT). An RCT is a scientific experiment that is designed to reduce bias when testing a new medical treatment, a social intervention, or another testable hypothesis.

In experiments, a spillover is an indirect effect on a subject not directly treated by the experiment. These effects are useful for policy analysis but complicate the statistical analysis of experiments.

In econometrics and related empirical fields, the local average treatment effect (LATE), also known as the complier average causal effect (CACE), is the effect of a treatment for subjects who comply with the experimental treatment assigned to their sample group. It is not to be confused with the average treatment effect (ATE), which includes compliers and non-compliers together. Compliance refers to the human-subject response to a proposed experimental treatment condition. Similar to the ATE, the LATE is calculated but does not include non-compliant parties. If the goal is to evaluate the effect of a treatment in ideal, compliant subjects, the LATE value will give a more precise estimate. However, it may lack external validity by ignoring the effect of non-compliance that is likely to occur in the real-world deployment of a treatment method. The LATE can be estimated by a ratio of the estimated intent-to-treat effect and the estimated proportion of compliers, or alternatively through an instrumental variable estimator.

References

  1. Abadie, Alberto (2021). "Using Synthetic Controls: Feasibility, Data Requirements, and Methodological Aspects". Journal of Economic Literature. 59 (2): 391–425. doi: 10.1257/jel.20191450 . hdl: 1721.1/144417 . ISSN   0022-0515.
  2. 1 2 Kreif, Noémi; Grieve, Richard; Hangartner, Dominik; Turner, Alex James; Nikolova, Silviya; Sutton, Matt (December 2016). "Examination of the Synthetic Control Method for Evaluating Health Policies with Multiple Treated Units". Health Economics. 25 (12): 1514–1528. doi:10.1002/hec.3258. PMC   5111584 . PMID   26443693.
  3. 1 2 Abadie, Alberto; Diamond, Alexis; Hainmueller, Jens (February 2015). "Comparative Politics and the Synthetic Control Method". American Journal of Political Science. 59 (2): 495–510. doi:10.1111/ajps.12116.
  4. Saunders, Jessica; Lundberg, Russell; Braga, Anthony A.; Ridgeway, Greg; Miles, Jeremy (3 June 2014). "A Synthetic Control Approach to Evaluating Place-Based Crime Interventions". Journal of Quantitative Criminology. 31 (3): 413–434. doi:10.1007/s10940-014-9226-5. S2CID   254702864.
  5. Billmeier, Andreas; Nannicini, Tommaso (July 2013). "Assessing Economic Liberalization Episodes: A Synthetic Control Approach". Review of Economics and Statistics. 95 (3): 983–1001. doi:10.1162/REST_a_00324. S2CID   57561957.
  6. Card, D.; Krueger, A. (1994). "Minimum Wages and Employment: A Case Study of the Fast-Food Industry in New Jersey and Pennsylvania". American Economic Review . 84 (4): 772–793. JSTOR   2118030.
  7. Billy, Alexander (2022). "Crime and the Mariel Boatlift". International Review of Law and Economics . 72: 106094. doi:10.1016/j.irle.2022.106094. S2CID   219390309 via Science Direct.
  8. Abadie, A.; Diamond, A.; Hainmüller, J. (2010). "Synthetic Control Methods for Comparative Case Studies: Estimating the Effect of California's Tobacco Control Program" (PDF). Journal of the American Statistical Association . 105 (490): 493–505. doi:10.1198/jasa.2009.ap08746. S2CID   8385861.
  9. Cavallo, E.; Galliani, S.; Noy, I.; Pantano, J. (2013). "Catastrophic Natural Disasters and Economic Growth" (PDF). Review of Economics and Statistics . 95 (5): 1549–1561. doi:10.1162/REST_a_00413. S2CID   16038784.
  10. Li, Ang.; Toll, Mathew. (2020). "Removing conscientious objection: The impact of 'No Jab No Pay' and 'No Jab No Play' vaccine policies in Australia". Preventive Medicine. 145: 106406. doi:10.1016/j.ypmed.2020.106406. ISSN   0091-7435. PMID   33388333. S2CID   230489130.
  11. Gautier, P. A.; Siegmann, A.; Van Vuuren, A. (2009). "Terrorism and Attitudes towards Minorities: The effect of the Theo van Gogh murder on house prices in Amsterdam". Journal of Urban Economics . 65 (2): 113–126. doi:10.1016/j.jue.2008.10.004. S2CID   190624.