Theory-driven evaluation

Last updated

Theory-driven evaluation (also theory-based evaluation) is an umbrella term for any approach to program evaluation that develops a theory of change and uses it to design, implement, analyze, and interpret findings from an evaluation. [1] [2] [3] More specifically, an evaluation is theory-driven if it: [4]

Contents

  1. formulates a theory of change using some combination of social science, lived experience, and program-related professionals' expertise;
  2. develops and prioritizes evaluation questions using the theory;
  3. uses the theory to guide the design and implementation of the evaluation;
  4. uses the theory to operationalize contextual, process, and outcome variables; and
  5. provides a causal explanation of how and why outcomes were achieved, including whether the program worked and/or had any unintended consequences (desirable or harmful), and what moderates outcomes.

By investigating the mechanisms through which outcomes are achieved, theory-driven approaches facilitate learning to improve programs and how they are implemented, and help knowledge to accumulate across apparently different programs. [5] [6] This is in contrast to methods-driven "black box" evaluations, which focus on following the steps of a method (for instance, randomized experiment or focus group) and only assess whether a program leads to its intended outcomes. [7] Theory-driven approaches can also improve the validity of evaluations, for instance leading to more precise estimates of impact in randomized controlled trials. [8]

History

Theory-driven evaluation emerged in the 1970s and 80s in response to the limitations of methods-driven "black box" evaluations. The term theory-driven evaluation was coined by Huey T. Chen and Peter H. Rossi. [9] Chen (1990) [10] wrote the first comprehensive introduction to conducting theory-driven evaluations, for example explaining how to develop a program theory of change and the different types of design. Its origins have been traced [11] to a book by Carol Weiss (1972) [12] and a rarely-cited article by Carol Taylor Fitz-Gibbon and Lynn Lyons Morris (1975). [13] However, "the first published use of what we would recognize as program theory" was in an evaluation of training programs, by Don Kirkpatrick in 1959. [14]

Funnell and Rogers (2011, pp. 23–24) comment on the confused nomenclature of the field, enumerating 22 approaches such as theory-based evaluation and program theory-driven evaluation science that are equivalent to or overlap significantly with theory-driven evaluation. The first definition of theory-based evaluation, by Fitz-Gibbon and Morris (1975), is near-identical to theory-driven evaluation: [15]

A theory-based evaluation of a program is one in which the selection of program features to evaluate is determined by an explicit conceptualization of the program in terms of a theory […] which attempts to explain how the program produces the desired effects. The theory might be psychological […] or social psychological […] or philosophical […]. The essential characteristic is that the theory points out a causal relationship between a process A and an outcome B.

Consequently, the terms theory-driven and theory-based evaluation are often used interchangeably in the literature. [16] [17] [18] However, theory-based evaluation is sometimes interpreted more narrowly to mean qualitative or small-n case study-based evaluations conducted without a comparison group, for example using process tracing or qualitative comparative analysis. [19] [20]

What is meant by "theory"?

The theory of theory-driven evaluation seeks to be as close as possible to the proximal causes of a social problem and site of intervention rather than, for instance, a "grand" theory, that tries to provide an overarching understanding of society, or a metaphysical theory about the nature of social reality: [21]

It advances evaluation practice very little to adopt one or another of current global theories in attacking, say, the problem of juvenile delinquency, but it does help a great deal to understand the authority structure in schools and the mechanisms of peer group influence and parental discipline in designing and evaluating a program that is supposed to reduce disciplinary problems in schools. [...T]he theory-driven perspective is closer to what econometricians call "model specification" than are more complicated and more abstract and general theories.

A distinction is also drawn between normative theory, concerning what a program is supposed to do and how it should be implemented, and causal theory, which specifies how the program is thought to work. [22] There can then be two broad ways in which a program fails to lead to the desired outcomes: (1) a program may be implemented as intended according to the normative theory; however, the causal theory is incorrect; and (2) the causal theory is correct; however, the program was not implemented correctly. [23]

Graphical causal models (GCMs) may be used to formalize causal theories and design, e.g., theory-driven quasi-experiments. [24] One of the advantages of GCMs is that they can be used to automatically determine which variables need to be statistically adjusted or matched on, to estimate the causal effect of a program.

Chen's action model/change model schema

Chen's action model/change model schema [25] provides an example of how a program theory and its context are conceptualized. The elements of the schema are then completed for each particular program.

Chen's (2015) action model/change model schema Chen's Action Model - Change Model Schema.svg
Chen's (2015) action model/change model schema

The change model specifies how an intervention of a program leads to outcomes via determinants, also known as intermediate or mediating variables.

The action model specifies how staff and delivery organizations deliver the intervention to beneficiaries:

Theory-driven methods

The full-range of research methods has been argued to apply. For instance, Chen (2015) provides examples using randomized experiments, quasi-experimental designs, process and outcome monitoring, and qualitative methods. [26] Although proponents of theory-driven evaluation are critical of "black box" experiments, Chen and Rossi (1983, p. 292) [27] argue that theory-driven experiments are possible and desirable:

[A]dvocates of the black box experimental paradigm often neglect the fact that after randomization exogenous variables are still correlated with outcome variables. Knowing how such exogenous factors affect outcomes makes it possible to construct more precise estimates of experimental effects by controlling for such exogenous variables.

It has been argued that theory-driven evaluation focusses too much on statistical approaches, such as randomized experiments, quasi-experiments, and structural equation modelling; [28] however, a case has also been made for the importance of qualitative methods, particularly when developing program theories and understanding implementation. [29]

There is also methodological debate concerning whether realist evaluations, considered a particular kind of theory-driven approach, may include randomized controlled trials in any form. Some evaluators think they may and conduct what they call "realist trials". [30] [31] [32] Others argue that a realist trial is an "oxymoron", and recommend instead calling them "theory-oriented trials". [33] A 2023 review of purported realist trials concluded that whether they are really realist depends on "ontological and epistemological" commitments of evaluators and that differences "cannot be resolved" by reviewing studies conducted. [34]

Examples

Examples discussed in a 2011 systematic review of 45 theory-driven evaluations include: [35]

A 2014 review of theory-driven evaluation in school psychology [39] highlighted two illustrative examples:

Related Research Articles

Program evaluation is a systematic method for collecting, analyzing, and using information to answer questions about projects, policies and programs, particularly about their effectiveness and efficiency.

<span class="mw-page-title-main">Field experiment</span> Experiment conducted outside the laboratory

Field experiments are experiments carried out outside of laboratory settings.

<span class="mw-page-title-main">Transtheoretical model</span> Integrative theory of therapy

The transtheoretical model of behavior change is an integrative theory of therapy that assesses an individual's readiness to act on a new healthier behavior, and provides strategies, or processes of change to guide the individual. The model is composed of constructs such as: stages of change, processes of change, levels of change, self-efficacy, and decisional balance.

The social norms approach, or social norms marketing, is an environmental strategy gaining ground in health campaigns. While conducting research in the mid-1980s, two researchers, H.W. Perkins and A.D. Berkowitz, reported that students at a small U.S. college held exaggerated beliefs about the normal frequency and consumption habits of other students with regard to alcohol. These inflated perceptions have been found in many educational institutions, with varying populations and locations. Despite the fact that college drinking is at elevated levels, the perceived amount almost always exceeds actual behavior. The social norms approach has shown signs of countering misperceptions, however research on changes in behavior resulting from changed perceptions varies between mixed to conclusively nonexistent.

Evidence-based policy is a concept in public policy that advocates for policy decisions to be grounded on, or influenced by, rigorously established objective evidence. This concept presents a stark contrast to policymaking predicated on ideology, 'common sense', anecdotes, or personal intuitions. The methodology employed in evidence-based policy often includes comprehensive research methods such as randomized controlled trials (RCT). Good data, analytical skills, and political support to the use of scientific information are typically seen as the crucial elements of an evidence-based approach.

<span class="mw-page-title-main">Causal model</span> Conceptual model in philosophy of science

In metaphysics, a causal model is a conceptual model that describes the causal mechanisms of a system. Several types of causal notation may be used in the development of a causal model. Causal models can improve study designs by providing clear rules for deciding which independent variables need to be included/controlled for.

<span class="mw-page-title-main">Logic model</span> Method of depicting causal relationships

Logic models are hypothesized descriptions of the chain of causes and effects leading to an outcome of interest. While they can be in a narrative form, logic model usually take form in a graphical depiction of the "if-then" (causal) relationships between the various elements leading to the outcome. However, the logic model is more than the graphical depiction: it is also the theories, scientific evidences, assumptions and beliefs that support it and the various processes behind it.

Impact evaluation assesses the changes that can be attributed to a particular intervention, such as a project, program or policy, both the intended ones, as well as ideally the unintended ones. In contrast to outcome monitoring, which examines whether targets have been achieved, impact evaluation is structured to answer the question: how would outcomes such as participants' well-being have changed if the intervention had not been undertaken? This involves counterfactual analysis, that is, "a comparison between what actually happened and what would have happened in the absence of the intervention." Impact evaluations seek to answer cause-and-effect questions. In other words, they look for the changes in outcome that are directly attributable to a program.

Normalization process theory (NPT) is a sociological theory, generally used in the fields of science and technology studies (STS), implementation research, and healthcare system research. The theory deals with the adoption of technological and organizational innovations into systems, recent studies have utilized this theory in evaluating new practices in social care and education settings. It was developed out of the normalization process model.

In statistics, econometrics, political science, epidemiology, and related disciplines, a regression discontinuity design (RDD) is a quasi-experimental pretest–posttest design that aims to determine the causal effects of interventions by assigning a cutoff or threshold above or below which an intervention is assigned. By comparing observations lying closely on either side of the threshold, it is possible to estimate the average treatment effect in environments in which randomisation is unfeasible. However, it remains impossible to make true causal inference with this method alone, as it does not automatically reject causal effects by any potential confounding variable. First applied by Donald Thistlethwaite and Donald Campbell (1960) to the evaluation of scholarship programs, the RDD has become increasingly popular in recent years. Recent study comparisons of randomised controlled trials (RCTs) and RDDs have empirically demonstrated the internal validity of the design.

Principal stratification is a statistical technique used in causal inference when adjusting results for post-treatment covariates. The idea is to identify underlying strata and then compute causal effects only within strata. It is a generalization of the local average treatment effect (LATE) in the sense of presenting applications besides all-or-none compliance. The LATE method, which was independently developed by Imbens and Angrist (1994) and Baker and Lindeman (1994) also included the key exclusion restriction and monotonicity assumptions for identifiability. For the history of early developments see Baker, Kramer, Lindeman.

<span class="mw-page-title-main">Theory of Change</span> A theory of how a social policy or program is thought to work

A theory of change (ToC) is an explicit theory of how and why it is thought that a social policy or program activities lead to outcomes and impacts. ToCs are used in the design of programs and program evaluation, across a range of policy areas.

The PRECEDE–PROCEED model is a cost–benefit evaluation framework proposed in 1974 by Lawrence W. Green that can help health program planners, policy makers and other evaluators, analyze situations and design health programs efficiently. It provides a comprehensive structure for assessing health and quality of life needs, and for designing, implementing and evaluating health promotion and other public health programs to meet those needs. One purpose and guiding principle of the PRECEDE–PROCEED model is to direct initial attention to outcomes, rather than inputs. It guides planners through a process that starts with desired outcomes and then works backwards in the causal chain to identify a mix of strategies for achieving those objectives. A fundamental assumption of the model is the active participation of its intended audience — that is, that the participants ("consumers") will take an active part in defining their own problems, establishing their goals and developing their solutions.

A behavior change method, or behavior change technique, is a theory-based method for changing one or several determinants of behavior such as a person's attitude or self-efficacy. Such behavior change methods are used in behavior change interventions. Although of course attempts to influence people's attitude and other psychological determinants were much older, especially the definition developed in the late nineties yielded useful insights, in particular four important benefits:

  1. It developed a generic, abstract vocabulary that facilitated discussion of the active ingredients of an intervention
  2. It emphasized the distinction between behavior change methods and practical applications of these methods
  3. It included the concept of 'parameters for effectiveness', important conditions for effectiveness often neglected
  4. It drew attention to the fact that behavior change methods influence specific determinants.

Intervention mapping is a protocol for developing theory-based and evidence-based health promotion programs. Intervention Mapping describes the process of health promotion program planning in six steps:

  1. the needs assessment based on the PRECEDE-PROCEED model
  2. the definition of performance and change objectives based upon scientific analyses of health problems and problem causing factors;
  3. the selection of theory-based intervention methods and practical applications to change health-related behavior;
  4. the production of program components, design and production;
  5. the anticipation of program adoption, implementation and sustainability; and
  6. the anticipation of process and effect evaluation.

Realist evaluation or realist review is a type of theory-driven evaluation method used in evaluating social programmes. It is based on the epistemological foundations of critical realism, though one of the originators of realist evaluation, Ray Pawson, who was "initially impressed" by how critical realism explains generative causation in experimental science, later criticised its "philosophical grandstanding" and "explain-all Marxism". Based on specific theories, realist evaluation provides an alternative lens to empiricist evaluation techniques for the study and understanding of programmes and policies. This technique assumes that knowledge is a social and historical product, thus the social and political context as well as theoretical mechanisms, need consideration in analysis of programme or policy effectiveness.

Causal inference is the process of determining the independent, actual effect of a particular phenomenon that is a component of a larger system. The main difference between causal inference and inference of association is that causal inference analyzes the response of an effect variable when a cause of the effect variable is changed. The study of why things occur is called etiology, and can be described using the language of scientific causal notation. Causal inference is said to provide the evidence of causality theorized by causal reasoning.

Control functions are statistical methods to correct for endogeneity problems by modelling the endogeneity in the error term. The approach thereby differs in important ways from other models that try to account for the same econometric problem. Instrumental variables, for example, attempt to model the endogenous variable X as an often invertible model with respect to a relevant and exogenous instrument Z. Panel analysis uses special data properties to difference out unobserved heterogeneity that is assumed to be fixed over time.

Experimental benchmarking allows researchers to learn about the accuracy of non-experimental research designs. Specifically, one can compare observational results to experimental findings to calibrate bias. Under ordinary conditions, carrying out an experiment gives the researchers an unbiased estimate of their parameter of interest. This estimate can then be compared to the findings of observational research. Note that benchmarking is an attempt to calibrate non-statistical uncertainty. When combined with meta-analysis this method can be used to understand the scope of bias associated with a specific area of research.

Huey-tsyh Chen is a Taiwanese American sociologist and scholar of program evaluation. He is Professor in the Department of Public Health and Director of the Center for Evaluation and Applied Research at Mercer University.

References

  1. Chen, H.-T., & Rossi, P. H. (1980). The Multi-Goal, Theory-Driven Approach to Evaluation: A Model Linking Basic and Applied Social Science. Social Forces, 59, 106–122.
  2. Coryn, C. L. S., Noakes, L. A., Westine, C. D., & Schröter, D. C. (2011, p. 201). A systematic review of theory-driven evaluation practice from 1990 to 2009. American Journal of Evaluation, 32(2), 199–226.
  3. Donaldson, S. I. (2022, p. 9). Introduction to Theory-Driven Program Evaluation (2nd ed.). Routledge.
  4. Coryn, C. L. S., Noakes, L. A., Westine, C. D., & Schröter, D. C. (2011, pp. 203–205). A systematic review of theory-driven evaluation practice from 1990 to 2009. American Journal of Evaluation, 32(2), 199–226.
  5. Chen, H. T. (2012). Theory-driven evaluation: Conceptual framework, application and advancement. In R. Strobl, O. Lobermeier, & W. Heitmeyer (Eds.), Evaluation von Programmen und Projekten für eine demokratische Kultur (pp. 17–40). Springer Fachmedien Wiesbaden. https://doi.org/10.1007/978-3-531-19009-9_2
  6. Weiss, C. H. (1995). Nothing as Practical as Good Theory: Exploring Theory-Based Evaluation for Comprehensive Community Initiatives for Children and Families. In J. P. Connell, A. C. Kublsch, L. B. Schorr, & C. H. Weiss (Eds.), New Approaches to Evaluating Community Initiatives: Concepts, Methods, and Contexts (Issue 7, pp. 65–92). The Aspen Institute.
  7. Chen, H.-T., & Rossi, P. H. (1980). The Multi-Goal, Theory-Driven Approach to Evaluation: A Model Linking Basic and Applied Social Science. Social Forces, 59, 106–122.
  8. Chen, H.-T., & Rossi, P. H. (1983). Evaluating With Sense: The Theory-Driven Approach. Evaluation Review, 7(3), 283–302.
  9. Chen, H.-T., & Rossi, P. H. (1980). The Multi-Goal, Theory-Driven Approach to Evaluation: A Model Linking Basic and Applied Social Science. Social Forces, 59, 106–122.
  10. Chen, H. T. (1990). Theory-driven evaluations. Thousand Oaks, CA: Sage.
  11. Worthen, B. R. (1996). Editor’s Note: The Origins of Theory-Based Evaluation. Evaluation Practice, 17(2), 169–171.
  12. Weiss, C. H. (1972). Evaluation research: Methods for assessing program effectiveness. Englewood Cliffs, NJ: Prentice-Hall.
  13. Fitz-Gibbon, C. T., & Morris, L. L. (1975). Theory-based evaluation. Evaluation Comment, 5(1), 1–4. Reprinted in Fitz-Gibbon, C. T., & Morris, L. L. (1996). Theory-based evaluation. Evaluation Practice, 17(2), 177–184.
  14. See p. 16, Funnell, S. C., Rogers, P. J. (2011). Purposeful Program Theory: Effective Use of Theories of Change and Logic Models. Jossey-Bass.
  15. Fitz-Gibbon, C. T., & Morris, L. L. (1975, p. 1).
  16. Birckmayer, J. D., & Weiss, C. H. (2000). Theory-based evaluation in practice: what do we learn? Evaluation review, 24(4), 407-431.
  17. Matta, Corrado; Lindvall, Jannika; Ryve, Andreas (2024). "The Mechanistic Rewards of Data and Theory Integration for Theory-Based Evaluation". American Journal of Evaluation. 45 (1): 110–132. doi:10.1177/10982140221122764.
  18. Dahler-Larsen, P. (2018, p. 9). Theory-Based Evaluation Meets Ambiguity: The Role of Janus Variables. American Journal of Evaluation, 39(1), 6–23.
  19. Stern, E., Stame, N., Mayne, J., Forss, K., Davies, R., & Befani, B. (2012). Broadening the range of designs and methods for impact evaluations. Institute for Development Studies.
  20. HM Treasury (2020). The Magenta Book.
  21. Chen, H.-T., & Rossi, P. H. (1983). Evaluating With Sense: The Theory-Driven Approach. Evaluation Review, 7(3), 283–302.
  22. Chen, H. T. (1989). The conceptual framework of the theory-driven perspective. Evaluation and Program Planning, 12(4), 391-396.
  23. Weiss, C. H. (1972, p. 38). Evaluation research: Methods for assessing program effectiveness. Englewood Cliffs, NJ: Prentice-Hall.
  24. Schmidt, R. (2024). A graphical method for causal program attribution in theory-based evaluation. Evaluation, 13563890231223171.
  25. Chen, H. T. (2015, Chapter 3). Practical Program Evaluation: Theory-Driven Evaluation and the Integrated Evaluation Perspective. SAGE Publications Ltd.
  26. Chen, H. T. (2015). Practical program evaluation: Theory-driven evaluation and the integrated evaluation perspective (2nd edition). Sage Publications.
  27. Chen, H.-T., & Rossi, P. H. (1983). Evaluating With Sense: The Theory-Driven Approach. Evaluation Review, 7(3), 283–302.
  28. Smith, N. L. (1994). Clarifying and Expanding the Application of Program Theory-driven Evaluations. Evaluation Practice, 15(1), 83–87.
  29. Chen, H. T. (1994). Theory-driven Evaluations: Need, Difficulties, and Options. American Journal of Evaluation, 15(1), 79–82. https://doi.org/10.1177/109821409401500109
  30. Martin P and Tannenbaum C (2017) A realist evaluation of patients’ decisions to deprescribe in the EMPOWER trial. BMJ Open 7(4): e015959.
  31. Bonell C, Fletcher A, Morton M, et al. (2012) Realist randomised controlled trials: A new approach to evaluating complex public health interventions. Social Science & Medicine 75(12): 2299–306.
  32. Bonell, C., Melendez-Torres, G. J., & Warren, E. (2024). Realist Trials and Systematic Reviews: Rigorous, Useful Evidence to Inform Health Policy. Cambridge: Cambridge University Press.
  33. Marchal, B., Westhorp, G., Wong, G., Van Belle, S., Greenhalgh, T., Kegels, G., & Pawson, R. (2013). Realist RCTs of complex interventions—An oxymoron. Social Science & Medicine, 94, 124–128.
  34. Nielsen, S. B., Jaspers, S. Ø., & Lemire, S. (2023). The curious case of the realist trial: Methodological oxymoron or unicorn? Evaluation.
  35. Coryn, C. L. S., Noakes, L. A., Westine, C. D., & Schröter, D. C. (2011). A systematic review of theory-driven evaluation practice from 1990 to 2009. American Journal of Evaluation, 32(2), 199–226. https://doi.org/10.1177/1098214010389321
  36. Bickman, L. (1996). The application of program theory to the evaluation of a managed mental health care system. Evaluation and Program Planning, 19, 111-119.
  37. Hense, J., Kriz, W. C., & Wolfe, J. (2009). Putting theory-oriented evaluation into practice: A logic model approach for evaluating SIMGAME. Simulation & Gaming, 40, 110-133.
  38. Chen, H. T., Weng, J. C. S., & Lin, L.-H. (1997). Evaluating the process and outcome of a garbage reduction program in Taiwan. Evaluation Review, 21, 27-42.
  39. Mercer, S. H., Idler, A. M., & Bartfai, J. M. (2014). Theory-Driven Evaluation in School Psychology Intervention Research: 2007–2012. School Psychology Review, 43(2), 119–131.
  40. Sheridan, S. M., Bovaird, J. A., Glover, T. A., Andrew Garbacz, S., Witte, A., & Kwon, K. (2012). A Randomized Trial Examining the Effects of Conjoint Behavioral Consultation and the Mediating Role of the Parent–Teacher Relationship. School Psychology Review, 41(1), 23–46.
  41. Hawkins, R. O., Hale, A., Sheeley, W., & Ling, S. (2011). Repeated reading and vocabulary‐previewing interventions to improve fluency and comprehension for struggling high‐school readers. Psychology in the Schools, 48(1), 59–77.