Multiple baseline design

Last updated

A multiple baseline design is used in medical, psychological, and biological research. The multiple baseline design was first reported in 1960 as used in basic operant research. It was applied in the late 1960s to human experiments in response to practical and ethical issues that arose in withdrawing apparently successful treatments from human subjects. In it two or more (often three) behaviors, people or settings are plotted in a staggered graph where a change is made to one, but not the other two, and then to the second, but not the third behavior, person or setting. Differential changes that occur to each behavior, person or in each setting help to strengthen what is essentially an AB design with its problematic competing hypotheses.

Contents

Because treatment is started at different times, changes are attributable to the treatment rather than to a chance factor. By gathering data from many subjects (instances), inferences can be made about the likeliness that the measured trait generalizes to a greater population. In multiple baseline designs, the experimenter starts by measuring a trait of interest, then applies a treatment before measuring that trait again. Treatment does not begin until a stable baseline has been recorded, and does not finish until measures regain stability. [1] If a significant change occurs across all participants the experimenter may infer that the treatment is effective.

Multiple base-line experiments are most commonly used in cases where the dependent variable is not expected to return to normal after the treatment has been applied, or when medical reasons forbid the withdrawal of a treatment. They often employ particular methods or recruiting participants. Multiple baseline designs are associated with potential confounds introduced by experimenter bias, which must be addressed to preserve objectivity. Particularly, researchers are advised to develop all test schedules and data collection limits beforehand.

Recruiting participants

Although multiple baseline designs may employ any method of recruitment, it is often associated with "ex post facto" recruitment. This is because multiple baselines can provide data regarding the consensus of a treatment response. Such data can often not be gathered from ABA (reversal) designs for ethical or learning reasons. Experimenters are advised not to remove cases that do not exactly fit their criteria, as this may introduce sampling bias and threaten validity. [1] Ex post facto recruitment methods are not considered true experiments, due to the limits of experimental control or randomized control that the experimenter has over the trait. This is because a control group may necessarily be selected from a discrete separate population. This research design is thus considered a quasi-experimental design.

Concurrent designs

Multiple baseline studies are often categorized as either concurrent or nonconcurrent. [1] [2] [3] Concurrent designs are the traditional approach to multiple baseline studies, where all participants undergo treatment simultaneously. This strategy is advantageous because it moderates several threats to validity, and history effects in particular. [2] [4] Concurrent multiple baseline designs are also useful for saving time, since all participants are processed at once. The ability to retrieve complete data sets within well defined time constraints is a valuable asset while planning research.

Nonconcurrent designs

Nonconcurrent multiple baseline studies apply treatment to several individuals at delayed intervals. This has the advantage of greater flexibility in recruitment of participants and testing location. For this reason, perhaps, nonconcurrent multiple baseline experiments are recommended for research in an educational setting. [3] It is recommended that the experimenter selects time frames beforehand to avoid experimenter bias, [1] but even when methods are used to improve validity, inferences may be weakened. [2] Currently, there is debate as to whether nonconcurrent studies represent a real threat from history effects. [2] [5] It is generally agreed, however, that concurrent testing is more stable.

Disadvantages

Although multiple baseline experimental designs compensate for many of the issues inherent in ex post facto recruitment, experimental manipulation of a trait gathered by this method may not be manipulated. Thus these studies are prevented from inferring causation if there are no phases to demonstrate reversibility. However, if such phases are included (as is the standard of experimentation), they can successfully demonstrate causation.

Managing threats to validity

A priori (beforehand) specification of the hypothesis, time frames, and data limits help control threats due to experimenter bias. [1] For the same reason researchers should avoid removing participants based on merit. Multiple probe designs may be useful in identifying extraneous factors which may be influencing your results. Lastly, experimenters should avoid gathering data during sessions alone. If in-session data is gathered a note of the dates should be tagged to each measurement in order to provide an accurate time-line for potential reviewers. This data may represent unnatural behaviour or states of mind, and must be considered carefully during interpretation. [2]

See also

Related Research Articles

<span class="mw-page-title-main">Design of experiments</span> Design of tasks

The design of experiments, also known as experiment design or experimental design, is the design of any task that aims to describe and explain the variation of information under conditions that are hypothesized to reflect the variation. The term is generally associated with experiments in which the design introduces conditions that directly affect the variation, but may also refer to the design of quasi-experiments, in which natural conditions that influence the variation are selected for observation.

Validity is the main extent to which a concept, conclusion or measurement is well-founded and likely corresponds accurately to the real world. The word "valid" is derived from the Latin validus, meaning strong. The validity of a measurement tool is the degree to which the tool measures what it claims to measure. Validity is based on the strength of a collection of different types of evidence described in greater detail below.

<span class="mw-page-title-main">Experimental psychology</span> Application of experimental method to psychological research

Experimental psychology refers to work done by those who apply experimental methods to psychological study and the underlying processes. Experimental psychologists employ human participants and animal subjects to study a great many topics, including sensation & perception, memory, cognition, learning, motivation, emotion; developmental processes, social psychology, and the neural substrates of all of these.

<span class="mw-page-title-main">Response bias</span> Type of bias

Response bias is a general term for a wide range of tendencies for participants to respond inaccurately or falsely to questions. These biases are prevalent in research involving participant self-report, such as structured interviews or surveys. Response biases can have a large impact on the validity of questionnaires or surveys.

<span class="mw-page-title-main">Field experiment</span>

Field experiments are experiments carried out outside of laboratory settings.

Internal validity is the extent to which a piece of evidence supports a claim about cause and effect, within the context of a particular study. It is one of the most important properties of scientific studies and is an important concept in reasoning about evidence more generally. Internal validity is determined by how well a study can rule out alternative explanations for its findings. It contrasts with external validity, the extent to which results can justify conclusions about other contexts. Both internal and external validity can be described using qualitative or quantitative forms of causal notation.

<span class="mw-page-title-main">Scientific control</span> Methods employed to reduce error in science tests

Informal improvements in any process or enquiry have been made by comparison between what was done previously and the new method for thousands of years. A scientific control is a modern formal experiment or observation designed to minimize the effects of variables other than the independent variable. This increases the reliability of the results, often through a comparison between control measurements and the other measurements. Scientific controls are a part of the scientific method.

External validity is the validity of applying the conclusions of a scientific study outside the context of that study. In other words, it is the extent to which the results of a study can be generalized to and across other situations, people, stimuli, and times. In contrast, internal validity is the validity of conclusions drawn within the context of a particular study. Because general conclusions are almost always a goal in research, external validity is an important property of any study. Mathematical analysis of external validity concerns a determination of whether generalization across heterogeneous populations is feasible, and devising statistical and computational methods that produce valid generalizations.

<span class="mw-page-title-main">Confounding</span> Variable or factor in causal inference

In causal inference, a confounder is a variable that influences both the dependent variable and independent variable, causing a spurious association. Confounding is a causal concept, and as such, cannot be described in terms of correlations or associations. The existence of confounders is an important quantitative explanation why correlation does not imply causation. Some notations are explicitly designed to identify the existence, possible existence, or non-existence of confounders in causal relationships between elements of a system.

<span class="mw-page-title-main">Research design</span> Overall strategy utilized to carry out research

Research design refers to the overall strategy utilized to carry out research that defines a succinct and logical plan to tackle established research question(s) through the collection, interpretation, analysis, and discussion of data.

<span class="mw-page-title-main">Demand characteristics</span> Extraneous variable in social research

In social research, particularly in psychology, the term demand characteristic refers to an experimental artifact where participants form an interpretation of the experiment's purpose and subconsciously change their behavior to fit that interpretation. Typically, demand characteristics are considered an extraneous variable, exerting an effect on behavior other than that intended by the experimenter. Pioneering research was conducted on demand characteristics by Martin Orne.

<span class="mw-page-title-main">Observational study</span> Study with uncontrolled variable of interest

In fields such as epidemiology, social sciences, psychology and statistics, an observational study draws inferences from a sample to a population where the independent variable is not under the control of the researcher because of ethical concerns or logistical constraints. One common observational study is about the possible effect of a treatment on subjects, where the assignment of subjects into a treated group versus a control group is outside the control of the investigator. This is in contrast with experiments, such as randomized controlled trials, where each subject is randomly assigned to a treated group or a control group. Observational studies, for lacking an assignment mechanism, naturally present difficulties for inferential analysis.

In natural and social science research, a protocol is most commonly a predefined procedural method in the design and implementation of an experiment. Protocols are written whenever it is desirable to standardize a laboratory method to ensure successful replication of results by others in the same laboratory or by other laboratories. Additionally, and by extension, protocols have the advantage of facilitating the assessment of experimental results through peer review. In addition to detailed procedures, equipment, and instruments, protocols will also contain study objectives, reasoning for experimental design, reasoning for chosen sample sizes, safety precautions, and how results were calculated and reported, including statistical analysis and any rules for predefining and documenting excluded data to avoid bias.

Single-subject research is a group of research methods that are used extensively in the experimental analysis of behavior and applied behavior analysis with both human and non-human participants. This research strategy focuses on one participant and tracks their progress in the research topic over a period of time. Single-subject research allows researchers to track changes in an individual over a large stretch of time instead of observing different people at different stages. This type of research can provide critical data in several fields, specifically psychology. It is most commonly used in experimental and applied analysis of behaviors. This research has been heavily debated over the years. Some believe that this research method is not effective at all while others praise the data that can be collected from it. Principal methods in this type of research are: A-B-A-B designs, Multi-element designs, Multiple Baseline designs, Repeated acquisition designs, Brief experimental designs and Combined designs.

<span class="mw-page-title-main">Quasi-experiment</span> Empirical interventional study

A quasi-experiment is an empirical interventional study used to estimate the causal impact of an intervention on target population without random assignment. Quasi-experimental research shares similarities with the traditional experimental design or randomized controlled trial, but it specifically lacks the element of random assignment to treatment or control. Instead, quasi-experimental designs typically allow the researcher to control the assignment to the treatment condition, but using some criterion other than random assignment.

In design of experiments, single-subject curriculum or single-case research design is a research design most often used in applied fields of psychology, education, and human behaviour in which the subject serves as his/her own control, rather than using another individual/group. Researchers use single-subject design because these designs are sensitive to individual organism differences vs group designs which are sensitive to averages of groups. The logic behind single subject designs is 1) Prediction, 2) Verification, and 3) Replication. The baseline data predicts behaviour by affirming the consequent. Verification refers to demonstrating that the baseline responding would have continued had no intervention been implemented. Replication occurs when a previously observed behaviour changed is reproduced. There can be large numbers of subjects in a research study using single-subject design, however—because the subject serves as their own control, this is still a single-subject design. These designs are used primarily to evaluate the effect of a variety of interventions in applied research.

Impact evaluation assesses the changes that can be attributed to a particular intervention, such as a project, program or policy, both the intended ones, as well as ideally the unintended ones. In contrast to outcome monitoring, which examines whether targets have been achieved, impact evaluation is structured to answer the question: how would outcomes such as participants' well-being have changed if the intervention had not been undertaken? This involves counterfactual analysis, that is, "a comparison between what actually happened and what would have happened in the absence of the intervention." Impact evaluations seek to answer cause-and-effect questions. In other words, they look for the changes in outcome that are directly attributable to a program.

<span class="mw-page-title-main">Psychological research</span> Research about behaviors of individuals or groups

Psychological research refers to research that psychologists conduct for systematic study and for analysis of the experiences and behaviors of individuals or groups. Their research can have educational, occupational and clinical applications.

<span class="mw-page-title-main">Between-group design experiment</span>

In the design of experiments, a between-group design is an experiment that has two or more groups of subjects each being tested by a different testing factor simultaneously. This design is usually used in place of, or in some cases in conjunction with, the within-subject design, which applies the same variations of conditions to each subject to observe the reactions. The simplest between-group design occurs with two groups; one is generally regarded as the treatment group, which receives the ‘special’ treatment, and the control group, which receives no variable treatment and is used as a reference The between-group design is widely used in psychological, economic, and sociological experiments, as well as in several other fields in the natural or social sciences.

References

  1. 1 2 3 4 5 Christ, T. (2007). Experimental control and threats to internal validity of concurrent and nonconcurrent multiple baseline designs. Psychology in the Schools, 44(5), 451-459. doi : 10.1002/pits.20237.
  2. 1 2 3 4 5 Recommendations for Reporting Multiple-Baseline Designs across Participants. Behavioral Interventions, 20(3), 219-224. doi : 10.1002/bin.191.
  3. 1 2 Harvey, M., May, M., & Kennedy, C. (2004). Nonconcurrent Multiple Baseline Designs and the Evaluation of Educational Systems. Journal of Behavioral Education, 13(4), 267-276. doi : 10.1023/B:JOBE.0000044735.51022.5d.
  4. Harris, F., & Jenson, W. (1985). Comparisons of multiple-baseline across persons designs and AB designs with replication: Issues and confusions. Behavioral Assessment, 7(2), 121-127. doi : 10.1007/BF00961078.
  5. Watson, P., & Workman, E. (1981). The non-concurrent multiple baseline across-individuals design: An extension of the traditional multiple baseline design. Journal of Behavior Therapy and Experimental Psychiatry, 12(3), 257-259. doi : 10.1016/0005-7916(81)90055-0.