This entry will describe the proper narrow and technical meaning of "ecological validity" as proposed by Egon Brunswik as part of the Brunswik Lens Model, the relation of "ecological validity" in "representative design" of research, and will outline the common misuses of the "ecological validity." For a more detailed explanation, see Hammond (1998).
Egon Brunswik defined the term "ecological validity" in the 1940s to describe a cue's informativeness. The ecological validity of a sensory cue in perception is the regression weight the cue X (something an organism might be able to measure from the proximal stimulus) in predicting a property of the world Y (some aspect of the distal stimulus). The "ecological validity" of X1 is its multiple regression weight when Y is regressed on X1, X2, and X3. For example, the color of a banana is a cue that indicates whether the banana is ripe. This particular cue has high ecological validity because a banana's ripeness is highly correlated with its color. By contrast, the presence of a sticker on the banana is a cue with an ecological validity close to 0, if (as seems likely) ripe and unripe bananas (in a fruit bowl, say) are equally likely to have stickers on them.
The concept of ecological validity is closely related to likelihood in Bayesian statistical inference and to cue validity in statistics.
Brunswik's concept of "ecological validity" is tied to his concept of "representative design." In a "representative design", the variances and correlations of some dependent variable Y and independent variables X1, X2, and X3 match their values in some specific real world ecology. Quoting Hammond (1998), "Generalizability of results concerning. . . the variables involved [in the experiment] must remain limited unless the range, but better also the distribution. . . of each variable, has been made representative of a carefully defined set of conditions" (1956, p. 53). Brunswik's admonition regarding the representativeness of the formal aspects of the conditions of experiments also includes the (ecological) intercorrelation among the independent variables in the experiment, thus challenging the typical factorial design in which variables are set in orthogonal relation to one another."
To understand why the "ecological validity" of a cue will change if the design is not "representative", consider two admissions officers, at schools A and B. School A is a highly selective university and B is a nonselective college. Admissions officers at A and B may learn to predict freshman GPA (Y) of applicants to their respective colleges on the basis of applicants' high school GPA (X1), ACT score (X2), and a rating of the quality of the student's essay on a 1 to 5 scale (X3). Because, in multiple regression, the weights of X1, X2, and X3 depend on their correlations and their variances, one would likely find very different regression weight (and therefore ecological validity of X1) of applicants at A versus B.
Brunswik believed that people learn over time to weight cues that will predict the criterion Y in a particular environment where they operate and receive feedback. If, in a particular environment where the judge normally operates, X1 and X2 are highly related, one can learn to predict Y using a subset of the cues to predict the criterion without loss of accuracy. But if the same person is put in a new situation with different ranges of the cues and different correlations among them, performance in predicting the criterion will suffer. This is similar to saying that Admissions officer A might have a hard time using what she had learned from experience at her selective employer if now attempting to predict freshman GPAs of applicants at B's university. Brunswik believed similar problems arise when researchers create experiments where the independent variables are not distributed in a way that matches the participants' local environments—for example, by making independent variables uncorrelated or by holding all but one variable constant.
Brunswik's students have written that the now-common use of "ecological validity" to describe a type of experimental validity was a corruption of his original terminology (see external link to paper by Hammond). Social scientists routinely refer to the "ecological validity" of an experiment as a rough synonym to Aronson and Carlsmith's (1968) concept of the "mundane realism" of the experimental procedures—Mundane realism refers to the extent to which the experimental situation is similar to situations people are likely to encounter outside of the laboratory. See Hammond's (1998) detailed critique of this mis-use. Another common misuse of ecological validity is as a synonym for External Validity.
Validity is the main extent to which a concept, conclusion or measurement is well-founded and likely corresponds accurately to the real world. The word "valid" is derived from the Latin validus, meaning strong. The validity of a measurement tool is the degree to which the tool measures what it claims to measure. Validity is based on the strength of a collection of different types of evidence described in greater detail below.
An ecological fallacy is a formal fallacy in the interpretation of statistical data that occurs when inferences about the nature of individuals are deduced from inferences about the group to which those individuals belong. 'Ecological fallacy' is a term that is sometimes used to describe the fallacy of division, which is not a statistical fallacy. The four common statistical ecological fallacies are: confusion between ecological correlations and individual correlations, confusion between group average and total average, Simpson's paradox, and confusion between higher average and higher likelihood.
In statistics, a spurious relationship or spurious correlation is a mathematical relationship in which two or more events or variables are associated but not causally related, due to either coincidence or the presence of a certain third, unseen factor.
In psychometrics, predictive validity is the extent to which a score on a scale or test predicts scores on some criterion measure.
In the behavioral sciences, ecological validity is often used to refer to the judgment of whether a given study's variables and conclusions are sufficiently relevant to its population. Psychological studies are usually conducted in laboratories though the goal of these studies is to understand human behavior in the real-world. Ideally, an experiment would have generalizable results that predict behavior outside of the lab, thus having more ecological validity. Ecological validity can be considered a commentary on the relative strength of a study's implication(s) for policy, society, culture, etc.
In statistics, the coefficient of determination, denoted R2 or r2 and pronounced "R squared", is the proportion of the variation in the dependent variable that is predictable from the independent variable(s).
In statistics, econometrics, epidemiology and related disciplines, the method of instrumental variables (IV) is used to estimate causal relationships when controlled experiments are not feasible or when a treatment is not successfully delivered to every unit in a randomized experiment. Intuitively, IVs are used when an explanatory variable of interest is correlated with the error term, in which case ordinary least squares and ANOVA give biased results. A valid instrument induces changes in the explanatory variable but has no independent effect on the dependent variable, allowing a researcher to uncover the causal effect of the explanatory variable on the dependent variable.
External validity is the validity of applying the conclusions of a scientific study outside the context of that study. In other words, it is the extent to which the results of a study can be generalized to and across other situations, people, stimuli, and times. In contrast, internal validity is the validity of conclusions drawn within the context of a particular study. Because general conclusions are almost always a goal in research, external validity is an important property of any study. Mathematical analysis of external validity concerns a determination of whether generalization across heterogeneous populations is feasible, and devising statistical and computational methods that produce valid generalizations.
In statistics, canonical analysis belongs to the family of regression methods for data analysis. Regression analysis quantifies a relationship between a predictor variable and a criterion variable by the coefficient of correlation r, coefficient of determination r2, and the standard regression coefficient β. Multiple regression analysis expresses a relationship between a set of predictor variables and a single criterion variable by the multiple correlation R, multiple coefficient of determination R², and a set of standard partial regression weights β1, β2, etc. Canonical variate analysis captures a relationship between a set of predictor variables and a set of criterion variables by the canonical correlations ρ1, ρ2, ..., and by the sets of canonical weights C and D.
In psychology, the take-the-best heuristic is a heuristic which decides between two alternatives by choosing based on the first cue that discriminates them, where cues are ordered by cue validity. In the original formulation, the cues were assumed to have binary values or have an unknown value. The logic of the heuristic is that it bases its choice on the best cue (reason) only and ignores the rest.
In statistics, a confounder is a variable that influences both the dependent variable and independent variable, causing a spurious association. Confounding is a causal concept, and as such, cannot be described in terms of correlations or associations. The existence of confounders is an important quantitative explanation why correlation does not imply causation.
In statistics, unit-weighted regression is a simplified and robust version of multiple regression analysis where only the intercept term is estimated. That is, it fits a model
Egon Brunswik Edler von Korompa was a psychologist who made contributions to functionalism and the history of psychology.
In the philosophy of science, a causal model is a conceptual model that describes the causal mechanisms of a system. Causal models can improve study designs by providing clear rules for deciding which independent variables need to be included/controlled for.
A quasi-experiment is an empirical interventional study used to estimate the causal impact of an intervention on target population without random assignment. Quasi-experimental research shares similarities with the traditional experimental design or randomized controlled trial, but it specifically lacks the element of random assignment to treatment or control. Instead, quasi-experimental designs typically allow the researcher to control the assignment to the treatment condition, but using some criterion other than random assignment.
In statistics, regression validation is the process of deciding whether the numerical results quantifying hypothesized relationships between variables, obtained from regression analysis, are acceptable as descriptions of the data. The validation process can involve analyzing the goodness of fit of the regression, analyzing whether the regression residuals are random, and checking whether the model's predictive performance deteriorates substantially when applied to data that were not used in model estimation.
The following outline is provided as an overview of and topical guide to regression analysis:
Incremental validity is a type of validity that is used to determine whether a new psychometric assessment will increase the predictive ability beyond that provided by an existing method of assessment. In other words, incremental validity seeks to answer if the new test adds much information that cannot be obtained with simpler, already existing methods.
In statistics, linear regression is a linear approach for modelling the relationship between a scalar response and one or more explanatory variables. The case of one explanatory variable is called simple linear regression; for more than one, the process is called multiple linear regression. This term is distinct from multivariate linear regression, where multiple correlated dependent variables are predicted, rather than a single scalar variable.