Ecological correlation

Last updated

In statistics, an ecological correlation (also spatial correlation) is a correlation between two variables that are group means, in contrast to a correlation between two variables that describe individuals. [1] For example, one might study the correlation between physical activity and weight among sixth-grade children. A study at the individual level might make use of 100 children, then measure both physical activity and weight; the correlation between the two variables would be at the individual level. By contrast, another study might make use of 100 classes of sixth-grade students, then measure the mean physical activity and the mean weight of each of the 100 classes. A correlation between these group means would be an example of an ecological correlation.

Because a correlation describes the measured strength of a relationship, correlations at the group level can be much higher than those at the individual level. Thinking both are equal is an example of ecological fallacy. [2]

See also

General topics
Specific applications

Related Research Articles

<span class="mw-page-title-main">Measurement</span> Process of assigning numbers to objects or events

Measurement is the quantification of attributes of an object or event, which can be used to compare with other objects or events. In other words, measurement is a process of determining how large or small a physical quantity is as compared to a basic reference quantity of the same kind. The scope and application of measurement are dependent on the context and discipline. In natural sciences and engineering, measurements do not apply to nominal properties of objects or events, which is consistent with the guidelines of the International vocabulary of metrology published by the International Bureau of Weights and Measures. However, in other fields such as statistics as well as the social and behavioural sciences, measurements can have multiple levels, which would include nominal, ordinal, interval and ratio scales.

The phrase "correlation does not imply causation" refers to the inability to legitimately deduce a cause-and-effect relationship between two events or variables solely on the basis of an observed association or correlation between them. The idea that "correlation implies causation" is an example of a questionable-cause logical fallacy, in which two events occurring together are taken to have established a cause-and-effect relationship. This fallacy is also known by the Latin phrase cum hoc ergo propter hoc. This differs from the fallacy known as post hoc ergo propter hoc, in which an event following another is seen as a necessary consequence of the former event, and from conflation, the errant merging of two events, ideas, databases, etc., into one.

<span class="mw-page-title-main">Experiment</span> Scientific procedure performed to validate a hypothesis

An experiment is a procedure carried out to support or refute a hypothesis, or determine the efficacy or likelihood of something previously untried. Experiments provide insight into cause-and-effect by demonstrating what outcome occurs when a particular factor is manipulated. Experiments vary greatly in goal and scale but always rely on repeatable procedure and logical analysis of the results. There also exist natural experimental studies.

<span class="mw-page-title-main">Correlation</span> Statistical concept

In statistics, correlation or dependence is any statistical relationship, whether causal or not, between two random variables or bivariate data. Although in the broadest sense, "correlation" may indicate any type of association, in statistics it usually refers to the degree to which a pair of variables are linearly related. Familiar examples of dependent phenomena include the correlation between the height of parents and their offspring, and the correlation between the price of a good and the quantity the consumers are willing to purchase, as it is depicted in the so-called demand curve.

In statistics and psychometrics, reliability is the overall consistency of a measure. A measure is said to have a high reliability if it produces similar results under consistent conditions:

"It is the characteristic of a set of test scores that relates to the amount of random error from the measurement process that might be embedded in the scores. Scores that are highly reliable are precise, reproducible, and consistent from one testing occasion to another. That is, if the testing process were repeated with a group of test takers, essentially the same results would be obtained. Various kinds of reliability coefficients, with values ranging between 0.00 and 1.00, are usually used to indicate the amount of error in the scores."

The g factor is a construct developed in psychometric investigations of cognitive abilities and human intelligence. It is a variable that summarizes positive correlations among different cognitive tasks, reflecting the fact that an individual's performance on one type of cognitive task tends to be comparable to that person's performance on other kinds of cognitive tasks. The g factor typically accounts for 40 to 50 percent of the between-individual performance differences on a given cognitive test, and composite scores based on many tests are frequently regarded as estimates of individuals' standing on the g factor. The terms IQ, general intelligence, general cognitive ability, general mental ability, and simply intelligence are often used interchangeably to refer to this common core shared by cognitive tests. However, the g factor itself is merely a mathematical construct indicating the level of observed correlation between cognitive tasks. The measured value of this construct depends on the cognitive tasks that are used, and little is known about the underlying causes of the observed correlations.

<span class="mw-page-title-main">Quantitative research</span> All procedures for the numerical representation of empirical facts

Quantitative research is a research strategy that focuses on quantifying the collection and analysis of data. It is formed from a deductive approach where emphasis is placed on the testing of theory, shaped by empiricist and positivist philosophies.

In statistics, an effect size is a value measuring the strength of the relationship between two variables in a population, or a sample-based estimate of that quantity. It can refer to the value of a statistic calculated from a sample of data, the value of a parameter for a hypothetical population, or to the equation that operationalizes how statistics or parameters lead to the effect size value. Examples of effect sizes include the correlation between two variables, the regression coefficient in a regression, the mean difference, or the risk of a particular event happening. Effect sizes complement statistical hypothesis testing, and play an important role in power analyses, sample size planning, and in meta-analyses. The cluster of data-analysis methods concerning effect sizes is referred to as estimation statistics.

An ecological fallacy is a formal fallacy in the interpretation of statistical data that occurs when inferences about the nature of individuals are deduced from inferences about the group to which those individuals belong. 'Ecological fallacy' is a term that is sometimes used to describe the fallacy of division, which is not a statistical fallacy. The four common statistical ecological fallacies are: confusion between ecological correlations and individual correlations, confusion between group average and total average, Simpson's paradox, and confusion between higher average and higher likelihood.

<span class="mw-page-title-main">Choropleth map</span> Type of data visualization for geographic regions

A choropleth map is a type of statistical thematic map that uses pseudocolor, i.e., color corresponding with an aggregate summary of a geographic characteristic within spatial enumeration units, such as population density or per-capita income.

In medical research, social science, and biology, a cross-sectional study is a type of observational study that analyzes data from a population, or a representative subset, at a specific point in time—that is, cross-sectional data.

Internal validity is the extent to which a piece of evidence supports a claim about cause and effect, within the context of a particular study. It is one of the most important properties of scientific studies and is an important concept in reasoning about evidence more generally. Internal validity is determined by how well a study can rule out alternative explanations for its findings. It contrasts with external validity, the extent to which results can justify conclusions about other contexts.

This glossary of statistics and probability is a list of definitions of terms and concepts used in the mathematical sciences of statistics and probability, their sub-disciplines, and related fields. For additional related terms, see Glossary of mathematics and Glossary of experimental design.

<span class="mw-page-title-main">Spatial analysis</span> Formal techniques which study entities using their topological, geometric, or geographic properties

Spatial analysis or spatial statistics includes any of the formal techniques which studies entities using their topological, geometric, or geographic properties. Spatial analysis includes a variety of techniques, many still in their early development, using different analytic approaches and applied in fields as diverse as astronomy, with its studies of the placement of galaxies in the cosmos, to chip fabrication engineering, with its use of "place and route" algorithms to build complex wiring structures. In a more restricted sense, spatial analysis is the technique applied to structures at the human scale, most notably in the analysis of geographic data or transcriptomics data.

<span class="mw-page-title-main">Modifiable areal unit problem</span> Source of statistical bias

The modifiable areal unit problem (MAUP) is a source of statistical bias that can significantly impact the results of statistical hypothesis tests. MAUP affects results when point-based measures of spatial phenomena are aggregated into districts, for example, population density or illness rates. The resulting summary values are influenced by both the shape and scale of the aggregation unit.

Intelligence: Knowns and Unknowns is a report issued in 1995 by a task force created by the Board of Scientific Affairs of the American Psychological Association (APA). It was subsequently published in the February 1996 issue of the peer-reviewed journal American Psychologist.

In statistics, the interclass correlation measures a relation between two variables of different classes (types), such as the weights of 10-year-old sons and the weights of their 40-year-old fathers. Deviations of a variable are measured from the mean of the data for that class – a son’s weight minus the mean of all the sons’ weights, or a father’s weight minus the mean of all the fathers’ weights.

<span class="mw-page-title-main">Socioeconomic status</span> Economic and social measure of a persons affluence and/or influence

Socioeconomic status (SES) is an economic and sociological combined total measure of a person's work experience and of an individual's or family's economic access to resources and social position in relation to others. When analyzing a family's SES, the household income, earners' education, and occupation are examined, as well as combined income, whereas for an individual's SES only their own attributes are assessed. Recently, research has revealed a lesser recognized attribute of SES as perceived financial stress, as it defines the "balance between income and necessary expenses". Perceived financial stress can be tested by deciphering whether a person at the end of each month has more than enough, just enough, or not enough money or resources. However, SES is more commonly used to depict an economic difference in society as a whole.

In statistics and regression analysis, moderation occurs when the relationship between two variables depends on a third variable. The third variable is referred to as the moderator variable or simply the moderator. The effect of a moderating variable is characterized statistically as an interaction; that is, a categorical or continuous variable that is associated with the direction and/or magnitude of the relation between dependent and independent variables. Specifically within a correlational analysis framework, a moderator is a third variable that affects the zero-order correlation between two other variables, or the value of the slope of the dependent variable on the independent variable. In analysis of variance (ANOVA) terms, a basic moderator effect can be represented as an interaction between a focal independent variable and a factor that specifies the appropriate conditions for its operation.

<span class="mw-page-title-main">Psychological research</span>

Psychological research refers to research that psychologists conduct for systematic study and for analysis of the experiences and behaviors of individuals or groups. Their research can have educational, occupational and clinical applications.

References

  1. Robinson, W. S. (1950). "Ecological Correlations and the Behavior of Individuals". American Sociological Review . 15 (3): 351–357. JSTOR   2087176.
  2. Vogt, W. Paul; Johnson, R. Burke (2011). Dictionary of Statistics & Methodology: A Nontechnical Guide for the Social Sciences. Sage. p. 119. ISBN   978-1-4522-3659-9.