Homogeneity (statistics)

For homogeneity of variance see homoscedasticity.

In statistics, homogeneity and its opposite, heterogeneity, arise in describing the properties of a dataset, or several datasets. They relate to the validity of the often convenient assumption that the statistical properties of any one part of an overall dataset are the same as any other part. In meta-analysis, which combines the data from several studies, homogeneity measures the differences or similarities between the several studies (see also Study heterogeneity).

Statistics study of the collection, organization, analysis, interpretation, and presentation of data

Statistics is a branch of mathematics dealing with data collection, organization, analysis, interpretation and presentation. In applying statistics to, for example, a scientific, industrial, or social problem, it is conventional to begin with a statistical population or a statistical model process to be studied. Populations can be diverse topics such as "all people living in a country" or "every atom composing a crystal". Statistics deals with every aspect of data, including the planning of data collection in terms of the design of surveys and experiments. See glossary of probability and statistics.

Meta-analysis statistical method that summarizes data from multiple sources

A meta-analysis is a statistical analysis that combines the results of multiple scientific studies.

In statistics, study heterogeneity is a problem that can arise when attempting to undertake a meta-analysis. Ideally, the studies whose results are being combined in the meta-analysis should all be undertaken in the same way and to the same experimental protocols: study heterogeneity is a term used to indicate that this ideal is not fully met.


Homogeneity can be studied to several degrees of complexity. For example, considerations of homoscedasticity examine how much the variability of data-values changes throughout a dataset. However, questions of homogeneity apply to all aspects of the statistical distributions, including the location parameter. Thus, a more detailed study would examine changes to the whole of the marginal distribution. An intermediate-level study might move from looking at the variability to studying changes in the skewness. In addition to these, questions of homogeneity apply also to the joint distributions.

Homoscedasticity statistical property

In statistics, a sequence or a vector of random variables is homoscedastic if all random variables in the sequence or vector have the same finite variance. This is also known as homogeneity of variance. The complementary notion is called heteroscedasticity. The spellings homoskedasticity and heteroskedasticity are also frequently used.

Variance Statistical measure

In probability theory and statistics, variance is the expectation of the squared deviation of a random variable from its mean. Informally, it measures how far a set of (random) numbers are spread out from their average value. Variance has a central role in statistics, where some ideas that use it include descriptive statistics, statistical inference, hypothesis testing, goodness of fit, and Monte Carlo sampling. Variance is an important tool in the sciences, where statistical analysis of data is common. The variance is the square of the standard deviation, the second central moment of a distribution, and the covariance of the random variable with itself, and it is often represented by , , or .

In statistics, a location family is a class of probability distributions that is parametrized by a scalar- or vector-valued parameter , which determines the "location" or shift of the distribution. Formally, this means that the probability density functions or probability mass functions in this class have the form

The concept of homogeneity can be applied in many different ways and, for certain types of statistical analysis, it is used to look for further properties that might need to be treated as varying within a dataset once some initial types of non-homogeneity have been dealt with.



Differences in the typical values across the dataset might initially be dealt with by constructing a regression model using certain explanatory variables to relate variations in the typical value to known quantities. There should then be a later stage of analysis to examine whether the errors in the predictions from the regression behave in the same way across the dataset. Thus the question becomes one of the homogeneity of the distribution of the residuals, as the explanatory variables change. See regression analysis.

Regression analysis set of statistical processes for estimating the relationships among variables

In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships among variables. It includes many techniques for modeling and analyzing several variables, when the focus is on the relationship between a dependent variable and one or more independent variables. More specifically, regression analysis helps one understand how the typical value of the dependent variable changes when any one of the independent variables is varied, while the other independent variables are held fixed.

Time series

The initial stages in the analysis of a time series may involve plotting values against time to examine homogeneity of the series in various ways: stability across time as opposed to a trend; stability of local fluctuations over time.

Combining information across sites

In hydrology, data-series across a number of sites composed of annual values of the within-year annual maximum river-flow are analysed. A common model is that the distributions of these values are the same for all sites apart from a simple scaling factor, so that the location and scale are linked in a simple way. There can then be questions of examining the homogeneity across sites of the distribution of the scaled values.

Hydrology The science of the movement, distribution, and quality of water on Earth and other planets

Hydrology is the scientific study of the movement, distribution, and quality of water on Earth and other planets, including the water cycle, water resources and environmental watershed sustainability. A practitioner of hydrology is a hydrologist, working within the fields of earth or environmental science, physical geography, geology or civil and environmental engineering. Using various analytical methods and scientific techniques, they collect and analyze data to help solve water related problems such as environmental preservation, natural disasters, and water management.

Combining information sources

In meteorology, weather datasets are acquired over many years of record and, as part of this, measurements at certain stations may cease occasionally while, at around the same time, measurements may start at nearby locations. There are then questions as to whether, if the records are combined to form a single longer set of records, those records can be considered homogeneous over time. An example of homogeneity testing of wind speed and direction data can be found in Romanić et al., 2015. [1]

Meteorology Interdisciplinary scientific study of the atmosphere focusing on weather forecasting

Meteorology is a branch of the atmospheric sciences which includes atmospheric chemistry and atmospheric physics, with a major focus on weather forecasting. The study of meteorology dates back millennia, though significant progress in meteorology did not occur until the 18th century. The 19th century saw modest progress in the field after weather observation networks were formed across broad regions. Prior attempts at prediction of weather depended on historical data. It was not until after the elucidation of the laws of physics and more particularly, the development of the computer, allowing for the automated solution of a great many equations that model the weather, in the latter half of the 20th century that significant breakthroughs in weather forecasting were achieved. An important domain of weather forecasting is marine weather forecasting as it relates to maritime and coastal safety, in which weather effects also include atmospheric interactions with large bodies of water.

Homogeneity within populations

Simple populations surveys may start from the idea that responses will be homogeneous across the whole of a population. Assessing the homogeneity of the population would involve looking to see whether the responses of certain identifiable subpopulations differ from those of others. For example, car-owners may differ from non-car-owners, or there may be differences between different age-groups.


A test for homogeneity, in the sense of exact equivalence of statistical distributions, can be based on an E-statistic. A location test tests the simpler hypothesis that distributions have the same location parameter.

