One-third hypothesis

Last updated

The one-third hypothesis (OTH) is a sociodynamic theory asserting that a subgroup's prominence increases as it approaches one-third of the total population and diminishes after it exceeds that number. It was first stated by sociologist Hugo O. Engelmann in a letter to the American Sociologist in 1967:

Contents

"...we would expect that the most persistent subgroups in any group would be those which approximate one-third or, by similar reasoning, a multiple of [i.e., a power of] one-third of the total group. Being the most persistent, these groups also should be the ones most significantly implicated in ongoing sociocultural transformation. This does not mean that these groups need to be dominant, but they play prominent roles." [1]

The OTH involves two mathematical curves. One represents the likelihood that a subgroup of a specific size will emerge. The other represents the probability that it will persist. The product of these two curves matches the prediction of the one-third hypothesis.

Statistical formalization

Statistically speaking, the group that is one-third of the population is the one most likely to persist and the group that is two-thirds the one most likely to dissolve into splinter groups, as if reacting to the cohesiveness of the group that is one-third.

According to the binomial coefficient a group of size r occurs in a population of size n in ways. Because each group of size r can dissolve in 2 r subgroups, the total number of ways all groups of size r can emerge and dissolve equals 3 n, in keeping with the summation:

Said otherwise, large groups close to two-thirds of the population will be more likely than any other groups to dissolve into splinter groups. A corollary of this consideration is that much smaller groups will be the ones most likely to emerge and to persist.

If groups of size r occur with a probability of and dissolve into subgroups with a probability of , then the equation reduces to and given that p and q are each equal to 1/2, Engelmann's One-Third Hypothesis can be readily deduced. It takes the form of

,

where n is the number of people and r is the size of a group and can be verified for large numbers by using the Stirling's approximation formula.

Early research and recent prediction

A perfect example of the OTH was illustrated by Wayne Youngquist’s 1968 “Wooden Shoes and the One-Third Hypothesis,” which documented the German population in Milwaukee little more than a century ago. As Germans approached one-third of city’s population they became more and more prominent. As they exceeded that level their importance began to abate. [2]

The first empirical test of Engelmann’s OTH came in the form of the 1967 Detroit riot. It did not explain the cause of the riots but was aimed at explaining their timing. [1]

Sam Butler, in 2011, explicitly cited Engelmann and the One-Third Hypothesis in his analysis of London's riots and their aetiology. [3]

Criticism

The OTH was never without its critics. Early on K. S. Srikantan correctly questioned the assumption that p and q are each equal to ½. [4] Even if they are not, however, so long as p + q = 1, the maximum value of r will occur at pn/(1+p). The group most likely to emerge and persist will always be smaller than half of the population.

In social dynamics the OTH is sometimes referred to as critical mass . The terminology, though appropriate, has become ambiguous because “critical mass” is used in a variety of ways that do not suggest the OTH at all. Similarly, the OTH is sometimes called the two-thirds theory.

See also

Related Research Articles

<span class="mw-page-title-main">Binomial distribution</span> Probability distribution

In probability theory and statistics, the binomial distribution with parameters n and p is the discrete probability distribution of the number of successes in a sequence of n independent experiments, each asking a yes–no question, and each with its own Boolean-valued outcome: success or failure. A single success/failure experiment is also called a Bernoulli trial or Bernoulli experiment, and a sequence of outcomes is called a Bernoulli process; for a single trial, i.e., n = 1, the binomial distribution is a Bernoulli distribution. The binomial distribution is the basis for the popular binomial test of statistical significance.

<span class="mw-page-title-main">Binomial coefficient</span> Number of subsets of a given size

In mathematics, the binomial coefficients are the positive integers that occur as coefficients in the binomial theorem. Commonly, a binomial coefficient is indexed by a pair of integers nk ≥ 0 and is written It is the coefficient of the xk term in the polynomial expansion of the binomial power (1 + x)n; this coefficient can be computed by the multiplicative formula

In statistics, the likelihood-ratio test is a hypothesis test that involves comparing the goodness of fit of two competing statistical models, typically one found by maximization over the entire parameter space and another found after imposing some constraint, based on the ratio of their likelihoods. If the more constrained model is supported by the observed data, the two likelihoods should not differ by more than sampling error. Thus the likelihood-ratio test tests whether this ratio is significantly different from one, or equivalently whether its natural logarithm is significantly different from zero.

In frequentist statistics, power is a measure of the ability of an experimental design and hypothesis testing setup to detect a particular effect if it is truly present. In typical use, it is a function of the test used, the assumed distribution of the test, and the effect size of interest. High statistical power is related to low variability, large sample sizes, large effects being looked for, and less stringent requirements for statistical significance.

<span class="mw-page-title-main">Random graph</span> Graph generated by a random process

In mathematics, random graph is the general term to refer to probability distributions over graphs. Random graphs may be described simply by a probability distribution, or by a random process which generates them. The theory of random graphs lies at the intersection between graph theory and probability theory. From a mathematical perspective, random graphs are used to answer questions about the properties of typical graphs. Its practical applications are found in all areas in which complex networks need to be modeled – many random graph models are thus known, mirroring the diverse types of complex networks encountered in different areas. In a mathematical context, random graph refers almost exclusively to the Erdős–Rényi random graph model. In other contexts, any graph model may be referred to as a random graph.

An odds ratio (OR) is a statistic that quantifies the strength of the association between two events, A and B. The odds ratio is defined as the ratio of the odds of event A taking place in the presence of B, and the odds of A in the absence of B. Due to symmetry, odds ratio reciprocally calculates the ratio of the odds of B occurring in the presence of A, and the odds of B in the absence of A. Two events are independent if and only if the OR equals 1, i.e., the odds of one event are the same in either the presence or absence of the other event. If the OR is greater than 1, then A and B are associated (correlated) in the sense that, compared to the absence of B, the presence of B raises the odds of A, and symmetrically the presence of A raises the odds of B. Conversely, if the OR is less than 1, then A and B are negatively correlated, and the presence of one event reduces the odds of the other event occurring.

In statistics, an effect size is a value measuring the strength of the relationship between two variables in a population, or a sample-based estimate of that quantity. It can refer to the value of a statistic calculated from a sample of data, the value of a parameter for a hypothetical population, or to the equation that operationalizes how statistics or parameters lead to the effect size value. Examples of effect sizes include the correlation between two variables, the regression coefficient in a regression, the mean difference, or the risk of a particular event happening. Effect sizes are a complement tool for statistical hypothesis testing, and play an important role in power analyses to assess the sample size required for new experiments. Effect size are fundamental in meta-analyses which aim to provide the combined effect size based on data from multiple studies. The cluster of data-analysis methods concerning effect sizes is referred to as estimation statistics.

Student's t-test is a statistical test used to test whether the difference between the response of two groups is statistically significant or not. It is any statistical hypothesis test in which the test statistic follows a Student's t-distribution under the null hypothesis. It is most commonly applied when the test statistic would follow a normal distribution if the value of a scaling term in the test statistic were known. When the scaling term is estimated based on the data, the test statistic—under certain conditions—follows a Student's t distribution. The t-test's most common application is to test whether the means of two populations are significantly different. In many cases, a Z-test will yield very similar results to a t-test because the latter converges to the former as the size of the dataset increases.

Fisher's exact test is a statistical significance test used in the analysis of contingency tables. Although in practice it is employed when sample sizes are small, it is valid for all sample sizes. It is named after its inventor, Ronald Fisher, and is one of a class of exact tests, so called because the significance of the deviation from a null hypothesis can be calculated exactly, rather than relying on an approximation that becomes exact in the limit as the sample size grows to infinity, as with many statistical tests.

In combinatorics, Bertrand's ballot problem is the question: "In an election where candidate A receives p votes and candidate B receives q votes with p > q, what is the probability that A will be strictly ahead of B throughout the count?" The answer is

<span class="mw-page-title-main">Kruskal–Wallis test</span> Non-parametric method for testing whether samples originate from the same distribution

The Kruskal–Wallis test by ranks, Kruskal–Wallis test, or one-way ANOVA on ranks is a non-parametric statistical test for testing whether samples originate from the same distribution. It is used for comparing two or more independent samples of equal or different sample sizes. It extends the Mann–Whitney U test, which is used for comparing only two groups. The parametric equivalent of the Kruskal–Wallis test is the one-way analysis of variance (ANOVA).

Error catastrophe refers to the cumulative loss of genetic information in a lineage of organisms due to high mutation rates. The mutation rate above which error catastrophe occurs is called the error threshold. Both terms were coined by Manfred Eigen in his mathematical evolutionary theory of the quasispecies.

This glossary of statistics and probability is a list of definitions of terms and concepts used in the mathematical sciences of statistics and probability, their sub-disciplines, and related fields. For additional related terms, see Glossary of mathematics and Glossary of experimental design.

In evolutionary biology and population genetics, the error threshold is a limit on the number of base pairs a self-replicating molecule may have before mutation will destroy the information in subsequent generations of the molecule. The error threshold is crucial to understanding "Eigen's paradox".

<span class="mw-page-title-main">Hugo O. Engelmann</span> American behavioural scientist (1917–2002)

Hugo Otto Engelmann was an American sociologist, anthropologist and general systems theorist. Throughout his work he emphasized the significance of history.

<span class="mw-page-title-main">Erdős–Rényi model</span> Two closely related models for generating random graphs

In the mathematical field of graph theory, the Erdős–Rényi model refers to one of two closely related models for generating random graphs or the evolution of a random network. These models are named after Hungarian mathematicians Paul Erdős and Alfréd Rényi, who introduced one of the models in 1959. Edgar Gilbert introduced the other model contemporaneously with and independently of Erdős and Rényi. In the model of Erdős and Rényi, all graphs on a fixed vertex set with a fixed number of edges are equally likely. In the model introduced by Gilbert, also called the Erdős–Rényi–Gilbert model, each edge has a fixed probability of being present or absent, independently of the other edges. These models can be used in the probabilistic method to prove the existence of graphs satisfying various properties, or to provide a rigorous definition of what it means for a property to hold for almost all graphs.

Tukey's range test, also known as Tukey's test, Tukey method, Tukey's honest significance test, or Tukey's HSDtest, is a single-step multiple comparison procedure and statistical test. It can be used to correctly interpret the statistical significance of the difference between means that have been selected for comparison because of their extreme values.

Multinomial test is the statistical test of the null hypothesis that the parameters of a multinomial distribution equal specified values; it is used for categorical data.

In statistics, L-moments are a sequence of statistics used to summarize the shape of a probability distribution. They are linear combinations of order statistics (L-statistics) analogous to conventional moments, and can be used to calculate quantities analogous to standard deviation, skewness and kurtosis, termed the L-scale, L-skewness and L-kurtosis respectively. Standardised L-moments are called L-moment ratios and are analogous to standardized moments. Just as for conventional moments, a theoretical distribution has a set of population L-moments. Sample L-moments can be defined for a sample from the population, and can be used as estimators of the population L-moments.

The Hosmer–Lemeshow test is a statistical test for goodness of fit and calibration for logistic regression models. It is used frequently in risk prediction models. The test assesses whether or not the observed event rates match expected event rates in subgroups of the model population. The Hosmer–Lemeshow test specifically identifies subgroups as the deciles of fitted risk values. Models for which expected and observed event rates in subgroups are similar are called well calibrated.

References

  1. 1 2 Hugo O. Engelmann. (1967). "Communication to the Editor." American Sociologist, November. p. 21.
  2. Wayne A. Youngquist. (1968). “Wooden Shoes and the One-Third Hypothesis.” Wisconsin Sociologist, vol. 6; Spring-Summer # 1 & 2
  3. Butler, Sam (2011). “London riots, cruel but not so unusual.” http://www.huffingtonpost.co.uk/sam-butler/just-a-little-bit-of-hist_b_922751.html
  4. Srikantan, K. S. (1968). "A Curious Mathematical Property." American Sociologist, May. p.p. 154-155.