Threshold model

Last updated

In mathematical or statistical modeling a threshold model is any model where a threshold value, or set of threshold values, is used to distinguish ranges of values where the behaviour predicted by the model varies in some important way. A particularly important instance arises in toxicology, where the model for the effect of a drug may be that there is zero effect for a dose below a critical or threshold value, while an effect of some significance exists above that value. [1] Certain types of regression model may include threshold effects. [1]


Collective behavior

Threshold models are often used to model the behavior of groups, ranging from social insects to animal herds to human society.

Classic threshold models were introduced by Sakoda [2] , in his 1949 dissertation and the Journal of Mathematical Sociology (JMS vol 1 #1, 1971) [3] . They were subsequently developed by Schelling, Axelrod, and Granovetter to model collective behavior. Schelling used a special case of Sakoda's model to describe the dynamics of segregation motivated by individual interactions in America (JMS vol 1 #2, 1971) [4] by constructing two simulation models. Schelling demonstrated that “there is no simple correspondence of individual incentive to collective results,” and that the dynamics of movement influenced patterns of segregation. In doing so Schelling highlighted the significance of “a general theory of ‘tipping’”.

Mark Granovetter, following Schelling, proposed the threshold model (Granovetter & Soong, 1983, 1986, 1988), which assumes that individuals’ behavior depends on the number of other individuals already engaging in that behavior (both Schelling and Granovetter classify their term of “threshold” as behavioral threshold.). He used the threshold model to explain the riot, residential segregation, and the spiral of silence. In the spirit of Granovetter’s threshold model, the “threshold” is “the number or proportion of others who must make one decision before a given actor does so”. It is necessary to emphasize the determinants of threshold. Different individuals have different thresholds. Individuals' thresholds may be influenced by many factors: social economic status, education, age, personality, etc. Further, Granovetter relates “threshold” with utility one gets from participating in collective behavior or not, using the utility function, each individual will calculate his or her cost and benefit from undertaking an action. And situation may change the cost and benefit of the behavior, so threshold is situation-specific. The distribution of the thresholds determines the outcome of the aggregate behavior (for example, public opinion).

Segmented regression analysis

The models used in segmented regression analysis are threshold models.


Certain deterministic recursive multivariate models which include threshold effects have been shown to produce fractal effects. [5]

Time series analysis

Several classes of nonlinear autoregressive models formulated for time series applications have been threshold models. [5]


A threshold model used in toxicology posits that anything above a certain dose of a toxin is dangerous, and anything below it safe. This model is usually applied to non-carcinogenic health hazards.

Edward J. Calabrese and Linda A. Baldwin wrote:

The threshold dose-response model is widely viewed as the most dominant model in toxicology. [6]

An alternative type of model in toxicology is the linear no-threshold model (LNT), while hormesis correspond to the existence of opposite effects at low vs. high dose, which usually gives a U- or inverted U-shaped dose response curve.

Liability threshold model

The liability-threshold model is a threshold model of categorical (usually binary) outcomes in which a large number of variables are summed to yield an overall 'liability' score; the observed outcome is determined by whether the latent score is smaller or larger than the threshold. The liability-threshold model is frequently employed in medicine and genetics to model risk factors contributing to disease.

In a genetic context, the variables are all the genes and different environmental conditions, which protect against or increase the risk of a disease, and the threshold z is the biological limit past which disease develops. The threshold can be estimated from population prevalence of the disease (which is usually low). Because the threshold is defined relative to the population & environment, the liability score is generally considered as a N(0, 1) normally distributed random variable.

Early genetics models were developed to deal with very rare genetic diseases by treating them as Mendelian diseases caused by 1 or 2 genes: the presence or absence of the gene corresponds to the presence or absence of the disease, and the occurrence of the disease will follow predictable patterns within families. Continuous traits like height or intelligence could be modeled as normal distributions, influenced by a large number of genes, and the heritability and effects of selection easily analyzed. Some diseases, like alcoholism, epilepsy, or schizophrenia, cannot be Mendelian diseases because they are common; do not appear in Mendelian ratios; respond slowly to selection against them; often occur in families with no prior history of that disease; however, relatives and adoptees of someone with that disease are far more likely (but not certain) to develop it, indicating a strong genetic component. The liability threshold model was developed to deal with these non-Mendelian binary cases; the model proposes that there is a continuous normally-distributed trait expressing risk polygenically influenced by many genes, which all individuals above a certain value develop the disease and all below it do not.

The first threshold models in genetics were introduced by Sewall Wright, examining the propensity of guinea pig strains to have an extra hind toe, a phenomenon which could not be explained as a dominant or recessive gene, or continuous "blinding inheritance". [7] [8] The modern liability-threshold model was introduced into human research by geneticist Douglas Scott Falconer in his textbook [9] and two papers. [10] [11] Falconer had been asked about the topic of modeling 'threshold characters' by Cyril Clarke who had diabetes. [12]

An early application of liability-threshold models was to schizophrenia by Irving Gottesman & James Shields, finding substantial heritability & little shared-environment influence [13] and undermining the "cold mother" theory of schizophrenia.

Further reading

Related Research Articles

Genetic disorder Health problem caused by one or more abnormalities in the genome

A genetic disorder is a health problem caused by one or more abnormalities in the genome. It can be caused by a mutation in a single gene (monogenic) or multiple genes (polygenic) or by a chromosomal abnormality. Although polygenic disorders are the most common, the term is mostly used when discussing disorders with a single genetic cause, either in a gene or chromosome. The mutation responsible can occur spontaneously before embryonic development, or it can be inherited from two parents who are carriers of a faulty gene or from a parent with the disorder. Some disorders are caused by a mutation on the X chromosome and have X-linked inheritance. Very few disorders are inherited on the Y chromosome or mitochondrial DNA.

Heredity Passing of traits to offspring from the speciess parents or ancestor

Heredity, also called inheritance or biological inheritance, is the passing on of traits from parents to their offspring; either through asexual reproduction or sexual reproduction, the offspring cells or organisms acquire the genetic information of their parents. Through heredity, variations between individuals can accumulate and cause species to evolve by natural selection. The study of heredity in biology is genetics.

Heritability Estimation of effect of genetic variation on phenotypic variation of a trait

Heritability is a statistic used in the fields of breeding and genetics that estimates the degree of variation in a phenotypic trait in a population that is due to genetic variation between individuals in that population. It measures how much of the variation of a trait can be attributed to variation of genetic factors, as opposed to variation of environmental factors. The concept of heritability can be expressed in the form of the following question: "What is the proportion of the variation in a given trait within a population that is not explained by the environment or random chance?"

Penetrance in genetics is the proportion of individuals carrying a particular variant of a gene that also express an associated trait. In medical genetics, the penetrance of a disease-causing mutation is the proportion of individuals with the mutation who exhibit clinical symptoms among all individuals with such mutation. For example, if a mutation in the gene responsible for a particular autosomal dominant disorder has 95% penetrance, then 95% of those with the mutation will develop the disease, while 5% will not.

Twin studies are studies conducted on identical or fraternal twins. They aim to reveal the importance of environmental and genetic influences for traits, phenotypes, and disorders. Twin research is considered a key tool in behavioral genetics and in content fields, from biology to psychology. Twin studies are part of the broader methodology used in behavior genetics, which uses all data that are genetically informative – siblings studies, adoption studies, pedigree, etc. These studies have been used to track traits ranging from personal behavior to the presentation of severe mental illnesses such as schizophrenia.

A quantitative trait locus (QTL) is a locus that correlates with variation of a quantitative trait in the phenotype of a population of organisms. QTLs are mapped by identifying which molecular markers correlate with an observed trait. This is often an early step in identifying and sequencing the actual genes that cause the trait variation.

Non-Mendelian inheritance

Non-Mendelian inheritance is any pattern of inheritance in which traits do not segregate in accordance with Mendel's laws. These laws describe the inheritance of traits linked to single genes on chromosomes in the nucleus. In Mendelian inheritance, each parent contributes one of two possible alleles for a trait. If the genotypes of both parents in a genetic cross are known, Mendel's laws can be used to determine the distribution of phenotypes expected for the population of offspring. There are several situations in which the proportions of phenotypes observed in the progeny do not match the predicted values.

A "polygene” or "multiple gene inheritance" is a member of a group of non-epistatic genes that interact additively to influence a phenotypic trait. The term "monozygous" is usually used to refer to a hypothetical gene as it is often difficult to characterise the effect of an individual gene from the effects of other genes and the environment on a particular phenotype. Advances in statistical methodology and high throughput sequencing are, however, allowing researchers to locate candidate genes for the trait. In the case that such a gene is identified, it is referred to as a quantitative trait locus (QTL). These genes are generally pleiotropic as well. The genes that contribute to type 2 diabetes are thought to be mostly polygenes. In July 2016, scientists reported identifying a set of 355 genes from the last universal common ancestor (LUCA) of all organisms living on Earth.

A phene is an individual genetically determined characteristic or trait which can be possessed by an organism, such as eye colour, height, behavior, tooth shape or any other observable characteristic.

Genome-wide association study Study to research genome-wide set of genetic variants in different individuals to see if any variant is associated with a trait.

In genetics, a genome-wide association study, also known as whole genome association study, is an observational study of a genome-wide set of genetic variants in different individuals to see if any variant is associated with a trait. GWASs typically focus on associations between single-nucleotide polymorphisms (SNPs) and traits like major human diseases, but can equally be applied to any other genetic variants and any other organisms.

In multivariate quantitative genetics, a genetic correlation is the proportion of variance that two traits share due to genetic causes, the correlation between the genetic influences on a trait and the genetic influences on a different trait estimating the degree of pleiotropy or causal overlap. A genetic correlation of 0 implies that the genetic effects on one trait are independent of the other, while a correlation of 1 implies that all of the genetic influences on the two traits are identical. The bivariate genetic correlation can be generalized to inferring genetic latent variable factors across > 2 traits using factor analysis. Genetic correlation models were introduced into behavioral genetics in the 1970s–1980s.

Behavioural genetics, also referred to as behaviour genetics, is a field of scientific research that uses genetic methods to investigate the nature and origins of individual differences in behaviour. While the name "behavioural genetics" connotes a focus on genetic influences, the field broadly investigates genetic and environmental influences, using research designs that allow removal of the confounding of genes and environment. Behavioural genetics was founded as a scientific discipline by Francis Galton in the late 19th century, only to be discredited through association with eugenics movements before and during World War II. In the latter half of the 20th century, the field saw renewed prominence with research on inheritance of behaviour and mental illness in humans, as well as research on genetically informative model organisms through selective breeding and crosses. In the late 20th and early 21st centuries, technological advances in molecular genetics made it possible to measure and modify the genome directly. This led to major advances in model organism research and in human studies, leading to new scientific discoveries.

The "missing heritability" problem is the fact that single genetic variations cannot account for much of the heritability of diseases, behaviors, and other phenotypes. This is a problem that has significant implications for medicine, since a person's susceptibility to disease may depend more on 'the combined effect of all the genes in the background than on the disease genes in the foreground', or the role of genes may have been severely overestimated.

Oligogenic inheritance describes a trait that is influenced by a few genes. Oligogenic inheritance represents an intermediate between monogenic inheritance in which a trait is determined by a single causative gene, and polygenic inheritance, in which a trait is influenced by many genes and often environmental factors.

Mendelian traits behave according to the model of monogenic or simple gene inheritance in which one gene corresponds to one trait. Discrete traits with simple Mendelian inheritance patterns are relatively rare in nature, and many of the clearest examples in humans cause disorders. Discrete traits found in humans are common examples for teaching genetics.

Polygenic score Numerical score aimed at predicting a trait based on variation in multiple genetic loci

In genetics, a polygenic score, also called a polygenic risk score (PRS), genetic risk score, or genome-wide score, is a number that summarises the estimated effect of many genetic variants on an individual's phenotype, typically calculated as a weighted sum of trait-associated alleles. It reflects an individuals estimated genetic predisposition for a given trait and can be used as a predictor for that trait. Polygenic scores are widely used in animal breeding and plant breeding due to their efficacy in improving livestock breeding and crops. They are also increasingly being used for risk prediction in humans for complex diseases which are typically affected by many genetic variants that each confer a small effect on overall risk.

Complex traits

Complex traits, also known as quantitative traits, are traits that do not behave according to simple Mendelian inheritance laws. More specifically, their inheritance cannot be explained by the genetic segregation of a single gene. Such traits show a continuous range of variation and are influenced by both environmental and genetic factors. Compared to strictly Mendelian traits, complex traits are far more common, and because they can be hugely polygenic, they are studied using statistical techniques such as QTL mapping rather than classical genetics methods. Examples of complex traits include height, circadian rhythms, enzyme kinetics, and many diseases including diabetes and Parkinson's disease. One major goal of genetic research today is to better understand the molecular mechanisms through which genetic variants act to influence complex traits.

The infinitesimal model, also known as the polygenic model, is a widely used statistical model in quantitative genetics. Originally developed in 1918 by Ronald Fisher, it is based on the idea that variation in a quantitative trait is influenced by an infinitely large number of genes, each of which makes an infinitely small (infinitesimal) contribution to the phenotype, as well as by environmental factors. In "The Correlation between Relatives on the Supposition of Mendelian Inheritance", the original 1918 paper introducing the model, Fisher showed that if a trait is polygenic, "then the random sampling of alleles at each gene produces a continuous, normally distributed phenotype in the population". However, the model does not necessarily imply that the trait must be normally distributed, only that its genetic component will be so around the average of that of the individual's parents. The model served to reconcile Mendelian genetics with the continuous distribution of quantitative traits documented by Francis Galton.

The omnigenic model of the genetics of complex traits posits that human gene regulatory networks are so interconnected that thousands of individual genes contribute at least slightly to the phenotype through expression in relevant cells. Because it proposes that the genetic architecture of complex traits is affected by a large number of genes, it is similar to the infinitesimal model developed by Ronald Fisher. It also incorporates the concept of "universal pleiotropy", which states that genetic variation in one part of the genome can potentially have an indirect effect on any other trait. The model was first proposed by Boyle et al. in a 2017 paper in Cell. According to this model, a small number of "core genes" with biological relevance to a given trait, as well as their regulators and associated pathways, contribute to complex human traits. "Peripheral" genes are also said to far outnumber core genes for a trait, and to contribute to much more of its heritability, despite being outside of the key pathways associated with the trait.

Multifactorial diseases combining multiple genes and environmental factors

Multifactorial diseases are not confined to any specific pattern of single gene inheritance and are likely to be associated with multiple genes effects together with the effects of environmental factors.


  1. 1 2 Dodge, Y. (2003) The Oxford Dictionary of Statistical Terms, OUP. ISBN   0-19-850994-4
  2. Journal of Artificial Societies and Social Simulation 20(3) 15, 2017.
  3. Sakoda, J. M. The Checkerboard Model of Social Interaction. Journal of Mathematical Sociology, 1(1):119–132, 1971.
  4. Schelling, T. C. Dynamic models of segregation. Journal of Mathematical Sociology, 1(2):143–186, 1971a.
  5. 1 2 Tong, H. (1990) Non-linear Time Series: A Dynamical System Approach, OUP. ISBN   0-19-852224-X
  6. Calabrese, E.J.; Baldwin, L.A. (2003). "The Hormetic Dose-Response Model Is More Common than the Threshold Model in Toxicology". Toxicological Sciences . 71 (2): 246–250. doi: 10.1093/toxsci/71.2.246 . PMID   12563110.
  7. Wright, S (1934). "An Analysis of Variability in Number of Digits in an Inbred Strain of Guinea Pigs". Genetics. 19 (6): 506–36. PMC   1208511 . PMID   17246735.
  8. Wright, S (1934b). "The Results of Crosses between Inbred Strains of Guinea Pigs, Differing in Number of Digits". Genetics. 19 (6): 537–51. PMC   1208512 . PMID   17246736.
  9. ch18, "Threshold characters", Introduction to Quantitative Genetics, Falconer 1960
  10. "The inheritance of liability to certain diseases, estimated from the incidence among relatives" Archived 2016-08-15 at the Wayback Machine , Falconer 1965
  11. "The inheritance of liability to diseases with variable age of onset, with particular reference to diabetes mellitus" Archived 2016-08-15 at the Wayback Machine , Falconer 1967
  12. "D. S. Falconer and Introduction to Quantitative Genetics", Hill & Mackay 2004
  13. Gottesman, II; Shields, J (1967). "A polygenic theory of schizophrenia". Proc Natl Acad Sci U S A. 58 (1): 199–205. Bibcode:1967PNAS...58..199G. doi:10.1073/pnas.58.1.199. PMC   335617 . PMID   5231600.