Threshold model

Last updated

In mathematical or statistical modeling a threshold model is any model where a threshold value, or set of threshold values, is used to distinguish ranges of values where the behaviour predicted by the model varies in some important way. A particularly important instance arises in toxicology, where the model for the effect of a drug may be that there is zero effect for a dose below a critical or threshold value, while an effect of some significance exists above that value. [1] Certain types of regression model may include threshold effects. [1]


Collective behavior

Threshold models are often used to model the behavior of groups, ranging from social insects to animal herds to human society.

Classic threshold models were introduced by Sakoda, [2] in his 1949 dissertation and the Journal of Mathematical Sociology (JMS vol 1 #1, 1971). [3] They were subsequently developed by Schelling, Axelrod, and Granovetter to model collective behavior. Schelling used a special case of Sakoda's model to describe the dynamics of segregation motivated by individual interactions in America (JMS vol 1 #2, 1971) [4] by constructing two simulation models. Schelling demonstrated that “there is no simple correspondence of individual incentive to collective results,” and that the dynamics of movement influenced patterns of segregation. In doing so Schelling highlighted the significance of “a general theory of ‘tipping’”.

Mark Granovetter, following Schelling, proposed the threshold model (Granovetter & Soong, 1983, 1986, 1988), which assumes that individuals’ behavior depends on the number of other individuals already engaging in that behavior (both Schelling and Granovetter classify their term of “threshold” as behavioral threshold.). He used the threshold model to explain the riot, residential segregation, and the spiral of silence. In the spirit of Granovetter’s threshold model, the “threshold” is “the number or proportion of others who must make one decision before a given actor does so”. It is necessary to emphasize the determinants of threshold. Different individuals have different thresholds. Individuals' thresholds may be influenced by many factors: social economic status, education, age, personality, etc. Further, Granovetter relates “threshold” with utility one gets from participating in collective behavior or not, using the utility function, each individual will calculate his or her cost and benefit from undertaking an action. And situation may change the cost and benefit of the behavior, so threshold is situation-specific. The distribution of the thresholds determines the outcome of the aggregate behavior (for example, public opinion).

Segmented regression analysis

The models used in segmented regression analysis are threshold models.


Certain deterministic recursive multivariate models which include threshold effects have been shown to produce fractal effects. [5]

Time series analysis

Several classes of nonlinear autoregressive models formulated for time series applications have been threshold models. [5]


A threshold model used in toxicology posits that anything above a certain dose of a toxin is dangerous, and anything below it safe. This model is usually applied to non-carcinogenic health hazards.

Edward J. Calabrese and Linda A. Baldwin wrote:

The threshold dose-response model is widely viewed as the most dominant model in toxicology. [6]

An alternative type of model in toxicology is the linear no-threshold model (LNT), while hormesis correspond to the existence of opposite effects at low vs. high dose, which usually gives a U- or inverted U-shaped dose response curve.

Liability threshold model

The liability-threshold model is a threshold model of categorical (usually binary) outcomes in which a large number of variables are summed to yield an overall 'liability' score; the observed outcome is determined by whether the latent score is smaller or larger than the threshold. The liability-threshold model is frequently employed in medicine and genetics to model risk factors contributing to disease.

In a genetic context, the variables are all the genes and different environmental conditions, which protect against or increase the risk of a disease, and the threshold z is the biological limit past which disease develops. The threshold can be estimated from population prevalence of the disease (which is usually low). Because the threshold is defined relative to the population & environment, the liability score is generally considered as a N(0, 1) normally distributed random variable.

Early genetics models were developed to deal with very rare genetic diseases by treating them as Mendelian diseases caused by 1 or 2 genes: the presence or absence of the gene corresponds to the presence or absence of the disease, and the occurrence of the disease will follow predictable patterns within families. Continuous traits like height or intelligence could be modeled as normal distributions, influenced by a large number of genes, and the heritability and effects of selection easily analyzed. Some diseases, like alcoholism, epilepsy, or schizophrenia, cannot be Mendelian diseases because they are common; do not appear in Mendelian ratios; respond slowly to selection against them; often occur in families with no prior history of that disease; however, relatives and adoptees of someone with that disease are far more likely (but not certain) to develop it, indicating a strong genetic component. The liability threshold model was developed to deal with these non-Mendelian binary cases; the model proposes that there is a continuous normally-distributed trait expressing risk polygenically influenced by many genes, which all individuals above a certain value develop the disease and all below it do not.

The first threshold models in genetics were introduced by Sewall Wright, examining the propensity of guinea pig strains to have an extra hind toe, a phenomenon which could not be explained as a dominant or recessive gene, or continuous "blinding inheritance". [7] [8] The modern liability-threshold model was introduced into human research by geneticist Douglas Scott Falconer in his textbook [9] and two papers. [10] [11] Falconer had been asked about the topic of modeling 'threshold characters' by Cyril Clarke who had diabetes. [12]

An early application of liability-threshold models was to schizophrenia by Irving Gottesman & James Shields, finding substantial heritability & little shared-environment influence [13] and undermining the "cold mother" theory of schizophrenia.

Further reading

Related Research Articles

An allele is one of two, or more, forms of a given gene variant. E.g. the ABO blood grouping is controlled by the ABO gene which has six common alleles. Nearly every living human's phenotype for the ABO gene is some combination of just these six alleles. An allele is one of two, or more, versions of the same gene at the same place on a chromosome. It can also refer to different sequence variations for a several-hundred base-pair or more region of the genome that codes for a protein. Alleles can come in different extremes of size. At the lowest possible size an allele can be a single nucleotide polymorphism (SNP). At the higher end, it can be up to several thousand base-pairs long. Most alleles result in little or no observable change in the function of the protein the gene codes for.

Genetic disorder Health problem caused by one or more abnormalities in the genome

A genetic disorder is a health problem caused by one or more abnormalities in the genome. It can be caused by a mutation in a single gene (monogenic) or multiple genes (polygenic) or by a chromosomal abnormality. Although polygenic disorders are the most common, the term is mostly used when discussing disorders with a single genetic cause, either in a gene or chromosome. The mutation responsible can occur spontaneously before embryonic development, or it can be inherited from two parents who are carriers of a faulty gene or from a parent with the disorder. Some disorders are caused by a mutation on the X chromosome and have X-linked inheritance. Very few disorders are inherited on the Y chromosome or mitochondrial DNA.

Heredity Passing of traits to offspring from the speciess parents or ancestor

Heredity, also called inheritance or biological inheritance, is the passing on of traits from parents to their offspring; either through asexual reproduction or sexual reproduction, the offspring cells or organisms acquire the genetic information of their parents. Through heredity, variations between individuals can accumulate and cause species to evolve by natural selection. The study of heredity in biology is genetics.

Heritability Estimation of effect of genetic variation on phenotypic variation of a trait

Heritability is a statistic used in the fields of breeding and genetics that estimates the degree of variation in a phenotypic trait in a population that is due to genetic variation between individuals in that population. It measures how much of the variation of a trait can be attributed to variation of genetic factors, as opposed to variation of environmental factors. The concept of heritability can be expressed in the form of the following question: "What is the proportion of the variation in a given trait within a population that is not explained by the environment or random chance?"

Penetrance in genetics is the proportion of individuals carrying a particular variant of a gene that also express an associated trait. In medical genetics, the penetrance of a disease-causing mutation is the proportion of individuals with the mutation who exhibit clinical symptoms among all individuals with such mutation. For example, if a mutation in the gene responsible for a particular autosomal dominant disorder has 95% penetrance, then 95% of those with the mutation will develop the disease, while 5% will not.

Twin studies are studies conducted on identical or fraternal twins. They aim to reveal the importance of environmental and genetic influences for traits, phenotypes, and disorders. Twin research is considered a key tool in behavioral genetics and in content fields, from biology to psychology. Twin studies are part of the broader methodology used in behavior genetics, which uses all data that are genetically informative – siblings studies, adoption studies, pedigree, etc. These studies have been used to track traits ranging from personal behavior to the presentation of severe mental illnesses such as schizophrenia.

A quantitative trait locus (QTL) is a locus that correlates with variation of a quantitative trait in the phenotype of a population of organisms. QTLs are mapped by identifying which molecular markers correlate with an observed trait. This is often an early step in identifying and sequencing the actual genes that cause the trait variation.

Non-Mendelian inheritance

Non-Mendelian inheritance is any pattern of inheritance in which traits do not segregate in accordance with Mendel's laws. These laws describe the inheritance of traits linked to single genes on chromosomes in the nucleus. In Mendelian inheritance, each parent contributes one of two possible alleles for a trait. If the genotypes of both parents in a genetic cross are known, Mendel's laws can be used to determine the distribution of phenotypes expected for the population of offspring. There are several situations in which the proportions of phenotypes observed in the progeny do not match the predicted values.

A polygene or multiple gene inheritance is a member of a group of non-epistatic genes that interact additively to influence a phenotypic trait. The term "monozygous" is usually used to refer to a hypothetical gene as it is often difficult to characterise the effect of an individual gene from the effects of other genes and the environment on a particular phenotype. Advances in statistical methodology and high throughput sequencing are, however, allowing researchers to locate candidate genes for the trait. In the case that such a gene is identified, it is referred to as a quantitative trait locus (QTL). These genes are generally pleiotropic as well. The genes that contribute to type 2 diabetes are thought to be mostly polygenes. In July 2016, scientists reported identifying a set of 355 genes from the last universal common ancestor (LUCA) of all organisms living on Earth.

A phene is an individual genetically determined characteristic or trait which can be possessed by an organism, such as eye colour, height, behavior, tooth shape or any other observable characteristic.

Genome-wide association study Study to research genome-wide set of genetic variants in different individuals to see if any variant is associated with a trait.

In genetics, a genome-wide association study, also known as whole genome association study, is an observational study of a genome-wide set of genetic variants in different individuals to see if any variant is associated with a trait. GWA studies typically focus on associations between single-nucleotide polymorphisms (SNPs) and traits like major human diseases, but can equally be applied to any other genetic variants and any other organisms.

Genetic epidemiology is the study of the role of genetic factors in determining health and disease in families and in populations, and the interplay of such genetic factors with environmental factors. Genetic epidemiology seeks to derive a statistical and quantitative analysis of how genetics work in large groups.

In multivariate quantitative genetics, a genetic correlation is the proportion of variance that two traits share due to genetic causes, the correlation between the genetic influences on a trait and the genetic influences on a different trait estimating the degree of pleiotropy or causal overlap. A genetic correlation of 0 implies that the genetic effects on one trait are independent of the other, while a correlation of 1 implies that all of the genetic influences on the two traits are identical. The bivariate genetic correlation can be generalized to inferring genetic latent variable factors across > 2 traits using factor analysis. Genetic correlation models were introduced into behavioral genetics in the 1970s–1980s.

Behavioural genetics, also referred to as behaviour genetics, is a field of scientific research that uses genetic methods to investigate the nature and origins of individual differences in behaviour. While the name "behavioural genetics" connotes a focus on genetic influences, the field broadly investigates genetic and environmental influences, using research designs that allow removal of the confounding of genes and environment. Behavioural genetics was founded as a scientific discipline by Francis Galton in the late 19th century, only to be discredited through association with eugenics movements before and during World War II. In the latter half of the 20th century, the field saw renewed prominence with research on inheritance of behaviour and mental illness in humans, as well as research on genetically informative model organisms through selective breeding and crosses. In the late 20th and early 21st centuries, technological advances in molecular genetics made it possible to measure and modify the genome directly. This led to major advances in model organism research and in human studies, leading to new scientific discoveries.

The "missing heritability" problem is the fact that single genetic variations cannot account for much of the heritability of diseases, behaviors, and other phenotypes. This is a problem that has significant implications for medicine, since a person's susceptibility to disease may depend more on 'the combined effect of all the genes in the background than on the disease genes in the foreground', or the role of genes may have been severely overestimated.

Oligogenic inheritance describes a trait that is influenced by a few genes. Oligogenic inheritance represents an intermediate between monogenic inheritance in which a trait is determined by a single causative gene, and polygenic inheritance, in which a trait is influenced by many genes and often environmental factors.

Mendelian traits behave according to the model of monogenic or simple gene inheritance in which one gene corresponds to one trait. Discrete traits with simple Mendelian inheritance patterns are relatively rare in nature, and many of the clearest examples in humans cause disorders. Discrete traits found in humans are common examples for teaching genetics.

Polygenic score Numerical score aimed at predicting a trait based on variation in multiple genetic loci

In genetics, a polygenic score, also called a polygenic risk score (PRS), genetic risk score, or genome-wide score, is a number that summarises the estimated effect of many genetic variants on an individual's phenotype, typically calculated as a weighted sum of trait-associated alleles. It reflects an individuals estimated genetic predisposition for a given trait and can be used as a predictor for that trait. Polygenic scores are widely used in animal breeding and plant breeding due to their efficacy in improving livestock breeding and crops. They are also increasingly being used for risk prediction in humans for complex diseases which are typically affected by many genetic variants that each confer a small effect on overall risk.

Complex traits

Complex traits, also known as quantitative traits, are traits that do not behave according to simple Mendelian inheritance laws. More specifically, their inheritance cannot be explained by the genetic segregation of a single gene. Such traits show a continuous range of variation and are influenced by both environmental and genetic factors. Compared to strictly Mendelian traits, complex traits are far more common, and because they can be hugely polygenic, they are studied using statistical techniques such as QTL mapping rather than classical genetics methods. Examples of complex traits include height, circadian rhythms, enzyme kinetics, and many diseases including diabetes and Parkinson's disease. One major goal of genetic research today is to better understand the molecular mechanisms through which genetic variants act to influence complex traits.

Multifactorial diseases

Multifactorial diseases are not confined to any specific pattern of single gene inheritance and are likely to be associated with multiple genes effects together with the effects of environmental factors.


  1. 1 2 Dodge, Y. (2003) The Oxford Dictionary of Statistical Terms, OUP. ISBN   0-19-850994-4
  2. Journal of Artificial Societies and Social Simulation 20(3) 15, 2017.
  3. Sakoda, J. M. The Checkerboard Model of Social Interaction. Journal of Mathematical Sociology, 1(1):119–132, 1971.
  4. Schelling, T. C. Dynamic models of segregation. Journal of Mathematical Sociology, 1(2):143–186, 1971a.
  5. 1 2 Tong, H. (1990) Non-linear Time Series: A Dynamical System Approach, OUP. ISBN   0-19-852224-X
  6. Calabrese, E.J.; Baldwin, L.A. (2003). "The Hormetic Dose-Response Model Is More Common than the Threshold Model in Toxicology". Toxicological Sciences . 71 (2): 246–250. doi: 10.1093/toxsci/71.2.246 . PMID   12563110.
  7. Wright, S (1934). "An Analysis of Variability in Number of Digits in an Inbred Strain of Guinea Pigs". Genetics. 19 (6): 506–36. PMC   1208511 . PMID   17246735.
  8. Wright, S (1934b). "The Results of Crosses between Inbred Strains of Guinea Pigs, Differing in Number of Digits". Genetics. 19 (6): 537–51. PMC   1208512 . PMID   17246736.
  9. ch18, "Threshold characters", Introduction to Quantitative Genetics, Falconer 1960
  10. "The inheritance of liability to certain diseases, estimated from the incidence among relatives" Archived 2016-08-15 at the Wayback Machine , Falconer 1965
  11. "The inheritance of liability to diseases with variable age of onset, with particular reference to diabetes mellitus" Archived 2016-08-15 at the Wayback Machine , Falconer 1967
  12. "D. S. Falconer and Introduction to Quantitative Genetics", Hill & Mackay 2004
  13. Gottesman, II; Shields, J (1967). "A polygenic theory of schizophrenia". Proc Natl Acad Sci U S A. 58 (1): 199–205. Bibcode:1967PNAS...58..199G. doi:10.1073/pnas.58.1.199. PMC   335617 . PMID   5231600.