A human disease modifier gene is a modifier gene [1] [2] that alters expression of a human gene at another locus that in turn causes a genetic disease. Whereas medical genetics has tended to distinguish between monogenic traits, governed by simple, Mendelian inheritance, and quantitative traits, with cumulative, multifactorial causes, increasing evidence suggests that human diseases exist on a continuous spectrum between the two. [3]
In the context of human disease, the terms 'modifier gene' and 'oligogene' have similar meanings, and characterization of a particular locus depends on characterization of the phenotype (effects) that it causes or modifies. The term 'modifier gene' may be taken to mean a gene in which genetic variation modifies the effects of mutation at a major locus, but has no effect on the normal condition, a condition not necessarily met for oligogenic interactions. [1] The study of diseases that arise from interactions amongst genes is important for understanding the genetic basis of disease. For these purposes, the study of both modifier genes and oligogenes are useful.
Early theories that established the likely existence of modifier genes, and gene interactions as determinants of phenotypic variation, originated from theories of evolution, notably the evolution of the condition of allelic dominance. While many insightful early theorists contributed to current understanding of modifier genes, emphasized here are the theories of Ronald A. Fisher, Sewall Wright, and John B. S. Haldane. Fisher and Wright proposed somewhat opposing theories of the evolution of dominance in 1928 and 1931 respectively. Both sought to explain the observation that overwhelmingly, wild-type alleles were dominant to a majority of deleterious mutations. [4] Their theories on the evolution of dominance had far-reaching implications for the fields of evolution, population and quantitative genetics, and biochemistry, and laid the early foundation of current understanding of human disease modifier genes.
Fisher theorized that, since negative selective pressures are strongest against dominant, deleterious mutations, de novo mutations that arise are initially codominant, but evolve recessivity through the accumulation of modifier alleles at other loci that attenuate disadvantageous phenotypes. [5] He cited breeding experiments in nasturtium and Drosophila , in which a distinct, mutant phenotype is lost with successive outcrossings, and recovered with inbreeding of offspring. [6] By outcrossing, he argued, the breeder selects for modifying factors that attenuate the mutant phenotype, and by inbreeding and limiting the gene pool of modifiers, the dominance of the mutation is recovered. Importantly, Fisher argues that, "even in small isolated stocks, a sufficient variety of modifying factors" exists to observe this evolution of dominance in an observable succession of generations. [6]
Wright challenged Fisher's theory and proposed that dominance of wild-type alleles evolves not by selection of modifiers, but by selection for physiological margins in biochemical pathways, that often allows them to function even with the occurrence of mutations, for instance in component enzymes. He argued that the effects of competition and genetic drift on populations, limited in genetic diversity, would overwhelm weak selection pressure acting on modifier alleles with nominal phenotypic consequences. [7] [4]
Ultimately, Wright's explanation of the evolution of dominance garnered most support, from experimental biochemists and geneticists, and later theorists, [8] [9] [10] but it was Fisher's theory that first introduced the concept of modifier genes.
For the field of modifier genes, the most important responses to Fisher's theory came from Haldane. Haldane showed that, contrary to Fisher's argument, despite a decreased intensity of selection for dominance in self-fertilized populations, dominance is often more common in inbred than in outbred plant species. [10] In 1941, Haldane applied the genetic model of modifier genes, which originated from theories of on the evolution of dominance, to phenotypic disease variation in humans. He analyzed data collected by Julia Bell on the heritability of several human diseases with quantitative phenotypes, specifically variation in age of onset. [11] From this data, Haldane concluded that no single factor could be proven responsible for the observed phenotypic variation, in what were considered to be simple, monogenic diseases. He proposed a very simple theory, yet one that is operative in the study of oligogenic and complex human genetic disorders today. Haldane proposed that there exist three possible sources of observed phenotypic variation in monogenic traits: [11]
Haldane provided a fundamental, theoretical basis for the existence of modifier genes. Emerging experimental evidence has substantiated the likely existence and importance of modifier genes of human genetic disease, and a few diseases serve as models for the course of this emerging evidence.
Characterization of the metabolic disorder, phenylketonuria (PKU) represents a progression from an initial biochemical discovery that informed genetic studies and led to an understanding that genetic heterogeneity within a major locus, and in addition to it, was responsible for variation in clinical phenotypes. [3] In 1953, G. A. Jervis identified a defect in the hepatic enzyme phenylalanine hydroxylase (PAH) as a cause of PKU. [12] This led to the understanding that aberrant phenylalanine metabolism was responsible for the observed phenotype and to the development of diagnostics measuring blood or urine phenylalanine levels. [13] Importantly, the development of clinical diagnostics facilitated better characterization of phenotypic variation, eliciting theories that genetic variation may underlie these observations. In 1983, Woo et al. mapped and cloned that PAH gene confirming theorized allelic heterogeneity. [14] Importantly, phenotypic heterogeneity unexplained by allelic heterogeneity in the PAH gene was observed over time and new mutations in tetrahydrobiopterin recycling were characterized such as by Blau et al. in 1993. [15]
Research on cystic fibrosis (CF) exhibits a progression from genetic to molecular characterization of the major locus, CFTR. In 1985, the gene was mapped by linkage analysis by Tsui et al., [16] and in 1989, the gene was cloned by Riordan et al. [17] When analyses of allelic variation in CFTR did not suffice to yield dependable genotype–phenotype correlations or explain the total phenotypic heterogeneity in CF, the existence of modifier loci was suspected and association studies, such as by Zielenski et al. in 1999, have identified and characterized these loci. [18]
Characterization of human disease modifier genes and oligogenic inheritance holds promise for understanding the abundance of phenotypic variation in human disease, which poses great challenge for therapies that are individually effective. Characterization of these phenotypes and the genes responsible for them, has relied on a combination of molecular techniques typically used to characterizing simple, monogenic disease, and statistical methods, typically used to characterize complex traits. [3]
Understanding the mechanism of disease modifier genes and oligogenic inheritance may provide unique insights into the functions of gene-gene interactions that underlie human disease. Currently, techniques for molecular characterization of disease mechanisms are established for many monogenic diseases, but these studies do not improve understanding of genetic interactions and their phenotypic implications. Methodology for studying complex traits are primarily statistical, and involve large human populations in which genetic modifiers cannot be experimentally manipulated. Thus, in human populations the roles of genetic modifiers are often established by statistical implications, without evidence for causation or opportunity for functional characterization. The study of human disease modifiers and oligogenic disease is a hybrid field that can yield unique and comprehensive insight into complex mechanistic causes of human disease, necessary for the development of effective therapeutics [3]
Characterizing complex multi-gene interactions responsible for disease- and further, developing effective therapeutics- is exceedingly difficult, and the molecular mechanisms of few oligogenic diseases have been elucidated. Retinitis pigmentosa is one disease for which genes responsible for phenotypic variation are identified and the molecular mechanisms by which they interact characterized. [3] Mutations in the human RDS gene have been found, which reduce the formation of functional Rds-Rds protein homodimers, and with digenic inheritance of mutation in ROM1, prevent tetramerization of Rds-Rds and Rom1-Rom1 homodimers. In the double mutant condition, dosage of functional tetrameric complex is sufficiently reduced to cause RP. [19] [20]
Short of the challenge of thoroughly understanding of the molecular basis of complex disease, studying modifier genes and oligogenes can improve clinical diagnoses and prognoses, by combinatorially improving the accuracy genotype–phenotype correlations and shifting from clinical reliance on epidemiological data to patient-specific genetic and biological data. Ultimately, developing therapeutics for human disease informed by an understanding of its molecular mechanisms is challenging. In the case where multi-gene interactions give rise to disease, a series of thorough methodological steps must be taken to determine these mechanisms.
For the common case in which a major locus is causally associated with disease, but there exists considerable, unexplained phenotypic variation, this process of discovery typically begins with establishing heritability of the disease phenotype, and the reasonable likelihood that modifier genes contribute to the observed phenotypic variation. The process proceeds with the identification of modifier loci and functional determination of the mechanisms of their interactions with other disease-causing genes, and culminates with the discovery and development of effective therapeutics for disease, informed by mechanistic understanding of the functions of modifier genes. For skilled researchers using modern techniques, each step in this process of curing human genetic disease by studying modifier gene interactions poses great challenges.
Characterizing modifier gene interactions may not be the easiest approach to curing disease. This is certainly true in the case that modifier genes do not exist or functionally contribute to the disease phenotype of interest. Thus, the first step of establishing the likelihood that modifier genes indeed exist and contribute to observed variation is crucial to this described research process. This is typically done by eliminating other sources of heterogeneity as likely causes of the phenotypic variation in the disease of interest. Recalling Haldane's model, these sources of are generally classified as 1) genetic variation in the major locus 2) other sources of genetic heterogeneity, which include but are not limited to modifier genes, and 3) environmental influences. [11]
Crucial to the efficient study of the molecular basis of human genetic disease via genetic modifiers is first establishing that clinical variation in the phenotype cannot be otherwise explained by factors such as major locus heterogeneity or environmental influences. Methods of establishing disease heritability attributable to modifier genes can be categorized into familial studies and whole-population studies.
Familial studies of phenotypic variation and heritability to establish the existence of modifier loci found themselves on fundamental principles. If heterogeneity in modifier loci underlies observed phenotypic variation, then, on the whole, the more similar their genetic backgrounds, the more similar will be their phenotypes. Thus, interfamilial phenotypic variation will be greater than intrafamilial variation. [1] Researchers use sibling and twin studies, for instance, to estimate heritability of a phenotype, controlling for sources of variation such as major locus heterogeneity and environmental influence. In one such example, Vanscoy et al. compare lung function between monozygotic and dizygotic twin pairs, as well as siblings with cystic fibrosis (CF). By comparing phenotype correlations between relatives, they estimate, using several models, heritability factors of lung function all greater than 0.5, after correcting for variation in major locus (CFTR) genotype and other sources of variation, [21] and this significant proportion of sources of variation invites further study to characterize these factors.
Whole-population approaches are another important method of demonstrating the probability of modifier gene interactions. These approaches typically involve more complex statistical modeling, but ultimately serve the same purpose of family studies, by controlling for sources of phenotypic variance in disease populations to estimate the degree of residual heritability attributable to modifier loci. Wexler et al. use these methods in a sample of about 4,000 Huntington's disease (HD) patients from one of the world's best characterized populations in Venezuela to estimate that about 40% of residual variation in age of onset is attributable to unidentified genetic loci, aside from the HD locus. [22]
Perhaps most obviously important to understanding the roles of modifier loci in causing human genetic disease is identifying them. The size and complexity of the human genome make this a challenging task. Methods of identifying modifier loci can be categorized into three major approaches: genotype–phenotype correlations, linkage and association studies, and experimental phenotypic analyses in animal models. [3]
Commonly, speculation of the responsibility of modifier loci for phenotypic variation of human disease arises from observations of genotype–phenotype correlations that deviate from simple, Mendelian inheritance of mutations at a single locus. Bardet–Biedl syndrome (BBS) provides one such example. BBS is a genetically-heterogeneous disorder with at least 6 known causative loci. It was initially thought that recessive inheritance of mutations at any one of these loci caused the disease, but once the first BBS gene, BBS6, was cloned, mutational and haplotype analysis revealed that some mutations did not conform to the expected monogenic, recessive transmission. [3] Katsanis, et al. were the first to demonstrate the phenomenon of 'triallelic' inheritance, a form of digenic inheritance, of BBS, which initially seemed to be transmitted in a recessive manner. With genotyping of 163 BBS pedigrees, they demonstrated that manifestation of the disease likely requires the presence of three mutant alleles, including those in BBS2, BBS6, and possibly in other loci [23] and they identified a digenic mechanism of inheritance in what was previously considered to be a simple, Mendelian disorder.
Genetic linkage analysis and association studies are widely used methods of identifying modifier loci. The advent of new genetic markers and automated genotyping methods have led to great diversity in applications of these studies. [24] Common methodologies used to identify human disease modifier loci include linkage analysis in families, candidate association studies, and whole-genome association studies.
In an example of a familial linkage study, genotyping and linkage analysis conducted in a population of 197 sibpairs and their parents identified a single modifier gene responsible for meconium ileus, an intestinal obstruction phenotype detected in a subset of CF patients. [18] Their study provides an example of the utility and feasibility of linkage analysis in identifying modifier loci, by simple genotyping and linkage analysis, especially where phenotypes under study are binary, and a subset of candidate markers can be tested.
Associations studies, genome-wide or biologically-informed candidate approaches, can also reveal modifier loci. To identify genes responsible for disease, association studies commonly compare case and control populations with and without phenotypic indications of the disease of interest respectively. To identify modifier loci of diseases of interest, for which causative major loci are commonly established, researchers consider a population composed only of individuals affected by the disease. Distribution of marker genotype is compared in patients with and without 'modified' phenotypes of interest to detect markers in linkage disequilibrium with potential modifier loci. [25] [26] [27] Association studies for detecting markers linked to modifier loci require large populations and dense maps of genomic markers, especially to detect markers of subtle effect on phenotypes for which quantification or dichotomization may be challenging. Since these types of association studies are constrained to populations of patients affected by a disease, a primary challenge is collecting a sizable population that will lend the study sufficient statistical power. [1] While, genome-wide approaches to association studies are systematic and comprehensive, candidate approaches spare the researcher some of the rigor of statistical analyses, for instance reducing the need for multiple testing corrections, but require informed selection of candidates. [1]
Milet et al. conducted a candidate association study for modifiers of genetic hemochromatosis (GH), to explain its variable penetrance. Where most cases of GH are caused by mutations in the HFE gene, they focus on genes in two candidate pathways, those involved in non-HFE GH and those involved in expression of the iron storage hormone hepcidin, and find evidence that several common SNPs in these candidates modify serum ferritin levels. [25] The identified modifier loci have established roles in iron metabolism and, with very simple interpretation, their results suggest future avenues for functional study, a corollary of their informed, candidate approach.
Corvol et al. conducted a whole-genome association study to identify modifiers of CF lung disease, identifying five associated loci. Their methods exemplify the large sample size required to conduct a statistically powerful analysis capable of identifying loci of relatively small effects. To obtain a large population, of 6,365 patients, they combine data on new and previously reported subjects and use linear mixed models that allow for the inclusion of affected siblings. [27] The Genetic Modifiers of Huntington's Disease (GeM-HD) Consortium conducted a similar genome-wide association study, the first to demonstrate modifier loci modulating age of onset phenotype in Huntington's disease (HD). Their analyses were conducted on a population of 1,089 individuals affected by HD, genetic samples for which were collected over a period of almost 30 years. [26]
Genotype-phenotype correlations, and linkage and association analyses in humans can effectively identify modifier genes with statistical support, but do not establish functional or causative effects of modifier genes. While genetic backgrounds cannot be experimentally manipulated in human populations, transgenic expression of modifier genes in animal models has been effectively used to show that variation in specific loci can cause phenotypic variation. For example, Ikeda et al. established that moth1 is a modifier gene of the tub gene, mutations in which cause obesity, retinal degeneration and hearing loss in tubby mice. [28] [3] In the wild-type form, the protein Mtap1a, encoded by neuronally-expressed moth1, is protective against hearing loss. Ikeda et al. showed, by transgenic expression in mice, that sequence polymorphisms in Mtap1a are crucial in causing the hearing-loss phenotype associated with tub gene mutation. [28]
Unexplained variation in human disease phenotype is a complex issue, and with established methodologies of identifying and characterizing modifier genes, it is tempting to pursue them as explanation, with the potential to lead to a better understanding of the biological mechanisms of disease, such as by characterized interactions between gene products. [1]
But identifying new modifier loci is challenging, and becomes more so, as those variants most easily detectable by current methods are increasingly identified and characterized. Still, for a majority human genetic diseases, less than 20% of heritability is explained by known variants. [29] Some argue that unexplained heritability may be due to many rare variants, which typical association studies are not designed to detect, or structural genetic variation. Though the challenging identification of new genetic variants may not be the most efficient approach to developing mechanistically-informed drugs, it may lead to the development of more safe, individualized, and effective interventional strategies, risk-assessments and prognoses. [30]
Zuk et al. argue that biomedical research should focus on interacting molecular mechanisms of genetic variants already discovered. They argue that genetic interactions are common. Widely used estimates of narrow-sense heritability, the ratio of additive genetic variance (calculated from measured effects of known variants) to total phenotypic variance (inferred from population data) assume additive effects in the heritability effects of variants. For most human traits, this fraction remains below 0.2. They argue that the assumption that heritability effects are additive overestimates heritability attributable to all genetic variation underlying disease, and thus underestimates the fraction of heritability attributable to already-discovered ones. Thus, they argue, research should focus on the molecular bases of already-discovered variants, as estimates of discovered sources of heritability are flawed, and there is insufficient evidence that undiscovered genetic modifiers exist. Furthermore, "the proportion of phenotypic variance explained by a variant in the human population is a notoriously poor predictor of the importance of the gene for biology or medicine," and effective therapeutics may target products of genes that explain very little of the clinically-observed variation in phenotype. [29]
An allele, or allelomorph, is a variant of the sequence of nucleotides at a particular location, or locus, on a DNA molecule.
Heredity, also called inheritance or biological inheritance, is the passing on of traits from parents to their offspring; either through asexual reproduction or sexual reproduction, the offspring cells or organisms acquire the genetic information of their parents. Through heredity, variations between individuals can accumulate and cause species to evolve by natural selection. The study of heredity in biology is genetics.
In genetics, the phenotype is the set of observable characteristics or traits of an organism. The term covers the organism's morphology, its developmental processes, its biochemical and physiological properties, its behavior, and the products of behavior. An organism's phenotype results from two basic factors: the expression of an organism's genetic code and the influence of environmental factors. Both factors may interact, further affecting the phenotype. When two or more clearly different phenotypes exist in the same population of a species, the species is called polymorphic. A well-documented example of polymorphism is Labrador Retriever coloring; while the coat color depends on many genes, it is clearly seen in the environment as yellow, black, and brown. Richard Dawkins in 1978 and then again in his 1982 book The Extended Phenotype suggested that one can regard bird nests and other built structures such as caddisfly larva cases and beaver dams as "extended phenotypes".
In genetics, dominance is the phenomenon of one variant (allele) of a gene on a chromosome masking or overriding the effect of a different variant of the same gene on the other copy of the chromosome. The first variant is termed dominant and the second is called recessive. This state of having two different variants of the same gene on each chromosome is originally caused by a mutation in one of the genes, either new or inherited. The terms autosomal dominant or autosomal recessive are used to describe gene variants on non-sex chromosomes (autosomes) and their associated traits, while those on sex chromosomes (allosomes) are termed X-linked dominant, X-linked recessive or Y-linked; these have an inheritance and presentation pattern that depends on the sex of both the parent and the child. Since there is only one copy of the Y chromosome, Y-linked traits cannot be dominant or recessive. Additionally, there are other forms of dominance, such as incomplete dominance, in which a gene variant has a partial effect compared to when it is present on both chromosomes, and co-dominance, in which different variants on each chromosome both show their associated traits.
Heritability is a statistic used in the fields of breeding and genetics that estimates the degree of variation in a phenotypic trait in a population that is due to genetic variation between individuals in that population. The concept of heritability can be expressed in the form of the following question: "What is the proportion of the variation in a given trait within a population that is not explained by the environment or random chance?"
Population genetics is a subfield of genetics that deals with genetic differences within and among populations, and is a part of evolutionary biology. Studies in this branch of biology examine such phenomena as adaptation, speciation, and population structure.
A quantitative trait locus (QTL) is a locus that correlates with variation of a quantitative trait in the phenotype of a population of organisms. QTLs are mapped by identifying which molecular markers correlate with an observed trait. This is often an early step in identifying the actual genes that cause the trait variation.
Forward genetics is a molecular genetics approach of determining the genetic basis responsible for a phenotype. Forward genetics provides an unbiased approach because it relies heavily on identifying the genes or genetic factors that cause a particular phenotype or trait of interest.
Genetic architecture is the underlying genetic basis of a phenotypic trait and its variational properties. Phenotypic variation for quantitative traits is, at the most basic level, the result of the segregation of alleles at quantitative trait loci (QTL). Environmental factors and other external influences can also play a role in phenotypic variation. Genetic architecture is a broad term that can be described for any given individual based on information regarding gene and allele number, the distribution of allelic and mutational effects, and patterns of pleiotropy, dominance, and epistasis.
In genetics, expressivity is the degree to which a phenotype is expressed by individuals having a particular genotype. Alternatively, it may refer to the expression of a particular gene by individuals having a certain phenotype. Expressivity is related to the intensity of a given phenotype; it differs from penetrance, which refers to the proportion of individuals with a particular genotype that share the same phenotype.
A polygene is a member of a group of non-epistatic genes that interact additively to influence a phenotypic trait, thus contributing to multiple-gene inheritance, a type of non-Mendelian inheritance, as opposed to single-gene inheritance, which is the core notion of Mendelian inheritance. The term "monozygous" is usually used to refer to a hypothetical gene as it is often difficult to distinguish the effect of an individual gene from the effects of other genes and the environment on a particular phenotype. Advances in statistical methodology and high throughput sequencing are, however, allowing researchers to locate candidate genes for the trait. In the case that such a gene is identified, it is referred to as a quantitative trait locus (QTL). These genes are generally pleiotropic as well. The genes that contribute to type 2 diabetes are thought to be mostly polygenes. In July 2016, scientists reported identifying a set of 355 genes from the last universal common ancestor (LUCA) of all organisms living on Earth.
Genetic association is when one or more genotypes within a population co-occur with a phenotypic trait more often than would be expected by chance occurrence.
In genomics, a genome-wide association study, is an observational study of a genome-wide set of genetic variants in different individuals to see if any variant is associated with a trait. GWA studies typically focus on associations between single-nucleotide polymorphisms (SNPs) and traits like major human diseases, but can equally be applied to any other genetic variants and any other organisms.
Genetic heterogeneity occurs through the production of single or similar phenotypes through different genetic mechanisms. There are two types of genetic heterogeneity: allelic heterogeneity, which occurs when a similar phenotype is produced by different alleles within the same gene; and locus heterogeneity, which occurs when a similar phenotype is produced by mutations at different loci.
Locus heterogeneity occurs when mutations at multiple genomic loci are capable of producing the same phenotype, and each individual mutation is sufficient to cause the specific phenotype independently. Locus heterogeneity should not be confused with allelic heterogeneity, in which a single phenotype can be produced by multiple mutations, all of which are at the same locus on a chromosome. Likewise, it should not be confused with phenotypic heterogeneity, in which different phenotypes arise among organisms with identical genotypes and environmental conditions. Locus heterogeneity and allelic heterogeneity are the two components of genetic heterogeneity.
Oligogenic inheritance describes a trait that is influenced by a few genes. Oligogenic inheritance represents an intermediate between monogenic inheritance in which a trait is determined by a single causative gene, and polygenic inheritance, in which a trait is influenced by many genes and often environmental factors.
Mendelian traits behave according to the model of monogenic or simple gene inheritance in which one gene corresponds to one trait. Discrete traits with simple Mendelian inheritance patterns are relatively rare in nature, and many of the clearest examples in humans cause disorders. Discrete traits found in humans are common examples for teaching genetics.
Genetic variance is a concept outlined by the English biologist and statistician Ronald Fisher in his fundamental theorem of natural selection. In his 1930 book The Genetical Theory of Natural Selection, Fisher postulates that the rate of change of biological fitness can be calculated by the genetic variance of the fitness itself. Fisher tried to give a statistical formula about how the change of fitness in a population can be attributed to changes in the allele frequency. Fisher made no restrictive assumptions in his formula concerning fitness parameters, mate choices or the number of alleles and loci involved.
This glossary of genetics and evolutionary biology is a list of definitions of terms and concepts used in the study of genetics and evolutionary biology, as well as sub-disciplines and related fields, with an emphasis on classical genetics, quantitative genetics, population biology, phylogenetics, speciation, and systematics. Overlapping and related terms can be found in Glossary of cellular and molecular biology, Glossary of ecology, and Glossary of biology.
Complex traits, also known as quantitative traits, are traits that do not behave according to simple Mendelian inheritance laws. More specifically, their inheritance cannot be explained by the genetic segregation of a single gene. Such traits show a continuous range of variation and are influenced by both environmental and genetic factors. Compared to strictly Mendelian traits, complex traits are far more common, and because they can be hugely polygenic, they are studied using statistical techniques such as quantitative genetics and quantitative trait loci (QTL) mapping rather than classical genetics methods. Examples of complex traits include height, circadian rhythms, enzyme kinetics, and many diseases including diabetes and Parkinson's disease. One major goal of genetic research today is to better understand the molecular mechanisms through which genetic variants act to influence complex traits.