Gene polymorphism

Last updated
Genes which control hair colour are polymorphic. PCWmice1.jpg
Genes which control hair colour are polymorphic.

A gene is said to be polymorphic if more than one allele occupies that gene's locus within a population. [1] In addition to having more than one allele at a specific locus, each allele must also occur in the population at a rate of at least 1% to generally be considered polymorphic. [2]

Contents

Gene polymorphisms can occur in any region of the genome. The majority of polymorphisms are silent, meaning they do not alter the function or expression of a gene. [3] Some polymorphisms are visible. For example, in dogs the E locus can have any of five different alleles, known as E, Em, Eg, Eh, and e. [4] Varying combinations of these alleles contribute to the pigmentation and patterns seen in dog coats. [5]

A polymorphic variant of a gene can lead to the abnormal expression or to the production of an abnormal form of the protein; this abnormality may cause or be associated with disease. For example, a polymorphic variant of the gene encoding the enzyme CYP4A11, in which thymidine replaces cytosine at the gene's nucleotide 8590 position encodes a CYP4A11 protein that substitutes phenylalanine with serine at the protein's amino acid position 434. [6] This variant protein has reduced enzyme activity in metabolizing arachidonic acid to the blood pressure-regulating eicosanoid, 20-hydroxyeicosatetraenoic acid. A study has shown that humans bearing this variant in one or both of their CYP4A11 genes have an increased incidence of hypertension, ischemic stroke, and coronary artery disease. [6]

Most notably, the genes coding for the major histocompatibility complex (MHC) are in fact the most polymorphic genes known. MHC molecules are involved in the immune system and interact with T-cells. There are more than 32,000 different alleles of human MHC class I and II genes, and it has been estimated that there are 200 variants at the HLA-B HLA-DRB1 loci alone. [7]

Some polymorphism may be maintained by balancing selection.

Differences between gene polymorphism and mutation

A rule of thumb that is sometimes used is to classify genetic variants that occur below 1% allele frequency as mutations rather than polymorphisms. [8] However, since polymorphisms may occur at low allele frequency, this is not a reliable way to tell new mutations from polymorphisms. [9] A mutation is a change to an inherited genetic sequence.

In the case of silent mutations there isn't a change in fitness, and the pressures responsible for Hardy-Weinberg equilibrium have no impact on the accumulation of silent polymorphisms over time. Most often, a polymorphism is variation in a single nucleotide (SNP), but also can be insertion or deletion of one or more nucleotides, changes in the number of times a short or longer sequence is repeated (both of these are common in parts of DNA that don't directly code for a protein, as are SNPs, but can have major effects on gene expression). [11] [12] Polymorphisms which result in a change in fitness are the grist for the mill of evolution by natural selection. All genetic polymorphisms start out as a mutation, but only if they are germline and are not lethal can they spread into a population. Polymorphisms are classified based on what happens at the level of the individual mutation in the DNA sequence (or RNA sequence in the case of RNA viruses), and what effect the mutation has on the phenotype (i.e. silent or resulting in some change in function or change in fitness). Polymorphisms are also classified based on whether the change is in the sequence of the resulting protein or in the regulation of the expression of the gene, which can occur at sites that are typically upstream and adjacent to the gene, but not always. [13] [11]

Identification

Polymorphisms can be identified in the laboratory using a variety of methods. Many methods employ PCR to amplify the sequence of a gene. Once amplified, polymorphisms and mutations in the sequence can be detected by DNA sequencing, either directly or after screening for variation with a method such as single strand conformation polymorphism analysis. [14]

Types

A polymorphism can be any sequence difference. Examples include:

Clinical significance

Many different human disease result from polymorphisms. Polymorphisms also play significant role as risk factors for development of disease. [19] Finally, polymorphisms in drug metabolism, esp. cytochrome p450 isoenzymes, proteins involved in drug transport (whether into the body, into protected areas of the body like the brain, or secreted out) as well as in specific cell surface receptor proteins alter the effect of various drugs. [13] This is a rapidly evolving area of drug safety research. [20] [21] Resources such as HapMap, DbSNP,Ensembl, DNA Data Bank of Japan, DrugBank, Kyoto Encyclopedia of Genes and Genomes (KEGG), GenBank, and other parts of the International Nucleotide Sequence Database Collaboration have become crucial in Personalized medicine,bioinformatics, and pharmacogenomics. [22]

Lung cancer

Polymorphisms have been discovered in multiple XPD exons. XPD refers to "xeroderma pigmentosum group D" and is involved in a DNA repair mechanism used during DNA replication. XPD works by cutting and removing segments of DNA that have been damaged due to things such as cigarette smoking and inhalation of other environmental carcinogens. [23] Asp312Asn and Lys751Gln are the two common polymorphisms of XPD that result in a change in a single amino acid. [24] This variation in Asn and Gln alleles has been related to individuals having a reduced DNA repair efficiency. [25] Several studies have been conducted to see if this diminished capacity to repair DNA is related to an increased risk of lung cancer. These studies examined the XPD gene in lung cancer patients of varying age, gender, race, and pack-years. The studies provided mixed results, from concluding individuals who are homozygous for the Asn allele or homozygous for the Gln allele had an increased risk of developing lung cancer, [26] to finding no statistical significance between smokers who have either allele polymorphism and their susceptibility to lung cancer. [27] Research continues to be conducted to determine the relationship between XPD polymorphisms and lung cancer risk.

As a cornerstone of Peronalized medicine cancers, Sequence analysis is becoming increasingly important to understand the specific mutations involved in the individual's cancer, such as needed to select specific molecular targets such as mutations in various receptors, but also understanding the polymorphisms they inherited which play important roles in diagnosis, prognosis, and treatment, such as treatment of leukemia with 6-mercaptopurine where toxicity largely depends on polymorphisms in multiple different genes involved in its metabolism. [28]

Asthma

Asthma is an inflammatory disease of the lungs and more than 100 loci have been identified as contributing to the development and severity of the condition. [29] By using the traditional linkage analysis, these asthma correlated genes were able to be identified in small quantities using genome-wide association studies (GWAS). There have been a number of studies looking into various polymorphisms of asthma-associated genes and how those polymorphisms interact with the carrier's environment. One example is the gene CD14, which is known to have a polymorphism that is associated with increased amounts of CD14 protein as well as reduced levels of IgE serum. [30] A study was conducted on 624 children looking at their IgE serum levels as it related to the polymorphism in CD14. The study found that IgE serum levels differed in children with the C allele in the CD14/-260 gene based on the type of allergens they regularly exposed to. [31] Children who were in regular contact with house pets showed higher serum levels of IgE while children who were regularly exposed to stable animals showed lower serum levels of IgE. [31] Continued research into gene-environment interactions may lead to more specialized treatment plans based on an individual's surroundings.

Related Research Articles

An allele, or allelomorph, is a variant of the sequence of nucleotides at a particular location, or locus, on a DNA molecule.

<span class="mw-page-title-main">Mutation</span> Alteration in the nucleotide sequence of a genome

In biology, a mutation is an alteration in the nucleic acid sequence of the genome of an organism, virus, or extrachromosomal DNA. Viral genomes contain either DNA or RNA. Mutations result from errors during DNA or viral replication, mitosis, or meiosis or other types of damage to DNA, which then may undergo error-prone repair, cause an error during other forms of repair, or cause an error during replication. Mutations may also result from insertion or deletion of segments of DNA due to mobile genetic elements.

In molecular biology, restriction fragment length polymorphism (RFLP) is a technique that exploits variations in homologous DNA sequences, known as polymorphisms, populations, or species or to pinpoint the locations of genes within a sequence. The term may refer to a polymorphism itself, as detected through the differing locations of restriction enzyme sites, or to a related laboratory technique by which such differences can be illustrated. In RFLP analysis, a DNA sample is digested into fragments by one or more restriction enzymes, and the resulting restriction fragments are then separated by gel electrophoresis according to their size.

<span class="mw-page-title-main">Human genome</span> Complete set of nucleic acid sequences for humans

The human genome is a complete set of nucleic acid sequences for humans, encoded as DNA within the 23 chromosome pairs in cell nuclei and in a small DNA molecule found within individual mitochondria. These are usually treated separately as the nuclear genome and the mitochondrial genome. Human genomes include both protein-coding DNA sequences and various types of DNA that does not encode proteins. The latter is a diverse category that includes DNA coding for non-translated RNA, such as that for ribosomal RNA, transfer RNA, ribozymes, small nuclear RNAs, and several types of regulatory RNAs. It also includes promoters and their associated gene-regulatory elements, DNA playing structural and replicatory roles, such as scaffolding regions, telomeres, centromeres, and origins of replication, plus large numbers of transposable elements, inserted viral DNA, non-functional pseudogenes and simple, highly repetitive sequences. Introns make up a large percentage of non-coding DNA. Some of this non-coding DNA is non-functional junk DNA, such as pseudogenes, but there is no firm consensus on the total amount of junk DNA.

A microsatellite is a tract of repetitive DNA in which certain DNA motifs are repeated, typically 5–50 times. Microsatellites occur at thousands of locations within an organism's genome. They have a higher mutation rate than other areas of DNA leading to high genetic diversity. Microsatellites are often referred to as short tandem repeats (STRs) by forensic geneticists and in genetic genealogy, or as simple sequence repeats (SSRs) by plant geneticists.

<span class="mw-page-title-main">Single-nucleotide polymorphism</span> Single nucleotide in genomic DNA at which different sequence alternatives exist

In genetics and bioinformatics, a single-nucleotide polymorphism is a germline substitution of a single nucleotide at a specific position in the genome that is present in a sufficiently large fraction of considered population.

The International HapMap Project was an organization that aimed to develop a haplotype map (HapMap) of the human genome, to describe the common patterns of human genetic variation. HapMap is used to find genetic variants affecting health, disease and responses to drugs and environmental factors. The information produced by the project is made freely available for research.

<span class="mw-page-title-main">Loss of heterozygosity</span>

Loss of heterozygosity (LOH) is a type of genetic abnormality in diploid organisms in which one copy of an entire gene and its surrounding chromosomal region are lost. Since diploid cells have two copies of their genes, one from each parent, a single copy of the lost gene still remains when this happens, but any heterozygosity is no longer present.

A genetic marker is a gene or DNA sequence with a known location on a chromosome that can be used to identify individuals or species. It can be described as a variation that can be observed. A genetic marker may be a short DNA sequence, such as a sequence surrounding a single base-pair change, or a long one, like minisatellites.

<span class="mw-page-title-main">Nucleotide excision repair</span> DNA repair mechanism

Nucleotide excision repair is a DNA repair mechanism. DNA damage occurs constantly because of chemicals, radiation and other mutagens. Three excision repair pathways exist to repair single stranded DNA damage: Nucleotide excision repair (NER), base excision repair (BER), and DNA mismatch repair (MMR). While the BER pathway can recognize specific non-bulky lesions in DNA, it can correct only damaged bases that are removed by specific glycosylases. Similarly, the MMR pathway only targets mismatched Watson-Crick base pairs.

<span class="mw-page-title-main">Ancestry-informative marker</span>

In population genetics, an ancestry-informative marker (AIM) is a single-nucleotide polymorphism that exhibits substantially different frequencies between different populations. A set of many AIMs can be used to estimate the proportion of ancestry of an individual derived from each population.

Genotyping is the process of determining differences in the genetic make-up (genotype) of an individual by examining the individual's DNA sequence using biological assays and comparing it to another individual's sequence or a reference sequence. It reveals the alleles an individual has inherited from their parents. Traditionally genotyping is the use of DNA sequences to define biological populations by use of molecular tools. It does not usually involve defining the genes of an individual.

<span class="mw-page-title-main">Human genetic variation</span> Genetic diversity in human populations

Human genetic variation is the genetic differences in and among populations. There may be multiple variants of any given gene in the human population (alleles), a situation called polymorphism.

In molecular biology, SNP array is a type of DNA microarray which is used to detect polymorphisms within a population. A single nucleotide polymorphism (SNP), a variation at a single site in DNA, is the most frequent type of variation in the genome. Around 335 million SNPs have been identified in the human genome, 15 million of which are present at frequencies of 1% or higher across different populations worldwide.

<span class="mw-page-title-main">Deoxyribonuclease I</span> Protein-coding gene in the species Homo sapiens

Deoxyribonuclease I, is an endonuclease of the DNase family coded by the human gene DNASE1. DNase I is a nuclease that cleaves DNA preferentially at phosphodiester linkages adjacent to a pyrimidine nucleotide, yielding 5'-phosphate-terminated polynucleotides with a free hydroxyl group on position 3', on average producing tetranucleotides. It acts on single-stranded DNA, double-stranded DNA, and chromatin. In addition to its role as a waste-management endonuclease, it has been suggested to be one of the deoxyribonucleases responsible for DNA fragmentation during apoptosis.

SNP genotyping is the measurement of genetic variations of single nucleotide polymorphisms (SNPs) between members of a species. It is a form of genotyping, which is the measurement of more general genetic variation. SNPs are one of the most common types of genetic variation. An SNP is a single base pair mutation at a specific locus, usually consisting of two alleles. SNPs are found to be involved in the etiology of many human diseases and are becoming of particular interest in pharmacogenetics. Because SNPs are conserved during evolution, they have been proposed as markers for use in quantitative trait loci (QTL) analysis and in association studies in place of microsatellites. The use of SNPs is being extended in the HapMap project, which aims to provide the minimal set of SNPs needed to genotype the human genome. SNPs can also provide a genetic fingerprint for use in identity testing. The increase of interest in SNPs has been reflected by the furious development of a diverse range of SNP genotyping methods.

An allele-specific oligonucleotide (ASO) is a short piece of synthetic DNA complementary to the sequence of a variable target DNA. It acts as a probe for the presence of the target in a Southern blot assay or, more commonly, in the simpler dot blot assay. It is a common tool used in genetic testing, forensics, and molecular biology research.

Disease gene identification is a process by which scientists identify the mutant genotypes responsible for an inherited genetic disorder. Mutations in these genes can include single nucleotide substitutions, single nucleotide additions/deletions, deletion of the entire gene, and other genetic abnormalities.

Single nucleotide polymorphism annotation is the process of predicting the effect or function of an individual SNP using SNP annotation tools. In SNP annotation the biological information is extracted, collected and displayed in a clear form amenable to query. SNP functional annotation is typically performed based on the available information on nucleic acid and protein sequences.

Personalized genomics is the human genetics-derived study of analyzing and interpreting individualized genetic information by genome sequencing to identify genetic variations compared to the library of known sequences. International genetics communities have spared no effort from the past and have gradually cooperated to prosecute research projects to determine DNA sequences of the human genome using DNA sequencing techniques. The methods that are the most commonly used are whole exome sequencing and whole genome sequencing. Both approaches are used to identify genetic variations. Genome sequencing became more cost-effective over time, and made it applicable in the medical field, allowing scientists to understand which genes are attributed to specific diseases.

References

  1. "Genetic polymorphism - Biology-Online Dictionary | Biology-Online Dictionary". September 2020.
  2. "Genetic Testing Report-Glossary". National Human Genome Research Institute (NHGRI). Retrieved 2017-11-08.
  3. Chanock, Stephen (2017-05-22). "Technologic Issues in GWAS and Follow-up Studies" (PDF). Genome.gov. Archived from the original (PDF) on 2018-08-22. Retrieved 2017-11-30.
  4. "Dog Coat Colour Genetics".
  5. "E-Locus (Recessive Yellow, Melanistic Mask Allele)". www.animalgenetics.us. Archived from the original on 2017-10-30. Retrieved 2017-11-08.
  6. 1 2 Wu CC, Gupta T, Garcia V, Ding Y, Schwartzman ML (2014). "20-HETE and blood pressure regulation: clinical implications". Cardiology in Review. 22 (1): 1–12. doi:10.1097/CRD.0b013e3182961659. PMC   4292790 . PMID   23584425.
  7. Bodmer, J. G.; Marsh, S. G. E.; Albert, E. D.; Bodmer, W. F.; Bontrop, R. E.; Dupont, B.; Erlich, H. A.; Hansen, J. A.; Mach, B. (1999-04-01). "Nomenclature for factors of the HLA system, 1998". European Journal of Immunogenetics. 26 (2–3): 81–116. doi:10.1046/j.1365-2370.1999.00159.x. ISSN   1365-2370. PMID   10331156.
  8. "Genetic Polymorphism and How It Lasts over Generations".
  9. Karki, Roshan; Pandya, Deep; Elston, Robert C.; Ferlini, Cristiano (2015-07-15). "Defining "mutation" and "polymorphism" in the era of personal genomics". BMC Medical Genomics. 8: 37. doi: 10.1186/s12920-015-0115-z . ISSN   1755-8794. PMC   4502642 . PMID   26173390.
  10. Karki, Roshan; Pandya, Deep; Elston, Robert C.; Ferlini, Cristiano (2015-07-15). "Defining "mutation" and "polymorphism" in the era of personal genomics". BMC Medical Genomics. 8: 37. doi: 10.1186/s12920-015-0115-z . ISSN   1755-8794. PMC   4502642 . PMID   26173390.
  11. 1 2 Chorley, Brian N.; Wang, Xuting; Campbell, Michelle R.; Pittman, Gary S.; Noureddine, Maher A.; Bell, Douglas A. (2008). "Discovery and verification of functional single nucleotide polymorphisms in regulatory genomic regions: Current and developing technologies". Mutation Research. 659 (1–2): 147–157. doi:10.1016/j.mrrev.2008.05.001. ISSN   0027-5107. PMC   2676583 . PMID   18565787.
  12. Albert, Paul R. (November 2011). "What is a functional genetic polymorphism? Defining classes of functionality". Journal of Psychiatry & Neuroscience. 36 (6): 363–365. doi:10.1503/jpn.110137. ISSN   1180-4882. PMC   3201989 . PMID   22011561.
  13. 1 2 Sadee, W; Wang, D; Papp, AC; Pinsonneault, JK; Smith, RM; Moyer, RA; Johnson, AD (March 2011). "Pharmacogenomics of the RNA World: Structural RNA Polymorphisms in Drug Therapy". Clinical Pharmacology and Therapeutics. 89 (3): 355–365. doi:10.1038/clpt.2010.314. ISSN   0009-9236. PMC   3251919 . PMID   21289622.
  14. Bull, Laura (2013). Genetics, Mutations, and Polymorphisms. Landes Bioscience.
  15. "What are single nucleotide polymorphisms (SNPs)?".
  16. Mills RE, Pittard WS, Mullaney JM, Farooq U, Creasy TH, Mahurkar AA, Kemeza DM, Strassler DS, Ponting CP, Webber C, Devine SE (2011). "Natural genetic variation caused by small insertions and deletions in the human genome". Genome Research. 21 (6): 830–9. doi:10.1101/gr.115907.110. PMC   3106316 . PMID   21460062.
  17. Mullaney JM, Mills RE, Pittard WS, Devine SE (2010). "Small insertions and deletions (INDELs) in human genomes". Human Molecular Genetics. 19 (R2): R131–6. doi:10.1093/hmg/ddq400. PMC   2953750 . PMID   20858594.
  18. "Difference Between Minisatellite and Microsatellite".
  19. "Polygenic Risk Scores". www.genome.gov. Retrieved 2024-02-17.
  20. Research, Center for Drug Evaluation and (2024-02-02). "Table of Pharmacogenomic Biomarkers in Drug Labeling". FDA.
  21. "Genomics and Medicine". www.genome.gov. Retrieved 2024-02-17.
  22. Mizrachi, Ilene (2007-08-22), "GenBank: The Nucleotide Sequence Database", The NCBI Handbook [Internet], National Center for Biotechnology Information (US), retrieved 2024-02-17
  23. Hou, S.-M. (2002-04-01). "The XPD variant alleles are associated with increased aromatic DNA adduct level and lung cancer risk". Carcinogenesis. 23 (4): 599–603. doi: 10.1093/carcin/23.4.599 . ISSN   0143-3334. PMID   11960912.
  24. Qin, Qin; Zhang, Chi; Yang, Xi; Zhu, Hongcheng; Yang, Baixia; Cai, Jing; Cheng, Hongyan; Ma, Jianxin; Lu, Jing (2013-11-15). "Polymorphisms in XPD Gene Could Predict Clinical Outcome of Platinum-Based Chemotherapy for Non-Small Cell Lung Cancer Patients: A Meta-Analysis of 24 Studies". PLOS ONE. 8 (11): e79864. Bibcode:2013PLoSO...879864Q. doi: 10.1371/journal.pone.0079864 . ISSN   1932-6203. PMC   3829883 . PMID   24260311.
  25. Benhamou S, Sarasin A (2005). "ERCC2 /XPD gene polymorphisms and lung cancer: a HuGE review". American Journal of Epidemiology. 161 (1): 1–14. doi: 10.1093/aje/kwi018 . PMID   15615908.
  26. Liang, Gang; Xing, Deyin; Miao, Xiaoping; Tan, Wen; Yu, Chunyuan; Lu, Wenfu; Lin, Dongxin (2003-07-10). "Sequence variations in the DNA repair gene XPD and risk of lung cancer in a Chinese population". International Journal of Cancer. 105 (5): 669–673. doi: 10.1002/ijc.11136 . ISSN   1097-0215. PMID   12740916.
  27. Misra, R Rita; Ratnasinghe, Duminda; Tangrea, Joseph A; Virtamo, Jarmo; Andersen, Mark R; Barrett, Michael; Taylor, Philip R; Albanes, Demetrius (2003). "Polymorphisms in the DNA repair genes XPD, XRCC1, XRCC3, and APE/ref-1, and the risk of lung cancer amongmale smokers in Finland". Cancer Letters. 191 (2): 171–178. doi:10.1016/s0304-3835(02)00638-9. PMID   12618330.
  28. Moradveisi, Borhan; Muwakkit, Samar; Zamani, Fatemeh; Ghaderi, Ebrahim; Mohammadi, Ebrahim; Zgheib, Nathalie K. (2019-08-27). "ITPA, TPMT, and NUDT15 Genetic Polymorphisms Predict 6-Mercaptopurine Toxicity in Middle Eastern Children With Acute Lymphoblastic Leukemia". Frontiers in Pharmacology. 10: 916. doi: 10.3389/fphar.2019.00916 . ISSN   1663-9812. PMC   6718715 . PMID   31507415.
  29. March ME, Sleiman PM, Hakonarson H (2013). "Genetic polymorphisms and associated susceptibility to asthma". International Journal of General Medicine . 6: 253–65. doi: 10.2147/IJGM.S28156 . PMC   3636804 . PMID   23637549.
  30. Baldini, M.; Lohman, I. C.; Halonen, M.; Erickson, R. P.; Holt, P. G.; Martinez, F. D. (May 1999). "A Polymorphism* in the 5' flanking region of the CD14 gene is associated with circulating soluble CD14 levels and with total serum immunoglobulin E". American Journal of Respiratory Cell and Molecular Biology. 20 (5): 976–983. doi:10.1165/ajrcmb.20.5.3494. ISSN   1044-1549. PMID   10226067.
  31. 1 2 Eder, Waltraud; Klimecki, Walt; Yu, Lizhi; von Mutius, Erika; Riedler, Josef; Braun-Fahrländer, Charlotte; Nowak, Dennis; Martinez, Fernando D.; Allergy And Endotoxin Alex Study Team (September 2005). "Opposite effects of CD 14/-260 on serum IgE levels in children raised in different environments". The Journal of Allergy and Clinical Immunology. 116 (3): 601–607. doi: 10.1016/j.jaci.2005.05.003 . ISSN   0091-6749. PMID   16159630.