Rare variant (genetics)

Last updated

A rare variant is a genetic variant which occurs at low frequency in a population. [1] Rare variants play a significant role in both complex and Mendelian disease and are responsible for a portion of the missing heritability of complex diseases. The theoretical case for a significant role of rare variants is that alleles that strongly predispose an individual to disease will be kept at low frequencies in populations by purifying selection. [2] Rare variants are increasingly being studied, as a consequence of whole exome and whole genome sequencing efforts. While these variants are individually infrequent in populations, there are many in human populations, and they can be unique to specific populations. They are more likely to be deleterious than common variants, as a result of rapid population growth and weak purifying selection. [3] They have been suspected of acting independently or along with common variants to cause disease states. [4]

Contents

Methods of discovery

Some methods, such as genetic burden tests, have been specifically developed to study genetic association of rare variants. [5] These methods aggregate rare variants over genetic regions, such as genes or whole pathways, and evaluate cumulative effects of multiple genetic variants. These methods may increase power when multiple variants in the region are associated with a disease or a trait. In addition, compared to a genome-wide association study, a region or gene based test performs much fewer tests resulting in a less stringent multiple-hypothesis correction than the genome-wide significance. [6] Some examples of these methods are SKAT, [7] SKAT-O, [5] ARIEL test, [8] aSUM [9] and STAAR. [10] SNP annotations help to prioritize rare functional variants, and incorporating these annotations can effectively boost the power of genetic association of rare variants analysis of whole exome and whole genome sequencing studies. [11] [12]

See also

Related Research Articles

<span class="mw-page-title-main">Bioinformatics</span> Computational analysis of large, complex sets of biological data

Bioinformatics is an interdisciplinary field of science that develops methods and software tools for understanding biological data, especially when the data sets are large and complex. Bioinformatics uses biology, chemistry, physics, computer science, computer programming, information engineering, mathematics and statistics to analyze and interpret biological data. The subsequent process of analyzing and interpreting data is referred to as computational biology.

<span class="mw-page-title-main">Human genome</span> Complete set of nucleic acid sequences for humans

The human genome is a complete set of nucleic acid sequences for humans, encoded as DNA within the 23 chromosome pairs in cell nuclei and in a small DNA molecule found within individual mitochondria. These are usually treated separately as the nuclear genome and the mitochondrial genome. Human genomes include both protein-coding DNA sequences and various types of DNA that does not encode proteins. The latter is a diverse category that includes DNA coding for non-translated RNA, such as that for ribosomal RNA, transfer RNA, ribozymes, small nuclear RNAs, and several types of regulatory RNAs. It also includes promoters and their associated gene-regulatory elements, DNA playing structural and replicatory roles, such as scaffolding regions, telomeres, centromeres, and origins of replication, plus large numbers of transposable elements, inserted viral DNA, non-functional pseudogenes and simple, highly repetitive sequences. Introns make up a large percentage of non-coding DNA. Some of this non-coding DNA is non-functional junk DNA, such as pseudogenes, but there is no firm consensus on the total amount of junk DNA.

<span class="mw-page-title-main">Single-nucleotide polymorphism</span> Single nucleotide in genomic DNA at which different sequence alternatives exist

In genetics and bioinformatics, a single-nucleotide polymorphism is a germline substitution of a single nucleotide at a specific position in the genome that is present in a sufficiently large fraction of considered population.

<span class="mw-page-title-main">Identity by descent</span> Identical nucleotide sequence due to inheritance without recombination from a common ancestor

A DNA segment is identical by state (IBS) in two or more individuals if they have identical nucleotide sequences in this segment. An IBS segment is identical by descent (IBD) in two or more individuals if they have inherited it from a common ancestor without recombination, that is, the segment has the same ancestral origin in these individuals. DNA segments that are IBD are IBS per definition, but segments that are not IBD can still be IBS due to the same mutations in different individuals or recombinations that do not alter the segment.

<span class="mw-page-title-main">Whole genome sequencing</span> Determining nearly the entirety of the DNA sequence of an organisms genome at a single time

Whole genome sequencing (WGS), also known as full genome sequencing, complete genome sequencing, or entire genome sequencing, is the process of determining the entirety, or nearly the entirety, of the DNA sequence of an organism's genome at a single time. This entails sequencing all of an organism's chromosomal DNA as well as DNA contained in the mitochondria and, for plants, in the chloroplast.

The exome is composed of all of the exons within the genome, the sequences which, when transcribed, remain within the mature RNA after introns are removed by RNA splicing. This includes untranslated regions of messenger RNA (mRNA), and coding regions. Exome sequencing has proven to be an efficient method of determining the genetic basis of more than two dozen Mendelian or single gene disorders.

<span class="mw-page-title-main">Institute of Genomics and Integrative Biology</span> Indian scientific research institute

CSIR Institute of Genomics and Integrative Biology (CSIR-IGIB) is a scientific research institute devoted primarily to biological research. It is a part of Council of Scientific and Industrial Research (CSIR), India.

Cancer genome sequencing is the whole genome sequencing of a single, homogeneous or heterogeneous group of cancer cells. It is a biochemical laboratory method for the characterization and identification of the DNA or RNA sequences of cancer cell(s).

<span class="mw-page-title-main">Exome sequencing</span> Sequencing of all the exons of a genome

Exome sequencing, also known as whole exome sequencing (WES), is a genomic technique for sequencing all of the protein-coding regions of genes in a genome. It consists of two steps: the first step is to select only the subset of DNA that encodes proteins. These regions are known as exons—humans have about 180,000 exons, constituting about 1% of the human genome, or approximately 30 million base pairs. The second step is to sequence the exonic DNA using any high-throughput DNA sequencing technology.

The missing heritability problem is the fact that single genetic variations cannot account for much of the heritability of diseases, behaviors, and other phenotypes. This is a problem that has significant implications for medicine, since a person's susceptibility to disease may depend more on the combined effect of all the genes in the background than on the disease genes in the foreground, or the role of genes may have been severely overestimated.

Joseph D. Buxbaum is an American molecular and cellular neuroscientist, autism researcher, and the Director of the Seaver Autism Center at the Icahn School of Medicine at Mount Sinai. Buxbaum is also, along with Simon Baron-Cohen, the co-editor of the BioMed Central journal Molecular Autism, and is a member of the scientific advisory board of the Autism Science Foundation. Buxbaum is a Professor of Psychiatry, Neuroscience, and Genetics and Genomic Sciences. He is also the Vice Chair for Research and for Mentoring in the Department of Psychiatry at the Icahn School of Medicine at Mount Sinai.

In statistics, a scan statistic or window statistic is a problem relating to the clustering of randomly positioned points. An example of a typical problem is the maximum size of a cluster of points on a line or the longest series of successes recorded by a moving window of fixed length.

Single nucleotide polymorphism annotation is the process of predicting the effect or function of an individual SNP using SNP annotation tools. In SNP annotation the biological information is extracted, collected and displayed in a clear form amenable to query. SNP functional annotation is typically performed based on the available information on nucleic acid and protein sequences.

Karen L. Mohlke is a biologist at University of North Carolina, Chapel Hill. She is known for her work in human genetics, especially in the area of diabetes research. She was one of the first researchers to use exome array genotyping.

A variant of uncertainsignificance (VUS) is a genetic variant that has been identified through genetic testing but whose significance to the function or health of an organism is not known. Two related terms are "gene of uncertain significance" (GUS), which refers to a gene that has been identified through genome sequencing but whose connection to a human disease has not been established, and "insignificant mutation", referring to a gene variant that has no impact on the health or function of an organism. The term "variant' is favored in clinical practice over "mutation" because it can be used to describe an allele more precisely. When the variant has no impact on health, it is called a "benign variant". When it is associated with a disease, it is called a "pathogenic variant". A "pharmacogenomic variant" has an effect only when an individual takes a particular drug and therefore is neither benign nor pathogenic.

The Qatar Genome Programme aims to sequence the genomes of 350,000 inhabitants of Qatar. Supported by the Sidra Medical and Research Center, the program is managed by Qatar Genome Committee and wants to eventually offer personalized healthcare in Qatar to its population with advances in genomics. The Qatar Genome Programme published its first genome-wide association study with 45 clinically relevant traits in 2021

Elective genetic and genomic testing are DNA tests performed for an individual who does not have an indication for testing. An elective genetic test analyzes selected sites in the human genome while an elective genomic test analyzes the entire human genome. Some elective genetic and genomic tests require a physician to order the test to ensure that individuals understand the risks and benefits of testing as well as the results. Other DNA-based tests, such as a genealogical DNA test do not require a physician's order. Elective testing is generally not paid for by health insurance companies. With the advent of personalized medicine, also called precision medicine, an increasing number of individuals are undertaking elective genetic and genomic testing.

<span class="mw-page-title-main">Complex traits</span>

Complex traits, also known as quantitative traits, are traits that do not behave according to simple Mendelian inheritance laws. More specifically, their inheritance cannot be explained by the genetic segregation of a single gene. Such traits show a continuous range of variation and are influenced by both environmental and genetic factors. Compared to strictly Mendelian traits, complex traits are far more common, and because they can be hugely polygenic, they are studied using statistical techniques such as quantitative genetics and quantitative trait loci (QTL) mapping rather than classical genetics methods. Examples of complex traits include height, circadian rhythms, enzyme kinetics, and many diseases including diabetes and Parkinson's disease. One major goal of genetic research today is to better understand the molecular mechanisms through which genetic variants act to influence complex traits.

ANNOVAR is a bioinformatics software tool for the interpretation and prioritization of single nucleotide variants (SNVs), insertions, deletions, and copy number variants (CNVs) of a given genome.

Personalized onco-genomics (POG) is the field of oncology and genomics that is focused on using whole genome analysis to make personalized clinical treatment decisions. The program was devised at British Columbia's BC Cancer Agency and is currently being led by Marco Marra and Janessa Laskin. Genome instability has been identified as one of the underlying hallmarks of cancer. The genetic diversity of cancer cells promotes multiple other cancer hallmark functions that help them survive in their microenvironment and eventually metastasise. The pronounced genomic heterogeneity of tumours has led researchers to develop an approach that assesses each individual's cancer to identify targeted therapies that can halt cancer growth. Identification of these "drivers" and corresponding medications used to possibly halt these pathways are important in cancer treatment.

References

  1. Erjavec SO, Gelfman S, Abdelaziz AR, Lee EY, Monga I, Alkelai A, Ionita-Laza I, Petukhova L, Christiano AM (Feb 2022). "Whole exome sequencing in Alopecia Areata identifies rare variants in KRT82". Nat Commun. 13 (1): 800. Bibcode:2022NatCo..13..800E. doi:10.1038/s41467-022-28343-3. PMC   8831607 . PMID   35145093.
  2. Goldstein, DB; Allen, A; Keebler, J; Margulies, EH; Petrou, S; Petrovski, S; Sunyaev, S (July 2013). "Sequencing studies in human genetics: design and interpretation". Nature Reviews Genetics. 14 (7): 460–70. doi:10.1038/nrg3455. PMC   4117319 . PMID   23752795.
  3. Nelson, M. R.; Wegmann, D.; Ehm, M. G.; Kessner, D.; St. Jean, P.; Verzilli, C.; Shen, J.; Tang, Z.; Bacanu, S.-A.; Fraser, D.; Warren, L.; Aponte, J.; Zawistowski, M.; Liu, X.; Zhang, H.; Zhang, Y.; Li, J.; Li, Y.; Li, L.; Woollard, P.; Topp, S.; Hall, M. D.; Nangle, K.; Wang, J.; Abecasis, G.; Cardon, L. R.; Zollner, S.; Whittaker, J. C.; Chissoe, S. L.; Novembre, J.; Mooser, V. (17 May 2012). "An Abundance of Rare Functional Variants in 202 Drug Target Genes Sequenced in 14,002 People". Science . 337 (6090): 100–104. Bibcode:2012Sci...337..100N. doi:10.1126/science.1217876. PMC   4319976 . PMID   22604722.
  4. Panoutsopoulou, K.; Tachmazidou, I.; Zeggini, E. (6 August 2013). "In search of low-frequency and rare variants affecting complex traits". Human Molecular Genetics. 22 (R1): R16–R21. doi:10.1093/hmg/ddt376. PMC   3782074 . PMID   23922232.
  5. 1 2 Lee, Seunggeun; Emond, Mary J.; Bamshad, Michael J.; Barnes, Kathleen C.; Rieder, Mark J.; Nickerson, Deborah A.; Christiani, David C.; Wurfel, Mark M.; Lin, Xihong (August 2012). "Optimal Unified Approach for Rare-Variant Association Testing with Application to Small-Sample Case-Control Whole-Exome Sequencing Studies". The American Journal of Human Genetics. 91 (2): 224–237. doi:10.1016/j.ajhg.2012.06.007. ISSN   0002-9297. PMC   3415556 . PMID   22863193.
  6. Lee, Seunggeung; Abecasis, Gonçalo R.; Boehnke, Michael; Lin, Xihong (July 2014). "Rare-Variant Association Analysis: Study Designs and Statistical Tests". The American Journal of Human Genetics. 95 (1): 5–23. doi:10.1016/j.ajhg.2014.06.009. ISSN   0002-9297. PMC   4085641 . PMID   24995866.
  7. Wu, Michael C.; Lee, Seunggeun; Cai, Tianxi; Li, Yun; Boehnke, Michael; Lin, Xihong (July 2011). "Rare-Variant Association Testing for Sequencing Data with the Sequence Kernel Association Test". The American Journal of Human Genetics. 89 (1): 82–93. doi:10.1016/j.ajhg.2011.05.029. ISSN   0002-9297. PMC   3135811 . PMID   21737059.
  8. Asimit, Jennifer L.; Day-Williams, Aaron G.; Morris, Andrew P.; Zeggini, Eleftheria (2012). "ARIEL and AMELIA: testing for an accumulation of rare variants using next-generation sequencing data". Human Heredity. 73 (2): 84–94. doi:10.1159/000336982. ISSN   1423-0062. PMC   3477640 . PMID   22441326.
  9. Han, Fang; Pan, Wei (June 2010). "A Data-Adaptive Sum Test for Disease Association with Multiple Common or Rare Variants". Human Heredity. 70 (1): 42–54. doi:10.1159/000288704. ISSN   0001-5652. PMC   2912645 . PMID   20413981.
  10. Li, Xihao; Li, Zilin; Zhou, Hufeng; Gaynor, Sheila M.; Liu, Yaowu; Chen, Han; Sun, Ryan; Dey, Rounak; Arnett, Donna K.; Aslibekyan, Stella; Ballantyne, Christie M.; Bielak, Lawrence F.; Blangero, John; Boerwinkle, Eric; Bowden, Donald W.; Broome, Jai G; Conomos, Matthew P; Correa, Adolfo; Cupples, L. Adrienne; Curran, Joanne E.; Freedman, Barry I.; Guo, Xiuqing; Hindy, George; Irvin, Marguerite R.; Kardia, Sharon L. R.; Kathiresan, Sekar; Khan, Alyna T.; Kooperberg, Charles L.; Laurie, Cathy C.; Liu, X. Shirley; Mahaney, Michael C.; Manichaiku, Ani W.; Martin, Lisa W.; Mathias, Rasika A.; McGarvey, Stephen T.; Mitchell, Braxton D.; Montasser, May E.; Moore, Jill E.; Morrison3, Alanna C.; O’Connell, Jeffrey R.; Palmer, Nicholette D.; Pampana, Akhil; Peralta, Juan M.; Peyser, Patricia A.; Psaty, Bruce M.; Redline, Susan; Rice, Kenneth M.; Rich, Stephen S.; Smith, Jennifer A.; Tiwari, Hemant K.; Tsai, Michael Y.; Vasan, Ramachandran S.; Wang, Fei Fei; Weeks, Daniel E.; Weng, Zhiping; Wilson, James G.; Yanek, Lisa R.; NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium; TOPMed Lipids Working Group; Neale, Benjamin M.; Sunyaev, Shamil R.; Abecasis, Gonçalo R.; Rotter, Jerome I.; Willer, Cristen J.; Peloso, Gina M.; Natarajan, Pradeep; Lin, Xihong (Sep 2020). "Dynamic incorporation of multiple in silico functional annotations empowers rare variant association analysis of large whole-genome sequencing studies at scale". Nature Genetics. 52 (9): 969–983. doi:10.1038/s41588-020-0676-4. ISSN   1061-4036. PMC   7483769 . PMID   32839606.{{cite journal}}: CS1 maint: numeric names: authors list (link)
  11. Li, Zilin; Li, Xihao; Zhou, Hufeng; Gaynor, Sheila M.; Margaret, Sunitha Selvaraj; Arapoglou, Theodore; Qiuck, Corbin; Liu, Yaowu; Chen, Han; Sun, Ryan; Dey, Rounak; Arnett, Donna K.; Auer, Paul L.; Bielak, Lawrence F.; Bis, Joshua C.; Blackwell, Thomas W.; Blangero, John; Boerwinkle, Eric; Bowden, Donald W.; Brody, Jennifer A.; Cade, Brian E.; Conomos, Matthew P.; Correa, Adolfo; Cupples, L. Adrienne; Curran, Joanne E.; de Vries, Paul S.; Duggirala, Ravindranath; Franceschini, Nora; Freedman, Barry I.; Goring, Harald H.H.; Guo, Xiuqing; Kalyani, Rita R.; Kooperberg, Charles; Kral, Brian G.; Lange, Leslie A.; Lin, Bridget; Manichaikul, Ani; Martin, Lisa W.; Mathias, Rasika A.; Meigs, James B.; Mitchell, Braxton D.; Mitchell, Braxton D.; Montasser, May E.; Morrison, Alanna C.; Naseri, Take; O’Connell, Jeffrey R.; Palmer, Nicholette D.; Reupena, Muagututi’a Sefuiva; Rice, Kenneth M.; Rich, Stephen S.; Smith, Jennifer A.; Taylor, Kent D.; Taub, Margaret A.; Vasan, Ramachandran S.; Weeks, Daniel E.; Wilson, James G.; Yanek, Lisa R.; Zhao, Wei; NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium; TOPMed Lipids Working Group; Rotter, Jerome I.; Willer, Cristen; Natarajan, Pradeep; Peloso, Gina M.; Lin, Xihong (2022). "A framework for detecting noncoding rare-variant associations of large-scale whole-genome sequencing studies". Nature Methods. 19 (12): 1599–1611. doi:10.1038/s41592-022-01640-x. PMC   10008172 . PMID   36303018. S2CID   243873361.
  12. Li, Xihao; Quick, Corbin; Zhou, Hufeng; Gaynor, Sheila M.; Liu, Yaowu; Chen, Han; Selvaraj, Margaret Sunitha; Sun, Ryan; Dey, Rounak; Arnett, Donna K.; Bielak, Lawrence F.; Bis, Joshua C.; Blangero, John; Boerwinkle, Eric; Bowden, Donald W.; Brody, Jennifer A.; Cade, Brian E.; Correa, Adolfo; Cupples, L. Adrienne; Curran, Joanne E.; de Vries, Paul S.; Duggirala, Ravindranath; Freedman, Barry I.; Göring, Harald H. H.; Guo, Xiuqing; Haessler, Jeffrey; Kalyani, Rita R.; Kooperberg, Charles; Kral, Brian G.; Lange, Leslie A.; Manichaikul, Ani; Martin, Lisa W.; McGarvey, Stephen T.; Mitchell, Braxton D.; Montasser, May E.; Morrison, Alanna C.; Naseri, Take; O’Connell, Jeffrey R.; Palmer, Nicholette D.; Peyser, Patricia A.; Psaty, Bruce M.; Raffield, Laura M.; Redline, Susan; Reiner, Alexander P.; Reupena, Muagututi’a Sefuiva; Rice, Kenneth M.; Rich, Stephen S.; Sitlani, Colleen M.; Smith, Jennifer A.; Taylor, Kent D.; Vasan, Ramachandran S.; Willer, Cristen J.; Wilson, James G.; Yanek, Lisa R.; Zhao, Wei; NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium; TOPMed Lipids Working Group; Rotter, Jerome I.; Natarajan, Pradeep; Peloso, Gina M.; Li, Zilin; Lin, Xihong (January 2023). "Powerful, scalable and resource-efficient meta-analysis of rare variant associations in large whole genome sequencing studies". Nature Genetics. 55 (1): 154–164. doi:10.1038/s41588-022-01225-6. PMC   10084891 .

Further reading