PhenX Toolkit

Last updated
The PhenX Toolkit
Type of site
Content Site
Available inEnglish
Owner NHGRI
Created by RTI International
URL www.phenxtoolkit.org
Commercialno
RegistrationOptional
Users 3700+ Registered, 1.6M+ visits
Launched6 February 2009 (2009-02-06)
Current statusActive
Content license
Public Domain

The PhenX Toolkit is a web-based catalog of high-priority measures related to complex diseases, phenotypic traits and environmental exposures. These measures were selected by working groups of experts using a consensus process. [1] PhenX Toolkit's mission is to provide investigators with standard measurement protocols for use in genomic, epidemiologic, clinical and translational research. Use of PhenX measures facilitates combining data from a variety of studies, and makes it easy for investigators to expand a study design beyond the primary research focus. [2] [3] [4] The Toolkit is funded by the National Human Genome Research Institute (NHGRI) of the National Institutes of Health (NIH) with co-funding by the Office of the Director (OD), the National Institute of Neurological Disorders and Stroke (NINDS), and the National Heart, Lung, and Blood Institute (NHLBI). [5] [6] Continuously funded since 2007, PhenX has received funding from a variety of NIH institutes, including the National Institute on Drug Abuse (NIDA), the National Institute on Mental Health (NIMH), the National Cancer Institute (NCI) and the National Institute on Minority Health and Health Disparities (NIMHD). [7] [8] The PhenX Toolkit is available to the scientific community at no cost.

Contents

For genome-wide association studies (GWAS) and other studies involving human subjects, the use of standard measures can facilitate cross-study analyses. [9] [10] [11] Such analyses compare independent findings to validate results or combine studies to increase sample size and statistical power. [12] [13] This increased power makes it possible to identify more subtle and complex associations such as gene-gene and gene-environment interactions. [14] [15] [16] [17] PhenX is an NIH Common Data Element (CDE) Repository project and supports the NIH Strategic Plan for Data Science, which promotes FAIR principles (i.e., Findable, Accessible, Interoperable, Reusable) to facilitate data sharing. [18]

In 2020, PhenX collaborated with the U.S. NIH's Public Health Emergency and Disaster Research Response Program (DR2) to collect and review research protocols for epidemiologists, clinicians and other scientists studying COVID-19. [19] The Toolkit released six specialty collections of protocols to promote the use of CDEs for research on COVID-19 in October 2020. [20]

See also

Related Research Articles

A microsatellite is a tract of repetitive DNA in which certain DNA motifs are repeated, typically 5–50 times. Microsatellites occur at thousands of locations within an organism's genome. They have a higher mutation rate than other areas of DNA leading to high genetic diversity. Microsatellites are often referred to as short tandem repeats (STRs) by forensic geneticists and in genetic genealogy, or as simple sequence repeats (SSRs) by plant geneticists.

An animal model is a living, non-human, often genetic-engineered animal used during the research and investigation of human disease, for the purpose of better understanding the disease process without the risk of harming a human. Although biological activity in an animal model does not ensure an effect in humans, many drugs, treatments and cures for human diseases are developed in part with the guidance of animal models. Animal models representing specific taxonomic groups in the research and study of developmental processes are also referred to as model organisms. There are three main types of animal models: Homologous, Isomorphic and Predictive. Homologous animals have the same causes, symptoms and treatment options as would humans who have the same disease. Isomorphic animals share the same symptoms and treatments, only. Predictive models are similar to a particular human disease in only a couple of aspects. However, these are useful in isolating and making predictions about mechanisms of a set of disease features.

<span class="mw-page-title-main">ENCODE</span> Research consortium investigating functional elements in human and model organism DNA

The Encyclopedia of DNA Elements (ENCODE) is a public research project which aims "to build a comprehensive parts list of functional elements in the human genome."

<span class="mw-page-title-main">Heritability of autism</span>

The heritability of autism is the proportion of differences in expression of autism that can be explained by genetic variation; if the heritability of a condition is high, then the condition is considered to be primarily genetic. Autism has a strong genetic basis. Although the genetics of autism are complex, autism spectrum disorder (ASD) is explained more by multigene effects than by rare mutations with large effects.

The autism-spectrum quotient (AQ) is a questionnaire published in 2001 by Simon Baron-Cohen and his colleagues at the Autism Research Centre in Cambridge, UK. Consisting of fifty questions, it aims to investigate whether adults of average intelligence have symptoms of autism spectrum conditions. More recently, versions of the AQ for children and adolescents have also been published.

<span class="mw-page-title-main">Human Genome Project</span> Human genome sequencing programme

The Human Genome Project (HGP) was an international scientific research projects with the goal of determining the base pairs that make up human DNA, and of identifying, mapping and sequencing all of the genes of the human genome from both a physical and a functional standpoint. It started in 1990 and was completed in 2003. It remains the world's largest collaborative biological project. Planning for the project started after it was adopted in 1984 by the US government, and it officially launched in 1990. It was declared complete on April 14, 2003, and included about 92% of the genome. Level "complete genome" was achieved in May 2021, with a remaining only 0.3% bases covered by potential issues. The final gapless assembly was finished in January 2022.

Mouse Genome Informatics (MGI) is a free, online database and bioinformatics resource hosted by The Jackson Laboratory, with funding by the National Human Genome Research Institute (NHGRI), the National Cancer Institute (NCI), and the Eunice Kennedy Shriver National Institute of Child Health and Human Development (NICHD). MGI provides access to data on the genetics, genomics and biology of the laboratory mouse to facilitate the study of human health and disease. The database integrates multiple projects, with the two largest contributions coming from the Mouse Genome Database and Mouse Gene Expression Database (GXD). As of 2018, MGI contains data curated from over 230,000 publications.

<span class="mw-page-title-main">Genome-wide association study</span> Study of genetic variants in different individuals

In genomics, a genome-wide association study, is an observational study of a genome-wide set of genetic variants in different individuals to see if any variant is associated with a trait. GWA studies typically focus on associations between single-nucleotide polymorphisms (SNPs) and traits like major human diseases, but can equally be applied to any other genetic variants and any other organisms.

Personal genomics or consumer genetics is the branch of genomics concerned with the sequencing, analysis and interpretation of the genome of an individual. The genotyping stage employs different techniques, including single-nucleotide polymorphism (SNP) analysis chips, or partial or full genome sequencing. Once the genotypes are known, the individual's variations can be compared with the published literature to determine likelihood of trait expression, ancestry inference and disease risk.

<span class="mw-page-title-main">1000 Genomes Project</span> International research effort on genetic variation

The 1000 Genomes Project (1KGP), taken place from January 2008 to 2015, was an international research effort to establish the most detailed catalogue of human genetic variation at the time. Scientists planned to sequence the genomes of at least one thousand anonymous healthy participants from a number of different ethnic groups within the following three years, using advancements in newly developed technologies. In 2010, the project finished its pilot phase, which was described in detail in a publication in the journal Nature. In 2012, the sequencing of 1092 genomes was announced in a Nature publication. In 2015, two papers in Nature reported results and the completion of the project and opportunities for future research.

The International Knockout Mouse Consortium (IKMC) is a scientific endeavour to produce a collection of mouse embryonic stem cell lines that together lack every gene in the genome, and then to distribute the cells to scientific researchers to create knockout mice to study. Many of the targeted alleles are designed so that they can generate both complete and conditional gene knockout mice. The IKMC was initiated on March 15, 2007, at a meeting in Brussels. By 2011, Nature reported that approximately 17,000 different genes have already been disabled by the consortium, "leaving only around 3,000 more to go".

The exome is composed of all of the exons within the genome, the sequences which, when transcribed, remain within the mature RNA after introns are removed by RNA splicing. This includes untranslated regions of messenger RNA (mRNA), and coding regions. Exome sequencing has proven to be an efficient method of determining the genetic basis of more than two dozen Mendelian or single gene disorders.

The Eating Disorder Diagnostic Scale (EDDS) is a self-report questionnaire that assesses the presence of three eating disorders; anorexia nervosa, bulimia nervosa and binge eating disorder. It was adapted by Stice et al. in 2000 from the validated structured psychiatric interview: The Eating Disorder Examination (EDE) and the eating disorder module of the Structured Clinical Interview for DSM-IV (SCID)16.

<span class="mw-page-title-main">Eric Schadt</span> American scientist

Eric Emil Schadt is an American mathematician and computational biologist. He is founder and former chief executive officer of Sema4, a patient-centered health intelligence company, and dean for precision medicine and Mount Sinai Professor in Predictive Health and Computational Biology at the Icahn School of Medicine at Mount Sinai. He was previously founding director of the Icahn Institute for Genomics and Multiscale Biology and chair of the Department of Genetics and Genomics Sciences at the Icahn School of Medicine at Mount Sinai.

Predictive genomics is at the intersection of multiple disciplines: predictive medicine, personal genomics and translational bioinformatics. Specifically, predictive genomics deals with the future phenotypic outcomes via prediction in areas such as complex multifactorial diseases in humans. To date, the success of predictive genomics has been dependent on the genetic framework underlying these applications, typically explored in genome-wide association (GWA) studies. The identification of associated single-nucleotide polymorphisms underpin GWA studies in complex diseases that have ranged from Type 2 Diabetes (T2D), Age-related macular degeneration (AMD) and Crohn's disease.

Single nucleotide polymorphism annotation is the process of predicting the effect or function of an individual SNP using SNP annotation tools. In SNP annotation the biological information is extracted, collected and displayed in a clear form amenable to query. SNP functional annotation is typically performed based on the available information on nucleic acid and protein sequences.

A human disease modifier gene is a modifier gene that alters expression of a human gene at another locus that in turn causes a genetic disease. Whereas medical genetics has tended to distinguish between monogenic traits, governed by simple, Mendelian inheritance, and quantitative traits, with cumulative, multifactorial causes, increasing evidence suggests that human diseases exist on a continuous spectrum between the two.

<span class="mw-page-title-main">Polygenic score</span> Numerical score aimed at predicting a trait based on variation in multiple genetic loci

In genetics, a polygenic score (PGS) is a number that summarizes the estimated effect of many genetic variants on an individual's phenotype. The PGS is also called the polygenic index (PGI) or genome-wide score; in the context of disease risk, it is called a polygenic risk score or genetic risk score. The score reflects an individual's estimated genetic predisposition for a given trait and can be used as a predictor for that trait. It gives an estimate of how likely an individual is to have a given trait based only on genetics, without taking environmental factors into account; and it is typically calculated as a weighted sum of trait-associated alleles.

In genetics and genetic epidemiology, a phenome-wide association study, abbreviated PheWAS, is a study design in which the association between single-nucleotide polymorphisms or other types of DNA variants is tested across a large number of different phenotypes. The aim of PheWAS studies is to examine the causal linkage between known sequence differences and any type of trait, including molecular, biochemical, cellular, and especially clinical diagnoses and outcomes. It is a complementary approach to the genome-wide association study, or GWAS, methodology. A fundamental difference between GWAS and PheWAS designs is the direction of inference: in a PheWAS it is from exposure to many possible outcomes, that is, from SNPs to differences in phenotypes and disease risk. In a GWAS, the polarity of analysis is from one or a few phenotypes to many possible DNA variants. The approach has proven useful in rediscovering previously reported genotype-phenotype associations, as well as in identifying new ones.

Rosalind Anne Eeles is a Professor of Oncogenetics at the Institute of Cancer Research and clinician at the Royal Marsden NHS Foundation Trust. She is a leader in the field of genetic susceptibility to prostate cancer, and is known for the discovery of 14 genetic variants involved in prostate cancer predisposition. According to ResearchGate, Eeles has published more than 500 articles in peer-reviewed journals, with over 34,000 citations and an h-index of 92. Eeles was elected a Fellow of the Academy of Medical Science in 2012. She was awarded a National Institute for Health Research Senior Investigator Emeritus in 2014.

References

  1. Maiese DR, Hendershot TP, Strader LC, et al. (2013). PhenX—Establishing a consensus process to select common measures for collaborative research (Technical report). Research Triangle Park, NC: RTI Press. doi: 10.3768/rtipress.2013.mr.0027.1310 . MR-0027-1310.
  2. Hamilton CM, Strader LC, Pratt JG, et al. (1 August 2011). "The PhenX Toolkit: Get the Most From Your Measures". American Journal of Epidemiology. 174 (3): 253–260. doi: 10.1093/aje/kwr193 . PMC   3141081 . PMID   21749974.
  3. Hendershot T, Pan H, Haines J, et al. (October 2011). "Using the PhenX Toolkit to Add Standard Measures to Your Study". Current Protocols in Human Genetics. 1 (21). Unit 1.21. doi:10.1002/0471142905.hg0121s71. PMID   21975939. S2CID   7273784.
  4. Fortier, Isabel; Doiron, Dany; Burton, Paul; Raina, Parminder (2011-08-01). "Invited Commentary: Consolidating Data Harmonization—How to Obtain Quality and Applicability?". American Journal of Epidemiology. 174 (3): 261–264. doi: 10.1093/aje/kwr194 . ISSN   0002-9262. PMID   21749975.
  5. RTI (January 30, 2023). "RTI's PhenX Toolkit to be established as biomedical knowledgebase with latest award". RTI International. Retrieved 10 February 2023.
  6. "Resource Summaries". wayback.archive-it.org. Retrieved 2021-05-21.
  7. Eckman, JR; et al. (December 26, 2017). "Standard measures for sickle cell disease research: the PhenX Toolkit sickle cell disease collections" (PDF). Blood Advances. 1 (27): 2703–2711. doi:10.1182/bloodadvances.2017010702. PMC   5745137 . PMID   29296922 . Retrieved 9 May 2018.
  8. "The PhenX Toolkit". RTI. 2021-12-15. Retrieved 2023-02-10.
  9. Pan H, Tryka KA, Vreeman DJ, et al. (May 2012). "Using PhenX Measures to Identify Opportunities for Cross-Study Analysis". Human Mutation. 33 (5): 849–857. doi: 10.1002/humu.22074 . PMC   3780790 . PMID   22415805.
  10. Manolio TA (February 2009). "Collaborative Genome-wide Association Studies of Diverse Diseases: Programs of the NHGRI's Office of Population Genomics". Pharmacogenomics. 10 (3): 235–241. doi:10.2217/14622416.10.2.235. PMC   2714942 . PMID   19207024.
  11. Sheehan, Jerry; Hirschfeld, Steven; Foster, Erin; Ghitza, Udi; Goetz, Kerry; Karpinski, Joanna; Lang, Lisa; Moser, Richard P.; Odenkirchen, Joanne; Reeves, Dianne; Rubinstein, Yaffa (December 2016). "Improving the value of clinical research through the use of Common Data Elements (CDEs)". Clinical Trials (London, England). 13 (6): 671–676. doi:10.1177/1740774516653238. ISSN   1740-7745. PMC   5133155 . PMID   27311638.
  12. Sheehan, Jerry; Hirschfeld, Steven; Foster, Erin; Ghitza, Udi; Goetz, Kerry; Karpinski, Joanna; Lang, Lisa; Moser, Richard P.; Odenkirchen, Joanne; Reeves, Dianne; Rubinstein, Yaffa (December 2016). "Improving the value of clinical research through the use of Common Data Elements (CDEs)". Clinical Trials (London, England). 13 (6): 671–676. doi:10.1177/1740774516653238. ISSN   1740-7745. PMC   5133155 . PMID   27311638.
  13. Fortier, Isabel; Dragieva, Nataliya; Saliba, Matilda; Craig, Camille; Robson, Paula J. (2019-04-16). "Harmonization of the Health and Risk Factor Questionnaire data of the Canadian Partnership for Tomorrow Project: a descriptive analysis". CMAJ Open. 7 (2): E272–E282. doi:10.9778/cmajo.20180062. ISSN   2291-0026. PMC   6498449 . PMID   31018973.
  14. Barrett JC, Hansoul S, Nicolae DL, et al. (August 2008). "Genome-wide association defines more than 30 distinct susceptibility loci for Crohn's disease". Nature Genetics. 40 (8): 955–962. doi:10.1038/ng.175. PMC   2574810 . PMID   18587394.
  15. Barrett JC, Clayton DG, Concannon P, et al. (June 2009). "Genome-wide association study and meta-analysis find that over 40 loci affect risk of type 1 diabetes" (PDF). Nature Genetics. 41 (6): 703–707. doi:10.1038/ng.381. PMC   2889014 . PMID   19430480.{{cite journal}}: CS1 maint: numeric names: authors list (link)
  16. Cooper JD, Smyth DJ, Smiles AM, et al. (December 2008). "Meta-analysis of genome-wide association study data identifies additional type 1 diabetes risk loci". Nature Genetics. 40 (12): 1399–1401. doi:10.1038/ng.249. PMC   2635556 . PMID   18978792.
  17. Hunter DJ (April 2005). "Gene-environment Interactions in Human Diseases". Nature Reviews Genetics. 6 (4): 287–298. doi:10.1038/nrg1578. PMID   15803198. S2CID   27052260.
  18. "NIH Common Data Elements (CDE) Repository". cde.nlm.nih.gov. Retrieved 2021-05-21.
  19. "Covid-19 researchers gain quick access to surveys, protocols (Environmental Factor, May 2020)". National Institute of Environmental Health Sciences. Retrieved 2021-05-20.
  20. Krzyzanowski, Michelle C.; Terry, Ian; Williams, David; West, Pat; Gridley, Lauren N.; Hamilton, Carol M. (2021). "The PhenX Toolkit: Establishing Standard Measures for COVID-19 Research". Current Protocols. 1 (4): e111. doi: 10.1002/cpz1.111 . ISSN   2691-1299. PMC   8206667 . PMID   33905618.