European Genome-phenome Archive

Last updated
European Genome-phenome Archive
Producer European Molecular Biology Laboratory-European Bioinformatics Institute, Centre for Genomic Regulation (United Kingdom and Spain)
LanguagesEnglish
Access
CostFree
Coverage
Disciplines Biomedical sciences
Format coverageDatasets
Links
Website EGA Official Portal

European Genome-phenome Archive (EGA) is a repository for human biomolecular and phenotypic data [1] in the United Kingdom and Spain. [2] [3] It involves the secure storage of all potentially identifiable genetic data, phenotypic and clinical data generated by biomedical research programs. [4]

Contents

As of March 2022, it stores and harvest data regarding over 4,500 research studies from over 1,000 institutions worldwide. [2]

History

EGA was launched in 2008 by the European Molecular Biology Laboratory’s European Bioinformatics Institute (EMBL-EBI) to support the voluntary archiving and dissemination of human genomic data requiring secure storage and distribution only to authorized researchers in a manner that "respects the consent agreements signed by the study subjects." Later, the EGA has expanded its scope of collaboration with the Centre for Genomic Regulation (CRG) in Barcelona. [5]

Controlled access

It offers the essential security required to regulate access, safeguard patient confidentiality, and provide access to those researchers and clinicians authorized to view controlled access data. Nevertheless, decisions about data access are not made by the EGA but rather by the appropriate data access-granting organization (DAO). [6]

Related Research Articles

<span class="mw-page-title-main">Phenotype</span> Composite of the organisms observable characteristics or traits

In genetics, the phenotype is the set of observable characteristics or traits of an organism. The term covers the organism's morphology, its developmental processes, its biochemical and physiological properties, its behavior, and the products of behavior. An organism's phenotype results from two basic factors: the expression of an organism's genetic code and the influence of environmental factors. Both factors may interact, further affecting the phenotype. When two or more clearly different phenotypes exist in the same population of a species, the species is called polymorphic. A well-documented example of polymorphism is Labrador Retriever coloring; while the coat color depends on many genes, it is clearly seen in the environment as yellow, black, and brown. Richard Dawkins in 1978 and then again in his 1982 book The Extended Phenotype suggested that one can regard bird nests and other built structures such as caddisfly larva cases and beaver dams as "extended phenotypes".

The International HapMap Project was an organization that aimed to develop a haplotype map (HapMap) of the human genome, to describe the common patterns of human genetic variation. HapMap is used to find genetic variants affecting health, disease and responses to drugs and environmental factors. The information produced by the project is made freely available for research.

<span class="mw-page-title-main">Comparative genomics</span>

Comparative genomics is a field of biological research in which the genomic features of different organisms are compared. The genomic features may include the DNA sequence, genes, gene order, regulatory sequences, and other genomic structural landmarks. In this branch of genomics, whole or large parts of genomes resulting from genome projects are compared to study basic biological similarities and differences as well as evolutionary relationships between organisms. The major principle of comparative genomics is that common features of two organisms will often be encoded within the DNA that is evolutionarily conserved between them. Therefore, comparative genomic approaches start with making some form of alignment of genome sequences and looking for orthologous sequences in the aligned genomes and checking to what extent those sequences are conserved. Based on these, genome and molecular evolution are inferred and this may in turn be put in the context of, for example, phenotypic evolution or population genetics.

A phenome, similar to phenotype, is the set of all traits expressed by a cell, tissue, organ, organism, or species.

<span class="mw-page-title-main">Wellcome Sanger Institute</span> British genomics research institute

The Wellcome Sanger Institute, previously known as The Sanger Centre and Wellcome Trust Sanger Institute, is a non-profit British genomics and genetics research institute, primarily funded by the Wellcome Trust.

The Rat Genome Database (RGD) is a database of rat genomics, genetics, physiology and functional data, as well as data for comparative genomics between rat, human and mouse. RGD is responsible for attaching biological information to the rat genome via structured vocabulary, or ontology, annotations assigned to genes and quantitative trait loci (QTL), and for consolidating rat strain data and making it available to the research community. They are also developing a suite of tools for mining and analyzing genomic, physiologic and functional data for the rat, and comparative data for rat, mouse, human, and five other species.

Phenomics is the systematic study of traits that make up a phenotype, and was coined by UC Berkeley and LBNL scientist Steven A. Garan. As such, it is a transdisciplinary area of research that involves biology, data sciences, engineering and other fields. Phenomics is concerned with the measurement of the phenotype where a phenome is a set of traits that can be produced by a given organism over the course of development and in response to genetic mutation and environmental influences. The relationship between phenotype and genotype enables researchers to understand and study pleiotropy. Phenomics concepts are used in functional genomics, pharmaceutical research, metabolic engineering, agricultural research, and increasingly in phylogenetics.

<span class="mw-page-title-main">Human genetic variation</span> Genetic diversity in human populations

Human genetic variation is the genetic differences in and among populations. There may be multiple variants of any given gene in the human population (alleles), a situation called polymorphism.

<span class="mw-page-title-main">Biobank</span> Repository of biological samples used for research

A biobank is a type of biorepository that stores biological samples for use in research. Biobanks have become an important resource in medical research, supporting many types of contemporary research like genomics and personalized medicine.

Ajit Varki is a physician-scientist who is distinguished professor of medicine and cellular and molecular medicine, co-director of the Glycobiology Research and Training Center at the University of California, San Diego (UCSD), and co-director of the UCSD/Salk Center for Academic Research and Training in Anthropogeny (CARTA). He is also executive editor of the textbook Essentials of Glycobiology and distinguished visiting professor at the Indian Institute of Technology in Madras and the National Center for Biological Sciences in Bangalore. He is a specialist advisor to the Human Gene Nomenclature Committee.

The UCSC Genome Browser is an online and downloadable genome browser hosted by the University of California, Santa Cruz (UCSC). It is an interactive website offering access to genome sequence data from a variety of vertebrate and invertebrate species and major model organisms, integrated with a large collection of aligned annotations. The Browser is a graphical viewer optimized to support fast interactive performance and is an open-source, web-based tool suite built on top of a MySQL database for rapid visualization, examination, and querying of the data at many levels. The Genome Browser Database, browsing tools, downloadable data files, and documentation can all be found on the UCSC Genome Bioinformatics website.

The PhenX Toolkit is a web-based catalog of high-priority measures related to complex diseases, phenotypic traits and environmental exposures. These measures were selected by working groups of experts using a consensus process. PhenX Toolkit's mission is to provide investigators with standard measurement protocols for use in genomic, epidemiologic, clinical and translational research. Use of PhenX measures facilitates combining data from a variety of studies, and makes it easy for investigators to expand a study design beyond the primary research focus. The Toolkit is funded by the National Human Genome Research Institute (NHGRI) of the National Institutes of Health (NIH) with co-funding by the Office of the Director (OD), the National Institute of Neurological Disorders and Stroke (NINDS), and the National Heart, Lung, and Blood Institute (NHLBI). Continuously funded since 2007, PhenX has received funding from a variety of NIH institutes, including the National Institute on Drug Abuse (NIDA), the National Institute on Mental Health (NIMH), the National Cancer Institute (NCI) and the National Institute on Minority Health and Health Disparities (NIMHD). The PhenX Toolkit is available to the scientific community at no cost.

<span class="mw-page-title-main">International Mouse Phenotyping Consortium</span>

The International Mouse Phenotyping Consortium (IMPC) is an international scientific endeavour to create and characterize the phenotype of 20,000 knockout mouse strains. Launched in September 2011, the consortium consists of over 15 research institutes across four continents with funding provided by the NIH, European national governments and the partner institutions.

The International Human Epigenome Consortium (IHEC) is a scientific organization, founded in 2010, that helps to coordinate global efforts in the field of Epigenomics. The initial goal was to generate at least 1,000 reference (baseline) human epigenomes from different types of normal and disease-related human cell types.

In bioinformatics, a Gene Disease Database is a systematized collection of data, typically structured to model aspects of reality, in a way to comprehend the underlying mechanisms of complex diseases, by understanding multiple composite interactions between phenotype-genotype relationships and gene-disease mechanisms. Gene Disease Databases integrate human gene-disease associations from various expert curated databases and text mining derived associations including Mendelian, complex and environmental diseases.

Phylomedicine is an emerging discipline at the intersection of medicine, genomics, and evolution. It focuses on the use of evolutionary knowledge to predict functional consequences of mutations found in personal genomes and populations.

In genetics and genetic epidemiology, a phenome-wide association study, abbreviated PheWAS, is a study design in which the association between single-nucleotide polymorphisms or other types of DNA variants is tested across a large number of different phenotypes. The aim of PheWAS studies is to examine the causal linkage between known sequence differences and any type of trait, including molecular, biochemical, cellular, and especially clinical diagnoses and outcomes. It is a complementary approach to the genome-wide association study, or GWAS, methodology. A fundamental difference between GWAS and PheWAS designs is the direction of inference: in a PheWAS it is from exposure to many possible outcomes, that is, from SNPs to differences in phenotypes and disease risk. In a GWAS, the polarity of analysis is from one or a few phenotypes to many possible DNA variants. The approach has proven useful in rediscovering previously reported genotype-phenotype associations, as well as in identifying new ones.

<span class="mw-page-title-main">Carolyn Lawrence-Dill</span> American plant biologist

Carolyn Joy Lawrence-Dill is an American plant biologist and academic administrator. She develops computational systems and tools to help plant science researchers use plant genetics and genomics data for basic biology applications that advance plant breeding.

Krina Tynke Zondervan is a Dutch biomedical scientist who is a Professor of Genomic Epidemiology at the University of Oxford. She serves on the board of the World Endometriosis Society.

References

  1. Fernández-Orth D, Lloret-Villas A, De Argila JR (June 2019). "European Genome-Phenome Archive (EGA) - Granular Solutions for the Next 10 Years". 2019 IEEE 32nd International Symposium on Computer-Based Medical Systems (CBMS): 4–6. doi:10.1109/CBMS.2019.00011. ISBN   978-1-7281-2286-1. S2CID   199490012.
  2. 1 2 Morello F. "Working together towards a federated European Genome-phenome Archive for publishing and re-using sensitive research data". CSC. Retrieved 2022-09-21.
  3. Lappalainen I, Almeida-King J, Kumanduri V, Senf A, Spalding JD, Ur-Rehman S, et al. (July 2015). "The European Genome-phenome Archive of human data consented for biomedical research". Nature Genetics. 47 (7): 692–695. doi:10.1038/ng.3312. PMC   5426533 . PMID   26111507.
  4. Freeberg MA, Fromont LA, D'Altri T, Romero AF, Ciges JI, Jene A, et al. (January 2022). "The European Genome-phenome Archive in 2021". Nucleic Acids Research. 50 (D1): D980–D987. doi:10.1093/nar/gkab1059. PMC   8728218 . PMID   34791407.
  5. Lappalainen I, Almeida-King J, Kumanduri V, Senf A, Spalding JD, Ur-Rehman S, et al. (July 2015). "The European Genome-phenome Archive of human data consented for biomedical research". Nature Genetics. 47 (7): 692–695. doi:10.1038/ng.3312. PMC   5426533 . PMID   26111507.
  6. "The European Genome-phenome Archive". www.re3data.org. Retrieved 2022-09-21.