Genomic and medical data

Last updated

Genomic and medical data refers to an area within genetics that concerns the recording, sequencing and analysis of an organism's genome.

Contents

Background

Genomics is a concept that was first developed by Fred Sanger who first sequenced the complete genome of a virus and of a mitochondrion. The term genomics was first coined in 1986 by Tom Roderick, a geneticist at the Jackson Laboratory in Maine, during a meeting about the mapping of the human genome.

The genome is the entire DNA content that is present within one cell of an organism. Experts in genomics strive to determine consummate DNA sequences and perform genetic mapping to help understand disease. Genomics involves the study of all genes at the DNA (deoxyribonucleic Acid), mRNA (messenger ribonucleic acid), and proteome level as well as the cellular or tissue level. [1]

Open source tools for medical data

OMOP : It is an acronym for Observational Medical Outcomes Partnership. The OMOP research program was initially established under Foundation for NIH) and created first version of OMOP common data model. The common data model was able to accommodate observational data of different types (both claims and electronic health records). It has a single common infrastructure that can accommodate both of the types from different sources around the world. It has successfully developed and executed large-scale statistical analyses capable of enabling active drug safety surveillance across prescription medications. [2]

OHDSI: It stands for Observational Health Data Sciences and Informatics and was initiated in 2013. It is a multi-stakeholder, interdisciplinary collaborative that is striving to bring out the value of observational health data through large-scale analytics. The main objective of OHDSI is to establish a research community for observational health data sciences that enables active engagement across multiple disciplines spanning multiple stakeholder groups. [3]

OpenClinica : It is the world's most widely used, open-source software for clinical research. First released in 2005, OpenClinica is designed to meet the diverse needs of modern research environments. It is built as a lightweight, extensible, and modular application. The software is web based and users can access it with a standard web browser and internet connection. [4]

OpenEHR : It is an open standard specification in health informatics that describes the management and storage, retrieval and exchange of health data in electronic health records. In OpenEHR, all health data for a person is stored in a "one lifetime", vendor-independent, person-centered EHR. [5]

Clinical3PO: clinical3po is an open source big data environment for the Veteran Affairs informatics and computing infrastructure, enables scalable markup of electronic health record events to be used for predictive analysis. [6]

Open Source tools for Genetic Data

Beacon Project: Beacon Project is an open web service that tests the willingness of international sites to share genetic data. It is being implemented on the websites of the world's top genomic research organizations. [7]

Matchmaker Exchange: Matchmaker Exchange is a federated network of databases whose goal is to find genetic causes of rare diseases by matching similar phenotypic and genotypic profiles.

BRCA Challenge: The BRCA Challenge aims to advance understanding of the genetic basis of breast and other cancers using data from around the world.

Gather and use genetic data in health care

More and more people are getting their DNA sequenced, but the use of genetic data to inform medical decisions is lagging. More than a decade since the Human Genome Project was declared complete, fewer than 60 genetic variants are deemed worthy for use in clinical care, most for severe conditions in very young children. These genetic variants can guide medical decisions. Genomic data alone is not very useful, but needs to be examined in the right context in research and healthcare

Some health-care organizations are working out ways to collect evidence from ongoing care to improve patient outcomes. Such efforts systematically analyze clinical data from peer-reviewed literature and introduce selected findings into clinical care. Then they track and evaluate outcomes.

One of the best examples of such a 'learning health-care system' is the Geisinger Health System, which serves 2.6 million patients in Pennsylvania. Once diagnosed with chronic conditions such as diabetes, heart failure or depression, patients' care and outcomes are monitored. Metrics such as hospitalization rates are used to recommend that certain individuals receive coordinated attention from a team of specialists including mental-health practitioners and nutritionists. Over three years, this reduced hospital admissions by 18% and seemed to reduce spending by 7%. [8]

Related Research Articles

<span class="mw-page-title-main">Genetic testing</span> Medical test

Genetic testing, also known as DNA testing, is used to identify changes in DNA sequence or chromosome structure. Genetic testing can also include measuring the results of genetic changes, such as RNA analysis as an output of gene expression, or through biochemical analysis to measure specific protein output. In a medical setting, genetic testing can be used to diagnose or rule out suspected genetic disorders, predict risks for specific conditions, or gain information that can be used to customize medical treatments based on an individual's genetic makeup. Genetic testing can also be used to determine biological relatives, such as a child's biological parentage through DNA paternity testing, or be used to broadly predict an individual's ancestry. Genetic testing of plants and animals can be used for similar reasons as in humans, to gain information used for selective breeding, or for efforts to boost genetic diversity in endangered populations.

<span class="mw-page-title-main">Health informatics</span> Computational approaches to health care

Health informatics is the study and implementation of computer structures and algorithms to improve communication, understanding, and management of medical information. It can be viewed as branch of engineering and applied science.

<span class="mw-page-title-main">Pharmacogenomics</span> Study of the role of the genome in drug response

Pharmacogenomics, often abbreviated "PGx," is the study of the role of the genome in drug response. Its name reflects its combining of pharmacology and genomics. Pharmacogenomics analyzes how the genetic makeup of a patient affects their response to drugs. It deals with the influence of acquired and inherited genetic variation on drug response, by correlating DNA mutations with pharmacokinetic, pharmacodynamic, and/or immunogenic endpoints.

<span class="mw-page-title-main">Electronic health record</span> Digital collection of patient and population electronically stored health information

An electronic health record (EHR) is the systematized collection of patient and population electronically stored health information in a digital format. These records can be shared across different health care settings. Records are shared through network-connected, enterprise-wide information systems or other information networks and exchanges. EHRs may include a range of data, including demographics, medical history, medication and allergies, immunization status, laboratory test results, radiology images, vital signs, personal statistics like age and weight, and billing information.

<span class="mw-page-title-main">Personalized medicine</span> Medical model that tailors medical practices to the individual patient

Personalized medicine, also referred to as precision medicine, is a medical model that separates people into different groups—with medical decisions, practices, interventions and/or products being tailored to the individual patient based on their predicted response or risk of disease. The terms personalized medicine, precision medicine, stratified medicine and P4 medicine are used interchangeably to describe this concept, though some authors and organizations differentiate between these expressions based on particular nuances. P4 is short for "predictive, preventive, personalised and participatory".

Personal genomics or consumer genetics is the branch of genomics concerned with the sequencing, analysis and interpretation of the genome of an individual. The genotyping stage employs different techniques, including single-nucleotide polymorphism (SNP) analysis chips, or partial or full genome sequencing. Once the genotypes are known, the individual's variations can be compared with the published literature to determine likelihood of trait expression, ancestry inference and disease risk.

Translational bioinformatics (TBI) is a field that emerged in the 2010s to study health informatics, focused on the convergence of molecular bioinformatics, biostatistics, statistical genetics and clinical informatics. Its focus is on applying informatics methodology to the increasing amount of biomedical and genomic data to formulate knowledge and medical tools, which can be utilized by scientists, clinicians, and patients. Furthermore, it involves applying biomedical research to improve human health through the use of computer-based information system. TBI employs data mining and analyzing biomedical informatics in order to generate clinical knowledge for application. Clinical knowledge includes finding similarities in patient populations, interpreting biological information to suggest therapy treatments and predict health outcomes.

In Electronic Health Records (EHR's) data masking, or controlled access, is the process of concealing patient health data from certain healthcare providers. Patients have the right to request the masking of their personal information, making it inaccessible to any physician, or a particular physician, unless a specific reason is provided. Data masking is also performed by healthcare agencies to restrict the amount of information that can be accessed by external bodies such as researchers, health insurance agencies and unauthorised individuals. It is a method used to protect patients’ sensitive information so that privacy and confidentiality are less of a concern. Techniques used to alter information within a patient's EHR include data encryption, obfuscation, hashing, exclusion and perturbation.

Personalized medicine involves medical treatments based on the characteristics of individual patients, including their medical history, family history, and genetics. Although personal genetic information is becoming increasingly important in healthcare, there is a lack of sufficient education in medical genetics among physicians and the general public. For example, pharmacogenomics is practiced worldwide by only a limited number of pharmacists, although most pharmacy colleges in the United States now include it in their curriculum. It is also increasingly common for genetic testing to be offered directly to consumers, who subsequently seek out educational materials and bring their results to their doctors. Issues involving genetic testing also invariably lead to ethical and legal concerns, such as the potential for inadvertent effects on family members, increased insurance rates, or increased psychological stress.

A variant of uncertainsignificance (VUS) is a genetic variant that has been identified through genetic testing but whose significance to the function or health of an organism is not known. Two related terms are "gene of uncertain significance" (GUS), which refers to a gene that has been identified through genome sequencing but whose connection to a human disease has not been established, and "insignificant mutation", referring to a gene variant that has no impact on the health or function of an organism. The term "variant' is favored in clinical practice over "mutation" because it can be used to describe an allele more precisely. When the variant has no impact on health, it is called a "benign variant". When it is associated with a disease, it is called a "pathogenic variant". A "pharmacogenomic variant" has an effect only when an individual takes a particular drug and therefore is neither benign nor pathogenic.

Clinicogenomics, also referred to as clinical genomics, is the study of clinical outcomes with genomic data. Genomic factors have a causal effect on clinical data. Clinicogenomics uses the entire genome of a patient in order to diagnose diseases or adjust medications exclusively for that patient. Whole genome testing can detect more mutations and structural anomalies than targeted gene testing. Furthermore, targeted gene testing can only test for the diseases for which the doctor screens, whereas testing the whole genome screens for all diseases with known markers at once.

DNA encryption is the process of hiding or perplexing genetic information by a computational method in order to improve genetic privacy in DNA sequencing processes. The human genome is complex and long, but it is very possible to interpret important, and identifying, information from smaller variabilities, rather than reading the entire genome. A whole human genome is a string of 3.2 billion base paired nucleotides, the building blocks of life, but between individuals the genetic variation differs only by 0.5%, an important 0.5% that accounts for all of human diversity, the pathology of different diseases, and ancestral story. Emerging strategies incorporate different methods, such as randomization algorithms and cryptographic approaches, to de-identify the genetic sequence from the individual, and fundamentally, isolate only the necessary information while protecting the rest of the genome from unnecessary inquiry. The priority now is to ascertain which methods are robust, and how policy should ensure the ongoing protection of genetic privacy.

In genetics and genetic epidemiology, a phenome-wide association study, abbreviated PheWAS, is a study design in which the association between single-nucleotide polymorphisms or other types of DNA variants is tested across a large number of different phenotypes. The aim of PheWAS studies is to examine the causal linkage between known sequence differences and any type of trait, including molecular, biochemical, cellular, and especially clinical diagnoses and outcomes. It is a complementary approach to the genome-wide association study, or GWAS, methodology. A fundamental difference between GWAS and PheWAS designs is the direction of inference: in a PheWAS it is from exposure to many possible outcomes, that is, from SNPs to differences in phenotypes and disease risk. In a GWAS, the polarity of analysis is from one or a few phenotypes to many possible DNA variants. The approach has proven useful in rediscovering previously reported genotype-phenotype associations, as well as in identifying new ones.

The Global Alliance for Genomics and Health (GA4GH) is an international consortium that is developing standards for responsibly collecting, storing, analyzing, and sharing genomic data in order to enable an "internet of genomics". GA4GH was founded in 2013.

Personalized onco-genomics (POG) is the field of oncology and genomics that is focused on using whole genome analysis to make personalized clinical treatment decisions. The program was devised at British Columbia's BC Cancer Agency and is currently being led by Marco Marra and Janessa Laskin. Genome instability has been identified as one of the underlying hallmarks of cancer. The genetic diversity of cancer cells promotes multiple other cancer hallmark functions that help them survive in their microenvironment and eventually metastasise. The pronounced genomic heterogeneity of tumours has led researchers to develop an approach that assesses each individual's cancer to identify targeted therapies that can halt cancer growth. Identification of these "drivers" and corresponding medications used to possibly halt these pathways are important in cancer treatment.

Sophia Genetics is a data-driven medicine software company with headquarters in Lausanne, Switzerland and Boston, Massachusetts. It provides genomic and radiomic analysis for hospitals, laboratories, and biopharma institutions. The company was ranked among the 50 smartest companies by the MIT Technology Review in 2017. The company went public on the Nasdaq in 2021, floating at $1.1B.

<span class="mw-page-title-main">Biological data</span>

Biological data refers to a compound or information derived from living organisms and their products. A medicinal compound made from living organisms, such as a serum or a vaccine, could be characterized as biological data. Biological data is highly complex when compared with other forms of data. There are many forms of biological data, including text, sequence data, protein structure, genomic data and amino acids, and links among others.

Personalized genomics is the human genetics-derived study of analyzing and interpreting individualized genetic information by genome sequencing to identify genetic variations compared to the library of known sequences. International genetics communities have spared no effort from the past and have gradually cooperated to prosecute research projects to determine DNA sequences of the human genome using DNA sequencing techniques. The methods that are the most commonly used are whole exome sequencing and whole genome sequencing. Both approaches are used to identify genetic variations. Genome sequencing became more cost-effective over time, and made it applicable in the medical field, allowing scientists to understand which genes are attributed to specific diseases.

A common data model (CDM) can refer to any standardised data model which allows for data and information exchange between different applications and data sources. Common data models aim to standardise logical infrastructure so that related applications can "operate on and share the same data", and can be seen as a way to "organize data from many sources that are in different formats into a standard structure".

Daniel Richard Masys is an American biotechnologist and academic. He is an Affiliate Professor of Biomedical and Health Informatics at the University of Washington.

References

  1. "Genetics, genomics and the patenting of DNA" (PDF).
  2. "Observational Medical Outcomes Partnership".
  3. "Who We Are | OHDSI". www.ohdsi.org. Retrieved 2015-11-24.
  4. "Clinical Trial Software Product Editions | OpenClinica". www.openclinica.com. Retrieved 2015-11-24.
  5. "openEHR - Foundation". www.openehr.org. Retrieved 2015-11-24.
  6. "Clinical Personalized Pragmatic Predictions of Outcomes (Clinical 3PO)". Clinical Personalized Pragmatic Predictions of Outcomes (Clinical 3PO). Retrieved 2015-11-24.
  7. "About the Global Alliance | Global Alliance for Genomics and Health". genomicsandhealth.org. Retrieved 2015-10-06.
  8. "Medical genomics: Gather and use genetic data in health care" . Retrieved 2015-10-06.