Computational epigenetics

Last updated
DNA methylation is an epigenetic mechanism that can be studied with bioinformatics. DNA methylation.jpg
DNA methylation is an epigenetic mechanism that can be studied with bioinformatics.

Computational epigenetics [1] uses statistical methods and mathematical modelling in epigenetic research. Due to the recent explosion of epigenome datasets, computational methods play an increasing role in all areas of epigenetic research.

Contents

Research in computational epigenetics comprises the development and application of bioinformatics methods for solving epigenetic questions, as well as computational data analysis and theoretical modeling in the context of epigenetics. This includes modelling of the effects of histone and DNA CpG island methylation.

Current research areas

Importance

Computational methods and next-generation sequencing (NGS) technologies to are being employed to study DNA methylation and histone modifications, which are essential in cancer research. High-throughput sequencing offers valuable insights into epigenetic changes, and the growing volume of these datasets drives the continuous development of bioinformatics techniques for their effective management and analysis. [2]

There is a need for data integration tools that can merge various types of epigenetic modifications and -omics data (including transcriptomics, genomics, epigenomics, and proteomics) to gain a comprehensive understanding of biological processes. This requires the standardization, annotation, and harmonization of epigenetic data, along with the enhancement of computational and machine learning approaches. [3]

Understanding the functional implications of epigenetics in diseases can be greatly advanced by using epigenetic editing tools, such as CRISPR-dCas9 technology. These tools enable precise modifications of epigenetic marks at specific loci, allowing researchers to assess the effects of these alterations in cellular and animal models, thus complementing insights obtained from computational analyses. [3]

Data processing and analysis

ChIP-on-chip technique ChIP-on-chip wet-lab.png
ChIP-on-chip technique

Various experimental techniques have been developed for genome-wide mapping of epigenetic information, [4] the most widely used being ChIP-on-chip, ChIP-seq and bisulfite sequencing. All of these methods generate large amounts of data and require efficient ways of data processing and quality control by bioinformatic methods.

Predictions

A substantial amount of bioinformatic research has been devoted to the prediction of epigenetic information from characteristics of the genome sequence. Such predictions serve a dual purpose. First, accurate epigenome predictions can substitute for experimental data, to some degree, which is particularly relevant for newly discovered epigenetic mechanisms and for species other than human and mouse. Second, prediction algorithms build statistical models of epigenetic information from training data and can therefore act as a first step toward quantitative modeling of an epigenetic mechanism. Successful computational prediction of DNA and lysine methylation and acetylation has been achieved by combinations of various features. [5] [6]

Applications in cancer epigenetics

The important role of epigenetic defects for cancer opens up new opportunities for improved diagnosis and therapy. These active areas of research give rise to two questions that are particularly amenable to bioinformatic analysis. First, given a list of genomic regions exhibiting epigenetic differences between tumor cells and controls (or between different disease subtypes), can we detect common patterns or find evidence of a functional relationship of these regions to cancer? Second, can we use bioinformatic methods in order to improve diagnosis and therapy by detecting and classifying important disease subtypes?

Emerging topics

The first wave of research in the field of computational epigenetics was driven by rapid progress of experimental methods for data generation, which required adequate computational methods for data processing and quality control, prompted epigenome prediction studies as a means of understanding the genomic distribution of epigenetic information, and provided the foundation for initial projects on cancer epigenetics. While these topics will continue to be major areas of research and the mere quantity of epigenetic data arising from epigenome projects poses a significant bioinformatic challenge, several additional topics are currently emerging.

Data portals and projects

Epigenomic Data Portals and Projects
NameDescriptionLink
IHEC Data PortalOffers a comprehensive list of reference epigenomes for humans (hg19, hg38) and mice (mm10). IHEC Portal [2]
NIH ROADMAP Epigenomics Mapping ConsortiumProvides genome-wide maps of histone modifications, chromatin accessibility, DNA methylation, and mRNA expression across various human cell types and tissues. ROADMAP Portal [2]
CEEHRCA Canadian initiative that provides detailed information on human epigenomes from different tissues. CEEHRC Portal [2]
BLUEPRINTA European project generating epigenomic maps for 100 different blood types. BLUEPRINT Portal [2]
IHEC CRESTFocuses on reference genomes for human epithelial, vascular endothelial, and reproductive cells. CREST Portal [2]
DeepBlueA central database designed for programmatic operations on epigenetic data, including data overlapping and aggregation. DeepBlue Portal [2]
Epigenome BrowserSupplies reference sequences and draft assemblies for a diverse range of genomes. Epigenome Browser [2]
WashU Epigenome BrowserOffers extensive epigenomic information for various species, including cows and fruit flies, in addition to humans and mice. WashU Portal [2]
ENCODE ProjectAn NIH-funded initiative aimed at mapping all functional elements of the human genome. ENCODE Portal [2]
GenExpAn interactive genome browser that integrates data from the Distributed Annotation System (DAS). GenExp Portal [2]
AHEAD Task ForceA systematic effort to map the human epigenome, creating bioinformatics networks for reference guides related to normal tissues. [2]
HEP Project ConsortiumProvides high-resolution epigenome data for analyzing DNA methylation in 43 individuals. [2] HEP
HEROIC Project ConsortiumFocuses on high-throughput studies of epigenetic regulation using various genomic assays. HEROIC Portal [2]
dbEMA database that examines the role of epigenetic proteins in oncogenesis, featuring data on mutations and gene expression across tumor samples. dbEM Portal [2]
EpiFactorsA database linking specific epigenetic factors to corresponding genes. EpiFactors Portal [2]
HEDDConcentrates on the storage and integration of datasets related to epigenetic drugs. HEDD Portal [2]

Databases

DNA Methylation and Epigenetic Databases
NameDescriptionCitation
MethDB Contains information on 19,905 DNA methylation content data and 5,382 methylation patterns for 48 species, 1,511 individuals, 198 tissues and cell lines, and 79 phenotypes. [8]
Pubmeth Contains over 5,000 records on methylated genes in various cancer types. [9]
REBASE Contains over 22,000 DNA methyltransferases genes derived from GenBank. [10]
DeepBlue Epigenomic Database Contains epigenomic data from more than 60,000 experiments from different IHEC members, divided into various epigenetic marks. DeepBlue also provides an API for access and processing of the data. [11]
MeInfoText Contains gene methylation information across 205 human cancer types. [12]
MethPrimerDB Contains 259 primer sets from human, mouse, and rat for DNA methylation analysis. [13]
The Histone Database Contains 254 sequences from histone H1, 383 from histone H2, 311 from histone H2B, 1043 from histone H3, and 198 from histone H4, representing at least 857 species. [14]
ChromDB Contains 9,341 chromatin-associated proteins, including RNAi-associated proteins, for a broad range of organisms. [15]
CREMOFAC Contains 1,725 redundant and 720 non-redundant chromatin-remodeling factor sequences in eukaryotes. [16]
The Krembil Family Epigenetics Laboratory Contains DNA methylation data of human chromosomes 21, 22, male germ cells, and DNA methylation profiles in monozygotic and dizygotic twins. [17]
MethyLogiX DNA methylation database Contains DNA methylation data of human chromosomes 21 and 22, male germ cells, and late-onset Alzheimer's disease. [18]

Sources and further reading

Related Research Articles

<span class="mw-page-title-main">Epigenetics</span> Study of DNA modifications that do not change its sequence

In biology, epigenetics is the study of heritable traits, or a stable change of cell function, that happen without changes to the DNA sequence. The Greek prefix epi- in epigenetics implies features that are "on top of" or "in addition to" the traditional genetic mechanism of inheritance. Epigenetics usually involves a change that is not erased by cell division, and affects the regulation of gene expression. Such effects on cellular and physiological phenotypic traits may result from environmental factors, or be part of normal development. Epigenetic factors can also lead to cancer.

<span class="mw-page-title-main">Epigenome</span> Biological term

In biology, the epigenome of an organism is the collection of chemical changes to its DNA and histone proteins that affects when, where, and how the DNA is expressed; these changes can be passed down to an organism's offspring via transgenerational epigenetic inheritance. Changes to the epigenome can result in changes to the structure of chromatin and changes to the function of the genome. The human epigenome, including DNA methylation and histone modification, is maintained through cell division. The epigenome is essential for normal development and cellular differentiation, enabling cells with the same genetic code to perform different functions. The human epigenome is dynamic and can be influenced by environmental factors such as diet, stress, and toxins.

<span class="mw-page-title-main">Bisulfite sequencing</span> Lab procedure detecting 5-methylcytosines in DNA

Bisulfitesequencing (also known as bisulphite sequencing) is the use of bisulfite treatment of DNA before routine sequencing to determine the pattern of methylation. DNA methylation was the first discovered epigenetic mark, and remains the most studied. In animals it predominantly involves the addition of a methyl group to the carbon-5 position of cytosine residues of the dinucleotide CpG, and is implicated in repression of transcriptional activity.

Epigenomics is the study of the complete set of epigenetic modifications on the genetic material of a cell, known as the epigenome. The field is analogous to genomics and proteomics, which are the study of the genome and proteome of a cell. Epigenetic modifications are reversible modifications on a cell's DNA or histones that affect gene expression without altering the DNA sequence. Epigenomic maintenance is a continuous process and plays an important role in stability of eukaryotic genomes by taking part in crucial biological mechanisms like DNA repair. Plant flavones are said to be inhibiting epigenomic marks that cause cancers. Two of the most characterized epigenetic modifications are DNA methylation and histone modification. Epigenetic modifications play an important role in gene expression and regulation, and are involved in numerous cellular processes such as in differentiation/development and tumorigenesis. The study of epigenetics on a global level has been made possible only recently through the adaptation of genomic high-throughput assays.

Methylated DNA immunoprecipitation is a large-scale purification technique in molecular biology that is used to enrich for methylated DNA sequences. It consists of isolating methylated DNA fragments via an antibody raised against 5-methylcytosine (5mC). This technique was first described by Weber M. et al. in 2005 and has helped pave the way for viable methylome-level assessment efforts, as the purified fraction of methylated DNA can be input to high-throughput DNA detection methods such as high-resolution DNA microarrays (MeDIP-chip) or next-generation sequencing (MeDIP-seq). Nonetheless, understanding of the methylome remains rudimentary; its study is complicated by the fact that, like other epigenetic properties, patterns vary from cell-type to cell-type.

The Epigenomics database at the National Center for Biotechnology Information was a database for whole-genome epigenetics data sets. It was retired on 1 June 2016.

The International Human Epigenome Consortium (IHEC) is a scientific organization, founded in 2010, that helps to coordinate global efforts in the field of Epigenomics. The initial goal was to generate at least 1,000 reference (baseline) human epigenomes from different types of normal and disease-related human cell types.

H3K27ac is an epigenetic modification to the DNA packaging protein histone H3. It is a mark that indicates acetylation of the lysine residue at N-terminal position 27 of the histone H3 protein.

H3K27me3 is an epigenetic modification to the DNA packaging protein histone H3. It is a mark that indicates the tri-methylation of lysine 27 on histone H3 protein.

<span class="mw-page-title-main">Epigenome-wide association study</span>

An epigenome-wide association study (EWAS) is an examination of a genome-wide set of quantifiable epigenetic marks, such as DNA methylation, in different individuals to derive associations between epigenetic variation and a particular identifiable phenotype/trait. When patterns change such as DNA methylation at specific loci, discriminating the phenotypically affected cases from control individuals, this is considered an indication that epigenetic perturbation has taken place that is associated, causally or consequentially, with the phenotype.

<span class="mw-page-title-main">Thomas Jenuwein</span> German scientist

Thomas Jenuwein is a German scientist working in the fields of epigenetics, chromatin biology, gene regulation and genome function.

H3K9me3 is an epigenetic modification to the DNA packaging protein Histone H3. It is a mark that indicates the tri-methylation at the 9th lysine residue of the histone H3 protein and is often associated with heterochromatin.

H3K4me1 is an epigenetic modification to the DNA packaging protein Histone H3. It is a mark that indicates the mono-methylation at the 4th lysine residue of the histone H3 protein and often associated with gene enhancers.

H3K36me3 is an epigenetic modification to the DNA packaging protein Histone H3. It is a mark that indicates the tri-methylation at the 36th lysine residue of the histone H3 protein and often associated with gene bodies.

H3K79me2 is an epigenetic modification to the DNA packaging protein Histone H3. It is a mark that indicates the di-methylation at the 79th lysine residue of the histone H3 protein. H3K79me2 is detected in the transcribed regions of active genes.

H4K20me is an epigenetic modification to the DNA packaging protein Histone H4. It is a mark that indicates the mono-methylation at the 20th lysine residue of the histone H4 protein. This mark can be di- and tri-methylated. It is critical for genome integrity including DNA damage repair, DNA replication and chromatin compaction.

H4K16ac is an epigenetic modification to the DNA packaging protein Histone H4. It is a mark that indicates the acetylation at the 16th lysine residue of the histone H4 protein.

H3K9ac is an epigenetic modification to the DNA packaging protein Histone H3. It is a mark that indicates the acetylation at the 9th lysine residue of the histone H3 protein.

H3K36me2 is an epigenetic modification to the DNA packaging protein Histone H3. It is a mark that indicates the di-methylation at the 36th lysine residue of the histone H3 protein.

H3K36me is an epigenetic modification to the DNA packaging protein Histone H3, specifically, the mono-methylation at the 36th lysine residue of the histone H3 protein.

References

  1. Bock C, Lengauer T (January 2008). "Computational epigenetics". Bioinformatics. 24 (1): 1–10. doi: 10.1093/bioinformatics/btm546 . PMID   18024971.
  2. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 Arora I, Tollefsbol TO (March 2021). "Computational methods and next-generation sequencing approaches to analyze epigenetics data: Profiling of methods and applications". Methods. 187: 92–103. doi:10.1016/j.ymeth.2020.09.008. PMC   7914156 . PMID   32941995.
  3. 1 2 Santaló J, Berdasco M (March 2022). "Ethical implications of epigenetics in the era of personalized medicine". Clinical Epigenetics. 14 (1): 44. doi: 10.1186/s13148-022-01263-1 . PMC   8953972 . PMID   35337378.
  4. Madrigal P, Krajewski P (July 2015). "Uncovering correlated variability in epigenomic datasets using the Karhunen-Loeve transform". BioData Mining. 8: 20. doi: 10.1186/s13040-015-0051-7 . PMC   4488123 . PMID   26140054.
  5. Shi SP, Qiu JD, Sun XY, Suo SB, Huang SY, Liang RP (April 2012). "PLMLA: prediction of lysine methylation and lysine acetylation by combining multiple features". Molecular BioSystems. 8 (5): 1520–1527. doi:10.1039/C2MB05502C. PMID   22402705. S2CID   6172534.
  6. Zheng H, Jiang SW, Wu H (2011). "Enhancement on the Predictive Power of the Prediction Model for Human Genomic DNA Methylation". Biocomp'11: The 2011 International Conference on Bioinformatics and Computational Biology. S2CID   14599625.
  7. Roznovăţ IA, Ruskin HJ (September 2013). "A computational model for genetic and epigenetic signals in colon cancer". Interdisciplinary Sciences, Computational Life Sciences. 5 (3): 175–186. doi:10.1007/s12539-013-0172-y. PMID   24307409. S2CID   11867110.
  8. DNA Methylation Database
  9. Pubmeth.Org
  10. "Official REBASE Homepage | the Restriction Enzyme Database | NEB".
  11. "DeepBlue Epigenomic Data Server".
  12. "MeInfoText: associated gene methylation and cancer information from text mining". Archived from the original on 2016-03-03. Retrieved 2010-01-29.
  13. "methPrimerDB: the DNA methylation analysis PCR primer database". Archived from the original on 2014-07-15. Retrieved 2010-01-29.
  14. "Histone Database - Histone Database". Archived from the original on 2015-09-05. Retrieved 2010-01-29.
  15. "ChromDB::Chromatin Database". Archived from the original on 2019-04-10. Retrieved 2010-01-29.
  16. Cremofac
  17. "Home". epigenomics.ca.
  18. Methylation Database Archived 2008-12-03 at the Wayback Machine