Computational epigenetics

Last updated
DNA methylation is an epigenetic mechanism that can be studied with bioinformatics. DNA methylation.jpg
DNA methylation is an epigenetic mechanism that can be studied with bioinformatics.

Computational epigenetics [1] uses statistical methods and mathematical modelling in epigenetic research. Due to the recent explosion of epigenome datasets, computational methods play an increasing role in all areas of epigenetic research.

Contents

Definition

Research in computational epigenetics comprises the development and application of bioinformatics methods for solving epigenetic questions, as well as computational data analysis and theoretical modeling in the context of epigenetics. This includes modelling of the effects of histone and DNA CpG island methylation.

Current research areas

Epigenetic data processing and analysis

ChIP-on-chip technique ChIP-on-chip wet-lab.png
ChIP-on-chip technique

Various experimental techniques have been developed for genome-wide mapping of epigenetic information, [2] the most widely used being ChIP-on-chip, ChIP-seq and bisulfite sequencing. All of these methods generate large amounts of data and require efficient ways of data processing and quality control by bioinformatic methods.

Epigenome prediction

A substantial amount of bioinformatic research has been devoted to the prediction of epigenetic information from characteristics of the genome sequence. Such predictions serve a dual purpose. First, accurate epigenome predictions can substitute for experimental data, to some degree, which is particularly relevant for newly discovered epigenetic mechanisms and for species other than human and mouse. Second, prediction algorithms build statistical models of epigenetic information from training data and can therefore act as a first step toward quantitative modeling of an epigenetic mechanism. Successful computational prediction of DNA and lysine methylation and acetylation has been achieved by combinations of various features. [3] [4]

Applications in cancer epigenetics

The important role of epigenetic defects for cancer opens up new opportunities for improved diagnosis and therapy. These active areas of research give rise to two questions that are particularly amenable to bioinformatic analysis. First, given a list of genomic regions exhibiting epigenetic differences between tumor cells and controls (or between different disease subtypes), can we detect common patterns or find evidence of a functional relationship of these regions to cancer? Second, can we use bioinformatic methods in order to improve diagnosis and therapy by detecting and classifying important disease subtypes?

Emerging topics

The first wave of research in the field of computational epigenetics was driven by rapid progress of experimental methods for data generation, which required adequate computational methods for data processing and quality control, prompted epigenome prediction studies as a means of understanding the genomic distribution of epigenetic information, and provided the foundation for initial projects on cancer epigenetics. While these topics will continue to be major areas of research and the mere quantity of epigenetic data arising from epigenome projects poses a significant bioinformatic challenge, several additional topics are currently emerging.

Epigenetics Databases

  1. MethDB [6] Contains information on 19,905 DNA methylation content data and 5,382 methylation patterns for 48 species, 1,511 individuals, 198 tissues and cell lines and 79 phenotypes.
  2. Pubmeth [7] Contains over 5,000 records on methylated genes in various cancer types.
  3. REBASE [8] Contains over 22,000 DNA methyltransferases genes derived from GenBank.
  4. DeepBlue Epigenomic Database [9] contains epigenomic data from more than 60,000 experiments from different IHEC members files divided in many different epigenetic marks. DeepBlue also provides an API for access and process the data on the server.
  5. MeInfoText [10] Contains gene methylation information across 205 human cancer types.
  6. MethPrimerDB [11] Contains 259 primer sets from human, mouse and rat for DNA methylation analysis.
  7. The Histone Database [12] Contains 254 sequences from histone H1, 383 from histone H2, 311 from histone H2B, 1043 from histone H3 and 198 from histone H4, altogether representing at least 857 species.
  8. ChromDB [13] Contains 9,341 chromatin-associated proteins, including RNAi-associated proteins, for a broad range of organisms.
  9. CREMOFAC [14] Contains 1725 redundant and 720 non-redundant chromatin-remodeling factor sequences in eukaryotes.
  10. The Krembil Family Epigenetics Laboratory [15] Contains DNA methylation data of human chromosomes 21, 22, male germ cells and DNA methylation profiles in monozygotic and dizygotic twins.
  11. MethyLogiX DNA methylation database [16] Contains DNA methylation data of human chromosomes 21 and 22, male germ cells and late-onset Alzheimer's disease.

Sources and further reading

Related Research Articles

<span class="mw-page-title-main">Epigenetics</span> Study of DNA modifications that do not change its sequence

In biology, epigenetics is the study of heritable traits, or a stable change of cell function, that happen without changes to the DNA sequence. The Greek prefix epi- in epigenetics implies features that are "on top of" or "in addition to" the traditional genetic mechanism of inheritance. Epigenetics usually involves a change that is not erased by cell division, and affects the regulation of gene expression. Such effects on cellular and physiological phenotypic traits may result from environmental factors, or be part of normal development. They can lead to cancer.

<span class="mw-page-title-main">Epigenome</span> Biological term

An epigenome consists of a record of the chemical changes to the DNA and histone proteins of an organism; these changes can be passed down to an organism's offspring via transgenerational stranded epigenetic inheritance. Changes to the epigenome can result in changes to the structure of chromatin and changes to the function of the genome.

Epigenomics is the study of the complete set of epigenetic modifications on the genetic material of a cell, known as the epigenome. The field is analogous to genomics and proteomics, which are the study of the genome and proteome of a cell. Epigenetic modifications are reversible modifications on a cell's DNA or histones that affect gene expression without altering the DNA sequence. Epigenomic maintenance is a continuous process and plays an important role in stability of eukaryotic genomes by taking part in crucial biological mechanisms like DNA repair. Plant flavones are said to be inhibiting epigenomic marks that cause cancers. Two of the most characterized epigenetic modifications are DNA methylation and histone modification. Epigenetic modifications play an important role in gene expression and regulation, and are involved in numerous cellular processes such as in differentiation/development and tumorigenesis. The study of epigenetics on a global level has been made possible only recently through the adaptation of genomic high-throughput assays.

The Epigenomics database at the National Center for Biotechnology Information was a database for whole-genome epigenetics data sets. It was retired on 1 June 2016.

The International Human Epigenome Consortium (IHEC) is a scientific organization, founded in 2010, that helps to coordinate global efforts in the field of Epigenomics. The initial goal was to generate at least 1,000 reference (baseline) human epigenomes from different types of normal and disease-related human cell types.

<span class="mw-page-title-main">Epigenome editing</span>

Epigenome editing or epigenome engineering is a type of genetic engineering in which the epigenome is modified at specific sites using engineered molecules targeted to those sites. Whereas gene editing involves changing the actual DNA sequence itself, epigenetic editing involves modifying and presenting DNA sequences to proteins and other DNA binding factors that influence DNA function. By "editing” epigenomic features in this manner, researchers can determine the exact biological role of an epigenetic modification at the site in question.

H3K4me3 is an epigenetic modification to the DNA packaging protein Histone H3 that indicates tri-methylation at the 4th lysine residue of the histone H3 protein and is often involved in the regulation of gene expression. The name denotes the addition of three methyl groups (trimethylation) to the lysine 4 on the histone H3 protein.

H3K27me3 is an epigenetic modification to the DNA packaging protein Histone H3. It is a mark that indicates the tri-methylation of lysine 27 on histone H3 protein.

<span class="mw-page-title-main">Epigenome-wide association study</span>

An epigenome-wide association study (EWAS) is an examination of a genome-wide set of quantifiable epigenetic marks, such as DNA methylation, in different individuals to derive associations between epigenetic variation and a particular identifiable phenotype/trait. When patterns change such as DNA methylation at specific loci, discriminating the phenotypically affected cases from control individuals, this is considered an indication that epigenetic perturbation has taken place that is associated, causally or consequentially, with the phenotype.

<span class="mw-page-title-main">Thomas Jenuwein</span> German scientist

Thomas Jenuwein is a German scientist working in the fields of epigenetics, chromatin biology, gene regulation and genome function.

Pharmacoepigenetics is an emerging field that studies the underlying epigenetic marking patterns that lead to variation in an individual's response to medical treatment.

H3K9me3 is an epigenetic modification to the DNA packaging protein Histone H3. It is a mark that indicates the tri-methylation at the 9th lysine residue of the histone H3 protein and is often associated with heterochromatin.

H3K9me2 is an epigenetic modification to the DNA packaging protein Histone H3. It is a mark that indicates the di-methylation at the 9th lysine residue of the histone H3 protein. H3K9me2 is strongly associated with transcriptional repression. H3K9me2 levels are higher at silent compared to active genes in a 10kb region surrounding the transcriptional start site. H3K9me2 represses gene expression both passively, by prohibiting acetylation as therefore binding of RNA polymerase or its regulatory factors, and actively, by recruiting transcriptional repressors. H3K9me2 has also been found in megabase blocks, termed Large Organised Chromatin K9 domains (LOCKS), which are primarily located within gene-sparse regions but also encompass genic and intergenic intervals. Its synthesis is catalyzed by G9a, G9a-like protein, and PRDM2. H3K9me2 can be removed by a wide range of histone lysine demethylases (KDMs) including KDM1, KDM3, KDM4 and KDM7 family members. H3K9me2 is important for various biological processes including cell lineage commitment, the reprogramming of somatic cells to induced pluripotent stem cells, regulation of the inflammatory response, and addiction to drug use.

Human epigenome is the complete set of structural modifications of chromatin and chemical modifications of histones and nucleotides. These modifications affect according to cellular type and development status. Various studies show that epigenome depends on exogenous factors.

H3K4me1 is an epigenetic modification to the DNA packaging protein Histone H3. It is a mark that indicates the mono-methylation at the 4th lysine residue of the histone H3 protein and often associated with gene enhancers.

H3K79me2 is an epigenetic modification to the DNA packaging protein Histone H3. It is a mark that indicates the di-methylation at the 79th lysine residue of the histone H3 protein. H3K79me2 is detected in the transcribed regions of active genes.

H4K20me is an epigenetic modification to the DNA packaging protein Histone H4. It is a mark that indicates the mono-methylation at the 20th lysine residue of the histone H4 protein. This mark can be di- and tri-methylated. It is critical for genome integrity including DNA damage repair, DNA replication and chromatin compaction.

H4K16ac is an epigenetic modification to the DNA packaging protein Histone H4. It is a mark that indicates the acetylation at the 16th lysine residue of the histone H4 protein.

H3K36me2 is an epigenetic modification to the DNA packaging protein Histone H3. It is a mark that indicates the di-methylation at the 36th lysine residue of the histone H3 protein.

H3K36me is an epigenetic modification to the DNA packaging protein Histone H3, specifically, the mono-methylation at the 36th lysine residue of the histone H3 protein.

References

  1. Bock C, Lengauer T (January 2008). "Computational epigenetics". Bioinformatics. 24 (1): 1–10. doi: 10.1093/bioinformatics/btm546 . PMID   18024971.
  2. Madrigal P, Krajewski P (July 2015). "Uncovering correlated variability in epigenomic datasets using the Karhunen-Loeve transform". BioData Mining. 8: 20. doi: 10.1186/s13040-015-0051-7 . PMC   4488123 . PMID   26140054.
  3. Shi SP, Qiu JD, Sun XY, Suo SB, Huang SY, Liang RP (April 2012). "PLMLA: prediction of lysine methylation and lysine acetylation by combining multiple features". Molecular BioSystems. 8 (5): 1520–1527. doi:10.1039/C2MB05502C. PMID   22402705. S2CID   6172534.
  4. Zheng H, Jiang SW, Wu H (2011). "Enhancement on the Predictive Power of the Prediction Model for Human Genomic DNA Methylation". Biocomp'11: The 2011 International Conference on Bioinformatics and Computational Biology. S2CID   14599625.
  5. Roznovăţ IA, Ruskin HJ (September 2013). "A computational model for genetic and epigenetic signals in colon cancer". Interdisciplinary Sciences, Computational Life Sciences. 5 (3): 175–186. doi:10.1007/s12539-013-0172-y. PMID   24307409. S2CID   11867110.
  6. DNA Methylation Database
  7. Pubmeth.Org
  8. "Official REBASE Homepage | the Restriction Enzyme Database | NEB".
  9. "DeepBlue Epigenomic Data Server".
  10. "MeInfoText: associated gene methylation and cancer information from text mining". Archived from the original on 2016-03-03. Retrieved 2010-01-29.
  11. "methPrimerDB: the DNA methylation analysis PCR primer database". Archived from the original on 2014-07-15. Retrieved 2010-01-29.
  12. "Histone Database - Histone Database". Archived from the original on 2015-09-05. Retrieved 2010-01-29.
  13. "ChromDB::Chromatin Database". Archived from the original on 2019-04-10. Retrieved 2010-01-29.
  14. Cremofac
  15. "Home". epigenomics.ca.
  16. Methylation Database Archived 2008-12-03 at the Wayback Machine