Computational epigenetics [1] uses statistical methods and mathematical modelling in epigenetic research. Due to the recent explosion of epigenome datasets, computational methods play an increasing role in all areas of epigenetic research.
Research in computational epigenetics comprises the development and application of bioinformatics methods for solving epigenetic questions, as well as computational data analysis and theoretical modeling in the context of epigenetics. This includes modelling of the effects of histone and DNA CpG island methylation.
Computational methods and next-generation sequencing (NGS) technologies to are being employed to study DNA methylation and histone modifications, which are essential in cancer research. High-throughput sequencing offers valuable insights into epigenetic changes, and the growing volume of these datasets drives the continuous development of bioinformatics techniques for their effective management and analysis. [2]
There is a need for data integration tools that can merge various types of epigenetic modifications and -omics data (including transcriptomics, genomics, epigenomics, and proteomics) to gain a comprehensive understanding of biological processes. This requires the standardization, annotation, and harmonization of epigenetic data, along with the enhancement of computational and machine learning approaches. [3]
Understanding the functional implications of epigenetics in diseases can be greatly advanced by using epigenetic editing tools, such as CRISPR-dCas9 technology. These tools enable precise modifications of epigenetic marks at specific loci, allowing researchers to assess the effects of these alterations in cellular and animal models, thus complementing insights obtained from computational analyses. [3]
Various experimental techniques have been developed for genome-wide mapping of epigenetic information, [4] the most widely used being ChIP-on-chip, ChIP-seq and bisulfite sequencing. All of these methods generate large amounts of data and require efficient ways of data processing and quality control by bioinformatic methods.
A substantial amount of bioinformatic research has been devoted to the prediction of epigenetic information from characteristics of the genome sequence. Such predictions serve a dual purpose. First, accurate epigenome predictions can substitute for experimental data, to some degree, which is particularly relevant for newly discovered epigenetic mechanisms and for species other than human and mouse. Second, prediction algorithms build statistical models of epigenetic information from training data and can therefore act as a first step toward quantitative modeling of an epigenetic mechanism. Successful computational prediction of DNA and lysine methylation and acetylation has been achieved by combinations of various features. [5] [6]
The important role of epigenetic defects for cancer opens up new opportunities for improved diagnosis and therapy. These active areas of research give rise to two questions that are particularly amenable to bioinformatic analysis. First, given a list of genomic regions exhibiting epigenetic differences between tumor cells and controls (or between different disease subtypes), can we detect common patterns or find evidence of a functional relationship of these regions to cancer? Second, can we use bioinformatic methods in order to improve diagnosis and therapy by detecting and classifying important disease subtypes?
The first wave of research in the field of computational epigenetics was driven by rapid progress of experimental methods for data generation, which required adequate computational methods for data processing and quality control, prompted epigenome prediction studies as a means of understanding the genomic distribution of epigenetic information, and provided the foundation for initial projects on cancer epigenetics. While these topics will continue to be major areas of research and the mere quantity of epigenetic data arising from epigenome projects poses a significant bioinformatic challenge, several additional topics are currently emerging.
Name | Description | Link |
---|---|---|
IHEC Data Portal | Offers a comprehensive list of reference epigenomes for humans (hg19, hg38) and mice (mm10). | IHEC Portal [2] |
NIH ROADMAP Epigenomics Mapping Consortium | Provides genome-wide maps of histone modifications, chromatin accessibility, DNA methylation, and mRNA expression across various human cell types and tissues. | ROADMAP Portal [2] |
CEEHRC | A Canadian initiative that provides detailed information on human epigenomes from different tissues. | CEEHRC Portal [2] |
BLUEPRINT | A European project generating epigenomic maps for 100 different blood types. | BLUEPRINT Portal [2] |
IHEC CREST | Focuses on reference genomes for human epithelial, vascular endothelial, and reproductive cells. | CREST Portal [2] |
DeepBlue | A central database designed for programmatic operations on epigenetic data, including data overlapping and aggregation. | DeepBlue Portal [2] |
Epigenome Browser | Supplies reference sequences and draft assemblies for a diverse range of genomes. | Epigenome Browser [2] |
WashU Epigenome Browser | Offers extensive epigenomic information for various species, including cows and fruit flies, in addition to humans and mice. | WashU Portal [2] |
ENCODE Project | An NIH-funded initiative aimed at mapping all functional elements of the human genome. | ENCODE Portal [2] |
GenExp | An interactive genome browser that integrates data from the Distributed Annotation System (DAS). | GenExp Portal [2] |
AHEAD Task Force | A systematic effort to map the human epigenome, creating bioinformatics networks for reference guides related to normal tissues. | [2] |
HEP Project Consortium | Provides high-resolution epigenome data for analyzing DNA methylation in 43 individuals. | [2] HEP |
HEROIC Project Consortium | Focuses on high-throughput studies of epigenetic regulation using various genomic assays. | HEROIC Portal [2] |
dbEM | A database that examines the role of epigenetic proteins in oncogenesis, featuring data on mutations and gene expression across tumor samples. | dbEM Portal [2] |
EpiFactors | A database linking specific epigenetic factors to corresponding genes. | EpiFactors Portal [2] |
HEDD | Concentrates on the storage and integration of datasets related to epigenetic drugs. | HEDD Portal [2] |
Name | Description | Citation |
---|---|---|
MethDB | Contains information on 19,905 DNA methylation content data and 5,382 methylation patterns for 48 species, 1,511 individuals, 198 tissues and cell lines, and 79 phenotypes. | [8] |
Pubmeth | Contains over 5,000 records on methylated genes in various cancer types. | [9] |
REBASE | Contains over 22,000 DNA methyltransferases genes derived from GenBank. | [10] |
DeepBlue Epigenomic Database | Contains epigenomic data from more than 60,000 experiments from different IHEC members, divided into various epigenetic marks. DeepBlue also provides an API for access and processing of the data. | [11] |
MeInfoText | Contains gene methylation information across 205 human cancer types. | [12] |
MethPrimerDB | Contains 259 primer sets from human, mouse, and rat for DNA methylation analysis. | [13] |
The Histone Database | Contains 254 sequences from histone H1, 383 from histone H2, 311 from histone H2B, 1043 from histone H3, and 198 from histone H4, representing at least 857 species. | [14] |
ChromDB | Contains 9,341 chromatin-associated proteins, including RNAi-associated proteins, for a broad range of organisms. | [15] |
CREMOFAC | Contains 1,725 redundant and 720 non-redundant chromatin-remodeling factor sequences in eukaryotes. | [16] |
The Krembil Family Epigenetics Laboratory | Contains DNA methylation data of human chromosomes 21, 22, male germ cells, and DNA methylation profiles in monozygotic and dizygotic twins. | [17] |
MethyLogiX DNA methylation database | Contains DNA methylation data of human chromosomes 21 and 22, male germ cells, and late-onset Alzheimer's disease. | [18] |
In biology, epigenetics is the study of heritable traits, or a stable change of cell function, that happen without changes to the DNA sequence. The Greek prefix epi- in epigenetics implies features that are "on top of" or "in addition to" the traditional genetic mechanism of inheritance. Epigenetics usually involves a change that is not erased by cell division, and affects the regulation of gene expression. Such effects on cellular and physiological phenotypic traits may result from environmental factors, or be part of normal development. Epigenetic factors can also lead to cancer.
In biology, the epigenome of an organism is the collection of chemical changes to its DNA and histone proteins that affects when, where, and how the DNA is expressed; these changes can be passed down to an organism's offspring via transgenerational epigenetic inheritance. Changes to the epigenome can result in changes to the structure of chromatin and changes to the function of the genome. The human epigenome, including DNA methylation and histone modification, is maintained through cell division. The epigenome is essential for normal development and cellular differentiation, enabling cells with the same genetic code to perform different functions. The human epigenome is dynamic and can be influenced by environmental factors such as diet, stress, and toxins.
Bisulfitesequencing (also known as bisulphite sequencing) is the use of bisulfite treatment of DNA before routine sequencing to determine the pattern of methylation. DNA methylation was the first discovered epigenetic mark, and remains the most studied. In animals it predominantly involves the addition of a methyl group to the carbon-5 position of cytosine residues of the dinucleotide CpG, and is implicated in repression of transcriptional activity.
Epigenomics is the study of the complete set of epigenetic modifications on the genetic material of a cell, known as the epigenome. The field is analogous to genomics and proteomics, which are the study of the genome and proteome of a cell. Epigenetic modifications are reversible modifications on a cell's DNA or histones that affect gene expression without altering the DNA sequence. Epigenomic maintenance is a continuous process and plays an important role in stability of eukaryotic genomes by taking part in crucial biological mechanisms like DNA repair. Plant flavones are said to be inhibiting epigenomic marks that cause cancers. Two of the most characterized epigenetic modifications are DNA methylation and histone modification. Epigenetic modifications play an important role in gene expression and regulation, and are involved in numerous cellular processes such as in differentiation/development and tumorigenesis. The study of epigenetics on a global level has been made possible only recently through the adaptation of genomic high-throughput assays.
Methylated DNA immunoprecipitation is a large-scale purification technique in molecular biology that is used to enrich for methylated DNA sequences. It consists of isolating methylated DNA fragments via an antibody raised against 5-methylcytosine (5mC). This technique was first described by Weber M. et al. in 2005 and has helped pave the way for viable methylome-level assessment efforts, as the purified fraction of methylated DNA can be input to high-throughput DNA detection methods such as high-resolution DNA microarrays (MeDIP-chip) or next-generation sequencing (MeDIP-seq). Nonetheless, understanding of the methylome remains rudimentary; its study is complicated by the fact that, like other epigenetic properties, patterns vary from cell-type to cell-type.
The Epigenomics database at the National Center for Biotechnology Information was a database for whole-genome epigenetics data sets. It was retired on 1 June 2016.
The International Human Epigenome Consortium (IHEC) is a scientific organization, founded in 2010, that helps to coordinate global efforts in the field of Epigenomics. The initial goal was to generate at least 1,000 reference (baseline) human epigenomes from different types of normal and disease-related human cell types.
H3K27ac is an epigenetic modification to the DNA packaging protein histone H3. It is a mark that indicates acetylation of the lysine residue at N-terminal position 27 of the histone H3 protein.
H3K27me3 is an epigenetic modification to the DNA packaging protein histone H3. It is a mark that indicates the tri-methylation of lysine 27 on histone H3 protein.
An epigenome-wide association study (EWAS) is an examination of a genome-wide set of quantifiable epigenetic marks, such as DNA methylation, in different individuals to derive associations between epigenetic variation and a particular identifiable phenotype/trait. When patterns change such as DNA methylation at specific loci, discriminating the phenotypically affected cases from control individuals, this is considered an indication that epigenetic perturbation has taken place that is associated, causally or consequentially, with the phenotype.
Thomas Jenuwein is a German scientist working in the fields of epigenetics, chromatin biology, gene regulation and genome function.
H3K9me3 is an epigenetic modification to the DNA packaging protein Histone H3. It is a mark that indicates the tri-methylation at the 9th lysine residue of the histone H3 protein and is often associated with heterochromatin.
H3K4me1 is an epigenetic modification to the DNA packaging protein Histone H3. It is a mark that indicates the mono-methylation at the 4th lysine residue of the histone H3 protein and often associated with gene enhancers.
H3K36me3 is an epigenetic modification to the DNA packaging protein Histone H3. It is a mark that indicates the tri-methylation at the 36th lysine residue of the histone H3 protein and often associated with gene bodies.
H3K79me2 is an epigenetic modification to the DNA packaging protein Histone H3. It is a mark that indicates the di-methylation at the 79th lysine residue of the histone H3 protein. H3K79me2 is detected in the transcribed regions of active genes.
H4K20me is an epigenetic modification to the DNA packaging protein Histone H4. It is a mark that indicates the mono-methylation at the 20th lysine residue of the histone H4 protein. This mark can be di- and tri-methylated. It is critical for genome integrity including DNA damage repair, DNA replication and chromatin compaction.
H4K16ac is an epigenetic modification to the DNA packaging protein Histone H4. It is a mark that indicates the acetylation at the 16th lysine residue of the histone H4 protein.
H3K9ac is an epigenetic modification to the DNA packaging protein Histone H3. It is a mark that indicates the acetylation at the 9th lysine residue of the histone H3 protein.
H3K36me2 is an epigenetic modification to the DNA packaging protein Histone H3. It is a mark that indicates the di-methylation at the 36th lysine residue of the histone H3 protein.
H3K36me is an epigenetic modification to the DNA packaging protein Histone H3, specifically, the mono-methylation at the 36th lysine residue of the histone H3 protein.