Human epigenome

Last updated

The human epigenome encompasses all chemical modifications to DNA and associated histone proteins that regulate gene expression without altering the DNA sequence. These modifications, including DNA methylation and histone modification, can be inherited through cell division (mitosis) or passed from parent to offspring (meiosis), without changing the DNA sequence itself. [1]

Contents

Such changes in gene activity are essential for normal development and cellular differentiation, enabling cells with the same genetic code to perform different functions. The human epigenome is dynamic and can be influenced by environmental factors such as diet, stress, and toxins.

Chemical modifications

Different types of chemical modifications exist and the ChIP-seq experimental procedure can be performed in order to study them. The epigenetic profiles of human tissues reveals the following distinct histone modifications in different functional areas: [2]

Active PromotersActive EnhancersTranscribed Gene BodiesSilenced Regions
H3K4me3 H3K4me1 H3K36me3 H3K27me3
H3K27ac H3K27ac H3K9me3

Methylation

DNA functionally interacts with a variety of epigenetic marks, such as cytosine methylation, also known as 5-methylcytosine (5mC). This epigenetic mark is widely conserved and plays major roles in the regulation of gene expression, in the silencing of transposable elements and repeat sequences. [3]

Individuals differ with their epigenetic profile, for example the variance in CpG methylation among individuals is about 42%. On the contrary, epigenetic profile (including methylation profile) of each individual is constant over the course of a year, reflecting the constancy of our phenotype and metabolic traits. Methylation profile, in particular, is quite stable in a 12-month period and appears to change more over decades. [4]

Methylation sites

CoRSIVs are Correlated Regions of Systemic Interindividual Variation in DNA methylation. They span only 0.1% of the human genome, so they are very rare; they can be inter-correlated over long genomic distances (>50 kbp). CoRSIVs are also associated with genes involved in a lot of human disorders, including tumors, mental disorders and cardiovascular diseases. It has been observed that disease-associated CpG sites are 37% enriched in CoRSIVs compared to control regions and 53% enriched in CoRSIVs relative to tDMRs (tissue specific Differentially Methylated Regions). [5]

Most of the CoRSIVs are only 200 – 300 bp long and include 5–10 CpG dinucleotides, the largest span several kb and involve hundreds of CpGs. These regions tend to occur in clusters and the two genomic areas of high CoRSIV density are observed at the major histocompatibility (MHC) locus on chromosome 6 and at the pericentromeric region on the long arm of chromosome 20. [5]

CoRSIVs are enriched in intergenic and quiescent regions (e.g. subtelomeric regions) and contain many transposable elements, but few CpG islands (CGI) and transcription factor binding sites. CoRSIVs are under-represented in the proximity of genes, in heterochromatic regions, active promoters, and enhancers. They are also usually not present in highly conserved genomic regions. [5]

CoRSIVs can have a useful application: measurements of CoRSIV methylation in one tissue can provide some information about epigenetic regulation in other tissues, indeed we can predict the expression of associated genes because systemic epigenetic variants are generally consistent in all tissues and cell types. [6]

Factors affecting methylation pattern

Quantification of the heritable basis underlying population epigenomic variation is also important to delineate its cis- and trans-regulatory architecture. In particular, most studies state that inter-individual differences in DNA methylation are mainly determined by cis-regulatory sequence polymorphisms, probably involving mutations in TFBSs (Transcription Factor Binding Sites) with downstream consequences on local chromatin environment. The sparsity of trans-acting polymorphisms in humans suggests that such effects are highly deleterious. Indeed, trans-acting factors are expected to be caused by mutations in chromatin control genes or other highly pleiotropic regulators. If trans-acting variants do exist in human populations, they probably segregate as rare alleles or originate from somatic mutations and present with clinical phenotypes, as is the case in many cancers. [3]

Correlation between methylation and gene expression

DNA methylation (in particular in CpG regions) is able to affect gene expression: hypermethylated regions tend to be differentially expressed. In fact, people with a similar methylation profile tend to also have the same transcriptome. Moreover, one key observation from human methylation is that most functionally relevant changes in CpG methylation occur in regulatory elements, such as enhancers.

Anyway, differential expression concerns only a slight number of methylated genes: only one fifth of genes with CpG methylation shows variable expression according to their methylation state. It is important to notice that methylation is not the only factor affecting gene regulation. [4]

Methylation in embryos

It was revealed by immunostaining experiments that in human preimplantation embryos there is a global DNA demethylation process. After fertilisation, the DNA methylation level decreases sharply in the early pronuclei. This is a consequence of active DNA demethylation at this stage. But global demethylation is not an irreversible process, in fact de novo methylation occurring from the early to mid-pronuclear stage and from the 4-cell to the 8-cell stage. [7]

The percentage of DNA methylation is different in oocytes and in sperm: the mature oocyte has an intermediate level of DNA methylation (72%), instead the sperm has high level of DNA methylation (86%). Demethylation in paternal genome occurs quickly after fertilisation, whereas the maternal genome is quite resistant at the demethylation process at this stage. Maternal different methylated regions (DMRs) are more resistant to the preimplantation demethylation wave. [7]

CpG methylation is similar in germinal vesicle (GV) stage, intermediate metaphase I (MI) stage and mature metaphase II (MII) stage. Non-CpG methylation continues to accumulate in these stages. [7]

Chromatin accessibility in germline was evaluated by different approaches, like scATAC-seq and sciATAC-seq, scCOOL-seq, scNOMe-seq and scDNase-seq. Stage-specific proximal and distal regions with accessible chromatin regions were identified. Global chromatin accessibility is found to gradually decrease from the zygote to the 8-cell stage and then increase. Parental allele-specific analysis shows that paternal genome becomes more open than the maternal genome from the late zygote stage to the 4-cell stage, which may reflect decondensation of the paternal genome with replacement of protamines by histones. [7]

Sequence-Dependent Allele-Specific Methylation

DNA methylation imbalances between homologous chromosomes show sequence-dependent behavior. Difference in the methylation state of neighboring cytosines on the same chromosome occurs due to the difference in DNA sequence between the chromosomes. Whole-genome bisulfite sequencing (WGBS) is used to explore sequence-dependent allele-specific methylation (SD-ASM) at a single-chromosome resolution level and comprehensive whole-genome coverage. The results of WGBS tested on 49 methylomes revealed CpG methylation imbalances exceeding 30% differences in 5% of the loci. [8]

On the sites of gene regulatory loci bound by transcription factors the random switching between methylated and unmethylated states of DNA was observed. This is also referred as stochastic switching and it is linked to selective buffering of gene regulatory circuit against mutations and genetic diseases. Only rare genetic variants show the stochastic type of gene regulation.

The study made by Onuchic et al. was aimed to construct the maps of allelic imbalances in DNA methylation, gene transcription, and also of histone modifications. 36 cell and tissue types from 13 participant donors were used to examine 71 epigenomes. The results of WGBS tested on 49 methylomes revealed CpG methylation imbalances exceeding 30% differences in 5% of the loci. The stochastic switching occurred at thousands of heterozygous regulatory loci that were bound to transcription factors. The intermediate methylation state is referred to the relative frequencies between methylated and unmethylated epialleles. The epiallele frequency variations are correlated with the allele affinity for transcription factors.

The analysis of the study suggests that human epigenome in average covers approximately 200 adverse SD-ASM variants. The sensitivity of the genes with tissue-specific expression patterns gives the opportunity for the evolutionary innovation in gene regulation. [8]

Haplotype reconstruction strategy is used to trace chromatin chemical modifications (using ChIP-seq) in a variety of human tissues. Haplotype-resolved epigenomic maps can trace allelic biases in chromatin configuration. A substantial variation among different tissues and individuals is observed. This allows the deeper understanding of cis-regulatory relationships between genes and control sequences. [2]

Structural modifications

During the last few years, several methods have been developed to study the structural and consequently the functional modifications of chromatin. The first project that used epigenomic profiling to identify regulatory elements in the human genome was ENCODE (Encyclopedia of DNA Elements) that focused on profiling histone modifications on cell lines. A few years later ENCODE was included in the International Human Epigenome Consortium (IHEC), which aims to coordinate international epigenome studies. [9]

The structural modifications that these projects aim to study can be divided into five main groups:

Topological associated domains (TADs)

Topological associated domains are a degree of structural organization of the genome of the cell. They are formed by regions of chromatin, sized from 100 kilobases up to megabases, which highly self-interact. The domains are linked by other genomic regions, which, based on their size, are either called “topological boundary regions” or “unorganized chromatin”. These boundary regions separate the topological domains from heterochromatin, and prevent the amplification of the latter. Topological domains are diffused in mammalian, although similar genome partitions were identified also in Drosophila. [10]

Topological domains in humans, like in other mammalians, have many functions regarding gene expression and transcriptional control process. Inside these domains, the chromatin shows to be well tangled, while in the boundary regions chromatin interactions are far less present. [11] These boundary areas in particular show some peculiarity that determine the functions of all the topological domains.

Firstly, they contain insulator regions and barrier elements, both of which function as inhibitors of further transcription from the RNA polymerase enzyme. [12] Such elements are characterized by the massive presence of insulator binding proteins CTCF.

Secondly, boundary regions block heterochromatin spreading, thus preventing the loss of useful genetic informations. This information derives from the observation that the heterochromatin mark H3K9me3 sequences clearly interrupts near boundary sequences. [13]

Thirdly, transcription start sites (TSS), housekeeping genes and tRNA genes are particularly abundant in boundary regions, denoting that those areas have a prolific transcriptional activity, thanks to their structural characteristics, different from other topological regions. [14] [15]

Finally, in the border areas of the topological domains and their surroundings there is an enrichment of Alu/B1 and B2 SINE retrotransposons. In the recent years, those sequences were referred to alter binding site of CTCF, thus interfering with expression of some genomic areas. [16]

Further proofs towards a role in genetic modulation and transcription regulation refers to the great conservation of the boundary pattern across mammalian evolution, with a dynamic range of small diversities inside different cell types, suggesting that these topological domains take part in cell-type specific regulatory events. [11]

Correlation between methylation and 3D structure

The 4D Nucleome project aims to realize a 3D maps of mammalian genomes in order to develop predictive models to correlate epigenomic modifications with genetic variation. In particular the goal is to link genetic and epigenomic modifications with the enhancers and promoters which they interact with in three-dimensional space, thus discovering gene-set interactomes and pathways as new candidates for functional analysis and therapeutic targeting.

Hi-C [17] is an experimental method used to map the connections between DNA fragments in three-dimensional space on a genome-wide scale. This technique combines chemical crosslinking of chromatin with restriction enzyme digestion and next-generation DNA sequencing. [18]

This kind of studies are currently limited by the lack or unavailability of raw data. [9]

See also

Related Research Articles

<span class="mw-page-title-main">Epigenetics</span> Study of DNA modifications that do not change its sequence

In biology, epigenetics is the study of heritable traits, or a stable change of cell function, that happen without changes to the DNA sequence. The Greek prefix epi- in epigenetics implies features that are "on top of" or "in addition to" the traditional genetic mechanism of inheritance. Epigenetics usually involves a change that is not erased by cell division, and affects the regulation of gene expression. Such effects on cellular and physiological phenotypic traits may result from environmental factors, or be part of normal development. They can lead to cancer.

<span class="mw-page-title-main">Epigenome</span> Biological term

In biology, the epigenome of an organism is the collection of chemical changes to its DNA and histone proteins that affects when, where, and how the DNA is expressed; these changes can be passed down to an organism's offspring via transgenerational stranded epigenetic inheritance. Changes to the epigenome can result in changes to the structure of chromatin and changes to the function of the genome.

<span class="mw-page-title-main">Computational epigenetics</span>

Computational epigenetics uses statistical methods and mathematical modelling in epigenetic research. Due to the recent explosion of epigenome datasets, computational methods play an increasing role in all areas of epigenetic research.

Epigenomics is the study of the complete set of epigenetic modifications on the genetic material of a cell, known as the epigenome. The field is analogous to genomics and proteomics, which are the study of the genome and proteome of a cell. Epigenetic modifications are reversible modifications on a cell's DNA or histones that affect gene expression without altering the DNA sequence. Epigenomic maintenance is a continuous process and plays an important role in stability of eukaryotic genomes by taking part in crucial biological mechanisms like DNA repair. Plant flavones are said to be inhibiting epigenomic marks that cause cancers. Two of the most characterized epigenetic modifications are DNA methylation and histone modification. Epigenetic modifications play an important role in gene expression and regulation, and are involved in numerous cellular processes such as in differentiation/development and tumorigenesis. The study of epigenetics on a global level has been made possible only recently through the adaptation of genomic high-throughput assays.

H3K4me3 is an epigenetic modification to the DNA packaging protein Histone H3 that indicates tri-methylation at the 4th lysine residue of the histone H3 protein and is often involved in the regulation of gene expression. The name denotes the addition of three methyl groups (trimethylation) to the lysine 4 on the histone H3 protein.

H3K27me3 is an epigenetic modification to the DNA packaging protein Histone H3. It is a mark that indicates the tri-methylation of lysine 27 on histone H3 protein.

H3K9me3 is an epigenetic modification to the DNA packaging protein Histone H3. It is a mark that indicates the tri-methylation at the 9th lysine residue of the histone H3 protein and is often associated with heterochromatin.

H3K4me1 is an epigenetic modification to the DNA packaging protein Histone H3. It is a mark that indicates the mono-methylation at the 4th lysine residue of the histone H3 protein and often associated with gene enhancers.

H3K36me3 is an epigenetic modification to the DNA packaging protein Histone H3. It is a mark that indicates the tri-methylation at the 36th lysine residue of the histone H3 protein and often associated with gene bodies.

H3K79me2 is an epigenetic modification to the DNA packaging protein Histone H3. It is a mark that indicates the di-methylation at the 79th lysine residue of the histone H3 protein. H3K79me2 is detected in the transcribed regions of active genes.

H4K20me is an epigenetic modification to the DNA packaging protein Histone H4. It is a mark that indicates the mono-methylation at the 20th lysine residue of the histone H4 protein. This mark can be di- and tri-methylated. It is critical for genome integrity including DNA damage repair, DNA replication and chromatin compaction.

H4K16ac is an epigenetic modification to the DNA packaging protein Histone H4. It is a mark that indicates the acetylation at the 16th lysine residue of the histone H4 protein.

H3K36me2 is an epigenetic modification to the DNA packaging protein Histone H3. It is a mark that indicates the di-methylation at the 36th lysine residue of the histone H3 protein.

H3K36me is an epigenetic modification to the DNA packaging protein Histone H3, specifically, the mono-methylation at the 36th lysine residue of the histone H3 protein.

H3R42me is an epigenetic modification to the DNA packaging protein histone H3. It is a mark that indicates the mono-methylation at the 42nd arginine residue of the histone H3 protein. In epigenetics, arginine methylation of histones H3 and H4 is associated with a more accessible chromatin structure and thus higher levels of transcription. The existence of arginine demethylases that could reverse arginine methylation is controversial.

H3R17me2 is an epigenetic modification to the DNA packaging protein histone H3. It is a mark that indicates the di-methylation at the 17th arginine residue of the histone H3 protein. In epigenetics, arginine methylation of histones H3 and H4 is associated with a more accessible chromatin structure and thus higher levels of transcription. The existence of arginine demethylases that could reverse arginine methylation is controversial.

H3R26me2 is an epigenetic modification to the DNA packaging protein histone H3. It is a mark that indicates the di-methylation at the 26th arginine residue of the histone H3 protein. In epigenetics, arginine methylation of histones H3 and H4 is associated with a more accessible chromatin structure and thus higher levels of transcription. The existence of arginine demethylases that could reverse arginine methylation is controversial.

H3R8me2 is an epigenetic modification to the DNA packaging protein histone H3. It is a mark that indicates the di-methylation at the 8th arginine residue of the histone H3 protein. In epigenetics, arginine methylation of histones H3 and H4 is associated with a more accessible chromatin structure and thus higher levels of transcription. The existence of arginine demethylases that could reverse arginine methylation is controversial.

H3R2me2 is an epigenetic modification to the DNA packaging protein histone H3. It is a mark that indicates the di-methylation at the 2nd arginine residue of the histone H3 protein. In epigenetics, arginine methylation of histones H3 and H4 is associated with a more accessible chromatin structure and thus higher levels of transcription. The existence of arginine demethylases that could reverse arginine methylation is controversial.

H4R3me2 is an epigenetic modification to the DNA packaging protein histone H4. It is a mark that indicates the di-methylation at the 3rd arginine residue of the histone H4 protein. In epigenetics, arginine methylation of histones H3 and H4 is associated with a more accessible chromatin structure and thus higher levels of transcription. The existence of arginine demethylases that could reverse arginine methylation is controversial.

References

  1. Delcuve GP, Rastegar M, Davie JR (May 2009). "Epigenetic control". Journal of Cellular Physiology. 219 (2): 243–50. doi:10.1002/jcp.21678. PMID   19127539.
  2. 1 2 Leung D, Jung I, Rajagopal N, Schmitt A, Selvaraj S, Lee AY, et al. (February 2015). "Integrative analysis of haplotype-resolved epigenomes across human tissues". Nature. 518 (7539): 350–354. Bibcode:2015Natur.518..350L. doi:10.1038/nature14217. PMC   4449149 . PMID   25693566.
  3. 1 2 Taudt A, Colomé-Tatché M, Johannes F (June 2016). "Genetic sources of population epigenomic variation". Nature Reviews. Genetics. 17 (6): 319–332. doi:10.1038/nrg.2016.45. PMID   27156976. S2CID   336906.
  4. 1 2 Tabassum R, Sivadas A, Agrawal V, Tian H, Arafat D, Gibson G (August 2015). "Omic personality: implications of stable transcript and methylation profiles for personalized medicine". Genome Medicine. 7 (1): 88. doi: 10.1186/s13073-015-0209-4 . PMC   4578259 . PMID   26391122.
  5. 1 2 3 Gunasekara CJ, Scott CA, Laritsky E, Baker MS, MacKay H, Duryea JD, et al. (June 2019). "A genomic atlas of systemic interindividual epigenetic variation in humans". Genome Biology. 20 (1): 105. doi: 10.1186/s13059-019-1708-1 . PMC   6545702 . PMID   31155008.
  6. Waterland RA, Michels KB (2007). "Epigenetic epidemiology of the developmental origins hypothesis". Annual Review of Nutrition. 27 (1): 363–388. doi:10.1146/annurev.nutr.27.061406.093705. PMID   17465856.
  7. 1 2 3 4 Wen L, Tang F (October 2019). "Human Germline Cell Development: from the Perspective of Single-Cell Sequencing". Molecular Cell. 76 (2): 320–328. doi: 10.1016/j.molcel.2019.08.025 . PMID   31563431.
  8. 1 2 Onuchic V, Lurie E, Carrero I, Pawliczek P, Patel RY, Rozowsky J, et al. (September 2018). "Allele-specific epigenome maps reveal sequence-dependent stochastic switching at regulatory loci". Science. 361 (6409). New York, N.Y. doi:10.1126/science.aar3146. PMC   6198826 . PMID   30139913.
  9. 1 2 3 Stricker SH, Köferle A, Beck S (January 2017). "From profiles to function in epigenomics". Nature Reviews. Genetics. 18 (1): 51–66. doi:10.1038/nrg.2016.138. PMID   27867193. S2CID   4461801.
  10. Sexton T, Yaffe E, Kenigsberg E, Bantignies F, Leblanc B, Hoichman M, et al. (February 2012). "Three-dimensional folding and functional organization principles of the Drosophila genome". Cell. 148 (3): 458–472. doi: 10.1016/j.cell.2012.01.010 . PMID   22265598.
  11. 1 2 Dixon JR, Selvaraj S, Yue F, Kim A, Li Y, Shen Y, et al. (April 2012). "Topological domains in mammalian genomes identified by analysis of chromatin interactions". Nature. 485 (7398): 376–380. Bibcode:2012Natur.485..376D. doi:10.1038/nature11082. PMC   3356448 . PMID   22495300.
  12. Kim YJ, Cecchini KR, Kim TH (May 2011). "Conserved, developmentally regulated mechanism couples chromosomal looping and heterochromatin barrier activity at the homeobox gene A locus". Proceedings of the National Academy of Sciences of the United States of America. 108 (18): 7391–7396. Bibcode:2011PNAS..108.7391K. doi: 10.1073/pnas.1018279108 . PMC   3088595 . PMID   21502535.
  13. Hawkins RD, Hon GC, Lee LK, Ngo Q, Lister R, Pelizzola M, et al. (May 2010). "Distinct epigenomic landscapes of pluripotent and lineage-committed human cells". Cell Stem Cell. 6 (5): 479–491. doi:10.1016/j.stem.2010.03.018. PMC   2867844 . PMID   20452322.
  14. Min IM, Waterfall JJ, Core LJ, Munroe RJ, Schimenti J, Lis JT (April 2011). "Regulating RNA polymerase pausing and transcription elongation in embryonic stem cells". Genes & Development. 25 (7): 742–754. doi:10.1101/gad.2005511. PMC   3070936 . PMID   21460038.
  15. Ebersole T, Kim JH, Samoshkin A, Kouprina N, Pavlicek A, White RJ, et al. (August 2011). "tRNA genes protect a reporter gene from epigenetic silencing in mouse cells". Cell Cycle. 10 (16): 2779–2791. doi:10.4161/cc.10.16.17092. PMC   3219543 . PMID   21822054.
  16. Schmidt D, Schwalie PC, Wilson MD, Ballester B, Gonçalves A, Kutter C, et al. (January 2012). "Waves of retrotransposon expansion remodel genome organization and CTCF binding in multiple mammalian lineages". Cell. 148 (1–2): 335–348. doi:10.1016/j.cell.2011.11.058. PMC   3368268 . PMID   22244452.
  17. Kumasaka N, Knights AJ, Gaffney DJ (January 2019). "High-resolution genetic mapping of putative causal interactions between regions of open chromatin". Nature Genetics. 51 (1): 128–137. doi:10.1038/s41588-018-0278-6. PMC   6330062 . PMID   30478436.
  18. Eagen KP (June 2018). "Principles of Chromosome Architecture Revealed by Hi-C". Trends in Biochemical Sciences. 43 (6): 469–478. doi:10.1016/j.tibs.2018.03.006. PMC   6028237 . PMID   29685368.