Monoallelic gene expression (MAE) is the phenomenon of the gene expression, when only one of the two gene copies (alleles) is actively expressed (transcribed), while the other is silent. [1] [2] [3] Diploid organisms bear two homologous copies of each chromosome (one from each parent), a gene can be expressed from both chromosomes (biallelic expression) or from only one (monoallelic expression). [4] MAE can be Random monoallelic expression (RME) or Constitutive monoallelic expression (constitutive). Constitutive monoallelic expression occurs from the same specific allele throughout the whole organism or tissue, as a result of genomic imprinting. [5] RME is a broader class of monoallelic expression, which is defined by random allelic choice in somatic cells, so that different cells of the multi-cellular organism express different alleles.
X-chromosome inactivation (XCI), is the most striking and well-studied example of RME. XCI leads to the transcriptional silencing of one of the X chromosomes in female cells, which results in expression of the genes that located on the other, remaining active X chromosome. XCI is critical for balanced gene expression in female mammals. The allelic choice of XCI by individual cells takes place randomly in epiblasts of the preimplantation embryo, [6] which leads to mosaic gene expression of the paternal and maternal X chromosome in female tissues. [7] [8] XCI is a chromosome-wide monoallelic expression, that includes expression of all genes that are located on X chromosome, in contrast to autosomal RME (aRME) that relates to single genes that are interspersed over the genome. aRME's can be fixed [9] or dynamic, depending whether or not the allele-specific expression is conserved in daughter cells after mitotic cell division.
Fixed aRME are established either by silencing of one allele that previously has been biallelically expressed, or by activation of a single allele from previously silent gene. Expression activation of the silent allele is coupled with a feedback mechanism that prevents expression of the second allele. Another scenario is also possible due to limited time-window of low-probability initiation, that could lead to high frequencies of cells with single-allele expression. It is estimated that 2 [10] [11] -10 [12] % of all genes are fixed aRME. Studies of fixed aRME require either expansion of monoclonal cultures or lineage-traced in vivo or in vitro cells that are mitotically.
Dynamic aRME occurs as a consequence of stochastic allelic expression. Transcription happens in bursts, which results in RNA molecules being synthesized from each allele separately. So over time, both alleles have a probability to initiate transcription. Transcriptional bursts are allelically stochastic, and lead to either maternal or paternal allele being accumulated in the cell. The gene transcription burst frequency and intensity combined with RNA-degradation rate form the shape of RNA distribution at the moment of observation and thus whether the gene is bi- or monoallelic. Studies that distinguish fixed and dynamic aRME require single-cell analyses of clonally related cells. [13]
Allelic exclusion is a process of gene expression when one allele is expressed and the other one kept silent. Two most studied cases of allelic exclusion are monoallelic expression of immunoglobulins in B and T cells [14] [15] [16] and olfactory receptors in sensory neurons. [17] Allelic exclusion is cell-type specific (as opposed to organism-wide XCI), which increases intercellular diversity, thus specificity towards certain antigens or odors.
Allele-biased expression is skewed expression level of one allele over the other, but both alleles are still expressed (in contrast to allelic exclusion). This phenomenon is often observed in cells of immune function [18] [19]
Methods of MAE detection are based on the difference between alleles, which can be distinguished either by the sequence of expressed mRNA or protein structure. Methods of MAE detection can be divided into single gene or whole genome MAE analysis. Whole genome MAE analysis cannot be performed based on protein structure yet, so these are completely NGS based techniques.
Single-gene analysis
Methods of detection | Synopsis |
---|---|
RT-qPCR | can be used to detect RME by using allele specific primers, SNP-sensitive hybridization probes or allele-specific restriction sites. Can be used for single cells or clonal cell population. |
Nascent RNA FISH | visualizes nascent(which is currently being synthesized) RNA in situ . Read-out is one, two or zero fluorescent dots, which indicates mono-,di-allelic or no expression respectfully at single cell resolution. |
Cell sorting | if the gene is a surface protein, and there is the allele-specific antibody, this technique can be used to detect presence or absence of fixed or dynamic RME by running the same cell over the time. Single cell resolution. |
Live cell imaging | results in expression dynamics over time. Requires the insertion of allele-specific fluorescent protein tag (for example GFP), in order to detect signal. |
Genome-wide analysis
Methods of detection | Synopsis |
---|---|
SNP-sensitive microarrays | can be used to give an estimate fixed RME of predefined set of transcripts for clonally expanded cell populations |
RNA-seq | similarly to the previous method gives and estimate of fixed RME for clonally expanded cell populations, but for all transcripts. |
Single-cell RNA sequencing | similar to the previous methods, but superior. Since, gives an opportunity for single-cell analysis. If multiple clonally related cells are analysed, can distinguish between fixed and dynamic RME's. [20] |
Genomic imprinting is an epigenetic phenomenon that causes genes to be expressed or not, depending on whether they are inherited from the mother or the father. Genes can also be partially imprinted. Partial imprinting occurs when alleles from both parents are differently expressed rather than complete expression and complete suppression of one parent's allele. Forms of genomic imprinting have been demonstrated in fungi, plants and animals. In 2014, there were about 150 imprinted genes known in mice and about half that in humans. As of 2019, 260 imprinted genes have been reported in mice and 228 in humans.
In genetics, an enhancer is a short region of DNA that can be bound by proteins (activators) to increase the likelihood that transcription of a particular gene will occur. These proteins are usually referred to as transcription factors. Enhancers are cis-acting. They can be located up to 1 Mbp away from the gene, upstream or downstream from the start site. There are hundreds of thousands of enhancers in the human genome. They are found in both prokaryotes and eukaryotes.
Dosage compensation is the process by which organisms equalize the expression of genes between members of different biological sexes. Across species, different sexes are often characterized by different types and numbers of sex chromosomes. In order to neutralize the large difference in gene dosage produced by differing numbers of sex chromosomes among the sexes, various evolutionary branches have acquired various methods to equalize gene expression among the sexes. Because sex chromosomes contain different numbers of genes, different species of organisms have developed different mechanisms to cope with this inequality. Replicating the actual gene is impossible; thus organisms instead equalize the expression from each gene. For example, in humans, female (XX) cells randomly silence the transcription of one X chromosome, and transcribe all information from the other, expressed X chromosome. Thus, human females have the same number of expressed X-linked genes per cell as do human males (XY), both sexes having essentially one X chromosome per cell, from which to transcribe and express genes.
X-inactivation is a process by which one of the copies of the X chromosome is inactivated in therian female mammals. The inactive X chromosome is silenced by being packaged into a transcriptionally inactive structure called heterochromatin. As nearly all female mammals have two X chromosomes, X-inactivation prevents them from having twice as many X chromosome gene products as males, who only possess a single copy of the X chromosome.
In biology, the word gene can have several different meanings. The Mendelian gene is a basic unit of heredity and the molecular gene is a sequence of nucleotides in DNA that is transcribed to produce a functional RNA. There are two types of molecular genes: protein-coding genes and non-coding genes.
Allelic exclusion is a process by which only one allele of a gene is expressed while the other allele is silenced. This phenomenon is most notable for playing a role in the development of B lymphocytes, where allelic exclusion allows for each mature B lymphocyte to express only one type of immunoglobulin. This subsequently results in each B lymphocyte being able to recognize only one antigen. This is significant as the co-expression of both alleles in B lymphocytes is associated with autoimmunity and the production of autoantibodies.
Myc is a family of regulator genes and proto-oncogenes that code for transcription factors. The Myc family consists of three related human genes: c-myc (MYC), l-myc (MYCL), and n-myc (MYCN). c-myc was the first gene to be discovered in this family, due to homology with the viral gene v-myc.
H19 is a gene for a long noncoding RNA, found in humans and elsewhere. H19 has a role in the negative regulation of body weight and cell proliferation. This gene also has a role in the formation of some cancers and in the regulation of gene expression. .
Xist is a non-coding RNA transcribed from the X chromosome of the placental mammals that acts as a major effector of the X-inactivation process. It is a component of the Xic – X-chromosome inactivation centre – along with two other RNA genes and two protein genes.
Paternally-expressed gene 3 protein is a protein that in humans is encoded by the PEG3 gene. PEG3 is an imprinted gene expressed exclusively from the paternal allele and plays important roles in controlling fetal growth rates and nurturing behaviors as has potential roles in mammalian reproduction. PEG3 is a transcription factor that binds to DNA [11-13] via the sequence motif AGTnnCnnnTGGCT, which it binds to using multiple Kruppel-like factors. It also regulate the expression of Pgm2l1 through the binding of the motif.
Long non-coding RNAs are a type of RNA, generally defined as transcripts more than 200 nucleotides that are not translated into protein. This arbitrary limit distinguishes long ncRNAs from small non-coding RNAs, such as microRNAs (miRNAs), small interfering RNAs (siRNAs), Piwi-interacting RNAs (piRNAs), small nucleolar RNAs (snoRNAs), and other short RNAs. Given that some lncRNAs have been reported to have the potential to encode small proteins or micro-peptides, the latest definition of lncRNA is a class of RNA molecules of over 200 nucleotides that have no or limited coding capacity. Long intervening/intergenic noncoding RNAs (lincRNAs) are sequences of lncRNA which do not overlap protein-coding genes.
TOX high mobility group box family member 3, also known as TOX3, is a human gene.
Maternal to zygotic transition (MZT), also known as embryonic genome activation, is the stage in embryonic development during which development comes under the exclusive control of the zygotic genome rather than the maternal (egg) genome. The egg contains stored maternal genetic material mRNA which controls embryo development until the onset of MZT. After MZT the diploid embryo takes over genetic control. This requires both zygotic genome activation (ZGA), and degradation of maternal products. This process is important because it is the first time that the new embryonic genome is utilized and the paternal and maternal genomes are used in combination. The zygotic genome now drives embryo development.
Cellular noise is random variability in quantities arising in cellular biology. For example, cells which are genetically identical, even within the same tissue, are often observed to have different expression levels of proteins, different sizes and structures. These apparently random differences can have important biological and medical consequences.
Genome instability refers to a high frequency of mutations within the genome of a cellular lineage. These mutations can include changes in nucleic acid sequences, chromosomal rearrangements or aneuploidy. Genome instability does occur in bacteria. In multicellular organisms genome instability is central to carcinogenesis, and in humans it is also a factor in some neurodegenerative diseases such as amyotrophic lateral sclerosis or the neuromuscular disease myotonic dystrophy.
Tsix is a non-coding RNA gene that is antisense to the Xist RNA. Tsix binds Xist during X chromosome inactivation. The name Tsix comes from the reverse of Xist, which stands for X-inactive specific transcript.
Epigenetics of human development is the study of how epigenetics effects human development.
X chromosome inactivation (XCI) is the phenomenon that has been selected during the evolution to balance X-linked gene dosage between XX females and XY males.
Human epigenome is the complete set of structural modifications of chromatin and chemical modifications of histones and nucleotides. These modifications affect according to cellular type and development status. Various studies show that epigenome depends on exogenous factors.
X chromosome reactivation (XCR) is the process by which the inactive X chromosome (the Xi) is re-activated in the cells of eutherian female mammals. Therian female mammalian cells have two X chromosomes, while males have only one, requiring X-chromosome inactivation (XCI) for sex-chromosome dosage compensation. In eutherians, XCI is the random inactivation of one of the X chromosomes, silencing its expression. Much of the scientific knowledge currently known about XCR comes from research limited to mouse models or stem cells.