Cis-natural antisense transcript

Last updated

Natural antisense transcripts (NATs) are a group of RNAs encoded within a cell that have transcript complementarity to other RNA transcripts. [1] They have been identified in multiple eukaryotes, including humans, mice, yeast and Arabidopsis thaliana . [2] This class of RNAs includes both protein-coding and non-coding RNAs. [3] Current evidence has suggested a variety of regulatory roles for NATs, such as RNA interference (RNAi), alternative splicing, genomic imprinting, and X-chromosome inactivation. [4] NATs are broadly grouped into two categories based on whether they act in cis or in trans. [5] Trans-NATs are transcribed from a different location than their targets and usually have complementarity to multiple transcripts with some mismatches. [6] MicroRNAs (miRNA) are an example of trans-NATs that can target multiple transcripts with a few mismatches. [6] Cis-natural antisense transcripts (cis-NATs) on the other hand are transcribed from the same genomic locus as their target but from the opposite DNA strand and form perfect pairs. [7]

Contents

Orientation

Figure 1: Orientations of cis-NATs within the genome Cis-NAT orientations.jpg
Figure 1: Orientations of cis-NATs within the genome

Cis-NATs have a variety of orientations and differing lengths of overlap between pairs. [7] There have been five identified orientations for cis-NATs to date. [8] The most common orientation is head-to-head, where the 5' ends of both transcripts align together. [3] This orientation would result in the greatest knockdown of gene expression if transcriptional collision is the reason for transcript inhibition. [1] There are however some studies that have suggested that tail-to-tail orientations are the most common NAT pairs. [1] Others such as tail to tail, overlapping, nearby head-to- head, and nearby tail-to-tail are less frequently encountered. [1] Completely overlapping NATs involve the antisense gene being located completely over top of each other. [3] Nearby head-to-head and tail-to-tail orientations are physically discrete from each other but are located very close to each other. [1] Current evidence suggests that there is an overrepresentation of NAT pairs in genes that have catalytic activity. [3] There may be something about these genes in particular that makes them more prone to this type of regulation.

Identification approach

Identification of NATs in whole genomes is possible due to the large collection of sequence data available from multiple organisms. In silico methods for detecting NATs suffer from several shortcomings depending on the source of sequence information. [7] Studies that use mRNA have sequences whose orientations are known, but the amount of mRNA sequence information available is small. [3] Predicted gene models using algorithms trained to look for genes gives an increased coverage of the genome at the cost of confidence in the identified gene. [7] Another resource is the extensive expressed sequence tag (EST) libraries but these small sequences must first be assigned an orientation before useful information can be extracted from them. [3] Some studies have utilized special sequence information in the ESTs such as the poly(A) signal, poly(A) tail, and splicing sites to both filter the ESTs and to give them the correct transcriptional orientation. [1] Combinations of the different sequence sources attempts to maximize coverage as well as maintain integrity in the data.

Pairs of NATs are identified when they form overlapping clusters. There is variability in the cut-off values used in different studies but generally ~20 nucleotides of sequence overlap is considered the minimum for transcripts to be considered and overlapping cluster. [1] Also, transcripts must map to only one other mRNA molecule in order for it to be considered a NAT pair. [1] [7] Currently there are a variety of web and software resources that can be used to look for antisense pairs. The NATsdb or Natural Antisense Transcript database is a rich tool for searching for antisense pairs from multiple organisms.

Mechanisms

Transcription collision model for expression inhibition Transcriptional Collision Cis-NATs.jpg
Transcription collision model for expression inhibition

Molecular mechanisms behind the regulatory role of cis-NATs are not currently well understood. [3] Three models have been proposed to explain the regulatory effects that cis-NATs have on gene expression. The first model attributes that base pairing between the cis-NAT and its complementary transcript result in a knockdown of mRNA expression. [9] The assumption of this model is that there will be a precise alignment of at least 6 base pairs between the cis-NAT pair to make double stranded RNA. [1] Epigenetic modifications like DNA methylation and post-translational modification of core histones form the basis of the second model. [1] Although it is not yet clearly understood, it is thought that the reverse transcript guides methylation complexes and/or histone-modifying complexes to the promoter regions of the sense transcript and cause an inhibition of expression from the gene. [1] Currently it is not known what attributes of cis-NATs are crucial for the epigenetic model of regulation. [1] The final proposed model that has gained favour due to recent experimental evidence is the transcriptional collision model. During the process of transcription of cis-NATs, the transcriptional complexes assemble in the promoter regions of the gene. RNA polymerases will then begin transcribing the gene at the transcription initiation site laying down nucleotides in a 5' to 3' direction. [6] In the areas of overlap between the cis-NATs the RNA polymerases will collide and stop at the crash site. [1] Transcription is inhibited because RNA polymerases prematurely stop and their incomplete transcripts get degraded. [10]

Importance

Regulation of many biological processes such as development, metabolism and many others requires a careful co-ordination between many different genes; this is usually referred to as a gene regulatory network. A flurry of interest in gene regulatory networks has been sparked by the advent of sequenced genomes of multiple organisms. The next step is to use this information to figure out how genes work together and not just in isolation. During the processes of mammalian development, there is an inactivation of the extra X-chromosome in females. It has been shown that a NAT pair called Xist and Tsix are involved in the hypermethylation of the chromosome. [11] As much as 20–30% of mammalian genes have been shown to be the targets of miRNAs, which highlights the importance of these molecules as regulators across a wide number of genes. [12] Evolutionary reasons for utilizing RNA for regulation of genes may be that it is less costly and faster than synthesizing proteins not needed by the cell. [1] This could have had a selective advantage for early eukaryotes with this type of transcriptional regulation.

Disease

Figure 3: Aberrant transcription of antisense transcripts can result in inhibition of oncogenes and allow cell to continue past cell cycle check points. Putative new oncogenes and tumor suppressor genes can be found by looking for upregulated antisense transcripts in cancer cells. Cis-Nats and cancer.jpg
Figure 3: Aberrant transcription of antisense transcripts can result in inhibition of oncogenes and allow cell to continue past cell cycle check points. Putative new oncogenes and tumor suppressor genes can be found by looking for upregulated antisense transcripts in cancer cells.

Antisense transcription might contribute to disease through chromosomal changes that result in the production of aberrant antisense transcripts. [4] A documented case of cis-NATs being involved in human disease comes from an inherited form of α-thalassemia where there is silencing of the hemoglobin α-2 gene through the action of a cis-NAT. [4] It is thought that in malignant cancer cells with activated transposable elements creates a large amount of transcriptional noise. [4] It is likely that aberrant antisense RNA transcripts resulting from this transcriptional noise may cause stochastic methylation of CpG islands associated with oncogenes and tumor suppressor genes. [4] This inhibition would further progress the malignancy of the cells since they lose key regulator genes. [4] By looking at upregulated antisense transcripts in tumor cells, researchers are able to look for more candidate tumor suppressor genes. [4] Also, aberrant cis-NATs have been implicated in neurological diseases such as Parkinson's disease. [4]

Related Research Articles

<span class="mw-page-title-main">Promoter (genetics)</span> Region of DNA encouraging transcription

In genetics, a promoter is a sequence of DNA to which proteins bind to initiate transcription of a single RNA transcript from the DNA downstream of the promoter. The RNA transcript may encode a protein (mRNA), or can have a function in and of itself, such as tRNA or rRNA. Promoters are located near the transcription start sites of genes, upstream on the DNA . Promoters can be about 100–1000 base pairs long, the sequence of which is highly dependent on the gene and product of transcription, type or class of RNA polymerase recruited to the site, and species of organism.

<span class="mw-page-title-main">Gene expression</span> Conversion of a genes sequence into a mature gene product or products

Gene expression is the process by which information from a gene is used in the synthesis of a functional gene product that enables it to produce end products, proteins or non-coding RNA, and ultimately affect a phenotype. These products are often proteins, but in non-protein-coding genes such as transfer RNA (tRNA) and small nuclear RNA (snRNA), the product is a functional non-coding RNA. Gene expression is summarized in the central dogma of molecular biology first formulated by Francis Crick in 1958, further developed in his 1970 article, and expanded by the subsequent discoveries of reverse transcription and RNA replication.

<span class="mw-page-title-main">Transcription (biology)</span> Process of copying a segment of DNA into RNA

Transcription is the process of copying a segment of DNA into RNA. The segments of DNA transcribed into RNA molecules that can encode proteins are said to produce messenger RNA (mRNA). Other segments of DNA are copied into RNA molecules called non-coding RNAs (ncRNAs). mRNA comprises only 1–3% of total RNA samples. Less than 2% of the human genome can be transcribed into mRNA, while at least 80% of mammalian genomic DNA can be actively transcribed, with the majority of this 80% considered to be ncRNA.

The coding region of a gene, also known as the coding sequence(CDS), is the portion of a gene's DNA or RNA that codes for protein. Studying the length, composition, regulation, splicing, structures, and functions of coding regions compared to non-coding regions over different species and time periods can provide a significant amount of important information regarding gene organization and evolution of prokaryotes and eukaryotes. This can further assist in mapping the human genome and developing gene therapy.

<span class="mw-page-title-main">Enhancer (genetics)</span> DNA sequence that binds activators to increase the likelihood of gene transcription

In genetics, an enhancer is a short region of DNA that can be bound by proteins (activators) to increase the likelihood that transcription of a particular gene will occur. These proteins are usually referred to as transcription factors. Enhancers are cis-acting. They can be located up to 1 Mbp away from the gene, upstream or downstream from the start site. There are hundreds of thousands of enhancers in the human genome. They are found in both prokaryotes and eukaryotes.

In molecular biology and genetics, transcriptional regulation is the means by which a cell regulates the conversion of DNA to RNA (transcription), thereby orchestrating gene activity. A single gene can be regulated in a range of ways, from altering the number of copies of RNA that are transcribed, to the temporal control of when the gene is transcribed. This control allows the cell or organism to respond to a variety of intra- and extracellular signals and thus mount a response. Some examples of this include producing the mRNA that encode enzymes to adapt to a change in a food source, producing the gene products involved in cell cycle specific activities, and producing the gene products responsible for cellular differentiation in multicellular eukaryotes, as studied in evolutionary developmental biology.

<span class="mw-page-title-main">Regulation of gene expression</span> Modifying mechanisms used by cells to increase or decrease the production of specific gene products

Regulation of gene expression, or gene regulation, includes a wide range of mechanisms that are used by cells to increase or decrease the production of specific gene products. Sophisticated programs of gene expression are widely observed in biology, for example to trigger developmental pathways, respond to environmental stimuli, or adapt to new food sources. Virtually any step of gene expression can be modulated, from transcriptional initiation, to RNA processing, and to the post-translational modification of a protein. Often, one gene regulator controls another, and so on, in a gene regulatory network.

<span class="mw-page-title-main">Antisense RNA</span>

Antisense RNA (asRNA), also referred to as antisense transcript, natural antisense transcript (NAT) or antisense oligonucleotide, is a single stranded RNA that is complementary to a protein coding messenger RNA (mRNA) with which it hybridizes, and thereby blocks its translation into protein. The asRNAs have been found in both prokaryotes and eukaryotes, and can be classified into short and long non-coding RNAs (ncRNAs). The primary function of asRNA is regulating gene expression. asRNAs may also be produced synthetically and have found wide spread use as research tools for gene knockdown. They may also have therapeutic applications.

<span class="mw-page-title-main">Primary transcript</span> RNA produced by transcription

A primary transcript is the single-stranded ribonucleic acid (RNA) product synthesized by transcription of DNA, and processed to yield various mature RNA products such as mRNAs, tRNAs, and rRNAs. The primary transcripts designated to be mRNAs are modified in preparation for translation. For example, a precursor mRNA (pre-mRNA) is a type of primary transcript that becomes a messenger RNA (mRNA) after processing.

Cis-regulatory elements (CREs) or Cis-regulatory modules (CRMs) are regions of non-coding DNA which regulate the transcription of neighboring genes. CREs are vital components of genetic regulatory networks, which in turn control morphogenesis, the development of anatomy, and other aspects of embryonic development, studied in evolutionary developmental biology.

Gene structure is the organisation of specialised sequence elements within a gene. Genes contain most of the information necessary for living cells to survive and reproduce. In most organisms, genes are made of DNA, where the particular DNA sequence determines the function of the gene. A gene is transcribed (copied) from DNA into RNA, which can either be non-coding (ncRNA) with a direct function, or an intermediate messenger (mRNA) that is then translated into protein. Each of these steps is controlled by specific sequence elements, or regions, within the gene. Every gene, therefore, requires multiple sequence elements to be functional. This includes the sequence that actually encodes the functional protein or ncRNA, as well as multiple regulatory sequence regions. These regions may be as short as a few base pairs, up to many thousands of base pairs long.

<span class="mw-page-title-main">Eukaryotic transcription</span> Transcription is heterocatalytic function of DNA

Eukaryotic transcription is the elaborate process that eukaryotic cells use to copy genetic information stored in DNA into units of transportable complementary RNA replica. Gene transcription occurs in both eukaryotic and prokaryotic cells. Unlike prokaryotic RNA polymerase that initiates the transcription of all different types of RNA, RNA polymerase in eukaryotes comes in three variations, each translating a different type of gene. A eukaryotic cell has a nucleus that separates the processes of transcription and translation. Eukaryotic transcription occurs within the nucleus where DNA is packaged into nucleosomes and higher order chromatin structures. The complexity of the eukaryotic genome necessitates a great variety and complexity of gene expression control.

<span class="mw-page-title-main">Tiling array</span>

Tiling arrays are a subtype of microarray chips. Like traditional microarrays, they function by hybridizing labeled DNA or RNA target molecules to probes fixed onto a solid surface.

<span class="mw-page-title-main">Long non-coding RNA</span> Non-protein coding transcripts longer than 200 nucleotides

Long non-coding RNAs are a type of RNA, generally defined as transcripts more than 200 nucleotides that are not translated into protein. This arbitrary limit distinguishes long ncRNAs from small non-coding RNAs, such as microRNAs (miRNAs), small interfering RNAs (siRNAs), Piwi-interacting RNAs (piRNAs), small nucleolar RNAs (snoRNAs), and other short RNAs. Given that some lncRNAs have been reported to have the potential to encode small proteins or micro-peptides, the latest definition of lncRNA is a class of RNA molecules of over 200 nucleotides that have no or limited coding capacity. Long intervening/intergenic noncoding RNAs (lincRNAs) are sequences of lncRNA which do not overlap protein-coding genes.

Cryptic unstable transcripts (CUTs) are a subset of non-coding RNAs (ncRNAs) that are produced from intergenic and intragenic regions. CUTs were first observed in S. cerevisiae yeast models and are found in most eukaryotes. Some basic characteristics of CUTs include a length of around 200–800 base pairs, a 5' cap, poly-adenylated tail, and rapid degradation due to the combined activity of poly-adenylating polymerases and exosome complexes. CUT transcription occurs through RNA Polymerase II and initiates from nucleosome-depleted regions, often in an antisense orientation. To date, CUTs have a relatively uncharacterized function but have been implicated in a number of putative gene regulation and silencing pathways. Thousands of loci leading to the generation of CUTs have been described in the yeast genome. Additionally, stable uncharacterized transcripts, or SUTs, have also been detected in cells and bear many similarities to CUTs but are not degraded through the same pathways.

Natural antisense short interfering RNA (natsiRNA) is a type of siRNA. They are endogenous RNA regulators which are between 21 and 24 nucleotides in length, and are generated from complementary mRNA transcripts which are further processed into siRNA.

Enhancer RNAs (eRNAs) represent a class of relatively long non-coding RNA molecules transcribed from the DNA sequence of enhancer regions. They were first detected in 2010 through the use of genome-wide techniques such as RNA-seq and ChIP-seq. eRNAs can be subdivided into two main classes: 1D eRNAs and 2D eRNAs, which differ primarily in terms of their size, polyadenylation state, and transcriptional directionality. The expression of a given eRNA correlates with the activity of its corresponding enhancer in target genes. Increasing evidence suggests that eRNAs actively play a role in transcriptional regulation in cis and in trans, and while their mechanisms of action remain unclear, a few models have been proposed.

<span class="mw-page-title-main">CRISPR interference</span> Genetic perturbation technique

CRISPR interference (CRISPRi) is a genetic perturbation technique that allows for sequence-specific repression of gene expression in prokaryotic and eukaryotic cells. It was first developed by Stanley Qi and colleagues in the laboratories of Wendell Lim, Adam Arkin, Jonathan Weissman, and Jennifer Doudna. Sequence-specific activation of gene expression refers to CRISPR activation (CRISPRa).

<span class="mw-page-title-main">Short interspersed nuclear element</span>

Short interspersed nuclear elements (SINEs) are non-autonomous, non-coding transposable elements (TEs) that are about 100 to 700 base pairs in length. They are a class of retrotransposons, DNA elements that amplify themselves throughout eukaryotic genomes, often through RNA intermediates. SINEs compose about 13% of the mammalian genome.

<span class="mw-page-title-main">Anti small RNA</span> RNA sequences

Antisense small RNAs are short RNA sequences that are complementary to other small RNA (sRNA) in the cell.

References

  1. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Osato N, Suzuki Y, Ikeo K, Gojobori T (2007). "Transcriptional Interferences in cis Natural Antisense Transcripts of Humans and Mice". Genetics. 176 (12): 1299–1306. doi:10.1534/genetics.106.069484. PMC   1894591 . PMID   17409075.
  2. Vanhée-Brossollet C, Vaquero C (1998). "Do natural antisense transcripts make sense in eukaryotes?". Gene. 211 (1): 1–9. doi:10.1016/S0378-1119(98)00093-6. PMID   9573333.
  3. 1 2 3 4 5 6 7 Lavorgna G, Dahary D, Lehner B, Sorek R, Sanderson CM, Casari G (2004). "In search of antisense". Trends Biochem. Sci. 29 (2): 88–94. doi:10.1016/j.tibs.2003.12.002. PMID   15102435.
  4. 1 2 3 4 5 6 7 8 Zhang Y, Liu XS, Liu QR, Wei L (2006). "Genome-wide in silico identification and analysis of cis natural antisense transcripts (cis-NATs) in ten species". Nucleic Acids Research. 34 (12): 3465–3475. doi:10.1093/nar/gkl473. PMC   1524920 . PMID   16849434.
  5. Chen J, Sun M, Kent WJ, Huang X, Xie H, Wang W, Zhou G, Shi RZ, Rowley JD (2004). "Over 20% of human transcripts might form sense–antisense pairs". Nucleic Acids Res. 32 (16): 4812–4820. doi:10.1093/nar/gkh818. PMC   519112 . PMID   15356298.
  6. 1 2 3 Carmichael GG (2003). "Antisense starts making more sense". Nat Biotechnol. 21 (4): 371–372. doi:10.1038/nbt0403-371. PMID   12665819. S2CID   3137487.
  7. 1 2 3 4 5 Wang XJ, Gaasterland T, Chua NH (2005). "Genome-wide prediction and identification of cis-natural antisense transcripts in Arabidopsis thaliana". Genome Biol. 6 (4): R30. doi: 10.1186/gb-2005-6-4-r30 . PMC   1088958 . PMID   15833117.
  8. Fahey,M.E.; Moore,T.F. Higgins,D.G. (2002). "Overlapping Antisense Transcription in the Human Genome". Comparative and Functional Genomics. 3 (3): 244–253. doi:10.1002/cfg.173. PMC   2447278 . PMID   18628857.{{cite journal}}: CS1 maint: multiple names: authors list (link)
  9. Borsani O, Zhu J, Verslues PE, Sunkar R, Zhu JK (2005). "Endogenous siRNAs Derived from a Pair of Natural cis-Antisense Transcripts Regulate Salt Tolerance in Arabidopsis". Cell. 123 (7): 1279–1291. doi:10.1016/j.cell.2005.11.035. PMC   3137516 . PMID   16377568.
  10. Røsok O, Sioud M (2005). "Systematic search for natural antisense transcripts in eukaryotes (review)". Int J Mol Med. 15 (2): 197–203. doi:10.3892/ijmm.15.2.197. PMID   15647831.
  11. Li YY, Qin L, Guo ZM, Liu L, Xu H, Hao P, Su J, Shi Y, He WZ, Li YX (2006). "In silico discovery of human natural antisense transcripts". BMC Bioinformatics. 7: 18. doi: 10.1186/1471-2105-7-18 . PMC   1369008 . PMID   16409644.
  12. Lehner B, Williams G, Campbell RD, Sanderson CM (2002). "Antisense transcripts in the human genome". Trends Genet. 18 (2): 63–65. doi:10.1016/S0168-9525(02)02598-2. PMID   11818131.