Translation complex profile sequencing (TCP-seq) is a molecular biology method for obtaining snapshots of momentary distribution of protein synthesis complexes along messenger RNA (mRNA) chains. [1]
Expression of genetic code in all life forms consists of two major processes, synthesis of copies of the genetic code recorded in DNA into the form of mRNA (transcription), and protein synthesis itself (translation), whereby the code copies in mRNA are decoded into amino acid sequences of the respective proteins. Both transcription and translation are highly regulated processes essentially controlling everything of what happens in live cells (and multicellular organisms, consequently).
Control of translation is especially important in eukaryotic cells where it forms part of post-transcriptional regulatory networks of genes expression. This additional functionality is reflected in the increased complexity of the translation process, making it a hard object to investigate. Yet details on when and what mRNA is translated and what mechanisms are responsible for this control are key to understanding of normal and pathological cell functionality. TCP-seq can be used to obtain this information.
With the advent of the high-throughput DNA and RNA sequence identification methods (such as Illumina sequencing), it became possible to efficiently analyse nucleotide sequences of large numbers of relatively short DNA and RNA fragments. Sequences of these fragments can be superimposed to reconstruct the source. Alternatively, if the source sequence is already known, the fragments can be found within it (“mapped”), and their individual numbers counted. Thus, if an initial stage exists whereby the fragments are differentially present or selected (“enriched”), this approach can be used to quantitatively describe such stage over even a very large number or length of the input sequences, most usually encompassing the entire DNA or RNA of the cell.
TCP-seq is based on these capabilities of the high-throughput RNA sequencing and further uses the nucleic acid protection phenomenon. The protection is manifested as resistance to depolymerisation or modification of stretches of nucleic acids (particularly, RNA) that are tightly bound to or engulfed with other biomolecules, which thus leave their “footprints” over the nucleic acid strand. These “footprint” fragments therefore represent location on nucleic acid chain where the interaction occurs. By sequencing and mapping the fragments back to the source sequence, it is possible to precisely identify the locations and counts of these intermolecular contacts.
In case of TCP-seq, ribosomes and ribosomal subunits engaged in interaction with mRNA are first fast chemically crosslinked to it with formaldehyde to preserve existing state of interactions (“snapshot” of distribution) and to block any possible non-equilibrium processes. The crosslinking can be performed directly in, but not restricted to, live cells. The RNA is then partially degraded (e.g. with ribonuclease) so that only fragments protected by the ribosomes or ribosomal subunits are left. The protected fragments are then purified according to the sedimentation dynamics of the attached ribosomes or ribosomal subunits, de-blocked, sequenced and mapped to the source transcriptome, giving the original locations of the translation complexes over mRNA.
TCP-seq merges several elements typical to other transcriptome-wide analyses of its kind. In particular, polysome profiling [2] [3] and ribosome (translation) profiling [4] approaches are also employed to identify mRNA involved in polysome formation and locations of elongating ribosomes over coding regions of transcripts, correspondingly. These methods, however, do not use chemical stabilisation of translation complexes and purification of the covalently bound intermediates from the live cells. TCP-seq thus can be considered more as a functional equivalent of ChIP-seq and similar methods of investigating momentary interactions of DNA that are redesigned to be applicable for translation.
The advantages of the method include:
The disadvantages include:
The method is currently being developed and was applied to investigate translation dynamics in live yeast cells and is extending, rather than simply combining, the capabilities of the previous techniques. [1] The only other transcriptome-wide method for mapping ribosome positions over mRNA with nucleotide precision is ribosome (translation) profiling. However, it captures positions of only elongating ribosomes, and most dynamic and functionally important intermediates of translation at the initiation stage are not detected.
TCP-seq was designed to specifically target these blind spots. It can essentially provide the same level of details for elongation phase as ribosome (translation) profiling, but also includes recording of initiation, termination and recycling intermediates (and basically any other possible translation complexes as long as the ribosome or its subunits are contacting and protecting the mRNA) of protein synthesis that previously remained out of the reach. Therefore, TCP-seq provides a single approach for a complete insight into the translation process of a biological sample. This particular aspect of the method can be expected to be developed further as the dynamics of ribosomal scanning on mRNA during translation initiation is generally unknown for the most of life. Current dataset containing TCP-seq data for translation initiation is available for yeast Saccharomyces cerevisiae, [5] [6] and likely to be extended for other organisms in the future.
The nucleolus is the largest structure in the nucleus of eukaryotic cells. It is best known as the site of ribosome biogenesis, which is the synthesis of ribosomes. The nucleolus also participates in the formation of signal recognition particles and plays a role in the cell's response to stress. Nucleoli are made of proteins, DNA and RNA, and form around specific chromosomal regions called nucleolar organizing regions. Malfunction of nucleoli can be the cause of several human conditions called "nucleolopathies" and the nucleolus is being investigated as a target for cancer chemotherapy.
Ribosomes are macromolecular machines, found within all cells, that perform biological protein synthesis. Ribosomes link amino acids together in the order specified by the codons of messenger RNA (mRNA) molecules to form polypeptide chains. Ribosomes consist of two major components: the small and large ribosomal subunits. Each subunit consists of one or more ribosomal RNA (rRNA) molecules and many ribosomal proteins. The ribosomes and associated molecules are also known as the translational apparatus.
In biology, translation is the process in living cells in which proteins are produced using RNA molecules as templates. The generated protein is a sequence of amino acids. This sequence is determined by the sequence of nucleotides in the RNA. The nucleotides are considered three at a time. Each such triple results in addition of one specific amino acid to the protein being generated. The matching from nucleotide triple to amino acid is called the genetic code. The translation is performed by a large complex of functional RNA and proteins called ribosomes. The entire process is called gene expression.
Ribosomal ribonucleic acid (rRNA) is a type of non-coding RNA which is the primary component of ribosomes, essential to all cells. rRNA is a ribozyme which carries out protein synthesis in ribosomes. Ribosomal RNA is transcribed from ribosomal DNA (rDNA) and then bound to ribosomal proteins to form small and large ribosome subunits. rRNA is the physical and mechanical factor of the ribosome that forces transfer RNA (tRNA) and messenger RNA (mRNA) to process and translate the latter into proteins. Ribosomal RNA is the predominant form of RNA found in most cells; it makes up about 80% of cellular RNA despite never being translated into proteins itself. Ribosomes are composed of approximately 60% rRNA and 40% ribosomal proteins by mass.
An internal ribosome entry site, abbreviated IRES, is an RNA element that allows for translation initiation in a cap-independent manner, as part of the greater process of protein synthesis. In eukaryotic translation, initiation typically occurs at the 5' end of mRNA molecules, since 5' cap recognition is required for the assembly of the initiation complex. The location for IRES elements is often in the 5'UTR, but can also occur elsewhere in mRNAs.
Bacterial translation is the process by which messenger RNA is translated into proteins in bacteria.
Eukaryotic translation is the biological process by which messenger RNA is translated into proteins in eukaryotes. It consists of four phases: initiation, elongation, termination, and recapping.
The Kozak consensus sequence is a nucleic acid motif that functions as the protein translation initiation site in most eukaryotic mRNA transcripts. Regarded as the optimum sequence for initiating translation in eukaryotes, the sequence is an integral aspect of protein regulation and overall cellular health as well as having implications in human disease. It ensures that a protein is correctly translated from the genetic message, mediating ribosome assembly and translation initiation. A wrong start site can result in non-functional proteins. As it has become more studied, expansions of the nucleotide sequence, bases of importance, and notable exceptions have arisen. The sequence was named after the scientist who discovered it, Marilyn Kozak. Kozak discovered the sequence through a detailed analysis of DNA genomic sequences.
A ribosome binding site, or ribosomal binding site (RBS), is a sequence of nucleotides upstream of the start codon of an mRNA transcript that is responsible for the recruitment of a ribosome during the initiation of translation. Mostly, RBS refers to bacterial sequences, although internal ribosome entry sites (IRES) have been described in mRNAs of eukaryotic cells or viruses that infect eukaryotes. Ribosome recruitment in eukaryotes is generally mediated by the 5' cap present on eukaryotic mRNAs.
The prokaryotic small ribosomal subunit, or 30S subunit, is the smaller subunit of the 70S ribosome found in prokaryotes. It is a complex of the 16S ribosomal RNA (rRNA) and 19 proteins. This complex is implicated in the binding of transfer RNA to messenger RNA (mRNA). The small subunit is responsible for the binding and the reading of the mRNA during translation. The small subunit, both the rRNA and its proteins, complexes with the large 50S subunit to form the 70S prokaryotic ribosome in prokaryotic cells. This 70S ribosome is then used to translate mRNA into proteins.
ChIP-sequencing, also known as ChIP-seq, is a method used to analyze protein interactions with DNA. ChIP-seq combines chromatin immunoprecipitation (ChIP) with massively parallel DNA sequencing to identify the binding sites of DNA-associated proteins. It can be used to map global binding sites precisely for any protein of interest. Previously, ChIP-on-chip was the most common technique utilized to study these protein–DNA relations.
EF-G is a prokaryotic elongation factor involved in protein translation. As a GTPase, EF-G catalyzes the movement (translocation) of transfer RNA (tRNA) and messenger RNA (mRNA) through the ribosome.
RNA-Seq is a sequencing technique that uses next-generation sequencing (NGS) to reveal the presence and quantity of RNA in a biological sample, representing an aggregated snapshot of the cells' dynamic pool of RNAs, also known as transcriptome.
Ribosome profiling, or Ribo-Seq, is an adaptation of a technique developed by Joan Steitz and Marilyn Kozak almost 50 years ago that Nicholas Ingolia and Jonathan Weissman adapted to work with next generation sequencing that uses specialized messenger RNA (mRNA) sequencing to determine which mRNAs are being actively translated. A related technique that can also be used to determine which mRNAs are being actively translated is the Translating Ribosome Affinity Purification (TRAP) methodology, which was developed by Nathaniel Heintz at Rockefeller University. TRAP does not involve ribosome footprinting but provides cell type-specific information.
Ribosomal pause refers to the queueing or stacking of ribosomes during translation of the nucleotide sequence of mRNA transcripts. These transcripts are decoded and converted into an amino acid sequence during protein synthesis by ribosomes. Due to the pause sites of some mRNA's, there is a disturbance caused in translation. Ribosomal pausing occurs in both eukaryotes and prokaryotes. A more severe pause is known as a ribosomal stall.
Polysome profiling is a technique in molecular biology that is used to study the association of mRNAs with ribosomes. It is important to note that this technique is different from ribosome profiling. Both techniques have been reviewed and both are used in analysis of the translatome, but the data they generate are at very different levels of specificity. When employed by experts, the technique is remarkably reproducible: the 3 profiles in the first image are from 3 different experiments.
In epitranscriptomic sequencing, most methods focus on either (1) enrichment and purification of the modified RNA molecules before running on the RNA sequencer, or (2) improving or modifying bioinformatics analysis pipelines to call the modification peaks. Most methods have been adapted and optimized for mRNA molecules, except for modified bisulfite sequencing for profiling 5-methylcytidine which was optimized for tRNAs and rRNAs.
Transcriptomics technologies are the techniques used to study an organism's transcriptome, the sum of all of its RNA transcripts. The information content of an organism is recorded in the DNA of its genome and expressed through transcription. Here, mRNA serves as a transient intermediary molecule in the information network, whilst non-coding RNAs perform additional diverse functions. A transcriptome captures a snapshot in time of the total transcripts present in a cell. Transcriptomics technologies provide a broad account of which cellular processes are active and which are dormant. A major challenge in molecular biology is to understand how a single genome gives rise to a variety of cells. Another is how gene expression is regulated.
Translatomics is the study of all open reading frames (ORFs) that are being actively translated in a cell or organism. This collection of ORFs is called the translatome. Characterizing a cell's translatome can give insight into the array of biological pathways that are active in the cell. According to the central dogma of molecular biology, the DNA in a cell is transcribed to produce RNA, which is then translated to produce a protein. Thousands of proteins are encoded in an organism's genome, and the proteins present in a cell cooperatively carry out many functions to support the life of the cell. Under various conditions, such as during stress or specific timepoints in development, the cell may require different biological pathways to be active, and therefore require a different collection of proteins. Depending on intrinsic and environmental conditions, the collection of proteins being made at one time varies. Translatomic techniques can be used to take a "snapshot" of this collection of actively translating ORFs, which can give information about which biological pathways the cell is activating under the present conditions.
This glossary of genetics is a list of definitions of terms and concepts commonly used in the study of genetics and related disciplines in biology, including molecular biology, cell biology, and evolutionary biology. It is intended as introductory material for novices; for more specific and technical detail, see the article corresponding to each term. For related terms, see Glossary of evolutionary biology.