In molecular biology, a stop codon (or termination codon) is a codon (nucleotide triplet within messenger RNA) that signals the termination of the translation process of the current protein. [1] Most codons in messenger RNA correspond to the addition of an amino acid to a growing polypeptide chain, which may ultimately become a protein; stop codons signal the termination of this process by binding release factors, which cause the ribosomal subunits to disassociate, releasing the amino acid chain.
While start codons need nearby sequences or initiation factors to start translation, a stop codon alone is sufficient to initiate termination.
In the standard genetic code, there are three different termination codons:
Codon | Standard code (Translation table 1) | Name | ||
---|---|---|---|---|
DNA | RNA | |||
TAG | UAG | STOP = Ter (*) | "amber" | |
TAA | UAA | STOP = Ter (*) | "ochre" | |
TGA | UGA | STOP = Ter (*) | "opal" (or "umber") |
There are variations on the standard genetic code, and alternative stop codons have been found in the mitochondrial genomes of vertebrates, [2] Scenedesmus obliquus , [3] and Thraustochytrium . [4]
Genetic code | Translation table | Codon | Translation with this code | Standard translation | ||||
---|---|---|---|---|---|---|---|---|
DNA | RNA | |||||||
Vertebrate mitochondrial | 2 | AGA | AGA | STOP = Ter (*) | Arg (R) | |||
AGG | AGG | STOP = Ter (*) | Arg (R) | |||||
Scenedesmus obliquus mitochondrial | 22 | TCA | UCA | STOP = Ter (*) | Ser (S) | |||
Thraustochytrium mitochondrial | 23 | TTA | UUA | STOP = Ter (*) | Leu (L) |
Amino-acid biochemical properties | Nonpolar | Polar | Basic | Acidic | Termination: stop codon |
The nuclear genetic code is flexible as illustrated by variant genetic codes that reassign standard stop codons to amino acids. [5]
Genetic code | Translation table | Codon | Conditional translation | Standard translation | ||||
---|---|---|---|---|---|---|---|---|
DNA | RNA | |||||||
Karyorelict nuclear | 27 | TGA | UGA | Ter (*) | or | Trp (W) | Ter (*) | |
Condylostoma nuclear | 28 | TAA | UAA | Ter (*) | or | Gln (Q) | Ter (*) | |
TAG | UAG | Ter (*) | or | Gln (Q) | Ter (*) | |||
TGA | UGA | Ter (*) | or | Trp (W) | Ter (*) | |||
Blastocrithidia nuclear | 31 | TAA | UAA | Ter (*) | or | Glu (E) | Ter (*) | |
TAG | UAG | Ter (*) | or | Glu (E) | Ter (*) |
In 1986, convincing evidence was provided that selenocysteine (Sec) was incorporated co-translationally. Moreover, the codon partially directing its incorporation in the polypeptide chain was identified as UGA also known as the opal termination codon. [6] Different mechanisms for overriding the termination function of this codon have been identified in prokaryotes and in eukaryotes. [7] A particular difference between these kingdoms is that cis elements seem restricted to the neighborhood of the UAG codon in prokaryotes while in eukaryotes this restriction is not present. Instead such locations seem disfavored albeit not prohibited. [8]
In 2003, a landmark paper described the identification of all known selenoproteins in humans: 25 in total. [9] Similar analyses have been run for other organisms.
The UAG codon can translate into pyrrolysine (Pyl) in a similar manner.
Distribution of stop codons within the genome of an organism is non-random and can correlate with GC-content. [10] [11] For example, the E. coli K-12 genome contains 2705 TAA (63%), 1257 TGA (29%), and 326 TAG (8%) stop codons (GC content 50.8%). [12] Also the substrates for the stop codons release factor 1 or release factor 2 are strongly correlated to the abundance of stop codons. [11] Large scale study of bacteria with a broad range of GC-contents shows that while the frequency of occurrence of TAA is negatively correlated to the GC-content and the frequency of occurrence of TGA is positively correlated to the GC-content, the frequency of occurrence of the TAG stop codon, which is often the minimally used stop codon in a genome, is not influenced by the GC-content. [13]
Recognition of stop codons in bacteria have been associated with the so-called 'tripeptide anticodon', [14] a highly conserved amino acid motif in RF1 (PxT) and RF2 (SPF). Even though this is supported by structural studies, it was shown that the tripeptide anticodon hypothesis is an oversimplification. [15]
Stop codons were historically given many different names, as they each corresponded to a distinct class of mutants that all behaved in a similar manner. These mutants were first isolated within bacteriophages (T4 and lambda), viruses that infect the bacteria Escherichia coli . Mutations in viral genes weakened their infectious ability, sometimes creating viruses that were able to infect and grow within only certain varieties of E. coli.
They were the first set of nonsense mutations to be discovered, isolated by Richard H. Epstein and Charles Steinberg and named after their friend and graduate Caltech student Harris Bernstein, whose last name means "amber" in German (cf. Bernstein). [16] [17] [18]
Viruses with amber mutations are characterized by their ability to infect only certain strains of bacteria, known as amber suppressors. These bacteria carry their own mutation that allows a recovery of function in the mutant viruses. For example, a mutation in the tRNA that recognizes the amber stop codon allows translation to "read through" the codon and produce a full-length protein, thereby recovering the normal form of the protein and "suppressing" the amber mutation. [19] Thus, amber mutants are an entire class of virus mutants that can grow in bacteria that contain amber suppressor mutations. Similar suppressors are known for ochre and opal stop codons as well.
tRNA molecules carrying unnatural aminoacids have been designed to recognize the amber stop codon in bacterial RNA. This technology allows for incorporation of orthogonal aminoacids (such as p-azidophenylalanine) at specific locations of the target protein.
It was the second stop codon mutation to be discovered. Reminiscent of the usual yellow-orange-brown color associated with amber, this second stop codon was given the name of "ochre", an orange-reddish-brown mineral pigment. [17]
Ochre mutant viruses had a property similar to amber mutants in that they recovered infectious ability within certain suppressor strains of bacteria. The set of ochre suppressors was distinct from amber suppressors, so ochre mutants were inferred to correspond to a different nucleotide triplet. Through a series of mutation experiments comparing these mutants with each other and other known amino acid codons, Sydney Brenner concluded that the amber and ochre mutations corresponded to the nucleotide triplets "UAG" and "UAA". [20]
The third and last stop codon in the standard genetic code was discovered soon after, and corresponds to the nucleotide triplet "UGA". [21]
To continue matching with the theme of colored minerals, the third nonsense codon came to be known as "opal", which is a type of silica showing a variety of colors. [17] Nonsense mutations that created this premature stop codon were later called opal mutations or umber mutations.
Nonsense mutations are changes in DNA sequence that introduce a premature stop codon, causing any resulting protein to be abnormally shortened. This often causes a loss of function in the protein, as critical parts of the amino acid chain are no longer assembled. Because of this terminology, stop codons have also been referred to as nonsense codons.
A nonstop mutation, also called a stop-loss variant, is a point mutation that occurs within a stop codon. Nonstop mutations cause the continued translation of an mRNA strand into what should be an untranslated region. Most polypeptides resulting from a gene with a nonstop mutation lose their function due to their extreme length and the impact on normal folding. Nonstop mutations differ from nonsense mutations in that they do not create a stop codon but, instead, delete one. Nonstop mutations also differ from missense mutations, which are point mutations where a single nucleotide is changed to cause replacement by a different amino acid. Nonstop mutations have been linked with many inherited diseases including endocrine disorders, [22] eye disease, [23] and neurodevelopmental disorders. [24] [25]
Hidden stops are non-stop codons that would be read as stop codons if they were frameshifted +1 or −1. These prematurely terminate translation if the corresponding frame-shift (such as due to a ribosomal RNA slip) occurs before the hidden stop. It is hypothesised that this decreases resource wastage on nonfunctional proteins and the production of potential cytotoxins. Researchers at Louisiana State University propose the ambush hypothesis , that hidden stops are selected for. Codons that can form hidden stops are used in genomes more frequently compared to synonymous codons that would otherwise code for the same amino acid. Unstable rRNA in an organism correlates with a higher frequency of hidden stops. [26] However, this hypothesis could not be validated with a larger data set. [27]
Stop-codons and hidden stops together are collectively referred as stop-signals. Researchers at University of Memphis found that the ratios of the stop-signals on the three reading frames of a genome (referred to as translation stop-signals ratio or TSSR) of genetically related bacteria, despite their great differences in gene contents, are much alike. This nearly identical genomic-TSSR value of genetically related bacteria may suggest that bacterial genome expansion is limited by their unique stop-signals bias of that bacterial species. [28]
Stop codon suppression or translational readthrough occurs when in translation a stop codon is interpreted as a sense codon, that is, when a (standard) amino acid is 'encoded' by the stop codon. Mutated tRNAs can be the cause of readthrough, but also certain nucleotide motifs close to the stop codon. Translational readthrough is very common in viruses and bacteria, and has also been found as a gene regulatory principle in humans, yeasts, bacteria and drosophila. [29] [30] This kind of endogenous translational readthrough constitutes a variation of the genetic code, because a stop codon codes for an amino acid. In the case of human malate dehydrogenase, the stop codon is read through with a frequency of about 4%. [31] The amino acid inserted at the stop codon depends on the identity of the stop codon itself: Gln, Tyr, and Lys have been found for the UAA and UAG codons, while Cys, Trp, and Arg for the UGA codon have been identified by mass spectrometry. [32] Extent of readthrough in mammals have widely variable extents, and can broadly diversify the proteome and affect cancer progression. [33]
In 2010, when Craig Venter unveiled the first fully functioning, reproducing cell controlled by synthetic DNA he described how his team used frequent stop codons to create watermarks in RNA and DNA to help confirm the results were indeed synthetic (and not contaminated or otherwise), using it to encode authors' names and website addresses. [34]
The genetic code is the set of rules used by living cells to translate information encoded within genetic material into proteins. Translation is accomplished by the ribosome, which links proteinogenic amino acids in an order specified by messenger RNA (mRNA), using transfer RNA (tRNA) molecules to carry amino acids and to read the mRNA three nucleotides at a time. The genetic code is highly similar among all organisms and can be expressed in a simple table with 64 entries.
In biology, translation is the process in living cells in which proteins are produced using RNA molecules as templates. The generated protein is a sequence of amino acids. This sequence is determined by the sequence of nucleotides in the RNA. The nucleotides are considered three at a time. Each such triple results in addition of one specific amino acid to the protein being generated. The matching from nucleotide triple to amino acid is called the genetic code. The translation is performed by a large complex of functional RNA and proteins called ribosomes. The entire process is called gene expression.
In molecular biology, a reading frame is a way of dividing the sequence of nucleotides in a nucleic acid molecule into a set of consecutive, non-overlapping triplets. Where these triplets equate to amino acids or stop signals during translation, they are called codons.
A frameshift mutation is a genetic mutation caused by indels of a number of nucleotides in a DNA sequence that is not divisible by three. Due to the triplet nature of gene expression by codons, the insertion or deletion can change the reading frame, resulting in a completely different translation from the original. The earlier in the sequence the deletion or insertion occurs, the more altered the protein. A frameshift mutation is not the same as a single-nucleotide polymorphism in which a nucleotide is replaced, rather than inserted or deleted. A frameshift mutation will in general cause the reading of the codons after the mutation to code for different amino acids. The frameshift mutation will also alter the first stop codon encountered in the sequence. The polypeptide being created could be abnormally short or abnormally long, and will most likely not be functional.
A point mutation is a genetic mutation where a single nucleotide base is changed, inserted or deleted from a DNA or RNA sequence of an organism's genome. Point mutations have a variety of effects on the downstream protein product—consequences that are moderately predictable based upon the specifics of the mutation. These consequences can range from no effect to deleterious effects, with regard to protein production, composition, and function.
The Nirenberg and Matthaei experiment was a scientific experiment performed in May 1961 by Marshall W. Nirenberg and his post-doctoral fellow, J. Heinrich Matthaei, at the National Institutes of Health (NIH). The experiment deciphered the first of the 64 triplet codons in the genetic code by using nucleic acid homopolymers to translate specific amino acids.
The Nirenberg and Leder experiment was a scientific experiment performed in 1964 by Marshall W. Nirenberg and Philip Leder. The experiment elucidated the triplet nature of the genetic code and allowed the remaining ambiguous codons in the genetic code to be deciphered.
In genetics, a nonsense mutation is a point mutation in a sequence of DNA that results in a nonsense codon, or a premature stop codon in the transcribed mRNA, and leads to a truncated, incomplete, and possibly nonfunctional protein product. Nonsense mutations are not always harmful; the functional effect of a nonsense mutation depends on many aspects, such as the location of the stop codon within the coding DNA. For example, the effect of a nonsense mutation depends on the proximity of the nonsense mutation to the original stop codon, and the degree to which functional subdomains of the protein are affected. As nonsense mutations leads to premature termination of polypeptide chains; they are also called chain termination mutations.
Silent mutations, also called synonymous or samesense mutations, are mutations in DNA that do not have an observable effect on the organism's phenotype. The phrase silent mutation is often used interchangeably with the phrase synonymous mutation; however, synonymous mutations are not always silent, nor vice versa. Synonymous mutations can affect transcription, splicing, mRNA transport, and translation, any of which could alter phenotype, rendering the synonymous mutation non-silent. The substrate specificity of the tRNA to the rare codon can affect the timing of translation, and in turn the co-translational folding of the protein. This is reflected in the codon usage bias that is observed in many species. Mutations that cause the altered codon to produce an amino acid with similar functionality are often classified as silent; if the properties of the amino acid are conserved, this mutation does not usually significantly affect protein function.
In genetics, a missense mutation is a point mutation in which a single nucleotide change results in a codon that codes for a different amino acid. It is a type of nonsynonymous substitution.
Genetics, a discipline of biology, is the science of heredity and variation in living organisms.
The start codon is the first codon of a messenger RNA (mRNA) transcript translated by a ribosome. The start codon always codes for methionine in eukaryotes and archaea and a N-formylmethionine (fMet) in bacteria, mitochondria and plastids.
Nonsense-mediated mRNA decay (NMD) is a surveillance pathway that exists in all eukaryotes. Its main function is to reduce errors in gene expression by eliminating mRNA transcripts that contain premature stop codons. Translation of these aberrant mRNAs could, in some cases, lead to deleterious gain-of-function or dominant-negative activity of the resulting proteins.
A suppressor mutation is a second mutation that alleviates or reverts the phenotypic effects of an already existing mutation in a process defined synthetic rescue. Genetic suppression therefore restores the phenotype seen prior to the original background mutation. Suppressor mutations are useful for identifying new genetic sites which affect a biological process of interest. They also provide evidence between functionally interacting molecules and intersecting biological pathways.
A nonsense suppressor is a factor which can inhibit the effect of the nonsense mutation. Nonsense suppressors can be generally divided into two classes: a) a mutated tRNA which can bind with a termination codon on mRNA; b) a mutation on ribosomes decreasing the effect of a termination codon. It is believed that nonsense suppressors keep a low concentration in the cell and do not disrupt normal translation most of the time. In addition, many genes do not have only one termination codon, and cells commonly use ochre codons as the termination signal, whose nonsense suppressors are usually inefficient.
Eukaryotic translation termination factor1 (eRF1), also referred to as TB3-1 or SUP45L1, is a protein that is encoded by the ERF1 gene. In Eukaryotes, eRF1 is an essential protein involved in stop codon recognition in translation, termination of translation, and nonsense mediated mRNA decay via the SURF complex.
Missense mRNA is a messenger RNA bearing one or more mutated codons that yield polypeptides with an amino acid sequence different from the wild-type or naturally occurring polypeptide. Missense mRNA molecules are created when template DNA strands or the mRNA strands themselves undergo a missense mutation in which a protein coding sequence is mutated and an altered amino acid sequence is coded for.
An expanded genetic code is an artificially modified genetic code in which one or more specific codons have been re-allocated to encode an amino acid that is not among the 22 common naturally-encoded proteinogenic amino acids.
A nonsynonymous substitution is a nucleotide mutation that alters the amino acid sequence of a protein. Nonsynonymous substitutions differ from synonymous substitutions, which do not alter amino acid sequences and are (sometimes) silent mutations. As nonsynonymous substitutions result in a biological change in the organism, they are subject to natural selection.
A codon table can be used to translate a genetic code into a sequence of amino acids. The standard genetic code is traditionally represented as an RNA codon table, because when proteins are made in a cell by ribosomes, it is messenger RNA (mRNA) that directs protein synthesis. The mRNA sequence is determined by the sequence of genomic DNA. In this context, the standard genetic code is referred to as translation table 1. It can also be represented in a DNA codon table. The DNA codons in such tables occur on the sense DNA strand and are arranged in a 5′-to-3′ direction. Different tables with alternate codons are used depending on the source of the genetic code, such as from a cell nucleus, mitochondrion, plastid, or hydrogenosome.
{{cite journal}}
: CS1 maint: multiple names: authors list (link)