List of genetic codes

Last updated

While there is much commonality, different parts of the tree of life use slightly different genetic codes. [1] When translating from genome to protein, the use of the correct genetic code is essential. The mitochondrial codes are the relatively well-known examples of variation. The translation table list below follows the numbering and designation by NCBI. [2] Four novel alternative genetic codes were discovered in bacterial genomes by Shulgina and Eddy using their codon assignment software Codetta, and validated by analysis of tRNA anticodons and identity elements; [3] these codes are not currently adopted at NCBI, but are numbered here 34-37, and specified in the table below.

Contents

  1. The standard code
  2. The vertebrate mitochondrial code
  3. The yeast mitochondrial code
  4. The mold, protozoan, and coelenterate mitochondrial code and the mycoplasma/spiroplasma code
  5. The invertebrate mitochondrial code
  6. The ciliate, dasycladacean and hexamita nuclear code
  7. The deleted kinetoplast code; cf. table 4.
  8. deleted, cf. table 1.
  9. The echinoderm and flatworm mitochondrial code
  10. The euplotid nuclear code
  11. The bacterial, archaeal and plant plastid code
  12. The alternative yeast nuclear code
  13. The ascidian mitochondrial code
  14. The alternative flatworm mitochondrial code
  15. The Blepharisma nuclear code [4]
  16. The chlorophycean mitochondrial code
  17. (none)
  18. (none)
  19. (none)
  20. (none)
  21. The trematode mitochondrial code
  22. The Scenedesmus obliquus mitochondrial code
  23. The Thraustochytrium mitochondrial code
  24. The Pterobranchia mitochondrial code
  25. The candidate division SR1 and gracilibacteria code
  26. The Pachysolen tannophilus nuclear code
  27. The karyorelict nuclear code
  28. The Condylostoma nuclear code
  29. The Mesodinium nuclear code
  30. The peritrich nuclear code
  31. The Blastocrithidia nuclear code
  32. The Balanophoraceae plastid code (not shown on web) [4] [5]
  33. The Cephalodiscidae mitochondrial code
  34. The Enterosoma code [3]
  35. The Peptacetobacter code [3]
  36. The Anaerococcus and Onthovivens code [3]
  37. The Absconditabacterales code [3]

The alternative translation tables (2 to 37) involve codon reassignments that are recapitulated in the DNA and RNA codon tables.

Table summary

Comparison of alternative translation tables for all codons (using IUPAC amino acid codes):

Amino-acid biochemical propertiesNonpolarPolarBasicAcidicTermination: stop codon *
CodonTranslation table ID (see above)
1234569101112131415162122232425262728293031323334353637
TTTFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
TTCFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
TTALLLLLLLLLLLLLLLL*LLLLLLLLLLLLLL
TTGLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL
TCTSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS
TCCSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS
TCASSSSSSSSSSSSSSS*SSSSSSSSSSSSSSS
TCGSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS
TATYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY
TACYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY
TAA*****Q*****Y********QQYEE*Y****
TAG*****Q******QL*L****QQYEEW*****
TGTCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
TGCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
TGA*WWWW*WC**WW**W**WG*WW**W*W***G
TGGWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWW
CTTLLTLLLLLLLLLLLLLLLLLLLLLLLLLLLL
CTCLLTLLLLLLLLLLLLLLLLLLLLLLLLLLLL
CTALLTLLLLLLLLLLLLLLLLLLLLLLLLLLLL
CTGLLTLLLLLLSLLLLLLLLLALLLLLLLLLLL
CCTPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP
CCCPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP
CCAPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP
CCGPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP
CATHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH
CACHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH
CAAQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQ
CAGQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQ
CGTRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR
CGCRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR
CGARRRRRRRRRRRRRRRRRRRRRRRRRRRRRRW
CGGRRRRRRRRRRRRRRRRRRRRRRRRRRRRQWW
ATTIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII
ATCIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII
ATAIMMIMIIIIIMIIIMIIIIIIIIIIIIIIII
ATGMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM
ACTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT
ACCTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT
ACATTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT
ACGTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT
AATNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
AACNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
AAAKKKKKKNKKKKNKKNKKKKKKKKKKKKKKKK
AAGKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKK
AGTSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS
AGCSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS
AGAR*RRSRSRRRGSRRSRRSRRRRRRRRSRRRR
AGGR*RRSRSRRRGSRRSRRKRRRRRRRRKMRRR
GTTVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVV
GTCVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVV
GTAVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVV
GTGVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVV
GCTAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
GCCAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
GCAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
GCGAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
GATDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD
GACDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD
GAAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE
GAGEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE
GGTGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG
GGCGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG
GGAGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG
GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG

Notes

Three translation tables have a peculiar status:

Other mechanisms also play a part in protein biosynthesis, such as post-transcriptional modification.

Related Research Articles

<span class="mw-page-title-main">Genetic code</span> Rules by which information encoded within genetic material is translated into proteins

The genetic code is the set of rules used by living cells to translate information encoded within genetic material into proteins. Translation is accomplished by the ribosome, which links proteinogenic amino acids in an order specified by messenger RNA (mRNA), using transfer RNA (tRNA) molecules to carry amino acids and to read the mRNA three nucleotides at a time. The genetic code is highly similar among all organisms and can be expressed in a simple table with 64 entries.

<span class="mw-page-title-main">Stop codon</span> Codon that marks the end of a protein-coding sequence

In molecular biology, a stop codon is a codon that signals the termination of the translation process of the current protein. Most codons in messenger RNA correspond to the addition of an amino acid to a growing polypeptide chain, which may ultimately become a protein; stop codons signal the termination of this process by binding release factors, which cause the ribosomal subunits to disassociate, releasing the amino acid chain.

<span class="mw-page-title-main">Translation (biology)</span> Cellular process of protein synthesis

In biology, translation is the process in living cells in which proteins are produced using RNA molecules as templates. The generated protein is a sequence of amino acids. This sequence is determined by the sequence of nucleotides in the RNA. The nucleotides are considered three at a time. Each such triple results in addition of one specific amino acid to the protein being generated. The matching from nucleotide triple to amino acid is called the genetic code. The translation is performed by a large complex of functional RNA and proteins called ribosomes. The entire process is called gene expression.

<span class="mw-page-title-main">Transfer RNA</span> RNA that facilitates the addition of amino acids to a new protein

Transfer RNA is an adaptor molecule composed of RNA, typically 76 to 90 nucleotides in length. In a cell, it provides the physical link between the genetic code in messenger RNA (mRNA) and the amino acid sequence of proteins, carrying the correct sequence of amino acids to be combined by the protein-synthesizing machinery, the ribosome. Each three-nucleotide codon in mRNA is complemented by a three-nucleotide anticodon in tRNA. As such, tRNAs are a necessary component of translation, the biological synthesis of new proteins in accordance with the genetic code.

<span class="mw-page-title-main">Start codon</span> First codon of a messenger RNA translated by a ribosome

The start codon is the first codon of a messenger RNA (mRNA) transcript translated by a ribosome. The start codon always codes for methionine in eukaryotes and archaea and a N-formylmethionine (fMet) in bacteria, mitochondria and plastids.

<span class="mw-page-title-main">DNA and RNA codon tables</span> List of standard rules to translate DNA encoded information into proteins

A codon table can be used to translate a genetic code into a sequence of amino acids. The standard genetic code is traditionally represented as an RNA codon table, because when proteins are made in a cell by ribosomes, it is messenger RNA (mRNA) that directs protein synthesis. The mRNA sequence is determined by the sequence of genomic DNA. In this context, the standard genetic code is referred to as translation table 1. It can also be represented in a DNA codon table. The DNA codons in such tables occur on the sense DNA strand and are arranged in a 5-to-3 direction. Different tables with alternate codons are used depending on the source of the genetic code, such as from a cell nucleus, mitochondrion, plastid, or hydrogenosome.

The pterobranchia mitochondrial code is a genetic code used by the mitochondrial genome of Rhabdopleura compacta (Pterobranchia). The Pterobranchia are one of the two groups in the Hemichordata which together with the Echinodermata and Chordata form the three major lineages of deuterostomes. AUA translates to isoleucine in Rhabdopleura as it does in the Echinodermata and Enteropneusta while AUA encodes methionine in the Chordata. The assignment of AGG to lysine is not found elsewhere in deuterostome mitochondria but it occurs in some taxa of Arthropoda. This code shares with many other mitochondrial codes the reassignment of the UGA STOP to tryptophan, and AGG and AGA to an amino acid other than arginine. The initiation codons in Rhabdopleura compacta are ATG and GTG.

The yeast mitochondrial code is a genetic code used by the mitochondrial genome of yeasts, notably Saccharomyces cerevisiae, Candida glabrata, Hansenula saturnus, and Kluyveromyces thermotolerans.

The bacterial, archaeal and plant plastid code is the DNA code used by bacteria, archaea, prokaryotic viruses and chloroplast proteins. It is essentially the same as the standard code, however there are some variations in alternative start codons.

The mold, protozoan, and coelenterate mitochondrial code and the mycoplasma/spiroplasma code is the genetic code used by various organisms, in some cases with slight variations, notably the use of UGA as a tryptophan codon rather than a stop codon.

The invertebrate mitochondrial code is a genetic code used by the mitochondrial genome of invertebrates. Mitochondria contain their own DNA and reproduce independently from their host cell. Variation in translation of the mitochondrial genetic code occurs when DNA codons result in non-standard amino acids has been identified in invertebrates, most notably arthropods. This variation has been helpful as a tool to improve upon the phylogenetic tree of invertebrates, like flatworms.

The echinoderm and flatworm mitochondrial code is a genetic code used by the mitochondria of certain echinoderm and flatworm species.

The alternative yeast nuclear code is a genetic code found in certain yeasts. However, other yeast, including Saccharomyces cerevisiae, Candida azyma, Candida diversa, Candida magnoliae, Candida rugopelliculosa, Yarrowia lipolytica, and Zygoascus hellenicus, definitely use the standard (nuclear) code.

The candidate division SR1 and gracilibacteria code is used in two groups of uncultivated bacteria found in marine and fresh-water environments and in the intestines and oral cavities of mammals among others. The difference to the standard and the bacterial code is that UGA represents an additional glycine codon and does not code for termination. A survey of many genomes with the codon assignment software Codetta, analyzed through the GTDB taxonomy system shows that this genetic code is limited to the Patescibacteria order BD1-5, not what are now termed Gracilibacteria, and that the SR1 genome assembly GCA_000350285.1 for which the table 25 code was originally defined is actually using the Absconditibacterales genetic code and has the associated three special recoding tRNAs. Thus this code may now be better named the "BD1-5 code".

The ascidian mitochondrial code is a genetic code found in the mitochondria of Ascidia.

The alternative flatworm mitochondrial code is a genetic code found in the mitochondria of Platyhelminthes and Nematodes.

The Condylostoma nuclear code is a genetic code used by the nuclear genome of the heterotrich ciliate Condylostoma magnum. This code, along with translation tables 27 and 31, is remarkable in that every one of the 64 possible codons can be a sense codon. Experimental evidence suggests that translation termination relies on context, specifically proximity to the poly(A) tail. Near such a tail, PABP could help terminate the protein by recruiting eRF1 and eRF3 to prevent the cognate tRNA from binding.

The Enterosoma genetic code translates AGG to methionine, as determined by the codon assignment software Codetta; it was further shown that this recoding is associated with a special tRNA with the appropriate anticodon and tRNA identity elements. The code is found in a small clade of species within the Enterosoma genus, according to the GTDB taxonomy system release 220. Codetta called the Enterosoma code for the following genome assemblies: GCA_002431755.1, GCA_002439645.1, GCA_002436825.1, GCA_002451385.1, GCA_002297105.1, GCA_002297045.1, GCA_002404995.1, and GCA_900549915.1.

The Anaerococcus and Onthovivens genetic code translates CGG to tryptophan, as determined by the codon assignment software Codetta; it was further shown that this recoding is associated with a special tRNA with the appropriate anticodon and tRNA identity elements appropriate for such decoding. As currently known, this code is limited to two distinct clades, the genus Anaerococcus in the class Clostridia and the genus Onthovivens in the class Bacilli, as defined by the GTDB taxonomy system release 220. Codetta called the Anaerococcus and Onthovivens code for the following genome assemblies: GCA_000024105.1, GCA_900445285.1, GCA_902500265.1, GCA_900258475.1, GCA_002399785.1, GCA_004558005.1, GCA_900540365.1, GCA_900540395.1, GCA_900545015.1.

The Absconditabacterales genetic code translates UGA to glycine, and CGG and GCA to tryptophan, as determined by the codon assignment software Codetta; it was further shown that these recodings are associated with three special tRNAs with appropriate anticodons and tRNA identity elements. Codetta called the Absconditibacterales code for the following genome assemblies: GCA_002792495.1, GCA_001007975.1, GCA_003488625.1, GCA_003260355.1, GCA_003242865.1, GCA_000350285.1, GCA_002746475.1, GCA_007116275.1, GCA_007115995.1, GCA_002361595.1, GCA_000503875.1, GCA_003543185.1, GCA_002441085.1, and GCA_002791215.1. Review of the GTDB taxonomy system for the order Absconditabacterales left two questionable genome assemblies ; spot-checking these two genomes shows that they both have all three special tRNAs, suggesting that the code is universal across the order.

References

  1. Watanabe, Kimitsuna; Suzuki, Tsutomu (2001). "Genetic Code and its Variants". Encyclopedia of Life Sciences. doi:10.1038/npg.els.0000810. ISBN   047001590X.
  2. Elzanowski, Andrzej; Jim Ostell (7 July 2010). "The Genetic Codes". National Center for Biotechnology Information . Retrieved 6 May 2013.
  3. 1 2 3 4 5 Shulgina, Yekaterina; Eddy, Sean R. (9 November 2021). "A computational screen for alternative genetic codes in over 250,000 genomes". eLife. 10. doi: 10.7554/eLife.71402 . PMC   8629427 . PMID   34751130.
  4. 1 2 3 "NCBI genetic code table in ASN-1 format, with changelog: gc.prt".
  5. Su, Huei-Jiun; Barkman, Todd J.; Hao, Weilong; Jones, Samuel S.; Naumann, Julia; Skippington, Elizabeth; Wafula, Eric K.; Hu, Jer-Ming; Palmer, Jeffrey D.; DePamphilis, Claude W. (15 January 2019). "Novel genetic code and record-setting AT-richness in the highly reduced plastid genome of the holoparasitic plant Balanophora". Proceedings of the National Academy of Sciences of the United States of America. 116 (3): 934–943. Bibcode:2019PNAS..116..934S. doi: 10.1073/pnas.1816822116 . PMC   6338844 . PMID   30598433.

See also

Further reading