Alloprotein

Last updated

An alloprotein is a novel synthetic protein containing one or more "non-natural" amino acids. Non-natural in the context means an amino acid either not occurring in nature (novel and synthesised amino acids), [1] or occurring in nature but not naturally occurring within proteins (natural but non-proteinogenic amino acids). [2]

The possibility for novel amino acids and proteins arises because, in nature, the genetic code responsible for protein structure has 64 possible codons available for encoding all amino acids used in proteins (4 nucleotides in each of 3 bases; 4 x 4 x 4 gives 64 possible combinations [3] ); however, in human beings and other eukaryotes, these encode for just 20 standard amino acids. [4] This level of information redundancy within the codon table is known in biochemistry as degeneracy. It opens the door for new amino acids to be potentially encoded. [4]

One approach takes advantage of the redundancy of the 3 codons that encode a "stop" signal. If one of these can be substituted by another stop codon, then that codon can in principle be "reassigned" (along with requisite tRNA, release factor and enzyme modifications) to code for a novel amino acid without affecting other existing codings. [5] [6] Using this approach, alloproteins and novel amino acids can be created by techniques that "expand" the genetic code to include additional novel codings, using newly devised codons and related tRNA (transfer RNA) and tRNA synthetase enzymes (aminoacyl tRNA synthetase). The usual mechanisms, which produce amino acids and combine them into proteins, then produce novel or non-proteinogenic amino acids and incorporate them to make novel proteins the same way. In 2010 this technique was used to reassign a codon in the genetic code of the bacterium E. coli , modifying it to produce and incorporate a novel amino acid, without adversely affecting existing encodings or the organism itself. [5] [6]

Alloprotein uses include the incorporation of unusual or heavy atoms for diffractive structure analysis, photo-reactive linkers (photocrosslinkers), fluorescent groups (used as labelled probes), and molecular switches for signaling pathways. [1] [7]

Definition and history

Modern alloprotein techniques were first developed in the late 1980s by Miyazawa and Yokoyama at the University of Tokyo to address limitations of existing methods: genetic manipulation was limited to the 20 standard amino acids, chemical synthesis was limited to small scale and low yield. [2]

An early use of the term is found in a 1990 paper "Biosynthesis of alloprotein", by Koide, Yokoyama and Miyazawa. [8]

A working description is provided by Budisa et al: [9]

"Genetic code engineering is [a] new research field that intent to reprogram protein synthesis by reassignment of specific codons to non-canonical (mainly synthetic) amino acids. The resulting proteins are alloproteins with tailor-made properties that are of outstanding interest for both, academia and industrial biotechnology."

Related Research Articles

Amino acid Organic compounds containing amine and carboxylic groups

Amino acids are organic compounds that contain amino (–NH2) and carboxyl (–COOH) functional groups, along with a side chain (R group) specific to each amino acid. The key elements of an amino acid are carbon (C), hydrogen (H), oxygen (O), and nitrogen (N), although other elements are found in the side chains of certain amino acids. About 500 naturally occurring amino acids are known as of 1983 (though only 20 appear in the genetic code) and can be classified in many ways. They can be classified according to the core structural functional groups' locations as alpha- (α-), beta- (β-), gamma- (γ-) or delta- (δ-) amino acids; other categories relate to polarity, pH level, and side chain group type (aliphatic, acyclic, aromatic, containing hydroxyl or sulfur, etc.). In the form of proteins, amino acid residues form the second-largest component (water is the largest) of human muscles and other tissues. Beyond their role as residues in proteins, amino acids participate in a number of processes such as neurotransmitter transport and biosynthesis.

Genetic code Rules by which information encoded within genetic material is translated into proteins.

The genetic code is the set of rules used by living cells to translate information encoded within genetic material into proteins. Translation is accomplished by the ribosome, which links proteinogenic amino acids in an order specified by messenger RNA (mRNA), using transfer RNA (tRNA) molecules to carry amino acids and to read the mRNA three nucleotides at a time. The genetic code is highly similar among all organisms and can be expressed in a simple table with 64 entries.

Selenocysteine Chemical compound

Selenocysteine is the 21st proteinogenic amino acid. Selenoprotein contain a selenocysteine residue. Selenocysteine is an analogue of the more common cysteine with selenium in place of the sulfur.

Methionine Group of stereoisomers

Methionine is an essential amino acid in humans. As the substrate for other amino acids such as cysteine and taurine, versatile compounds such as SAM-e, and the important antioxidant glutathione, methionine plays a critical role in the metabolism and health of many species, including humans. It is encoded by the codon AUG.

Pyrrolysine Chemical compound

Pyrrolysine is an α-amino acid that is used in the biosynthesis of proteins in some methanogenic archaea and bacteria; it is not present in humans. It contains an α-amino group, a carboxylic acid group. Its pyrroline side-chain is similar to that of lysine in being basic and positively charged at neutral pH.

Central dogma of molecular biology Explanation of the flow of genetic information within a biological system

The central dogma of molecular biology is an explanation of the flow of genetic information within a biological system. It is often stated as "DNA makes RNA, and RNA makes protein", although this is not its original meaning. It was first stated by Francis Crick in 1957, then published in 1958:

The Central Dogma. This states that once "information" has passed into protein it cannot get out again. In more detail, the transfer of information from nucleic acid to nucleic acid, or from nucleic acid to protein may be possible, but transfer from protein to protein, or from protein to nucleic acid is impossible. Information means here the precise determination of sequence, either of bases in the nucleic acid or of amino acid residues in the protein.

Translation (biology) Cellular process of protein synthesis

In molecular biology and genetics, translation is the process in which ribosomes in the cytoplasm or endoplasmic reticulum synthesize proteins after the process of transcription of DNA to RNA in the cell's nucleus. The entire process is called gene expression.

Proteinogenic amino acid Amino acid that is incorporated biosynthetically into proteins during translation

Proteinogenic amino acids are amino acids that are incorporated biosynthetically into proteins during translation. The word "proteinogenic" means "protein creating". Throughout known life, there are 22 genetically encoded (proteinogenic) amino acids, 20 in the standard genetic code and an additional 2 that can be incorporated by special translation mechanisms.

Transfer RNA RNA that facilitates the addition of amino acids to a new protein

A transfer RNA is an adaptor molecule composed of RNA, typically 76 to 90 nucleotides in length, that serves as the physical link between the mRNA and the amino acid sequence of proteins. Transfer RNA does this by carrying an amino acid to the protein synthetic machinery of a cell called the ribosome. Complementation of a 3-nucleotide codon in a messenger RNA (mRNA) by a 3-nucleotide anticodon of the tRNA results in protein synthesis based on the mRNA code. As such, tRNAs are a necessary component of translation, the biological synthesis of new proteins in accordance with the genetic code.

Wobble base pair

A wobble base pair is a pairing between two nucleotides in RNA molecules that does not follow Watson-Crick base pair rules. The four main wobble base pairs are guanine-uracil (G-U), hypoxanthine-uracil (I-U), hypoxanthine-adenine (I-A), and hypoxanthine-cytosine (I-C). In order to maintain consistency of nucleic acid nomenclature, "I" is used for hypoxanthine because hypoxanthine is the nucleobase of inosine; nomenclature otherwise follows the names of nucleobases and their corresponding nucleosides. The thermodynamic stability of a wobble base pair is comparable to that of a Watson-Crick base pair. Wobble base pairs are fundamental in RNA secondary structure and are critical for the proper translation of the genetic code.

Auxotrophy

Auxotrophy is the inability of an organism to synthesize a particular organic compound required for its growth. An auxotroph is an organism that displays this characteristic; auxotrophic is the corresponding adjective. Auxotrophy is the opposite of prototrophy, which is characterized by the ability to synthesize all the compounds needed for growth.

Biosynthesis is a multi-step, enzyme-catalyzed process where substrates are converted into more complex products in living organisms. In biosynthesis, simple compounds are modified, converted into other compounds, or joined together to form macromolecules. This process often consists of metabolic pathways. Some of these biosynthetic pathways are located within a single cellular organelle, while others involve enzymes that are located within multiple cellular organelles. Examples of these biosynthetic pathways include the production of lipid membrane components and nucleotides. Biosynthesis is usually synonymous with anabolism.

Xenobiology (XB) is a subfield of synthetic biology, the study of synthesizing and manipulating biological devices and systems. The name "xenobiology" derives from the Greek word xenos, which means "stranger, alien". Xenobiology is a form of biology that is not (yet) familiar to science and is not found in nature. In practice, it describes novel biological systems and biochemistries that differ from the canonical DNA–RNA-20 amino acid system. For example, instead of DNA or RNA, XB explores nucleic acid analogues, termed xeno nucleic acid (XNA) as information carriers. It also focuses on an expanded genetic code and the incorporation of non-proteinogenic amino acids into proteins.

A synonymous substitution is the evolutionary substitution of one base for another in an exon of a gene coding for a protein, such that the produced amino acid sequence is not modified. This is possible because the genetic code is "degenerate", meaning that some amino acids are coded for by more than one three-base-pair codon; since some of the codons for a given amino acid differ by just one base pair from others coding for the same amino acid, a mutation that replaces the "normal" base by one of the alternatives will result in incorporation of the same amino acid into the growing polypeptide chain when the gene is translated. Synonymous substitutions and mutations affecting noncoding DNA are often considered silent mutations; however, it is not always the case that the mutation is silent.

This glossary of genetics is a list of definitions of terms and concepts commonly used in the study of genetics and related disciplines in biology, including molecular biology and evolutionary biology. It is intended as introductory material for novices; for more specific and technical detail, see the article corresponding to each term. For related terms, see Glossary of evolutionary biology.

Expanded genetic code

An expanded genetic code is an artificially modified genetic code in which one or more specific codons have been re-allocated to encode an amino acid that is not among the 22 common naturally-encoded proteinogenic amino acids.

Non-proteinogenic amino acids

In biochemistry, non-coded or non-proteinogenic amino acids are those not naturally encoded or found in the genetic code of any organism. Despite the use of only 22 amino acids by the translational machinery to assemble proteins, over 140 amino acids are known to occur naturally in proteins and thousands more may occur in nature or be synthesized in the laboratory. Many non-proteinogenic amino acids are noteworthy because they are;

Degeneracy of codons is the redundancy of the genetic code, exhibited as the multiplicity of three-base pair codon combinations that specify an amino acid. The degeneracy of the genetic code is what accounts for the existence of synonymous mutations.

Isoserine Chemical compound

Isoserine is a non-proteinogenic α-hydroxy-β-amino acid, and an isomer of serine. Non-proteinogenic amino acids do not form proteins, and are not part of the genetic code of any known organism. Isoserine has only been produced synthetically.

Nediljko Budisa

Nediljko "Ned" Budisa is a Croatian biochemist, professor and holder of the Tier 1 Canada Research Chair (CRC) for chemical synthetic biology at the University of Manitoba. As pioneer in the areas of genetic code engineering and chemical synthetic biology (Xenobiology), his research has a wide range of applications in biotechnology and engineering biology in general. Being highly interdisciplinary, it includes bioorganic and medical chemistry, structural biology, biophysics and molecular biotechnology as well as metabolic and biomaterial engineering. He is the author of the only textbook in his research field: “Engineering the genetic code: expanding the amino acid repertoire for the design of novel proteins”.

References

  1. 1 2 Expanded Genetic Code System Research Team, Yokohama Institute, Japan
  2. 1 2 Method for producing protein containing nonprotein amino acids - 1988, Miyazawa & Yokoyama et al. Description states: The present invention relates to a method for producing proteins comprising nonprotein amino acids (hereinafter referred to as non-natural proteins) using protein-producing organisms. The term "nonprotein amino acids" as used herein implicates all amino acids excluding the aforementioned 20 natural amino acids. Thus, all amino acids but the aforementioned 20 amino acids are referred to as nonprotein amino acids even if they are naturally present.
  3. Crick, Francis (1988). "Chapter 8: The genetic code". What mad pursuit: a personal view of scientific discovery . New York: Basic Books. pp.  89–101. ISBN   0-465-09138-5.CS1 maint: discouraged parameter (link)
  4. 1 2 Hahn, Ulrich (2004). "Old Codons, New Amino Acids". Angewandte Chemie International Edition. 43: 1190–1193. doi:10.1002/anie.200301720.
  5. 1 2 First Genetic Code of an Organism Revised in a Research Laboratory - RIKEN
  6. 1 2 Codon reassignment in the Escherichia coli genetic code - 2010
  7. Riken Systems and Structural Biology Center: protein synthesis and functional studies
  8. Koide, H; Yokoyama, S; Miyazawa, T. "[Biosynthesis of alloprotein]". Nihon Rinsho. 48: 208–13. PMID   2406480.
  9. A holistic approach to genetic code engineering - Wiltschi, Merkel and Budisa, Max Planck Institute of Biochemistry