Arginylation is a post-translational modification in which proteins are modified by the addition of arginine (Arg) at the N-terminal amino group or side chains of reactive amino acids by the enzyme, arginyltransferase (ATE1). Recent studies have also revealed that hundreds of proteins in vivo are arginylated, proteins which are essential for many biological pathways. While still poorly understood in a biological setting, the ATE1 enzyme is highly conserved which suggests that arginylation is an important biological post-translational modification.
Examples of ATE1 targets which have been identified include ornithine decarboxylase., [1] thyroglobulin, [2] insulin, [3] and neurotensin. [4]
In 1963, a group of researchers observed that specific radioactive amino acids were being incorporated into proteins obtained from ribosome-free cell and tissue extracts. [5] This incorporation of amino acids into ribosome-lacking cells was first observed in prokaryotes using leucine (Leu) and phenylalanine (Phe), and was further discovered in mammalian liver extracts using arginine. The incorporation of other amino acids into ribosome-lacking cells failed to yield similar results, suggesting that the mechanism was specific to leucine and phenylalanine in bacteria and arginine in mammals. [6] One of the most interesting aspects of arginylation is that the amino acids used for arginylation are transferred from aminoacyl tRNAs onto the target protein, without the use of any other translational components. This way of modifying proteins post-translationally does not occur in any other amino acid addition to proteins, such as in glycylation, [7] glutamylation, [8] and tyrosination [9] [10] making arginylation truly unique.
Upon discovery of this modification and its mechanism, further research was performed to identify an enzyme or enzymes which promote this modification. After identifying the enzyme responsible for this modification in both plants [11] and guinea-pig hair follicles, [12] it was cloned and characterized in yeast and given the name ATE1 [13] due to its ability. Later studies have also identified various genes which code for ATE1 enzymes in multiple species, leading to the conclusion that ATE1 is present in all eukaryotes. [11] [13] [14]
Upon the identification of the early targets of arginylation by ATE1 (in vitro and in vivo), a pattern emerged. This pattern showed that ATE1 displayed a high affinity for proteins and peptides containing the acidic amino acids asparagine or glutamine which were exposed on the N-terminal side of the protein or peptide. Further studies aided by high precision mass spectrometry have revealed hundreds of proteins from different cells and tissues which have been arginylated. [15] [16] Several of these proteins also displayed arginylation at their N-chain termini, but contained residues other than asparagine or glutamine. [5] As such, arginylation studies are still in the introductory stages and further research into the specificity of arginylation must be performed.
However, the assumption that arginylation only occurs at the N-terminus severely limited the amount of proteins which were likely to be arginylated. This is due to the fact that, if the preference of arginylation to occur only at the N-terminus assumption was true, then arginylation would never be able to happen on intact proteins due to protein sequences beginning with methionine at the N-terminus and not the preferred asparagine or glutamine. This assumption was soon proved false when a protein was discovered with an arginylated residue in the middle of its sequence.
Although N-terminus arginylation was originally thought to be the only site for targeting by ATE1 enzymes, it has recently been discovered that arginylation may also occur in the middle of the peptide chain of a protein. The first discovery of this unprecedented modification occurred when neurotensin, a biological peptide found in the central nervous system, was isolated from cells and it was discovered that arginine was attached to a mid-chain glutamine residue. [4] This discovery changed the view of how arginylation occurs, as this meant that there may be ways to modify and arginylate intact proteins.
In an effort to determine the prevalence of mid-chain arginylation, a mass spectrometry screen of various peptides was performed. The results from this experiment revealed a plethora of various proteins which contained modified asparagine and glutamine residues present in the middle of their peptide chain, and further studies determined that ATE1 could also be mediating this reaction. Indeed, this discovery changed the biological scope of arginylation by suggesting that arginylation can also occur on fully intact proteins, not just on the N-terminus of protein fragments or pre-processed proteins. [5]
In 1986, the N-end rule was elucidated, and it states that the identity of the amino acid at the N-terminus of the protein's amino acid sequence determines the half-life of the protein. In an effort to determine the effects of arginylation on the half-life of proteins, several studies were performed using modified yeast proteins. These studies revealed that when proteins were engineered to include N-termini which had been arginylated, the modified proteins were metabolically unstable. [17] [18] [19] Furthermore, it was also discovered that protein ubiquitination and degradation become more likely to occur when a protein is arginylated. [20] The evidence gathered from these experiments make it clear that arginylation in vivo leads to the degradation of proteins with asparagine and glutamine residues at their N-termini.
However, there have also been several recent studies which have shown that protein degradation may not be the prevalent function of arginylation, but that this modification may also be important for certain proteins to function correctly. For instance, when arginylation occurs on beta amyloid proteins, the proteins are guided into their proper alpha helical shape and are also prevented from misfolding and aggregating. [21] Another protein which benefits from arginylation is calreticulin because when modified, its role during endoplasmic reticulum stress is facilitated, rather than it being removed from cells entirely. [22] [23] As both degradation and facilitation effects of arginylation have been identified and studied, it is clear that arginylation has an important role in protein regulation within cells.
Due to it being a lesser understood post-translational modification, arginylation and its regulation in vivo still remains largely esoteric. The expression of ATE1 can vary significantly within different tissues, but its levels within these tissues peak at mid-development [24] but begin to decline as an organism ages. [5] [25] It has also been observed that a variety of physiological compounds and drugs are able to affect the incorporation of arginine in vivo, but it is hypothesized that this occurs in a non-specific manner. [26] As such, it has been theorized that inhibitors and activators which regulate ATE1 activity, and therefore arginylation, may exist in vivo.
Arginylation's ability to make proteins metabolically unstable, as observed in yeast, makes proteins which have been modified in this way an attractive target for removal. One of the well characterized arginylation regulators is the ubiquitin dependent protein degradation which quickly degrades and removes harmful proteins. This important regulator of arginylation facilitates the specificity of this post-translational modification and efficiently removes proteins which were not meant to be arginylated in vivo. [27]
Lastly, an unproven but highly attractive mechanism of regulating arginylation in vivo suggests the use of de-arginylation enzymes which may be able to remove an arginine that has been added post-translationally to proteins. Enzymes such as Aminopeptidase B and Carboxypeptidase B are able to remove arginine from a proteins N-terminus and from side chain carboxyl groups, respectively, but do not specifically target arginylated sites. The proposed de-arginylation enzymes are theorized to act in the same way as the previously mentioned enzymes Aminopeptidase B and Carboxypeptidase B, but would differ in the fact that they specifically target arginylated protein substrates. Although these enzymes have not been discovered as of yet, the search for and discovery of these enzymes is an exciting path for further studies.
Initially written off as a non-essential process due to the ATE1 knockout in yeast, later studies have shown arginylation plays a significant role in several biological processes. The knockout of ATE1 in mice and Drosophila resulted in embryonic lethality for both species. Further studies using the mouse model to observe the effects of ATE1 knockout in the development of the organism revealed that the gene loss resulted in abnormal cardiac and craniofacial morphogenesis, impaired angiogenesis, and the ability of cells to undergo meiosis. Postnatally, ATE1 knockout resulted in weight loss, infertility, and mental retardation. Additionally, observing the effects of ATE1 deletion in Arabidopsis thaliana, a model plant organism, revealed defective shoot and leaf development, abnormal seed germination, and delayed leaf senescence. The dysfunctions resulting from the knockout of the ATE1 enzyme therefore suggest that arginylation is necessary for many physiological pathways within eukaryotes.
Protein primary structure is the linear sequence of amino acids in a peptide or protein. By convention, the primary structure of a protein is reported starting from the amino-terminal (N) end to the carboxyl-terminal (C) end. Protein biosynthesis is most commonly performed by ribosomes in cells. Peptides can also be synthesized in the laboratory. Protein primary structures can be directly sequenced, or inferred from DNA sequences.
Protein biosynthesis is a core biological process, occurring inside cells, balancing the loss of cellular proteins through the production of new proteins. Proteins perform a number of critical functions as enzymes, structural proteins or hormones. Protein synthesis is a very similar process for both prokaryotes and eukaryotes but there are some distinct differences.
Asparagine is an α-amino acid that is used in the biosynthesis of proteins. It contains an α-amino group, an α-carboxylic acid group, and a side chain carboxamide, classifying it as a polar, aliphatic amino acid. It is non-essential in humans, meaning the body can synthesize it. It is encoded by the codons AAU and AAC.
Arginine is the amino acid with the formula (H2N)(HN)CN(H)(CH2)3CH(NH2)CO2H. The molecule features a guanidino group appended to a standard amino acid framework. At physiological pH, the carboxylic acid is deprotonated (−CO2−) and both the amino and guanidino groups are protonated, resulting in a cation. Only the l-arginine (symbol Arg or R) enantiomer is found naturally. Arg residues are common components of proteins. It is encoded by the codons CGU, CGC, CGA, CGG, AGA, and AGG. The guanidine group in arginine is the precursor for the biosynthesis of nitric oxide. Like all amino acids, it is a white, water-soluble solid.
Post-translational modification (PTM) is the covalent process of changing proteins following protein biosynthesis. PTMs may involve enzymes or occur spontaneously. Proteins are created by ribosomes translating mRNA into polypeptide chains, which may then change to form the mature protein product. PTMs are important components in cell signalling, as for example when prohormones are converted to hormones.
Proteinogenic amino acids are amino acids that are incorporated biosynthetically into proteins during translation. The word "proteinogenic" means "protein creating". Throughout known life, there are 22 genetically encoded (proteinogenic) amino acids, 20 in the standard genetic code and an additional 2 that can be incorporated by special translation mechanisms.
The N-terminus (also known as the amino-terminus, NH2-terminus, N-terminal end or amine-terminus) is the start of a protein or polypeptide, referring to the free amine group (-NH2) located at the end of a polypeptide. Within a peptide, the amine group is bonded to the carboxylic group of another amino acid, making it a chain. That leaves a free carboxylic group at one end of the peptide, called the C-terminus, and a free amine group on the other end called the N-terminus. By convention, peptide sequences are written N-terminus to C-terminus, left to right (in LTR writing systems). This correlates the translation direction to the text direction, because when a protein is translated from messenger RNA, it is created from the N-terminus to the C-terminus, as amino acids are added to the carboxyl end of the protein.
Myelin basic protein (MBP) is a protein believed to be important in the process of myelination of nerves in the nervous system. The myelin sheath is a multi-layered membrane, unique to the nervous system, that functions as an insulator to greatly increase the velocity of axonal impulse conduction. MBP maintains the correct structure of myelin, interacting with the lipids in the myelin membrane.
Histone H4 is one of the five main histone proteins involved in the structure of chromatin in eukaryotic cells. Featuring a main globular domain and a long N-terminal tail, H4 is involved with the structure of the nucleosome of the 'beads on a string' organization. Histone proteins are highly post-translationally modified. Covalently bonded modifications include acetylation and methylation of the N-terminal tails. These modifications may alter expression of genes located on DNA associated with its parent histone octamer. Histone H4 is an important protein in the structure and function of chromatin, where its sequence variants and variable modification states are thought to play a role in the dynamic and long term regulation of genes.
N-Formylmethionine is a derivative of the amino acid methionine in which a formyl group has been added to the amino group. It is specifically used for initiation of protein synthesis from bacterial and organellar genes, and may be removed post-translationally.
In molecular biology, SUMOproteins are a family of small proteins that are covalently attached to and detached from other proteins in cells to modify their function. This process is called SUMOylation. SUMOylation is a post-translational modification involved in various cellular processes, such as nuclear-cytosolic transport, transcriptional regulation, apoptosis, protein stability, response to stress, and progression through the cell cycle.
Citrullination or deimination is the conversion of the amino acid arginine in a protein into the amino acid citrulline. Citrulline is not one of the 20 standard amino acids encoded by DNA in the genetic code. Instead, it is the result of a post-translational modification. Citrullination is distinct from the formation of the free amino acid citrulline as part of the urea cycle or as a byproduct of enzymes of the nitric oxide synthase family.
Deamidation is a chemical reaction in which an amide functional group in the side chain of the amino acids asparagine or glutamine is removed or converted to another functional group. Typically, asparagine is converted to aspartic acid or isoaspartic acid. Glutamine is converted to glutamic acid or pyroglutamic acid (5-oxoproline). In a protein or peptide, these reactions are important because they may alter its structure, stability or function and may lead to protein degradation. The net chemical change is the addition of a water group and removal of an ammonia group, which corresponds to a +1 (0.98402) Da mass increase. Although deamidation occurs on glutamine, glycosylated asparagine and other amides, these are negligible under typical proteolysis conditions.
A peptide library is a tool for studying proteins. Peptide libraries typically contain a large number of peptides that have a systematic combination of amino acids. Usually, solid phase synthesis, e.g. resin as a flat surface or beads, is used for peptide library generation. Peptide libraries are a popular tool for experiments in drug design, protein–protein interactions, and other biochemical and pharmaceutical applications.
An isopeptide bond is a type of amide bond formed between a carboxyl group of one amino acid and an amino group of another. An isopeptide bond is the linkage between the side chain amino or carboxyl group of one amino acid to the α-carboxyl, α-amino group, or the side chain of another amino acid. In a typical peptide bond, also known as eupeptide bond, the amide bond always forms between the α-carboxyl group of one amino acid and the α-amino group of the second amino acid. Isopeptide bonds are rarer than regular peptide bonds. Isopeptide bonds lead to branching in the primary sequence of a protein. Proteins formed from normal peptide bonds typically have a linear primary sequence.
ADP-ribosylation is the addition of one or more ADP-ribose moieties to a protein. It is a reversible post-translational modification that is involved in many cellular processes, including cell signaling, DNA repair, gene regulation and apoptosis. Improper ADP-ribosylation has been implicated in some forms of cancer. It is also the basis for the toxicity of bacterial compounds such as cholera toxin, diphtheria toxin, and others.
In enzymology, an alpha-tubulin N-acetyltransferase is an enzyme which is encoded by the ATAT1 gene.
In enzymology, an arginyltransferase is an enzyme that catalyzes the chemical reaction
(ADP-ribosyl)hydrolase 1, also termed [Protein ADP-ribosylarginine] hydrolase and protein-Nω-(ADP-D-ribosyl)-L-arginine ADP-ribosylhydrolase (EC, is an enzyme that in humans is encoded by the ADPRH gene. This enzyme is a specific mono(ADP-ribosyl)hydrolase that catalyses the removal of an ADP-ribosyl modification from target arginine residues of protein substrates. The chemical reactions can formally be described as follows:
N-glycosyltransferase is an enzyme in prokaryotes which transfers individual hexoses onto asparagine sidechains in substrate proteins, using a nucleotide-bound intermediary, within the cytoplasm. They are distinct from regular N-glycosylating enzymes, which are oligosaccharyltransferases that transfer pre-assembled oligosaccharides. Both enzyme families however target a shared amino acid sequence asparagine—-any amino acid except proline—serine or threonine (N–x–S/T), with some variations.