The N-end rule is a rule that governs the rate of protein degradation through recognition of the N-terminal residue of proteins. The rule states that the N-terminal amino acid of a protein determines its half-life (time after which half of the total amount of a given polypeptide is degraded). The rule applies to both eukaryotic and prokaryotic organisms, but with different strength, rules, and outcome. [1] In eukaryotic cells, these N-terminal residues are recognized and targeted by ubiquitin ligases, mediating ubiquitination thereby marking the protein for degradation. [2] The rule was initially discovered by Alexander Varshavsky and co-workers in 1986. [3] However, only rough estimations of protein half-life can be deduced from this 'rule', as N-terminal amino acid modification can lead to variability and anomalies, whilst amino acid impact can also change from organism to organism. Other degradation signals, known as degrons, can also be found in sequence.
The rule may operate differently in different organisms.
N-terminal residues - approximate half-life of proteins for S. cerevisiae [3]
N-terminal residues - approximate half-life of proteins in mammalian systems [4]
In Escherichia coli , positively-charged and some aliphatic and aromatic residues on the N-terminus, such as arginine, lysine, leucine, phenylalanine, tyrosine, and tryptophan, have short half-lives of around 2-minutes and are rapidly degraded. [5] These residues (when located at the N-terminus of a protein) are referred to as destabilising residues. In bacteria, destabilising residues can be further defined as Primary destabilising residues (leucine, phenylalanine, tyrosine, and tryptophan) or secondary destabilising residues (arginine, lysine and in a special case methionine [6] ). Secondary destabilising residues are modified by the attachment of a Primary destabilising residue by the enzyme leucyl/phenylalanyl-tRNA-protein transferase. [5] [6] All other amino acids when located at the N-terminus of a protein are referred to as stabilising residues and have half-lives of more than 10 hours . [5] Proteins bearing an N-terminal Primary destabilising residue are specifically recognised by the bacterial N-recognin (recognition component) ClpS. [7] [8] ClpS is as a specific adaptor protein for the ATP-dependent AAA+ protease ClpAP, and hence ClpS delivers N-degron substrates to ClpAP for degradation.
A complicating issue is that the first residue of bacterial proteins is normally expressed with an N-terminal formylmethionine (f-Met). The formyl group of this methionine is quickly removed, and the methionine itself is then removed by methionyl aminopeptidase. The removal of the methionine is more efficient when the second residue is small and uncharged (for example alanine), but inefficient when it is bulky and charged such as arginine. Once the f-Met is removed, the second residue becomes the N-terminal residue and are subject to the N-end rule. Residues with middle sized side-chains such as leucine as the second residue therefore may have a short half-life. [9]
There are several reasons why it is possible that the N-end rule functions in the chloroplast organelle of plant cells as well. [10] The first piece of evidence comes from the endosymbiotic theory which encompasses the idea that chloroplasts are derived from cyanobacteria, photosynthetic organisms that can convert light into energy. [11] [12] It is thought that the chloroplast developed from an endosymbiosis between a eukaryotic cell and a cyanobacterium, because chloroplasts share several features with the bacterium, including photosynthetic capabilities. [11] [12] The bacterial N-end rule is already well documented; it involves the Clp protease system which consists of the adaptor protein ClpS and the ClpA/P chaperone and protease core. [5] [7] [13] A similar Clp system is present in the chloroplast stroma, suggesting that the N-end rule might function similarly in chloroplasts and bacteria. [10] [14]
Additionally, a 2013 study in Arabidopsis thaliana revealed the protein ClpS1, a possible plastid homolog of the bacterial ClpS recognin. [15] ClpS is a bacterial adaptor protein that is responsible for recognizing protein substrates via their N-terminal residues and delivering them to a protease core for degradation. [7] This study suggests that ClpS1 is functionally similar to ClpS, also playing a role in substrate recognition via specific N-terminal residues (degrons) like its bacterial counterpart. [15] It is posited that upon recognition, ClpS1 binds to these substrate proteins and brings them to the ClpC chaperone of the protease core machinery to initiate degradation. [15]
In another study, Arabidopsis thaliana stromal proteins were analyzed to determine the relative abundance of specific N-terminal residues. [16] This study revealed that Alanine, Serine, Threonine, and Valine were the most abundant N-terminal residues, while Leucine, Phenylalanine, Tryptophan, and Tyrosine (all triggers for degradation in bacteria) were among the residues that were rarely detected. [16]
Furthermore, an affinity assay using ClpS1 and N-terminal residues was performed to determine whether ClpS1 did indeed have specific binding partners. [17] This study revealed that Phenylalanine and Tryptophan bind specifically to ClpS1, making them prime candidates for N-degrons in chloroplasts. [17]
Further research is currently being conducted to confirm whether the N-end rule operates in chloroplasts. [10] [17]
An apicoplast is a derived non-photosynthetic plastid found in most Apicomplexa, including Toxoplasma gondii , Plasmodium falciparum and other Plasmodium spp. (parasites causing malaria). Similar to plants, several Apicomplexa n species, including Plasmodium falciparum contain all of the necessary components [18] [19] required for a Apicoplast-localized Clp-protease, including a potential homolog of the bacterial ClpS N-recognin. [20] [21] In vitro data demonstrate that Plasmodium falciparum ClpS is able to recognize a variety of N-terminal primary destabilizing residues, not only the classic bacterial Primary destabilizing residues (leucine, phenylalanine, tyrosine and tryptophan) but also N-terminal Isoleucine and hence exhibits broad specificity (in comparison to its bacterial counterpart). [21]
Chymotrypsin (EC 3.4.21.1, chymotrypsins A and B, alpha-chymar ophth, avazyme, chymar, chymotest, enzeon, quimar, quimotrase, alpha-chymar, alpha-chymotrypsin A, alpha-chymotrypsin) is a digestive enzyme component of pancreatic juice acting in the duodenum, where it performs proteolysis, the breakdown of proteins and polypeptides. Chymotrypsin preferentially cleaves peptide amide bonds where the side chain of the amino acid N-terminal to the scissile amide bond (the P1 position) is a large hydrophobic amino acid (tyrosine, tryptophan, and phenylalanine). These amino acids contain an aromatic ring in their side chain that fits into a hydrophobic pocket (the S1 position) of the enzyme. It is activated in the presence of trypsin. The hydrophobic and shape complementarity between the peptide substrate P1 side chain and the enzyme S1 binding cavity accounts for the substrate specificity of this enzyme. Chymotrypsin also hydrolyzes other amide bonds in peptides at slower rates, particularly those containing leucine at the P1 position.
Proteolysis is the breakdown of proteins into smaller polypeptides or amino acids. Uncatalysed, the hydrolysis of peptide bonds is extremely slow, taking hundreds of years. Proteolysis is typically catalysed by cellular enzymes called proteases, but may also occur by intra-molecular digestion.
A protease is an enzyme that catalyzes proteolysis, breaking down proteins into smaller polypeptides or single amino acids, and spurring the formation of new protein products. They do this by cleaving the peptide bonds within proteins by hydrolysis, a reaction where water breaks bonds. Proteases are involved in numerous biological pathways, including digestion of ingested proteins, protein catabolism, and cell signaling.
Phenylalanine hydroxylase (PAH) (EC 1.14.16.1) is an enzyme that catalyzes the hydroxylation of the aromatic side-chain of phenylalanine to generate tyrosine. PAH is one of three members of the biopterin-dependent aromatic amino acid hydroxylases, a class of monooxygenase that uses tetrahydrobiopterin (BH4, a pteridine cofactor) and a non-heme iron for catalysis. During the reaction, molecular oxygen is heterolytically cleaved with sequential incorporation of one oxygen atom into BH4 and phenylalanine substrate. In humans, mutations in its encoding gene, PAH, can lead to the metabolic disorder phenylketonuria.
The N-terminus (also known as the amino-terminus, NH2-terminus, N-terminal end or amine-terminus) is the start of a protein or polypeptide, referring to the free amine group (-NH2) located at the end of a polypeptide. Within a peptide, the amine group is bonded to the carboxylic group of another amino acid, making it a chain. That leaves a free carboxylic group at one end of the peptide, called the C-terminus, and a free amine group on the other end called the N-terminus. By convention, peptide sequences are written N-terminus to C-terminus, left to right (in LTR writing systems). This correlates the translation direction to the text direction, because when a protein is translated from messenger RNA, it is created from the N-terminus to the C-terminus, as amino acids are added to the carboxyl end of the protein.
A ubiquitin ligase is a protein that recruits an E2 ubiquitin-conjugating enzyme that has been loaded with ubiquitin, recognizes a protein substrate, and assists or directly catalyzes the transfer of ubiquitin from the E2 to the protein substrate. In simple and more general terms, the ligase enables movement of ubiquitin from a ubiquitin carrier to another protein by some mechanism. The ubiquitin, once it reaches its destination, ends up being attached by an isopeptide bond to a lysine residue, which is part of the target protein. E3 ligases interact with both the target protein and the E2 enzyme, and so impart substrate specificity to the E2. Commonly, E3s polyubiquitinate their substrate with Lys48-linked chains of ubiquitin, targeting the substrate for destruction by the proteasome. However, many other types of linkages are possible and alter a protein's activity, interactions, or localization. Ubiquitination by E3 ligases regulates diverse areas such as cell trafficking, DNA repair, and signaling and is of profound importance in cell biology. E3 ligases are also key players in cell cycle control, mediating the degradation of cyclins, as well as cyclin dependent kinase inhibitor proteins. The human genome encodes over 600 putative E3 ligases, allowing for tremendous diversity in substrates.
Serine proteases are enzymes that cleave peptide bonds in proteins. Serine serves as the nucleophilic amino acid at the (enzyme's) active site. They are found ubiquitously in both eukaryotes and prokaryotes. Serine proteases fall into two broad categories based on their structure: chymotrypsin-like (trypsin-like) or subtilisin-like.
Plasmepsins are a class of at least 10 enzymes produced by the Plasmodium falciparum parasite. There are ten different isoforms of these proteins and ten genes coding them respectively in Plasmodium. It has been suggested that the plasmepsin family is smaller in other human Plasmodium species. Expression of Plm I, II, IV, V, IX, X and HAP occurs in the erythrocytic cycle, and expression of Plm VI, VII, VIII, occurs in the exoerythrocytic cycle. Through their haemoglobin-degrading activity, they are an important cause of symptoms in malaria sufferers. Consequently, this family of enzymes is a potential target for antimalarial drugs.
N-Formylmethionine is a derivative of the amino acid methionine in which a formyl group has been added to the amino group. It is specifically used for initiation of protein synthesis from bacterial and organellar genes, and may be removed post-translationally.
Alexander J. Varshavsky is a Russian-American biochemist and geneticist. He works at the California Institute of Technology (Caltech) as the Morgan Professor of Biology. Varshavsky left Russia in 1977, emigrating to United States.
A degron is a portion of a protein that is important in regulation of protein degradation rates. Known degrons include short amino acid sequences, structural motifs and exposed amino acids located anywhere in the protein. In fact, some proteins can even contain multiple degrons. Degrons are present in a variety of organisms, from the N-degrons first characterized in yeast to the PEST sequence of mouse ornithine decarboxylase. Degrons have been identified in prokaryotes as well as eukaryotes. While there are many types of different degrons, and a high degree of variability even within these groups, degrons are all similar for their involvement in regulating the rate of a protein's degradation. Much like protein degradation, mechanisms are categorized by their dependence or lack thereof on ubiquitin, a small protein involved in proteasomal protein degradation, Degrons may also be referred to as “ubiquitin-dependent" or “ubiquitin-independent".
The heat shock proteins HslV and HslU are expressed in many bacteria such as E. coli in response to cell stress. The hslV protein is a protease and the hslU protein is an ATPase; the two form a symmetric assembly of four stacked rings, consisting of an hslV dodecamer bound to an hslU hexamer, with a central pore in which the protease and ATPase active sites reside. The hslV protein degrades unneeded or damaged proteins only when in complex with the hslU protein in the ATP-bound state. HslV is thought to resemble the hypothetical ancestor of the proteasome, a large protein complex specialized for regulated degradation of unneeded proteins in eukaryotes, many archaea, and a few bacteria. HslV bears high similarity to core subunits of proteasomes.
Carboxypeptidase A usually refers to the pancreatic exopeptidase that hydrolyzes peptide bonds of C-terminal residues with aromatic or aliphatic side-chains. Most scientists in the field now refer to this enzyme as CPA1, and to a related pancreatic carboxypeptidase as CPA2.
In enzymology, arginine kinase is an enzyme that catalyzes the chemical reaction
The human gene UBR1 encodes the enzyme ubiquitin-protein ligase E3 component n-recognin 1.
ATP-dependent Clp protease proteolytic subunit (ClpP) is an enzyme that in humans is encoded by the CLPP gene. This protein is an essential component to form the protein complex of Clp protease.
Johanson–Blizzard syndrome (JBS) is a rare, sometimes fatal autosomal recessive multisystem congenital disorder featuring abnormal development of the pancreas, nose and scalp, with intellectual disability, hearing loss and growth failure. It is sometimes described as a form of ectodermal dysplasia.
ClpS is an N-recognin in the N-end rule pathway. ClpS interacts with protein substrates that have a bulky hydrophobic residue at the N-terminus. The protein substrate is then degraded by the ClpAP protease.
Acyldepsipeptide or cyclic acyldepsipeptide (ADEP) is a class of potential antibiotics first isolated from bacteria and act by deregulating the ClpP protease. Natural ADEPs were originally found as products of aerobic fermentation in Streptomyces hawaiiensis, A54556A and B, and in the culture broth of Streptomyces species, enopeptin A and B. ADEPs are of great interest in drug development due to their antibiotic properties and thus are being modified in attempt to achieve greater antimicrobial activity.
Arginylation is a post-translational modification in which proteins are modified by the addition of arginine (Arg) at the N-terminal amino group or side chains of reactive amino acids by the enzyme, arginyltransferase (ATE1). Recent studies have also revealed that hundreds of proteins in vivo are arginylated, proteins which are essential for many biological pathways. While still poorly understood in a biological setting, the ATE1 enzyme is highly conserved which suggests that arginylation is an important biological post-translational modification.