Asparagine peptide lyase are one of the seven groups in which proteases, also termed proteolytic enzymes, peptidases, or proteinases, are classified according to their catalytic residue. The catalytic mechanism of the asparagine peptide lyases involves an asparagine residue acting as nucleophile to perform a nucleophilic elimination reaction, rather than hydrolysis, to catalyse the breaking of a peptide bond. [1]
The existence of this seventh catalytic type of proteases, in which the peptide bond cleavage occurs by self-processing instead of hydrolysis, was demonstrated with the discovery of the crystal structure of the self-cleaving precursor of the Tsh autotransporter from E. coli. [2]
These enzymes are synthesized as precursors or propeptides, which cleave themselves by an autoproteolytic reaction. [2]
The self-cleaving nature of asparagine peptide lyases contradicts the general definition of an enzyme given that the enzymatic activity destroys the enzyme. However, the self-processing is the action of a proteolytic enzyme, notwithstanding the enzyme is not recoverable from the reaction. [1]
All the proteolytic activity of the asparagine peptide lyases is only self-cleavages, then no further peptidase activity occurs. [3]
The main residue of the active site is the asparagine and there are other residues involved in the catalytic mechanism, which are different between the different families of asparagine peptide lyases. [2] [4] [5]
The cleavage mechanism consists in the cyclization of the asparagine, assisted by other active site residues. In certain conditions, the asparagine cyclic structure nucleophilically attacks its C-terminal peptide bond to the main chain forming a new bond to create a stable succinimide, cleaving itself from the main chain and consequently releasing the two halves of the product. [6] [7]
No inhibitors are known. [3]
The MEROPS protease database includes the following ten families of asparagine peptide lyases, which are included in 6 different clans of proteases. [3]
Proteolytic enzymes are classified into families based on sequence similarity. Each family includes proteolytic enzymes with homologous sequences and common catalytic type. Clans are groups of proteolytic enzymes families with related structures, where catalytic type is not conserved.
Clan | Family | MEROPS ID | Peptidases and homologues | NC-IUBMB | PDB ID |
---|---|---|---|---|---|
NA | N1 | N01.001 | nodavirus coat protein | 3.4.23.44 | 2BBV |
unassigned | family N1 unassigned peptide lyases | * | - | ||
N2 | N02.001 | tetravirus coat protein | * | 1OHF | |
non-peptidase homologue | family N2 non-lyase homologues | * | - | ||
unassigned | family N2 unassigned peptide lyases | * | - | ||
N8 | N08.001 | picornavirus capsid VP0-type self-cleaving protein | * | 1NCQ | |
non-peptidase homologue | family N8 non-lyase homologues | * | - | ||
unassigned | family N8 unassigned peptide lyases | * | - | ||
NB | N6 | N06.001 | YscU protein (Yersinia pseudotuberculosis) | * | 2JLJ |
N06.002 | SpaS protein (Salmonella sp.) | * | 3C01; 2VT1 | ||
N06.003 | EscU protein (Escherichia coli) | * | 3BZO | ||
N06.004 | HrcU protein (Xanthomonas sp.) | * | - | ||
N06.A01 | FlhB protein (Escherichia coli) | * | - | ||
non-peptidase homologue | family N6 non-lyase homologues | * | - | ||
unassigned | family N6 unassigned peptide lyases | * | - | ||
NC | N7 | N07.001 | reovirus type 1 coat protein | * | 1JMU |
N07.002 | aquareovirus coat protein | * | - | ||
unassigned | family N7 unassigned peptide lyases | * | - | ||
ND | N4 | N04.001 | Tsh-associated self-cleaving domain (Escherichia coli) and similar | * | 3AEH |
N04.002 | EspP gamma protein autotransporter domain (Escherichi-type) | * | 2QOM | ||
non-peptidase homologue | family N4 non-lyase homologues | * | - | ||
unassigned | family N4 unassigned peptide lyases | * | - | ||
NE | N5 | N05.001 | picobirnavirus self-cleaving protein | * | 2VF1 |
unassigned | family N5 unassigned peptide lyases | * | - | ||
PD | N9 | N09.001 | intein-containing V-type proton ATPase catalytic subunit A | 3.6.3.14 | 1VDE |
non-peptidase homologue | family N9 non-lyase homologues | * | - | ||
unassigned | family N9 unassigned peptide lyases | * | - | ||
N10 | N10.001 | intein-containing DNA gyrase subunit A precursor | * | - | |
N10.002 | intein-containing replicative DNA helicase precursor | * | 1MI8 | ||
N10.003 | intein-containing DNA polymerase III subunit alpha precursor | 2.7.7.7 | 2KEQ | ||
N10.004 | intein-containing translation initiation factor IF-2 precursor | - | |||
N10.005 | intein-containing DNA polymerase II large subunit DP2 precursor Mername-AA281 | * | - | ||
N10.006 | intein-containing DNA polymerase II large subunit DP2 precursor Mername-AA282 | 2.7.7.7 | - | ||
N10.007 | intein-containing DNA-dependent DNA polymerase precursor | * | 2CW7; 2CW8 | ||
N10.008 | intein-containing DNA gyrase subunit A (Mycobacterium xenopi) | * | 1AM2; 4OZ6 | ||
N10.009 | Mtu recA intein (Mycobacterium sp.) | * | 2IN9 | ||
non-peptidase homologue | family N10 non-lyase homologues | * | - | ||
unassigned | family N10 unassigned peptide lyases | * | - | ||
N11 | N11.001 | intein-containing chloroplast ATP-dependent peptide lyase | * | - | |
non-peptidase homologue | family N11 non-lyase homologues | * | - | ||
unassigned | family N11 unassigned peptide lyases | * | - |
*Not yet included in IUBMB recommendations.
The ten different families of asparagine peptide lyases are distributed in three different types:
There are five families of viral coat proteins (N1, N2, N8, N7 and N5), two families of autotransporter proteins (N6 and N4) and three families of intein-containing proteins (N9, N10 and N11).
There are five families of viral coat proteins in which processing occurs at an asparagine residue. These five families are included in three clans: Clan NA (Families N1, N2 and N8), clan NC (Family N7) and clan NE (Family N5). [8]
Family N1: The known autolytic cleavage is mediated by the nodavirus endopeptidase, from the C-terminus of the coat protein and only occurs within the assembled virion. [9]
Family N2: Includes tetraviruses endopeptidases. The known autolytic cleavage is from the C-terminus of the coat protein. The cleavage occurs during the late stages of virion assembly. [10]
Family N8: The known autolytic cleavage is in poliovirus VP0 viral capsid protein into VP2 and Vp4 in the provirion. [11]
Family N7: The known autolytic cleavage is from the N-terminus of the coat protein. [12]
Family N5: The known autolytic cleavage is from the N-terminus of the coat protein. [13]
Autotransporter proteins are outer membrane or secreted proteins found in a broad variety of Gram-negative bacteria. These proteins contain three structural motifs: a signal sequence, a passenger domain located at the N-terminal, and a translocator or autotransporter domain located at the C-terminal, forming a beta barrel structure. These structures promote the protein self-transport. Autotransporter proteins are usually related to virulence functions. This fact, their interaction with host cells and the broad occurrence of autotransporter encoding genes, bring up the possibility to represent therapeutic targets for the design of vaccines against Gram-negative pathogens. [14]
Two of the families in which the MEROPS database classifies asparagine peptide lyases are autotransporter proteins, families N4 and N6. [3]
Family N4 includes secreted virulence factors, or autotransporters, from enterobacteria. Their only proteolytic activity is releasing the virulence factor from the precursor, enabling it to be secreted. The active site residues in family N4 asparagine peptide lyases are N1100, Y1227, E1249 and R1282.
Family N6 includes autoprocessing endopeptidases involved in type III protein secretion system, in which autoproteolysis is essential for mediating the secretion of proteins. Type III secretion system secretes proteins directly into host cells by an injectisome, a hollow tubular structure that penetrates into the host cell. Secreted proteins can pass through the injectisome into the host cell cytoplasm. The conserved active site residue in family N6 asparagine peptide lyases is N263.
An intein is a protein contained within another protein, the extein. Parasitic DNA infects an intein gene, which encodes an endonuclease. The resulting cDNA (complementary DNA) encodes the extein along with the intein. The intein contains a self-cleaving domain, which has the endonuclease nested within it. The intein domain performs two proteolytic cleavages at its own N-terminus and C-terminus and releases from the extein, separating it in two fragments. This two fragments are then spliced together and the extein remains as a completely functional protein.
The N-terminal residue of the intein domain must be a serine, threonine or cysteine, and it attacks its preceding peptide bond in order to form an ester or a thioester. The first residue of the second portion of the extein must be a serine, threonine or cysteine as well, and this second nucleophile forms a branched intermediary. The C-terminal residue of the intein domain is always an asparagine, which cyclizes to form a succinimide, cleaving its own peptide bond and releasing the intein from the extein. Finally, in the extein the ester or thioester bond is rearranged to form a normal peptide bond. [15]
There are three known families of intein-containing proteins (N9, N10 and N11) all of them included in the PD clan, which contains proteolytic enzymes of different catalytic types. The tertiary structure has been solved for the intein V type proton ATPase catalytic subunit (Saccharomyces cerevisiae), a member of family N9 and for several inteins from family N10.
Chymotrypsin (EC 3.4.21.1, chymotrypsins A and B, alpha-chymar ophth, avazyme, chymar, chymotest, enzeon, quimar, quimotrase, alpha-chymar, alpha-chymotrypsin A, alpha-chymotrypsin) is a digestive enzyme component of pancreatic juice acting in the duodenum, where it performs proteolysis, the breakdown of proteins and polypeptides. Chymotrypsin preferentially cleaves peptide amide bonds where the side chain of the amino acid N-terminal to the scissile amide bond (the P1 position) is a large hydrophobic amino acid (tyrosine, tryptophan, and phenylalanine). These amino acids contain an aromatic ring in their side chain that fits into a hydrophobic pocket (the S1 position) of the enzyme. It is activated in the presence of trypsin. The hydrophobic and shape complementarity between the peptide substrate P1 side chain and the enzyme S1 binding cavity accounts for the substrate specificity of this enzyme. Chymotrypsin also hydrolyzes other amide bonds in peptides at slower rates, particularly those containing leucine at the P1 position.
Proteolysis is the breakdown of proteins into smaller polypeptides or amino acids. Uncatalysed, the hydrolysis of peptide bonds is extremely slow, taking hundreds of years. Proteolysis is typically catalysed by cellular enzymes called proteases, but may also occur by intra-molecular digestion.
Trypsin is an enzyme in the first section of the small intestine that starts the digestion of protein molecules by cutting long chains of amino acids into smaller pieces. It is a serine protease from the PA clan superfamily, found in the digestive system of many vertebrates, where it hydrolyzes proteins. Trypsin is formed in the small intestine when its proenzyme form, the trypsinogen produced by the pancreas, is activated. Trypsin cuts peptide chains mainly at the carboxyl side of the amino acids lysine or arginine. It is used for numerous biotechnological processes. The process is commonly referred to as trypsinogen proteolysis or trypsinization, and proteins that have been digested/treated with trypsin are said to have been trypsinized.
A protease is an enzyme that catalyzes proteolysis, breaking down proteins into smaller polypeptides or single amino acids, and spurring the formation of new protein products. They do this by cleaving the peptide bonds within proteins by hydrolysis, a reaction where water breaks bonds. Proteases are involved in numerous biological pathways, including digestion of ingested proteins, protein catabolism, and cell signaling.
In biology and biochemistry, protease inhibitors, or antiproteases, are molecules that inhibit the function of proteases. Many naturally occurring protease inhibitors are proteins.
Serine proteases are enzymes that cleave peptide bonds in proteins. Serine serves as the nucleophilic amino acid at the (enzyme's) active site. They are found ubiquitously in both eukaryotes and prokaryotes. Serine proteases fall into two broad categories based on their structure: chymotrypsin-like (trypsin-like) or subtilisin-like.
A metalloproteinase, or metalloprotease, is any protease enzyme whose catalytic mechanism involves a metal. An example is ADAM12 which plays a significant role in the fusion of muscle cells during embryo development, in a process known as myogenesis.
Papain, also known as papaya proteinase I, is a cysteine protease enzyme present in papaya and mountain papaya. It is the namesake member of the papain-like protease family.
A catalytic triad is a set of three coordinated amino acid residues that can be found in the active site of some enzymes. Catalytic triads are most commonly found in hydrolase and transferase enzymes. An acid-base-nucleophile triad is a common motif for generating a nucleophilic residue for covalent catalysis. The residues form a charge-relay network to polarise and activate the nucleophile, which attacks the substrate, forming a covalent intermediate which is then hydrolysed to release the product and regenerate free enzyme. The nucleophile is most commonly a serine or cysteine, but occasionally threonine or even selenocysteine. The 3D structure of the enzyme brings together the triad residues in a precise orientation, even though they may be far apart in the sequence.
Cysteine proteases, also known as thiol proteases, are hydrolase enzymes that degrade proteins. These proteases share a common catalytic mechanism that involves a nucleophilic cysteine thiol in a catalytic triad or dyad.
In molecular biology, the Signal Peptide Peptidase (SPP) is a type of protein that specifically cleaves parts of other proteins. It is an intramembrane aspartyl protease with the conserved active site motifs 'YD' and 'GxGD' in adjacent transmembrane domains (TMDs). Its sequences is highly conserved in different vertebrate species. SPP cleaves remnant signal peptides left behind in membrane by the action of signal peptidase and also plays key roles in immune surveillance and the maturation of certain viral proteins.
Aspartic proteases are a catalytic type of protease enzymes that use an activated water molecule bound to one or more aspartate residues for catalysis of their peptide substrates. In general, they have two highly conserved aspartates in the active site and are optimally active at acidic pH. Nearly all known aspartyl proteases are inhibited by pepstatin.
TEV protease is a highly sequence-specific cysteine protease from Tobacco Etch Virus (TEV). It is a member of the PA clan of chymotrypsin-like proteases. Due to its high sequence specificity, TEV protease is frequently used for the controlled cleavage of fusion proteins in vitro and in vivo. The consensus sequence recognized by TEV protease is Glu-Asn-Leu-Tyr-Phe-Gln-|-Ser, where "|" denotes cleaved peptide bond.
MEROPS is an online database for peptidases and their inhibitors. The classification scheme for peptidases was published by Rawlings & Barrett in 1993, and that for protein inhibitors by Rawlings et al. in 2004. The most recent version, MEROPS 12.4, was released in late October 2021.
Subtilases are a family of subtilisin-like serine proteases. They appear to have independently and convergently evolved an Asp/Ser/His catalytic triad, like in the trypsin serine proteases. The structure of proteins in this family shows that they have an alpha/beta fold containing a 7-stranded parallel beta sheet.
Astacins are a family of multidomain metalloendopeptidases which are either secreted or membrane-anchored. These metallopeptidases belong to the MEROPS peptidase family M12, subfamily M12A. The protein fold of the peptidase domain for members of this family resembles that of thermolysin, the type example for clan MA and the predicted active site residues for members of this family and thermolysin occur in the motif HEXXH.
The PA clan is the largest group of proteases with common ancestry as identified by structural homology. Members have a chymotrypsin-like fold and similar proteolysis mechanisms but can have identity of <10%. The clan contains both cysteine and serine proteases. PA clan proteases can be found in plants, animals, fungi, eubacteria, archaea and viruses.
Glutamic proteases are a group of proteolytic enzymes containing a glutamic acid residue within the active site. This type of protease was first described in 2004 and became the sixth catalytic type of protease. Members of this group of protease had been previously assumed to be an aspartate protease, but structural determination showed it to belong to a novel protease family. The first structure of this group of protease was scytalidoglutamic peptidase, the active site of which contains a catalytic dyad, glutamic acid (E) and glutamine (Q), which give rise to the name eqolisin. This group of proteases are found primarily in pathogenic fungi affecting plant and human.
Asparagine endopeptidase is a proteolytic enzyme from C13 peptidase family which hydrolyses a peptide bond using the thiol group of a cysteine residue as a nucleophile. It is also known as asparaginyl endopeptidase, citvac, proteinase B, hemoglobinase, PRSC1 gene product or LGMN, vicilin peptidohydrolase and bean endopeptidase. In humans it is encoded by the LGMN gene.
Papain-like proteases are a large protein family of cysteine protease enzymes that share structural and enzymatic properties with the group's namesake member, papain. They are found in all domains of life. In animals, the group is often known as cysteine cathepsins or, in older literature, lysosomal peptidases. In the MEROPS protease enzyme classification system, papain-like proteases form Clan CA. Papain-like proteases share a common catalytic dyad active site featuring a cysteine amino acid residue that acts as a nucleophile.