MEROPS

Last updated

MEROPS
Content
DescriptionMEROPS: the database of proteolytic enzymes, their substrates and inhibitors.
Contact
Research center The Wellcome Trust Sanger Institute
AuthorsNeil D Rawlings
Primary citationRawlings & al. (2012) [1]
Release date2010
Access
Website https://www.ebi.ac.uk/merops/

MEROPS is an online database for peptidases (also known as proteases, proteinases and proteolytic enzymes) and their inhibitors. [2] The classification scheme for peptidases was published by Rawlings & Barrett in 1993, [3] and that for protein inhibitors by Rawlings et al. in 2004. [4] The most recent version, MEROPS 12.4, was released in late October 2021. [5]

Contents

Overview

The classification is based on similarities at the tertiary and primary structural levels. Comparisons are restricted to that part of the sequence directly involved in the reaction, which in the case of a peptidase must include the active site, and for a protein inhibitor the reactive site. The classification is hierarchical: sequences are assembled into families, and families are assembled into clans. Each peptidase, family, and clan has a unique identifier.

Classification

Family

The families of peptidases are constructed by comparisons of amino acid sequences. A family is assembled around a type example, the sequence of a well-characterized peptidase or inhibitor. All other sequences in the family must be related to the family type example, either directly or through a transitive relationship involving one or more sequences already shown to be family members. Typically, FastA or BlastP is used to establish sequence relationships, with an expect value of 0.001 or lower taken to be statistically significant. HMMER or psi-blast searches are used for adding sequences which are distantly related to a family. Each family is identified by a letter representing the catalytic type of the peptidases it contains followed by an arbitrary unique number.

Some families are divided into subfamilies due to evidence of very ancient divergence within the family. The divergence corresponds to more than 150 accepted point mutations per 100 amino acid residues.

Clan

The similarity in three-dimensional structures supports the evidence that many of the families do share common ancestry with others. "Clan" is used to describe such a group of families. A clan is also assembled around a type example, this being the structure of a well-characterized peptidase or inhibitor. A family is included in a clan if the tertiary structure of a family member can be shown to be related to that of the clan type example. Typically, DALI is used to establish clan membership, with a z score of 6.00 standard deviation units or above considered to be statistically significant. For peptidases, other evidence to indicate that families are related when a tertiary structure is absent includes the same order of catalytic residues in the sequences.

Identifier

Each family, clan, peptidase, and inhibitor has a unique identifier. Description and example of identifiers are shown in the table below.

TypeIdentifier DescriptionIdentifier example
FamilyLetter indicating the catalytic type followed by a serial number of up to two digitsA1
ClanThe first letter indicating the catalytic type followed by a serial letterAA
PeptidaseFamily name (padded with zeros if necessary to make it three characters long), a dot and a three-digit serial numberA01.001
InhibitorThe initial 'I' followed by family name (padded with zeros), a dot and three-digit serial numberI10.001
Compound InhibitorThe initial ‘L’ is for any compound inhibitor. 'L' is followed by the family name of the inhibitor units (padded to three characters). Following the dash, there is a three-digit serial numberLI01-001
Small-molecule inhibitorEach SMI is assigned an identifier consisting of an initial J followed by a five-digit numberJ00095

See also

Related Research Articles

<span class="mw-page-title-main">Protease</span> Enzyme that cleaves other proteins into smaller peptides

A protease is an enzyme that catalyzes proteolysis, breaking down proteins into smaller polypeptides or single amino acids, and spurring the formation of new protein products. They do this by cleaving the peptide bonds within proteins by hydrolysis, a reaction where water breaks bonds. Proteases are involved in numerous biological pathways, including digestion of ingested proteins, protein catabolism, and cell signaling.

In biology and biochemistry, protease inhibitors, or antiproteases, are molecules that inhibit the function of proteases. Many naturally occurring protease inhibitors are proteins.

<span class="mw-page-title-main">Metalloproteinase</span> Type of enzyme

A metalloproteinase, or metalloprotease, is any protease enzyme whose catalytic mechanism involves a metal. An example is ADAM12 which plays a significant role in the fusion of muscle cells during embryo development, in a process known as myogenesis.

<span class="mw-page-title-main">Papain</span> Widely used enzyme extracted from papayas

Papain, also known as papaya proteinase I, is a cysteine protease enzyme present in papaya and mountain papaya. It is the namesake member of the papain-like protease family.

<span class="mw-page-title-main">Catalytic triad</span> Set of three coordinated amino acids

A catalytic triad is a set of three coordinated amino acids that can be found in the active site of some enzymes. Catalytic triads are most commonly found in hydrolase and transferase enzymes. An acid-base-nucleophile triad is a common motif for generating a nucleophilic residue for covalent catalysis. The residues form a charge-relay network to polarise and activate the nucleophile, which attacks the substrate, forming a covalent intermediate which is then hydrolysed to release the product and regenerate free enzyme. The nucleophile is most commonly a serine or cysteine amino acid, but occasionally threonine or even selenocysteine. The 3D structure of the enzyme brings together the triad residues in a precise orientation, even though they may be far apart in the sequence.

Collagenases are enzymes that break the peptide bonds in collagen. They assist in destroying extracellular structures in the pathogenesis of bacteria such as Clostridium. They are considered a virulence factor, facilitating the spread of gas gangrene. They normally target the connective tissue in muscle cells and other body organs.

<span class="mw-page-title-main">Aspartic protease</span>

Aspartic proteases are a catalytic type of protease enzymes that use an activated water molecule bound to one or more aspartate residues for catalysis of their peptide substrates. In general, they have two highly conserved aspartates in the active site and are optimally active at acidic pH. Nearly all known aspartyl proteases are inhibited by pepstatin.

<span class="mw-page-title-main">Cathepsin L1</span> Protein-coding gene in the species Homo sapiens

Cathepsin L1 is a protein that in humans is encoded by the CTSL1 gene. The protein is a cysteine cathepsin, a lysosomal cysteine protease that plays a major role in intracellular protein catabolism.

<span class="mw-page-title-main">Kunitz STI protease inhibitor</span>

Kunitz soybean trypsin inhibitor is a type of protein contained in legume seeds which functions as a protease inhibitor. Kunitz-type Soybean Trypsin Inhibitors are usually specific for either trypsin or chymotrypsin. They are thought to protect seeds against consumption by animal predators.

<span class="mw-page-title-main">Subtilase</span>

Subtilases are a family of subtilisin-like serine proteases. They appear to have independently and convergently evolved an Asp/Ser/His catalytic triad, like in the trypsin serine proteases. The structure of proteins in this family shows that they have an alpha/beta fold containing a 7-stranded parallel beta sheet.

<span class="mw-page-title-main">Kunitz domain</span> InterPro Domain

Kunitz domains are the active domains of proteins that inhibit the function of protein degrading enzymes or, more specifically, domains of Kunitz-type are protease inhibitors. They are relatively small with a length of about 50 to 60 amino acids and a molecular weight of 6 kDa. Examples of Kunitz-type protease inhibitors are aprotinin, Alzheimer's amyloid precursor protein (APP), and tissue factor pathway inhibitor (TFPI). Kunitz STI protease inhibitor, the trypsin inhibitor initially studied by Moses Kunitz, was extracted from soybeans.

<span class="mw-page-title-main">Astacin</span>

Astacins are a family of multidomain metalloendopeptidases which are either secreted or membrane-anchored. These metallopeptidases belong to the MEROPS peptidase family M12, subfamily M12A. The protein fold of the peptidase domain for members of this family resembles that of thermolysin, the type example for clan MA and the predicted active site residues for members of this family and thermolysin occur in the motif HEXXH.

<span class="mw-page-title-main">Bowman–Birk protease inhibitor</span>

In molecular biology, the Bowman–Birk protease inhibitor family of proteins consists of eukaryotic proteinase inhibitors, belonging to MEROPS inhibitor family I12, clan IF. They mainly inhibit serine peptidases of the S1 family, but also inhibit S3 peptidases.

<span class="mw-page-title-main">Kazal domain</span>

The Kazal domain is an evolutionary conserved protein domain usually indicative of serine protease inhibitors. However, kazal-like domains are also seen in the extracellular part of agrins, which are not known to be protease inhibitors.

<span class="mw-page-title-main">Ecotin</span>

In molecular biology, ecotin is a protease inhibitor which belongs to MEROPS inhibitor family I11, clan IN. Ecotins are dimeric periplasmic proteins from Escherichia coli and related Gram-negative bacteria that have been shown to be potent inhibitors of many trypsin-fold serine proteases of widely varying substrate specificity, which belong to MEROPS peptidase family S1. Phylogenetic analysis suggested that ecotin has an exogenous target, possibly neutrophil elastase. Ecotin from E. coli, Yersinia pestis, and Pseudomonas aeruginosa, all species that encounter the mammalian immune system, inhibit neutrophil elastase strongly while ecotin from the plant pathogen Pantoea citrea inhibits neutrophil elastase 1000-fold less potently. Ecotins all potently inhibit pancreatic digestive peptidases trypsin and chymotrypsin, while showing more variable inhibition of the blood peptidases Factor Xa, thrombin, and urokinase-type plasminogen activator.

A protein superfamily is the largest grouping (clade) of proteins for which common ancestry can be inferred. Usually this common ancestry is inferred from structural alignment and mechanistic similarity, even if no sequence similarity is evident. Sequence homology can then be deduced even if not apparent. Superfamilies typically contain several protein families which show sequence similarity within each family. The term protein clan is commonly used for protease and glycosyl hydrolases superfamilies based on the MEROPS and CAZy classification systems.

<span class="mw-page-title-main">PA clan of proteases</span>

The PA clan is the largest group of proteases with common ancestry as identified by structural homology. Members have a chymotrypsin-like fold and similar proteolysis mechanisms but can have identity of <10%. The clan contains both cysteine and serine proteases. PA clan proteases can be found in plants, animals, fungi, eubacteria, archaea and viruses.

Asparagine peptide lyase are one of the seven groups in which proteases, also termed proteolytic enzymes, peptidases, or proteinases, are classified according to their catalytic residue. The catalytic mechanism of the asparagine peptide lyases involves an asparagine residue acting as nucleophile to perform a nucleophilic elimination reaction, rather than hydrolysis, to catalyse the breaking of a peptide bond.

<span class="mw-page-title-main">Asparagine endopeptidase</span> Class of enzymes

Asparagine endopeptidase is a proteolytic enzyme from C13 peptidase family which hydrolyses a peptide bond using the thiol group of a cysteine residue as a nucleophile. It is also known as asparaginyl endopeptidase, citvac, proteinase B, hemoglobinase, PRSC1 gene product or LGMN, vicilin peptidohydrolase and bean endopeptidase. In humans it is encoded by the LGMN gene.

<span class="mw-page-title-main">Papain-like protease</span> Protein family of cysteine protease enzymes

Papain-like proteases are a large protein family of cysteine protease enzymes that share structural and enzymatic properties with the group's namesake member, papain. They are found in all domains of life. In animals, the group is often known as cysteine cathepsins or, in older literature, lysosomal peptidases. In the MEROPS protease enzyme classification system, papain-like proteases form Clan CA. Papain-like proteases share a common catalytic dyad active site featuring a cysteine amino acid residue that acts as a nucleophile.

References

  1. Rawlings, Neil D; Barrett Alan J; Bateman Alex (January 2012). "MEROPS: the database of proteolytic enzymes, their substrates and inhibitors". Nucleic Acids Res. 40 (Database issue). England: D343-50. doi:10.1093/nar/gkr987. PMC   3245014 . PMID   22086950.
  2. Rawlings ND, Barrett AJ, Bateman A (January 2010). "MEROPS: the peptidase database". Nucleic Acids Res. 38 (Database issue): D227–33. doi:10.1093/nar/gkp971. PMC   2808883 . PMID   19892822.
  3. Rawlings, N.D. & Barrett, A.J. (1993) "Evolutionary families of peptidases." Biochem J 290, 205-218.
  4. Rawlings, N.D., Tolle, D.P. & Barrett, A.J. (2004) "Evolutionary families of peptidase inhibitors." Biochem J 378, 705-716.
  5. "What's New in MEROPS?".