MEROPS

Last updated

MEROPS
Content
DescriptionMEROPS: the database of proteolytic enzymes, their substrates and inhibitors.
Contact
Research center The Wellcome Trust Sanger Institute
AuthorsNeil D Rawlings
Primary citationRawlings & al. (2012) [1]
Release date2010
Access
Website https://www.ebi.ac.uk/merops/

MEROPS is an online database for peptidases (also known as proteases, proteinases and proteolytic enzymes) and their inhibitors. [2] The classification scheme for peptidases was published by Rawlings & Barrett in 1993, [3] and that for protein inhibitors by Rawlings et al. in 2004. [4] The most recent version, MEROPS 12.4, was released in late October 2021. [5]

Contents

Overview

The classification is based on similarities at the tertiary and primary structural levels. Comparisons are restricted to that part of the sequence directly involved in the reaction, which in the case of a peptidase must include the active site, and for a protein inhibitor the reactive site. The classification is hierarchical: sequences are assembled into families, and families are assembled into clans. Each peptidase, family, and clan has a unique identifier.

Classification

Family

The families of peptidases are constructed by comparisons of amino acid sequences. A family is assembled around a type example, the sequence of a well-characterized peptidase or inhibitor. All other sequences in the family must be related to the family type example, either directly or through a transitive relationship involving one or more sequences already shown to be family members. Typically, FastA or BlastP is used to establish sequence relationships, with an expect value of 0.001 or lower taken to be statistically significant. HMMER or psi-blast searches are used for adding sequences which are distantly related to a family. Each family is identified by a letter representing the catalytic type of the peptidases it contains followed by an arbitrary unique number.

Some families are divided into subfamilies due to evidence of very ancient divergence within the family. The divergence corresponds to more than 150 accepted point mutations per 100 amino acid residues.

Clan

The similarity in three-dimensional structures supports the evidence that many of the families do share common ancestry with others. "Clan" is used to describe such a group of families. A clan is also assembled around a type example, this being the structure of a well-characterized peptidase or inhibitor. A family is included in a clan if the tertiary structure of a family member can be shown to be related to that of the clan type example. Typically, DALI is used to establish clan membership, with a z score of 6.00 standard deviation units or above considered to be statistically significant. For peptidases, other evidence to indicate that families are related when a tertiary structure is absent includes the same order of catalytic residues in the sequences.

Identifier

Each family, clan, peptidase, and inhibitor has a unique identifier. Description and example of identifiers are shown in the table below.

TypeIdentifier DescriptionIdentifier example
FamilyLetter indicating the catalytic type followed by a serial number of up to two digitsA1
ClanThe first letter indicating the catalytic type followed by a serial letterAA
PeptidaseFamily name (padded with zeros if necessary to make it three characters long), a dot and a three-digit serial numberA01.001
InhibitorThe initial 'I' followed by family name (padded with zeros), a dot and three-digit serial numberI10.001
Compound InhibitorThe initial ‘L’ is for any compound inhibitor. 'L' is followed by the family name of the inhibitor units (padded to three characters). Following the dash, there is a three-digit serial numberLI01-001
Small-molecule inhibitorEach SMI is assigned an identifier consisting of an initial J followed by a five-digit numberJ00095

See also

Related Research Articles

<span class="mw-page-title-main">Protease</span> Enzyme that cleaves other proteins into smaller peptides

A protease is an enzyme that catalyzes proteolysis, breaking down proteins into smaller polypeptides or single amino acids, and spurring the formation of new protein products. They do this by cleaving the peptide bonds within proteins by hydrolysis, a reaction where water breaks bonds. Proteases are involved in many biological functions, including digestion of ingested proteins, protein catabolism, and cell signaling.

In biology and biochemistry, protease inhibitors, or antiproteases, are molecules that inhibit the function of proteases. Many naturally occurring protease inhibitors are proteins.

<span class="mw-page-title-main">Serine protease</span> Class of enzymes

Serine proteases are enzymes that cleave peptide bonds in proteins. Serine serves as the nucleophilic amino acid at the (enzyme's) active site. They are found ubiquitously in both eukaryotes and prokaryotes. Serine proteases fall into two broad categories based on their structure: chymotrypsin-like (trypsin-like) or subtilisin-like.

A metalloproteinase, or metalloprotease, is any protease enzyme whose catalytic mechanism involves a metal. An example is ADAM12 which plays a significant role in the fusion of muscle cells during embryo development, in a process known as myogenesis.

<span class="mw-page-title-main">Catalytic triad</span> Set of three coordinated amino acids

A catalytic triad is a set of three coordinated amino acids that can be found in the active site of some enzymes. Catalytic triads are most commonly found in hydrolase and transferase enzymes. An acid-base-nucleophile triad is a common motif for generating a nucleophilic residue for covalent catalysis. The residues form a charge-relay network to polarise and activate the nucleophile, which attacks the substrate, forming a covalent intermediate which is then hydrolysed to release the product and regenerate free enzyme. The nucleophile is most commonly a serine or cysteine amino acid, but occasionally threonine or even selenocysteine. The 3D structure of the enzyme brings together the triad residues in a precise orientation, even though they may be far apart in the sequence.

<span class="mw-page-title-main">Cysteine protease</span> Class of enzymes

Cysteine proteases, also known as thiol proteases, are hydrolase enzymes that degrade proteins. These proteases share a common catalytic mechanism that involves a nucleophilic cysteine thiol in a catalytic triad or dyad.

<span class="mw-page-title-main">Aspartic protease</span>

Aspartic proteases are a catalytic type of protease enzymes that use an activated water molecule bound to one or more aspartate residues for catalysis of their peptide substrates. In general, they have two highly conserved aspartates in the active site and are optimally active at acidic pH. Nearly all known aspartyl proteases are inhibited by pepstatin.

<span class="mw-page-title-main">TEV protease</span> Highly specific protease

TEV protease is a highly sequence-specific cysteine protease from Tobacco Etch Virus (TEV). It is a member of the PA clan of chymotrypsin-like proteases. Due to its high sequence specificity it is frequently used for the controlled cleavage of fusion proteins in vitro and in vivo.

<span class="mw-page-title-main">Cathepsin L1</span> Protein-coding gene in the species Homo sapiens

Cathepsin L1 is a protein that in humans is encoded by the CTSL1 gene. The protein is a cysteine cathepsin, a lysosomal cysteine protease that plays a major role in intracellular protein catabolism.

The Proteolysis MAP (PMAP) was an integrated web resource focused on proteases. Its domain now links to a scam/spam browser extender.

<span class="mw-page-title-main">Subtilase</span>

Subtilases are a family of subtilisin-like serine proteases. They appear to have independently and convergently evolved an Asp/Ser/His catalytic triad, like in the trypsin serine proteases. The structure of proteins in this family shows that they have an alpha/beta fold containing a 7-stranded parallel beta sheet.

<span class="mw-page-title-main">Bowman–Birk protease inhibitor</span>

In molecular biology, the Bowman–Birk protease inhibitor family of proteins consists of eukaryotic proteinase inhibitors, belonging to MEROPS inhibitor family I12, clan IF. They mainly inhibit serine peptidases of the S1 family, but also inhibit S3 peptidases.

<span class="mw-page-title-main">Clp protease family</span> A protein-targeting ATP-dependent enzyme family.

In molecular biology, the CLP protease family is a family of serine peptidases belong to the MEROPS peptidase family S14. ClpP is an ATP-dependent protease that cleaves a number of proteins, such as casein and albumin. It exists as a heterodimer of ATP-binding regulatory A and catalytic P subunits, both of which are required for effective levels of protease activity in the presence of ATP, although the P subunit alone does possess some catalytic activity.

<span class="mw-page-title-main">Ecotin</span>

In molecular biology, ecotin is a protease inhibitor which belongs to MEROPS inhibitor family I11, clan IN. Ecotins are dimeric periplasmic proteins from Escherichia coli and related Gram-negative bacteria that have been shown to be potent inhibitors of many trypsin-fold serine proteases of widely varying substrate specificity, which belong to MEROPS peptidase family S1. Phylogenetic analysis suggested that ecotin has an exogenous target, possibly neutrophil elastase. Ecotin from E. coli, Yersinia pestis, and Pseudomonas aeruginosa, all species that encounter the mammalian immune system, inhibit neutrophil elastase strongly while ecotin from the plant pathogen Pantoea citrea inhibits neutrophil elastase 1000-fold less potently. Ecotins all potently inhibit pancreatic digestive peptidases trypsin and chymotrypsin, while showing more variable inhibition of the blood peptidases Factor Xa, thrombin, and urokinase-type plasminogen activator.

A protein superfamily is the largest grouping (clade) of proteins for which common ancestry can be inferred. Usually this common ancestry is inferred from structural alignment and mechanistic similarity, even if no sequence similarity is evident. Sequence homology can then be deduced even if not apparent. Superfamilies typically contain several protein families which show sequence similarity within each family. The term protein clan is commonly used for protease and glycosyl hydrolases superfamilies based on the MEROPS and CAZy classification systems.

<span class="mw-page-title-main">PA clan of proteases</span>

The PA clan is the largest group of proteases with common ancestry as identified by structural homology. Members have a chymotrypsin-like fold and similar proteolysis mechanisms but can have identity of <10%. The clan contains both cysteine and serine proteases. PA clan proteases can be found in plants, animals, fungi, eubacteria, archaea and viruses.

Cospin is a serine protease inhibitor from the mushroom species Coprinopsis cinerea in the phylum Basidiomycota.

<span class="mw-page-title-main">Glutamic protease</span>

Glutamic proteases are a group of proteolytic enzymes containing a glutamic acid residue within the active site. This type of protease was first described in 2004 and became the sixth catalytic type of protease. Members of this group of protease had been previously assumed to be an aspartate protease, but structural determination showed it to belong to a novel protease family. The first structure of this group of protease was scytalidoglutamic peptidase, the active site of which contains a catalytic dyad, glutamic acid (E) and glutamine (Q), which give rise to the name eqolisin. This group of proteases are found primarily in pathogenic fungi affecting plant and human.

Asparagine peptide lyase are one of the seven groups in which proteases, also termed proteolytic enzymes, peptidases, or proteinases, are classified according to their catalytic residue. The catalytic mechanism of the asparagine peptide lyases involves an asparagine residue acting as nucleophile to perform a nucleophilic elimination reaction, rather than hydrolysis, to catalyse the breaking of a peptide bond.

<span class="mw-page-title-main">Papain-like protease</span>

Papain-like proteases are a large protein family of cysteine protease enzymes that share structural and enzymatic properties with the group's namesake member, papain. They are found in all domains of life. In animals, the group is often known as cysteine cathepsins or, in older literature, lysosomal peptidases. In the MEROPS protease enzyme classification system, papain-like proteases form Clan CA. Papain-like proteases share a common catalytic dyad active site featuring a cysteine amino acid residue that acts as a nucleophile.

References

  1. Rawlings, Neil D; Barrett Alan J; Bateman Alex (January 2012). "MEROPS: the database of proteolytic enzymes, their substrates and inhibitors". Nucleic Acids Res. England. 40 (Database issue): D343-50. doi:10.1093/nar/gkr987. PMC   3245014 . PMID   22086950.
  2. Rawlings ND, Barrett AJ, Bateman A (January 2010). "MEROPS: the peptidase database". Nucleic Acids Res. 38 (Database issue): D227–33. doi:10.1093/nar/gkp971. PMC   2808883 . PMID   19892822.
  3. Rawlings, N.D. & Barrett, A.J. (1993) "Evolutionary families of peptidases." Biochem J 290, 205-218.
  4. Rawlings, N.D., Tolle, D.P. & Barrett, A.J. (2004) "Evolutionary families of peptidase inhibitors." Biochem J 378, 705-716.
  5. "What's New in MEROPS?".