Content | |
---|---|
Description | TopFIND is the Termini oriented protein Function Inferred Database, a central resource of protein data integrated with knowledge on protein termini, proteolytic processing by proteases, terminal amino acid modifications and inferred functional implications created by combining community contributions with the UniProt and MEROPS databases. |
Data types captured | Protein annotation |
Organisms | H. sapiens, M. musculus, A. thaliana, S. cerevisiae, E. coli |
Contact | |
Research center | University of British Columbia (UBC), Canada |
Laboratory | Christopher Overall |
Authors | Philipp F. Lange |
Primary citation | TopFIND 2.0--linking protein termini with proteolytic processing and modifications altering protein function [1] |
Release date | 2011 |
Access | |
Data format | Custom comma separated file, SQL, XML. |
Website | clipserve |
Miscellaneous | |
License | Creative Commons Attribution-NoDerivs |
Curation policy | Yes - manual and automatic. Rules for automatic annotation generated by Database Curators and computational algorithms. |
TopFIND is the Termini oriented protein Function Inferred Database (TopFIND) is an integrated knowledgebase focused on protein termini, their formation by proteases and functional implications. It contains information about the processing and the processing state of proteins and functional implications thereof derived from research literature, contributions by the scientific community and biological databases. [2]
Among the most fundamental characteristics of a protein are the N- and C-termini defining the start and end of the polypeptide chain. While genetically encoded, protein termini isoforms are also often generated during translation, following which, termini are highly dynamic, being frequently trimmed at their ends by a large array of exopeptidases. Neo-termini can also be generated by endopeptidases after precise and limited proteolysis, termed processing. Necessary for the maturation of many proteins, processing can also occur afterwards, often resulting in dramatic functional consequences. Aberrant proteolysis can cause wide range of diseases like arthritis [3] or cancer. [4] Hence, proteolytic generation of pleiotrophic stable forms of proteins, the universal susceptibility of proteins to proteolysis, and its irreversibility, distinguishes proteolysis from many highly studied posttranslational modifications. Proteases are tightly interconnected in the protease web [5] [6] and their aberrant activity in disease can lead to diagnostic fragment profiles with characteristic protein termini. [7] Following proteolysis, the newly formed protein termini can be further modified, [8] a process that affects protein function and stability. [9]
TopFIND is a resource for comprehensive coverage of protein N- and C-termini discovered by all available in silico, in vitro as well as in vivo methodologies. It makes use of existing knowledge by seamless integration of data from UniProt and MEROPS and provides access to new data from community submission and manual literature curating. It renders modifications of protein termini, such as acetylation and citrullination, easily accessible and searchable and provides the means to identify and analyse extend and distribution of terminal modifications across a protein. Since its inception TopFIND has been expanded to further species. [1]
The data is presented to the user with a strong emphasis on the relation to curated background information and underlying evidence that led to the observation of a terminus, its modification or proteolytic cleavage. In brief the protein information, its domain structure, protein termini, terminus modifications and proteolytic processing of and by other proteins is listed. All information is accompanied by metadata like its original source, method of identification, confidence measurement or related publication. A positional cross correlation evaluation matches termini and cleavage sites with protein features (such as amino acid variants) and domains to highlight potential effects and dependencies in a unique way. Also, a network view of all proteins showing their functional dependency as protease, substrate or protease inhibitor tied in with protein interactions is provided for the easy evaluation of network wide effects. A powerful yet user friendly filtering mechanism allows the presented data to be filtered based on parameters like methodology used, in vivo relevance, confidence or data source (e.g. limited to a single laboratory or publication). This provides means to assess physiological relevant data and to deduce functional information and hypotheses relevant to the bench scientist. In a later release analysis tools for the evaluation of proteolytic pathways in experimental data have been added. [10]
Proteolysis is the breakdown of proteins into smaller polypeptides or amino acids. Uncatalysed, the hydrolysis of peptide bonds is extremely slow, taking hundreds of years. Proteolysis is typically catalysed by cellular enzymes called proteases, but may also occur by intra-molecular digestion.
A protease is an enzyme that catalyzes proteolysis, breaking down proteins into smaller polypeptides or single amino acids, and spurring the formation of new protein products. They do this by cleaving the peptide bonds within proteins by hydrolysis, a reaction where water breaks bonds. Proteases are involved in many biological functions, including digestion of ingested proteins, protein catabolism, and cell signaling.
Post-translational modification (PTM) is the covalent process of changing proteins following protein biosynthesis. PTMs may involve enzymes or occur spontaneously. Proteins are created by ribosomes translating mRNA into polypeptide chains, which may then change to form the mature protein product. PTMs are important components in cell signalling, as for example when prohormones are converted to hormones.
A metalloproteinase, or metalloprotease, is any protease enzyme whose catalytic mechanism involves a metal. An example is ADAM12 which plays a significant role in the fusion of muscle cells during embryo development, in a process known as myogenesis.
A calpain is a protein belonging to the family of calcium-dependent, non-lysosomal cysteine proteases expressed ubiquitously in mammals and many other organisms. Calpains constitute the C2 family of protease clan CA in the MEROPS database. The calpain proteolytic system includes the calpain proteases, the small regulatory subunit CAPNS1, also known as CAPN4, and the endogenous calpain-specific inhibitor, calpastatin.
Cysteine proteases, also known as thiol proteases, are hydrolase enzymes that degrade proteins. These proteases share a common catalytic mechanism that involves a nucleophilic cysteine thiol in a catalytic triad or dyad.
In molecular biology, the Signal Peptide Peptidase (SPP) is a type of protein that specifically cleaves parts of other proteins. It is an intramembrane aspartyl protease with the conserved active site motifs 'YD' and 'GxGD' in adjacent transmembrane domains (TMDs). Its sequences is highly conserved in different vertebrate species. SPP cleaves remnant signal peptides left behind in membrane by the action of signal peptidase and also plays key roles in immune surveillance and the maturation of certain viral proteins.
Aspartic proteases are a catalytic type of protease enzymes that use an activated water molecule bound to one or more aspartate residues for catalysis of their peptide substrates. In general, they have two highly conserved aspartates in the active site and are optimally active at acidic pH. Nearly all known aspartyl proteases are inhibited by pepstatin.
Caspase 2 also known as CASP2 is an enzyme that, in humans, is encoded by the CASP2 gene. CASP2 orthologs have been identified in nearly all mammals for which complete genome data are available. Unique orthologs are also present in birds, lizards, lissamphibians, and teleosts.
Caspase 4 is an enzyme that proteolytically cleaves other proteins at an aspartic acid residue (LEVD-), and belongs to a family of cysteine proteases called caspases. The function of caspase 4 is not fully known, but it is believed to be an inflammatory caspase, along with caspase 1, caspase 5, with a role in the immune system.
MEROPS is an online database for peptidases and their inhibitors. The classification scheme for peptidases was published by Rawlings & Barrett in 1993, and that for protein inhibitors by Rawlings et al. in 2004. The most recent version, MEROPS 12.4, was released in late October 2021.
Caspase-7, apoptosis-related cysteine peptidase, also known as CASP7, is a human protein encoded by the CASP7 gene. CASP7 orthologs have been identified in nearly all mammals for which complete genome data are available. Unique orthologs are also present in birds, lizards, lissamphibians, and teleosts.
ATP-dependent Clp protease proteolytic subunit (ClpP) is an enzyme that in humans is encoded by the CLPP gene. This protein is an essential component to form the protein complex of Clp protease.
The Proteolysis MAP (PMAP) was an integrated web resource focused on proteases. Its domain now links to a scam/spam browser extender.
In molecular biology short linear motifs (SLiMs), linear motifs or minimotifs are short stretches of protein sequence that mediate protein–protein interaction.
Threonine proteases are a family of proteolytic enzymes harbouring a threonine (Thr) residue within the active site. The prototype members of this class of enzymes are the catalytic subunits of the proteasome, however the acyltransferases convergently evolved the same active site geometry and mechanism.
Terminal amine isotopic labeling of substrates (TAILS) is a method in quantitative proteomics that identifies the protein content of samples based on N-terminal fragments of each protein and detects differences in protein abundance among samples.
Fast parallel proteolysis (FASTpp) is a method to determine the thermostability of proteins by measuring which fraction of protein resists rapid proteolytic digestion.
Degradomics is a sub-discipline of biology encompassing all the genomic and proteomic approaches devoted to the study of proteases, their inhibitors, and their substrates on a system-wide scale. This includes the analysis of the protease and protease-substrate repertoires, also called "protease degradomes". The scope of these degradomes can range from cell, tissue, and organism-wide scales.
Asparagine peptide lyase are one of the seven groups in which proteases, also termed proteolytic enzymes, peptidases, or proteinases, are classified according to their catalytic residue. The catalytic mechanism of the asparagine peptide lyases involves an asparagine residue acting as nucleophile to perform a nucleophilic elimination reaction, rather than hydrolysis, to catalyse the breaking of a peptide bond.
{{cite journal}}
: CS1 maint: multiple names: authors list (link){{cite journal}}
: CS1 maint: multiple names: authors list (link){{cite journal}}
: CS1 maint: multiple names: authors list (link){{cite journal}}
: CS1 maint: multiple names: authors list (link){{cite journal}}
: CS1 maint: multiple names: authors list (link)