PROSITE

Last updated
PROSITE
Prosite.png
Content
DescriptionPROSITE, a protein domain database for functional characterization and annotation.
Contact
Research center Swiss Institute of Bioinformatics
LaboratoryStructural Biology and Bioinformatics Department
Primary citation PMID   19858104
Release date1988 (1988)
Access
Website prosite.expasy.org

PROSITE is a protein database. [1] [2] It consists of entries describing the protein families, domains and functional sites as well as amino acid patterns and profiles in them. These are manually curated by a team of the Swiss Institute of Bioinformatics and tightly integrated into Swiss-Prot protein annotation. PROSITE was created in 1988 by Amos Bairoch, who directed the group for more than 20 years. Since July 2018, the director of PROSITE and Swiss-Prot is Alan Bridge.

Contents

PROSITE's uses include identifying possible functions of newly discovered proteins and analysis of known proteins for previously undetermined activity. Properties from well-studied genes can be propagated to biologically related organisms, and for different or poorly known genes biochemical functions can be predicted from similarities. PROSITE offers tools for protein sequence analysis and motif detection (see sequence motif, PROSITE patterns). It is part of the ExPASy proteomics analysis servers.

The database ProRule builds on the domain descriptions of PROSITE. [3] It provides additional information about functionally or structurally critical amino acids. The rules contain information about biologically meaningful residues, like active sites, substrate- or co-factor-binding sites, posttranslational modification sites or disulfide bonds, to help function determination. These can automatically generate annotation based on PROSITE motifs.

Statistics

As of 26 February 2022, release 2022_01 has 1,902 documentation entries, 1,311 patterns, 1,336 profiles, and 1,352 ProRules.

See also

Related Research Articles

<span class="mw-page-title-main">Post-translational modification</span> Biological processes

Post-translational modification (PTM) is the covalent process of changing proteins following protein making which involves enzymes. This process often occurs in the rough ER and the golgi apparatus. Proteins are created by ribosomes translating mRNA into polypeptide chains, which may then change to form the mature protein product. PTMs are important components in cell signalling, as for example when prohormones are converted to hormones.

In biology, a sequence motif is a nucleotide or amino-acid sequence pattern that is widespread and usually assumed to be related to biological function of the macromolecule. For example, an N-glycosylation site motif can be defined as Asn, followed by anything but Pro, followed by either Ser or Thr, followed by anything but Pro residue.

<span class="mw-page-title-main">UniProt</span> Database of protein sequences and functional information

(See also: List of proteins in the human body)

The Protein Information Resource (PIR), located at Georgetown University Medical Center, is an integrated public bioinformatics resource to support genomic and proteomic research, and scientific studies. It contains protein sequences databases

<span class="mw-page-title-main">Amos Bairoch</span>

Amos Bairoch is a Swiss bioinformatician and Professor of Bioinformatics at the Department of Human Protein Sciences of the University of Geneva where he leads the CALIPHO group at the Swiss Institute of Bioinformatics (SIB) combining bioinformatics, curation, and experimental efforts to functionally characterize human proteins.

InterPro is a database of protein families, protein domains and functional sites in which identifiable features found in known proteins can be applied to new protein sequences in order to functionally characterise them.

Expasy is an online bioinformatics resource operated by the SIB Swiss Institute of Bioinformatics. It is an extensible and integrative portal which provides access to over 160 databases and software tools and supports a range of life science and clinical research areas, from genomics, proteomics and structural biology, to evolution and phylogeny, systems biology and medical chemistry. The individual resources are hosted in a decentralized way by different groups of the SIB Swiss Institute of Bioinformatics and partner institutions.

(See also: List of proteins in the human body)

Protein function prediction methods are techniques that bioinformatics researchers use to assign biological or biochemical roles to proteins. These proteins are usually ones that are poorly studied or predicted based on genomic sequence data. These predictions are often driven by data-intensive computational procedures. Information may come from nucleic acid sequence homology, gene expression profiles, protein domain structures, text mining of publications, phylogenetic profiles, phenotypic profiles, and protein-protein interaction. Protein function is a broad term: the roles of proteins range from catalysis of biochemical reactions to transport to signal transduction, and a single protein may play a role in multiple processes or cellular pathways.

<span class="mw-page-title-main">Rolf Apweiler</span>

Rolf Apweiler is a director of European Bioinformatics Institute (EBI) part of the European Molecular Biology Laboratory (EMBL) with Ewan Birney.

αr9 is a family of bacterial small non-coding RNAs with representatives in a broad group of α-proteobacteria from the order Hyphomicrobiales. The first member of this family (Smr9C) was found in a Sinorhizobium meliloti 1021 locus located in the chromosome (C). Further homology and structure conservation analysis have identified full-length Smr9C homologs in several nitrogen-fixing symbiotic rhizobia, in the plant pathogens belonging to Agrobacterium species as well as in a broad spectrum of Brucella species. αr9C RNA species are 144-158 nt long and share a well defined common secondary structure consisting of seven conserved regions. Most of the αr9 transcripts can be catalogued as trans-acting sRNAs expressed from well-defined promoter regions of independent transcription units within intergenic regions (IGRs) of the α-proteobacterial genomes.

<span class="mw-page-title-main">Terri Attwood</span> British bioinformatics researcher

Teresa K. Attwood is a Professor of Bioinformatics in the Department of Computer Science and School of Biological Sciences at the University of Manchester and a visiting fellow at the European Bioinformatics Institute (EMBL-EBI). She held a Royal Society University Research Fellowship at University College London (UCL) from 1993 to 1999 and at the University of Manchester from 1999 to 2002.

<span class="mw-page-title-main">RUFY2</span> Protein-coding gene in humans

RUN and FYVE domain containing 2 (RUFY2) is a protein that in humans is encoded by the RUFY2 gene. The RUFY2 gene is named for two of its domains, the RUN domain and FYVE domains. RUFY2 is a member of the RUFY family of proteins that include RUFY1, RUFY2, RUFY3, and RUFY4. RUFY2 protein has a dynamic role in endosomal membrane trafficking.

Julian John Thurstan Gough is a Group Leader in the Laboratory of Molecular Biology (LMB) of the Medical Research Council (MRC). He was previously a professor of bioinformatics at the University of Bristol.

SMIM23 or Small Integral Membrane Protein 23 is a protein which in humans is encoded by the SMIM23 or c5orf50 gene. The longer mRNA isoform is 519 nucleotides which translates to 172 amino acids of a protein. In recent advancements, researchers have identified this gene, along with a few others, could potentially play a role in how facial morphology arises in humans.

<span class="mw-page-title-main">LRRIQ3</span> Protein-coding gene in the species Homo sapiens

LRRIQ3, which is also known as LRRC44, is a protein that in humans is encoded by the LRRIQ3 gene. It is predominantly expressed in the testes, and is linked to a number of diseases.

<span class="mw-page-title-main">PROB1</span> Protein-coding gene in the species Homo sapiens

Proline-rich basic protein 1(PROB1) is a protein encoded by the PROB1 gene located on human chromosome 5, open reading frame 65. PROB1 is also known as C5orf65 and weakly similar to basic proline-rich protein.

<span class="mw-page-title-main">C2orf16</span> Protein-coding gene in the species Homo sapiens

C2orf16 is a protein that in humans is encoded by the C2orf16 gene. Isoform 2 of this protein is 1,984 amino acids long. The gene contains 1 exon and is located at 2p23.3. Aliases for C2orf16 include Open Reading Frame 16 on Chromosome 2 and P-S-E-R-S-H-H-S Repeats Containing Sequence.

<span class="mw-page-title-main">Transmembrane protein 179</span> Protein-coding gene in the species Homo sapiens

Transmembrane protein 179 is a protein that in humans is encoded by the TMEM179 gene. The function of transmembrane protein 179 is not yet well understood, but it is believed to have a function in the nervous system.

In molecular biology, MvirDB is a publicly available database that stores information on toxins, virulence factors and antibiotic resistance genes. Sources that this database uses for DNA and protein information include: Tox-Prot, SCORPION, the PRINTS Virulence Factors, VFDB, TVFac, Islander, ARGO and VIDA. The database provides a BLAST tool that allows the user to query their sequence against all DNA and protein sequences in MvirDB. Information on virulence factors can be obtained from the usage of the provided browser tool. Once the browser tool is used, the results are returned as a readable table that is organized by ascending E-Values, each of which are hyperlinked to their related page. MvirDB is implemented in an Oracle 10g relational database.

References

  1. De Castro E, Sigrist CJA, Gattiker A, Bulliard V, Langendijk-Genevaux PS, Gasteiger E, Bairoch A, Hulo N (2006). "ScanProsite: detection of PROSITE signature matches and ProRule-associated functional and structural residues in proteins". Nucleic Acids Res. 34 (Web Server issue): W362–365. doi:10.1093/nar/gkl124. PMC   1538847 . PMID   16845026.
  2. Hulo N, Bairoch A, Bulliard V, Cerutti L, Cuche B, De Castro E, Lachaize C, Langendijk-Genevaux PS, Sigrist CJA (2007). "The 20 years of PROSITE". Nucleic Acids Res. 36 (Database issue): D245–9. doi:10.1093/nar/gkm977. PMC   2238851 . PMID   18003654.
  3. Sigrist CJ, De Castro E, Langendijk-Genevaux PS, Le Saux V, Bairoch A, Hulo N (2005). "ProRule: a new database containing functional and structural information on PROSITE profiles". Bioinformatics. 21 (21): 4060–4066. doi:10.1093/bioinformatics/bti614. PMID   16091411.