Expasy is an online bioinformatics resource operated by the SIB Swiss Institute of Bioinformatics. It is an extensible and integrative portal which provides access to over 160 databases and software tools and supports a range of life science and clinical research areas, from genomics, proteomics and structural biology, to evolution and phylogeny, systems biology and medical chemistry. The individual resources (databases, web-based and downloadable software tools) are hosted in a decentralised way by different groups of the SIB Swiss Institute of Bioinformatics and partner institutions.
Queries of Expasy allow:
Expasy provides up-to-date information from the most recent release of each resources.
The terms used in Expasy are based on the EDAM comprehensive ontology. [2]
Expasy was created in August 1993. Originally, it was called ExPASy (Expert Protein Analysis System) and acted as a proteomics server to analyze protein sequences and structures and two-dimensional gel electrophoresis (2-D Page electrophoresis). [3] Among others, ExPASy hosted the protein sequence knowledgebase, UniProtKB/Swiss-Prot, and its computer annotated supplement, UniProtKB/TrEMBL, before these moved to the UniProt website.[ citation needed ]
ExPASy was the first website of the life sciences and among the first 150 websites in the world. As of 5 April 2007 [update] , ExPASy had been consulted 1 billion times since its installation on 1 August 1993. [4]
In June 2011, it became the SIB ExPASy Bioformatics Resources Portal: a diverse catalogue of bioinformatics resources developed by SIB Groups. [5] The current version of Expasy was released in October 2020. [6] [7]
Biological databases are libraries of biological sciences, collected from scientific experiments, published literature, high-throughput experiment technology, and computational analysis. They contain information from research areas including genomics, proteomics, metabolomics, microarray gene expression, and phylogenetics. Information contained in biological databases includes gene function, structure, localization, clinical effects of mutations as well as similarities of biological sequences and structures.
The SIB Swiss Institute of Bioinformatics is an academic not-for-profit foundation which federates bioinformatics activities throughout Switzerland.
UniProt is a freely accessible database of protein sequence and functional information, many entries being derived from genome sequencing projects. It contains a large amount of information about the biological function of proteins derived from the research literature. It is maintained by the UniProt consortium, which consists of several European bioinformatics organisations and a foundation from Washington, DC, United States.
The Protein Information Resource (PIR), located at Georgetown University Medical Center, is an integrated public bioinformatics resource to support genomic and proteomic research, and scientific studies. It contains protein sequences databases
The European Bioinformatics Institute (EMBL-EBI) is an Intergovernmental Organization (IGO) which, as part of the European Molecular Biology Laboratory (EMBL) family, focuses on research and services in bioinformatics. It is located on the Wellcome Genome Campus in Hinxton near Cambridge, and employs over 600 full-time equivalent (FTE) staff. Institute leaders such as Rolf Apweiler, Alex Bateman, Ewan Birney, and Guy Cochrane, an adviser on the National Genomics Data Center Scientific Advisory Board, serve as part of the international research network of the BIG Data Center at the Beijing Institute of Genomics.
Amos Bairoch is a Swiss bioinformatician and Professor of Bioinformatics at the Department of Human Protein Sciences of the University of Geneva where he leads the CALIPHO group at the Swiss Institute of Bioinformatics (SIB) combining bioinformatics, curation, and experimental efforts to functionally characterize human proteins.
InterPro is a database of protein families, protein domains and functional sites in which identifiable features found in known proteins can be applied to new protein sequences in order to functionally characterise them.
PROSITE is a protein database. It consists of entries describing the protein families, domains and functional sites as well as amino acid patterns and profiles in them. These are manually curated by a team of the Swiss Institute of Bioinformatics and tightly integrated into Swiss-Prot protein annotation. PROSITE was created in 1988 by Amos Bairoch, who directed the group for more than 20 years. Since July 2018, the director of PROSITE and Swiss-Prot is Alan Bridge.
SUPERFAMILY is a database and search platform of structural and functional annotation for all proteins and genomes. It classifies amino acid sequences into known structural domains, especially into SCOP superfamilies. Domains are functional, structural, and evolutionary units that form proteins. Domains of common Ancestry are grouped into superfamilies. The domains and domain superfamilies are defined and described in SCOP. Superfamilies are groups of proteins which have structural evidence to support a common evolutionary ancestor but may not have detectable sequence homology.
PDBsum is a database that provides an overview of the contents of each 3D macromolecular structure deposited in the Protein Data Bank. The original version of the database was developed around 1995 by Roman Laskowski and collaborators at University College London. As of 2014, PDBsum is maintained by Laskowski and collaborators in the laboratory of Janet Thornton at the European Bioinformatics Institute (EBI).
neXtProt is an on-line knowledge platform on human proteins. It strives to be a comprehensive resource that provides a variety of types of information on human proteins, such as their function, subcellular location, expression, interactions and role in diseases. The major part of the information in neXtProt is obtained from the UniProt Swiss-Prot database but it is complemented by data originating from high-throughput studies with an emphasis on proteomics. neXtProt offers also an advanced search capacity based on the SPARQL technology as well as an API that allows to programatically extract the data stored in the resource. It is developed by the CALIPHO group directed by Amos Bairoch and Lydie Lane of the Swiss Institute of Bioinformatics (SIB).
Rolf Apweiler is a director of European Bioinformatics Institute (EBI) part of the European Molecular Biology Laboratory (EMBL) with Ewan Birney.
SWISS-MODEL is a structural bioinformatics web-server dedicated to homology modeling of 3D protein structures. Homology modeling is currently the most accurate method to generate reliable three-dimensional protein structure models and is routinely used in many practical applications. Homology modelling methods make use of experimental protein structures ("templates") to build models for evolutionary related proteins ("targets").
In bioinformatics, the PANTHER classification system is a large curated biological database of gene/protein families and their functionally related subfamilies that can be used to classify and identify the function of gene products. PANTHER is part of the Gene Ontology Reference Genome Project designed to classify proteins and their genes for high-throughput analysis.
Cathy H. Wu is the Edward G. Jefferson Chair and professor and director of the Center for Bioinformatics & Computational Biology (CBCB) at the University of Delaware. She is also the director of the Protein Information Resource (PIR) and the North east Bioinformatics Collaborative Steering Committee, and the adjunct professor at the Georgetown University Medical Center.
Chromosome 9 open reading frame 152 is a protein that in humans is encoded by the C9orf152 gene. The exact function of the protein is not completely understood.
Serum amyloid A-like 1 is a protein in humans encoded by the SAAL1 gene.
Biocuration is the field of life sciences dedicated to organizing biomedical data, information and knowledge into structured formats, such as spreadsheets, tables and knowledge graphs. The biocuration of biomedical knowledge is made possible by the cooperative work of biocurators, software developers and bioinformaticians and is at the base of the work of biological databases.