Human Proteinpedia, which is closely associated with Institute of Bioinformatics (IOB), Bangalore and Johns Hopkins University, is a portal for sharing and integration of human proteomic data. [1] [2] It allows research laboratories to contribute and maintain protein annotations. Human Protein Reference Database (HPRD) integrates data, that is deposited in Human Proteinpedia along with the existing literature curated information at the context of an individual protein. [3] [4] In essence, researchers can add new data to HPRD by registering to Human Proteinpedia. The data deposited in Human Proteinpedia is freely available for download. Emphasizing the importance of proteomics data disposition to public repositories, Nature Methods recommends Human Proteinpedia in their editorial. [5] More than 70 labs participate in this effort.
Data pertaining to post-translational modifications, protein–protein interactions, tissue expression, expression in cell lines, subcellular localization and enzyme substrate relationships can be submitted to Human Proteinpedia.
Protein annotations present in Human Proteinpedia are derived from a number of platforms such as
This portal that allows adding of protein information was developed as a collaborative effort between the laboratory of Dr. Akhilesh Pandey at Johns Hopkins University and the Institute of Bioinformatics.
* What are the criteria for contributing data?
Any investigator who fulfills the following criteria can contribute data:
i) provides experimentally derived data, and,
ii) is willing to share data, and,
iii) is willing to be listed as the 'contributor' of the data
* Can I contribute data anonymously?
Anonymous contributions are not allowed. Contributor details should be clearly presented while contributing data.
* Can bioinformatically predicted data be shared through Human Proteinpedia?
Predictions of any type are not allowed. Contributed data should be derived experimentally and should be accompanied with experimental evidence.
* Is the contributed data subjected to peer review?
The data are not subjected to peer review and the actual experimental data (raw or processed) should be provided.
* What will happen to conflicting results from different laboratories?
In cases where a given entry is documented as erroneous, we will consult with the contributing group(s) about deleting the entry.
Bioinformatics is an interdisciplinary field of science that develops methods and software tools for understanding biological data, especially when the data sets are large and complex. Bioinformatics uses biology, chemistry, physics, computer science, computer programming, information engineering, mathematics and statistics to analyze and interpret biological data. The subsequent process of analyzing and interpreting data is referred to as computational biology.
Proteomics is the large-scale study of proteins. Proteins are vital macromolecules of all living organisms, with many functions such as the formation of structural fibers of muscle tissue, enzymatic digestion of food, or synthesis and replication of DNA. In addition, other kinds of proteins include antibodies that protect an organism from infection, and hormones that send important signals throughout the body.
In the field of bioinformatics, a sequence database is a type of biological database that is composed of a large collection of computerized ("digital") nucleic acid sequences, protein sequences, or other polymer sequences stored on a computer. The UniProt database is an example of a protein sequence database. As of 2013 it contained over 40 million sequences and is growing at an exponential rate. Historically, sequences were published in paper form, but as the number of sequences grew, this storage method became unsustainable.
In molecular biology, an interactome is the whole set of molecular interactions in a particular cell. The term specifically refers to physical interactions among molecules but can also describe sets of indirect interactions among genes.
A sequence profiling tool in bioinformatics is a type of software that presents information related to a genetic sequence, gene name, or keyword input. Such tools generally take a query such as a DNA, RNA, or protein sequence or ‘keyword’ and search one or more databases for information related to that sequence. Summaries and aggregate results are provided in standardized format describing the information that would otherwise have required visits to many smaller sites or direct literature searches to compile. Many sequence profiling tools are software portals or gateways that simplify the process of finding information about a query in the large and growing number of bioinformatics databases. The access to these kinds of tools is either web based or locally downloadable executables.
The European Bioinformatics Institute (EMBL-EBI) is an intergovernmental organization (IGO) which, as part of the European Molecular Biology Laboratory (EMBL) family, focuses on research and services in bioinformatics. It is located on the Wellcome Genome Campus in Hinxton near Cambridge, and employs over 600 full-time equivalent (FTE) staff.
Pfam is a database of protein families that includes their annotations and multiple sequence alignments generated using hidden Markov models. Last version of Pfam, 36.0, was released in September 2023 and contains 20,795 families. It is currently provided through InterPro database.
Amos Bairoch is a Swiss bioinformatician and Professor of Bioinformatics at the Department of Human Protein Sciences of the University of Geneva where he leads the CALIPHO group at the Swiss Institute of Bioinformatics (SIB) combining bioinformatics, curation, and experimental efforts to functionally characterize human proteins.
InterPro is a database of protein families, protein domains and functional sites in which identifiable features found in known proteins can be applied to new protein sequences in order to functionally characterise them.
Reactome is a free online database of biological pathways. It is manually curated and authored by PhD-level biologists, in collaboration with Reactome editorial staff. The content is cross-referenced to many bioinformatics databases. The rationale behind Reactome is to visually represent biological pathways in full mechanistic detail, while making the source data available in a computationally accessible format.
The Human Protein Reference Database (HPRD) is a protein database accessible through the Internet. It is closely associated with the premier Indian Non-Profit research organisation Institute of Bioinformatics (IOB), Bangalore, India. This database is a collaborative output of IOB and the Pandey Lab of Johns Hopkins University.
The Biomolecular Object Network Databank is a bioinformatics databank containing information on small molecule structures and interactions. The databank integrates a number of existing databases to provide a comprehensive overview of the information currently available for a given molecule.
GeneCards is a database of human genes that provides genomic, proteomic, transcriptomic, genetic and functional information on all known and predicted human genes. It is being developed and maintained by the Crown Human Genome Center at the Weizmann Institute of Science, in collaboration with LifeMap Sciences.
The Influenza Research Database (IRD) is an integrative and comprehensive publicly available database and analysis resource to search, analyze, visualize, save and share data for influenza virus research. IRD is one of the five Bioinformatics Resource Centers (BRC) funded by the National Institute of Allergy and Infectious Diseases (NIAID), a component of the National Institutes of Health (NIH), which is an agency of the United States Department of Health and Human Services.
The PageRank algorithm has several applications in biochemistry.
In bioinformatics, a Gene Disease Database is a systematized collection of data, typically structured to model aspects of reality, in a way to comprehend the underlying mechanisms of complex diseases, by understanding multiple composite interactions between phenotype-genotype relationships and gene-disease mechanisms. Gene Disease Databases integrate human gene-disease associations from various expert curated databases and text mining derived associations including Mendelian, complex and environmental diseases.
Single nucleotide polymorphism annotation is the process of predicting the effect or function of an individual SNP using SNP annotation tools. In SNP annotation the biological information is extracted, collected and displayed in a clear form amenable to query. SNP functional annotation is typically performed based on the available information on nucleic acid and protein sequences.
The Expression Atlas is a database maintained by the European Bioinformatics Institute that provides information on gene expression patterns from RNA-Seq and Microarray studies, and protein expression from Proteomics studies. The Expression Atlas allows searches by gene, splice variant, protein attribute, disease, treatment or organism part. Individual genes or gene sets can be searched for. All datasets in Expression Atlas have its metadata manually curated and its data analysed through standardised analysis pipelines. There are two components to the Expression Atlas, the Baseline Atlas and the Differential Atlas:
The Institute of Bioinformatics, often referred to as IOB, is an Indian not-for-profit academic research organization based in Bangalore, India. It is involved in research in the fields of bioinformatics, multi-omics, systems biology and neurological disorders. In 2002, the institute was set up by The Genomics Research Trust and the Johns Hopkins University of Baltimore, Maryland. This organization is recognized as a 'Scientific and Industrial Research Organization' (SIRO) of the Department of Scientific and Industrial Research, Government of India. Renowned Proteomicist Akhilesh Pandey, Professor at Department of Laboratory Medicine and Pathology, Center for Individualized Medicine of Mayo Clinic in Rochester, Minnesota, USA is the Founding and current Director of IOB, and eminent Proteomicist Ravi Sirdeshmukh, Founder President of the 'Proteomic Society of India' is the current Associate Director of IOB.