LocDB

Last updated
LocDB
Database.png
Content
Descriptionexperimental annotations of localization
Organisms Homo sapiens
Arabidopsis thaliana
Contact
Research center Columbia University
Laboratory Department of Biochemistry and Molecular Biophysics
Authors Shruti Rastogi
Primary citationRastogi & al. (2011) [1]
Release date2010
Access
Data format MySQL database
Website http://www.rostlab.org/services/locDB
Web service URL http://www.rostlab.org/services/locDB/search.php
Miscellaneous
VersionRelease 1.0

LocDB [1] is an expert-curated database that collects experimental annotations for the subcellular localization of proteins in Homo sapiens (human) and Arabidopsis thaliana (Weed). The database also contains predictions of subcellular localization from a variety of state-of-the-art prediction methods for all proteins with experimental information.

<i>Homo sapiens</i> Humans as a biological species

In taxonomy, Homo sapiens is the only extant human species. The name is Latin for "wise man" and was introduced in 1758 by Carl Linnaeus.

<i>Arabidopsis thaliana</i> A species of flowering plants belonging to the mustards, crucifers, and cabbage family, and used as a model organism in plant biology and genetics

Arabidopsis thaliana, the thale cress, mouse-ear cress or arabidopsis, is a small flowering plant native to Eurasia and Africa. A. thaliana is considered a weed; it is found by roadsides and in disturbed land.

Proteins are the fundamental functional components of cells. They are responsible for transforming genetic information into physical reality. These macromolecules mediate gene regulation, enzymatic catalysis, cellular metabolism, DNA replication, and transport of nutrients, recognition, and transmission of signals. The interpretation of this wealth of data to elucidate protein function in post-genomic era is a fundamental challenge. To date, even for the most well-studied organisms such as yeast, about one-fourth of the proteins remain uncharacterized. A major obstacle in experimentally determining protein function is that the studies require enormous resources. Hence, the gap between the amount of sequences deposited in databases and the experimental characterization of the corresponding proteins is ever-growing. Bioinformatics plays a central role in bridging this sequence-function gap through the development of tools for faster and more effective prediction of protein function. This repository effectively fills the gap between experimental annotations and predictions and provides a bigger and more reliable dataset for the testing of new prediction methods. [1]

See also

Protein targeting or protein sorting is the biological mechanism by which proteins are transported to their appropriate destinations in the cell or outside it. Proteins can be targeted to the inner space of an organelle, different intracellular membranes, plasma membrane, or to exterior of the cell via secretion. This delivery process is carried out based on information contained in the protein itself. Correct sorting is crucial for the cell; errors can lead to diseases.

Protein subcellular localization prediction involves the prediction of where a protein resides in a cell, its subcellular localization.

Related Research Articles

The Bioinformatic Harvester was a bioinformatic meta search engine created by the European Molecular Biology Laboratory and subsequently hosted and further developed by KIT Karlsruhe Institute of Technology for genes and protein-associated information. Harvester currently works for human, mouse, rat, zebrafish, drosophila and arabidopsis thaliana based information. Harvester cross-links >50 popular bioinformatic resources and allows cross searches. Harvester serves tens of thousands of pages every day to scientists and physicians. Since 2014 the service is down.

HomoloGene, a tool of the United States National Center for Biotechnology Information (NCBI), is a system for automated detection of homologs among the annotated genes of several completely sequenced eukaryotic genomes.

C16orf42 protein-coding gene in the species Homo sapiens

C16orf42, or chromosome 16 open reading frame 42, is a hypothetical human protein found on chromosome 16. Its protein is 312 amino acids long. and its cDNA has 1214 base pairs

MALSU1 protein-coding gene in the species Homo sapiens

MALSU1 is a gene on chromosome 7 in humans that encodes the protein MALSU1. This protein localizes to mitochondria and is probably involved in mitochondrial translation or the biogenesis of the large subunit of the mitochondrial ribosome.

COMBREX is a multifaceted project that includes a database of gene annotations, functional predictions and recommendations based on Active Learning principles associated with millions of genes in prokaryotic genomes.

PSORTdb

PSORTdb is a database of protein subcellular localization (SCL) for bacteria and archaea. It is a member of the PSORT family of bioinformatics tools. The database consists of two datasets, ePSORTdb and cPSORTdb, which contain information determined through experimental validation and computational prediction, respectively. The ePSORTdb dataset is the largest curated collection of experimentally verified SCL data.

PhylomeDB is a public biological database for complete catalogs of gene phylogenies (phylomes). It allows users to interactively explore the evolutionary history of genes through the visualization of phylogenetic trees and multiple sequence alignments. Moreover, phylomeDB provides genome-wide orthology and paralogy predictions which are based on the analysis of the phylogenetic trees. The automated pipeline used to reconstruct trees aims at providing a high-quality phylogenetic analysis of different genomes, including Maximum Likelihood tree inference, alignment trimming and evolutionary model testing.

The Critical Assessment of Functional Annotation (CAFA) is an experiment designed to provide a large-scale assessment of computational methods dedicated to predicting protein function. Different algorithms are evaluated by their ability to predict the Gene Ontology (GO) terms in the categories of Molecular Function, Biological Process, and Cellular Component.

FAM203B protein-coding gene in the species Homo sapiens

Family with Sequence Similarity 203, Member B (FAM203B) is a protein encoded by the FAM203B gene (8q24.3) in humans. While FAM203B is only found in humans and possibly non-human primates, its paralog, FAM203A, is highly conserved. The FAM203B protein contains two conserved domains of unknown function, DUF383 and DUF384, and no transmembrane domains. This protein has no known function yet, although the homolog of FAM203A in Caenorhabditis elegans (Y54H5A.2) is thought to help regulate the actin cytoskeleton.

EVI5L protein-coding gene in the species Homo sapiens

EVI5L is a protein that in humans is encoded by the EVI5L gene. EVI5L is a member of the Ras superfamily of monomeric guanine nucleotide-binding (G) proteins, and functions as a GTPase-activating protein (GAP) with a broad specificity. Measurement of in vitro Rab-GAP activity has shown that EVI5L has significant Rab2A- and Rab10-GAP activity.

LOC105377021 is a protein which in humans is encoded by the LOC105377021 gene. LOC105377021 exhibits expressional pathology related to breast cancer, specifically triple negative breast cancer. LOC105377021 contains a serine rich region in addition to predicted alpha helix motifs.

C12orf60 protein-coding gene in the species Homo sapiens

Uncharacterized protein C12orf60 is a protein that in humans is encoded by the C12orf60 gene. The gene is also known as LOC144608 or MGC47869. The protein lacks transmembrane domains and helices, but it is rich in alpha-helices. It is predicted to localize in the nucleus.

RTL6 mammalian protein found in Homo sapiens

Retrotransposon Gag Like 6 is a protein encoded by the RTL6 gene in humans. RTL6 is a member of the Mart family of genes, which are related to Sushi-like retrotransposons and were derived from fish and amphibians. The RTL6 protein is localized to the nucleus and has a predicted leucine zipper motif that is known to bind nucleic acids in similar proteins, such as LDOC1.

BEND2 is a protein that in humans is encoded by the BEND2 gene. It is also found in other vertebrates, including mammals, birds, and reptiles. The expression of BEND2 in Homo sapiens is regulated and occurs at high levels in the skeletal muscle tissue of the male testis and in the bone marrow. The presence of the BEN domains in the BEND2 protein indicates that this protein may be involved in chromatin modification and regulation.

C6orf62 protein-coding gene in the species Homo sapiens

Chromosome 6 open reading frame 62 (C6orf62), also known as X-trans-activated protein 12 (XTP12), is a gene that encodes a protein of the same name. The encoded protein is predicted to have a subcellular location within the cytosol.

Chromosome 19 open reading frame 18 (c19orf18) is a protein which in humans is encoded by the c19orf18 gene. The gene is exclusive to mammals and the protein is predicted to have a transmembrane domain and a coiled coil stretch. This protein has a function that is not yet fully understood by the scientific community.

LOC100287387 is a protein that in humans is encoded by the gene LOC100287387. The function of the protein is not yet understood in the scientific community. The gene is located on the q arm of chromosome 2.

Membranome database provides structural and functional information about more than 6000 single-pass (bitopic) transmembrane proteins from Homo sapiens, Arabidopsis thaliana, Dictyostelium discoideum, Saccharomyces cerevisiae, Escherichia coli and Methanocaldococcus jannaschii. Bitopic membrane proteins consist of a single transmembrane alpha-helix connecting water-soluble domains of the protein situated at the opposite sides of a biological membrane. These proteins are frequently involved in the signal transduction and communication between cells in multicellular organisms.

References

  1. 1 2 3 Rastogi, Shruti; Rost Burkhard (Jan 2011). "LocDB: experimental annotations of localization for Homo sapiens and Arabidopsis thaliana". Nucleic Acids Res. England. 39 (Database issue): D230–4. doi:10.1093/nar/gkq927. PMC   3013784 . PMID   21071420.