H-Invitational

Last updated
The H-Invitational
Database.png
Content
Descriptionannotation resource for human genes and transcripts.
Organisms Homo sapiens
Contact
Laboratory Japan Biological Informatics Consortium
Primary citationYamasaki & et al. (2008) [1]
Release date2007
Access
Website http://www.h-invitational.jp

the H-Invitational Database (H-InvDB) is a comprehensive annotation resource for human genes and transcripts. [1]

Gene Basic physical and functional unit of heredity

In biology, a gene is a sequence of nucleotides in DNA or RNA that codes for a molecule that has a function. During gene expression, the DNA is first copied into RNA. The RNA can be directly functional or be the intermediate template for a protein that performs a function. The transmission of genes to an organism's offspring is the basis of the inheritance of phenotypic trait. These genes make up different DNA sequences called genotypes. Genotypes along with environmental and developmental factors determine what the phenotypes will be. Most biological traits are under the influence of polygenes as well as gene–environment interactions. Some genetic traits are instantly visible, such as eye color or number of limbs, and some are not, such as blood type, risk for specific diseases, or the thousands of basic biochemical processes that constitute life.

RNA family of large biological molecules

Ribonucleic acid (RNA) is a polymeric molecule essential in various biological roles in coding, decoding, regulation and expression of genes. RNA and DNA are nucleic acids, and, along with lipids, proteins and carbohydrates, constitute the four major macromolecules essential for all known forms of life. Like DNA, RNA is assembled as a chain of nucleotides, but unlike DNA it is more often found in nature as a single-strand folded onto itself, rather than a paired double-strand. Cellular organisms use messenger RNA (mRNA) to convey genetic information that directs synthesis of specific proteins. Many viruses encode their genetic information using an RNA genome.

Contents

See also

Related Research Articles

In genetics, an expressed sequence tag (EST) is a short sub-sequence of a cDNA sequence. ESTs may be used to identify gene transcripts, and are instrumental in gene discovery and in gene-sequence determination. The identification of ESTs has proceeded rapidly, with approximately 74.2 million ESTs now available in public databases.

Gene ontology (GO) is a major bioinformatics initiative to unify the representation of gene and gene product attributes across all species. More specifically, the project aims to: 1) maintain and develop its controlled vocabulary of gene and gene product attributes; 2) annotate genes and gene products, and assimilate and disseminate annotation data; and 3) provide tools for easy access to all aspects of the data provided by the project, and to enable functional interpretation of experimental data using the GO, for example via enrichment analysis. GO is part of a larger classification effort, the Open Biomedical Ontologies (OBO).

UniProt database of protein sequence and functional information

UniProt is a freely accessible database of protein sequence and functional information, many entries being derived from genome sequencing projects. It contains a large amount of information about the biological function of proteins derived from the research literature.

GNAS complex locus protein-coding gene in the species Homo sapiens

GNAS complex locus is a gene locus in humans. Its main product is the stimulatory G-protein alpha subunit (Gs), a key component of many signal transduction pathways.

Mouse Genome Informatics (MGI) is a free, online database and bioinformatics resource hosted by The Jackson Laboratory, with funding by the National Human Genome Research Institute (NHGRI), the National Cancer Institute (NCI), and the Eunice Kennedy Shriver National Institute of Child Health and Human Development (NICHD). MGI provides access to data on the genetics, genomics and biology of the laboratory mouse to facilitate the study of human health and disease. The database integrates multiple projects, with the two largest contributions coming from the Mouse Genome Database and Gene Expression Database (GXD).

Small nucleolar RNA SNORD64

SNORD64 is a non-coding RNA (ncRNA) molecule which functions in the biogenesis (modification) of other small nuclear RNAs (snRNAs). This type of modifying RNA is located in the nucleolus of the eukaryotic cell which is a major site of snRNA biogenesis. It is known as a small nucleolar RNA (snoRNA) and also often referred to as a guide RNA.

GENCODE is a scientific project in genome research and part of the ENCODE scale-up project.

The UCSC Genome Browser is an on-line, and downloadable, genome browser hosted by the University of California, Santa Cruz (UCSC). It is an interactive website offering access to genome sequence data from a variety of vertebrate and invertebrate species and major model organisms, integrated with a large collection of aligned annotations. The Browser is a graphical viewer optimized to support fast interactive performance and is an open-source, web-based tool suite built on top of a MySQL database for rapid visualization, examination, and querying of the data at many levels. The Genome Browser Database, browsing tools, downloadable data files, and documentation can all be found on the UCSC Genome Bioinformatics website.

GeneCards is a database of human genes that provides genomic, proteomic, transcriptomic, genetic and functional information on all known and predicted human genes. It is being developed and maintained by the Crown Human Genome Center at the Weizmann Institute of Science. This database aims at providing a quick overview of the current available biomedical information about the searched gene, including the human genes, the encoded proteins, and the relevant diseases. The GeneCards database provides access to free Web resources about more than 7000 all known human genes that integrated from >90 data resources, such as HGNC, Ensembl, and NCBI. The core gene list is based on approved gene symbols published by the HUGO Gene Nomenclature Committee (HGNC). The information are carefully gathered and selected from these databases by the powerful and user-friendly engine. If the search does not return any results, this database will give several suggestions to help users accomplish their searching depended on the type of query, and offer direct links to other databases’ search engine. Over time, the GeneCards database has developed a suite of tools that has more specialised capability. Since 1998, the GeneCards database has been widely used by bioinformatics, genomics and medical communities for more than 15 years.

A conjoined gene (CG) is defined as a gene, which gives rise to transcripts by combining at least part of one exon from each of two or more distinct known (parent) genes which lie on the same chromosome, are in the same orientation, and often (95%) translate independently into different proteins. In some cases, the transcripts formed by CGs are translated to form chimeric or completely novel proteins.

A biological pathway is a series of interactions among molecules in a cell that leads to a certain product or a change in a cell. Such a pathway can trigger the assembly of new molecules, such as a fat or protein. Pathways can also turn genes on and off, or spur a cell to move. Some of the most common biological pathways are involved in metabolism, the regulation of gene expression and the transmission of signals. Pathways play a key role in advanced studies of genomics.

ASPicDB is a database of human protein variants generated by alternative splicing, a process by which the exons of the RNA produced by transcription of a gene are reconnected in multiple ways during RNA splicing.

The Consensus Coding Sequence (CCDS) Project is a collaborative effort to maintain a dataset of protein-coding regions that are identically annotated on the human and mouse reference genome assemblies. The CCDS project tracks identical protein annotations on the reference mouse and human genomes with a stable identifier, and ensures that they are consistently represented by the National Center for Biotechnology Information (NCBI), Ensembl, and UCSC Genome Browser. The integrity of the CCDS dataset is maintained through stringent quality assurance testing and on-going manual curation.

The Mammalian Promoter Database (MPromDb) is a curated database of gene promoters identified from ChIP-seq The proximal promoter region contains the cis-regulatory elements of most of the transcription factors (TFs).

dcGO is a comprehensive ontology database for protein domains. As an ontology resource, dcGO integrates Open Biomedical Ontologies from a variety of contexts, ranging from functional information like Gene Ontology to others on enzymes and pathways, from phenotype information across major model organisms to information about human diseases and drugs. As a protein domain resource, dcGO includes annotations to both the individual domains and supra-domains.

WormBase is an online biological database about the biology and genome of the nematode model organism Caenorhabditis elegans and contains information about other related nematodes. WormBase is used by the C. elegans research community both as an information resource and as a place to publish and distribute their results. The database is regularly updated with new versions being released every two months. WormBase is one of the organizations participating in the Generic Model Organism Database (GMOD) project.

In Bioinformatics, a Gene Disease Database is a systematized collection of data, typically structured to model aspects of reality, in a way to comprehend the underlying mechanisms of complex diseases, by understanding multiple composite interactions between phenotype-genotype relationships and gene-disease mechanisms. Gene Disease Databases integrate human gene-disease associations from various expert curated databases and text-mining derived associations including Mendelian, complex and environmental diseases.

The Histone Database is a comprehensive database of histone protein sequences including histone variants, classified by histone types and variants, maintained by National Center for Biotechnology Information. The creation of the Histone Database was stimulated by the X-ray analysis of the structure of the nucleosomal core histone octamer followed by the application of a novel motif searching method to a group of proteins containing the histone fold motif in the early-mid-1990. The first version of the Histone Database was released in 1995 and several updates have been released since then.

Mark Boguski American physician

Mark Boguski is the Chief Medical Officer of Liberty BioSecurity, LLC and founded the Precision Medicine Network in 2014. He is member of the U.S. National Academy of Medicine and a Fellow of both the College of American Pathologists and the American College of Medical Informatics. Dr. Boguski has served on the faculties of the U.S. National Institutes of Health, the Johns Hopkins University School of Medicine and Harvard Medical School and as an executive in the biotechnology and pharmaceutical industries. He is a former Vice President and Global Head of Genome and Protein Sciences at Novartis and a graduate of the Medical Scientist Training Program at Washington University in St. Louis. He has also written a series of books on cancer for the general public under the series title Reimagining Cancer.

References

  1. 1 2 "The H-Invitational Database (H-InvDB), a comprehensive annotation resource for human genes and transcripts". Nucleic Acids Res. England. 36 (Database issue): D793–9. Jan 2008. doi:10.1093/nar/gkm999. PMC   2238988 . PMID   18089548.