HIVToolbox is an internet application that helps scientists identify and develop new hypotheses as well as facilitate interpretation of experiments, with the goal of helping scientists better understand HIV, the virus that causes AIDS, and to identify new potential drug targets. [1]
The information about HIV includes the different parts of the HIV proteins, often referred to as domains. Smaller regions of the proteins called minimotifs are also present in HIV proteins. These minimotifs, also called short linear motifs encode different molecular functions in HIV proteins. [2] HIVtoolbox is an internet application that runs off of a MySQL database designed to integrate HIV data for protein sequences derived from many different isolates, protein structures, protein domains, protein-protein interactions, functional elements in proteins, and known and predicted minimotifs from the Minimotif Miner database. [3] [4] Other amino acids involved in molecular functions such as catalysis, dimerization, etc. are also included. The database for the program also contains calculated amino acid conservation for all HIV proteins. This helps to identify regions of the virus that do not change, thus are thought to represent important regions. The 2nd release of HIVToolbox, HIVToolbox2 also contains information about the locations on the proteins where different HIV drugs bind, and their spatial relationship to the mutations that arise in HIV and render the virus resistant to the drug.
Retroviral integrase (IN) is an enzyme produced by a retrovirus that integrates its genetic information into that of the host cell it infects. Retroviral INs are not to be confused with phage integrases (recombinases) used in biotechnology, such as λ phage integrase, as discussed in site-specific recombination.
A protease is an enzyme that catalyzes proteolysis, breaking down proteins into smaller polypeptides or single amino acids, and spurring the formation of new protein products. They do this by cleaving the peptide bonds within proteins by hydrolysis, a reaction where water breaks bonds. Proteases are involved in numerous biological pathways, including digestion of ingested proteins, protein catabolism, and cell signaling.
Glycoproteins are proteins which contain oligosaccharide chains covalently attached to amino acid side-chains. The carbohydrate is attached to the protein in a cotranslational or posttranslational modification. This process is known as glycosylation. Secreted extracellular proteins are often glycosylated.
Protein structure prediction is the inference of the three-dimensional structure of a protein from its amino acid sequence—that is, the prediction of its secondary and tertiary structure from primary structure. Structure prediction is different from the inverse problem of protein design. Protein structure prediction is one of the most important goals pursued by computational biology; it is important in medicine and biotechnology.
In biology, a sequence motif is a nucleotide or amino-acid sequence pattern that is widespread and usually assumed to be related to biological function of the macromolecule. For example, an N-glycosylation site motif can be defined as Asn, followed by anything but Pro, followed by either Ser or Thr, followed by anything but Pro residue.
A protein family is a group of evolutionarily related proteins. In many cases, a protein family has a corresponding gene family, in which each gene encodes a corresponding protein with a 1:1 relationship. The term "protein family" should not be confused with family as it is used in taxonomy.
A frameshift mutation is a genetic mutation caused by indels of a number of nucleotides in a DNA sequence that is not divisible by three. Due to the triplet nature of gene expression by codons, the insertion or deletion can change the reading frame, resulting in a completely different translation from the original. The earlier in the sequence the deletion or insertion occurs, the more altered the protein. A frameshift mutation is not the same as a single-nucleotide polymorphism in which a nucleotide is replaced, rather than inserted or deleted. A frameshift mutation will in general cause the reading of the codons after the mutation to code for different amino acids. The frameshift mutation will also alter the first stop codon encountered in the sequence. The polypeptide being created could be abnormally short or abnormally long, and will most likely not be functional.
A coiled coil is a structural motif in proteins in which 2–7 alpha-helices are coiled together like the strands of a rope. They have been found in roughly 5-10% of proteins and have a variety of functions. They are one of the most widespread motifs found in protein-protein interactions. To aid protein study, several tools have been developed to predict coiled-coils in protein structures. Many coiled coil-type proteins are involved in important biological functions, such as the regulation of gene expression — e.g., transcription factors. Notable examples are the oncoproteins c-Fos and c-Jun, as well as the muscle protein tropomyosin.
In molecular biology, an intrinsically disordered protein (IDP) is a protein that lacks a fixed or ordered three-dimensional structure, typically in the absence of its macromolecular interaction partners, such as other proteins or RNA. IDPs range from fully unstructured to partially structured and include random coil, molten globule-like aggregates, or flexible linkers in large multi-domain proteins. They are sometimes considered as a separate class of proteins along with globular, fibrous and membrane proteins.
Molecular mimicry is the theoretical possibility that sequence similarities between foreign and self-peptides are enough to result in the cross-activation of autoreactive T or B cells by pathogen-derived peptides. Despite the prevalence of several peptide sequences which can be both foreign and self in nature, just a few crucial residues can activate a single antibody or TCR. This highlights the importance of structural homology in the theory of molecular mimicry. Upon activation, these "peptide mimic" specific T or B cells can cross-react with self-epitopes, thus leading to tissue pathology (autoimmunity). Molecular mimicry is one of several ways in which autoimmunity can be evoked. A molecular mimicking event is more than an epiphenomenon despite its low probability, and these events have serious implications in the onset of many human autoimmune disorders.
The omega loop is a non-regular protein structural motif, consisting of a loop of six or more amino acid residues and any amino acid sequence. The defining characteristic is that residues that make up the beginning and end of the loop are close together in space with no intervening lengths of regular secondary structural motifs. It is named after its shape, which resembles the upper-case Greek letter Omega (Ω).
HIV-1 protease or PR is a retroviral aspartyl protease (retropepsin), an enzyme involved with peptide bond hydrolysis in retroviruses, that is essential for the life-cycle of HIV, the retrovirus that causes AIDS. HIV-1 PR cleaves newly synthesized polyproteins at nine cleavage sites to create the mature protein components of an HIV virion, the infectious form of a virus outside of the host cell. Without effective HIV-1 PR, HIV virions remain uninfectious.
Ribosomal frameshifting, also known as translational frameshifting or translational recoding, is a biological phenomenon that occurs during translation that results in the production of multiple, unique proteins from a single mRNA. The process can be programmed by the nucleotide sequence of the mRNA and is sometimes affected by the secondary, 3-dimensional mRNA structure. It has been described mainly in viruses, retrotransposons and bacterial insertion elements, and also in some cellular genes.
TAF9 RNA polymerase II, TATA box binding protein (TBP)-associated factor, 32kDa, also known as TAF9, is a protein that in humans is encoded by the TAF9 gene.
The P24 capsid protein is the most abundant HIV protein with each virus containing approximately 1,500 to 3,000 p24 molecules. It is the major structural protein within the capsid, and it is involved in maintaining the structural integrity of the virus and facilitating various stages of the viral life cycle, including viral entry into host cells and the release of new virus particles. Detection of p24 protein's antigen can be used to identify the presence of HIV in a person's blood, however, more modern tests have taken their place. After approximately 50 days of infection, the p24 antigen is often cleared from the bloodstream entirely.
A nuclear export signal (NES) is a short target peptide containing 4 hydrophobic residues in a protein that targets it for export from the cell nucleus to the cytoplasm through the nuclear pore complex using nuclear transport. It has the opposite effect of a nuclear localization signal, which targets a protein located in the cytoplasm for import to the nucleus. The NES is recognized and bound by exportins.
Rev is a transactivating protein that is essential to the regulation of HIV-1 protein expression. A nuclear localization signal is encoded in the rev gene, which allows the Rev protein to be localized to the nucleus, where it is involved in the export of unspliced and incompletely spliced mRNAs. In the absence of Rev, mRNAs of the HIV-1 late (structural) genes are retained in the nucleus, preventing their translation.
Vpr is a Human immunodeficiency virus gene and protein product. Vpr stands for "Viral Protein R". Vpr, a 96 amino acid 14-kDa protein, plays an important role in regulating nuclear import of the HIV-1 pre-integration complex, and is required for virus replication and enhanced gene expression from provirus in dividing or non-dividing cells such as T cells or macrophages. Vpr also induces G2 cell cycle arrest and apoptosis in proliferating cells, which can result in immune dysfunction.
In molecular biology short linear motifs (SLiMs), linear motifs or minimotifs are short stretches of protein sequence that mediate protein–protein interaction.
Minimotif Miner is a program and database designed to identify minimotifs in any protein. Minimotifs are short, contiguous peptide sequences that are known to have a function in at least one protein. Minimotifs are also called sequence motifs or short linear motifs or SLiMs. These are generally restricted to one secondary structure element and are less than 15 amino acids in length.