This article needs to be updated.(August 2015) |
Original author(s) | Andrej Sali |
---|---|
Developer(s) | University of California, San Francisco, Accelrys |
Initial release | 1989 |
Stable release | 10.3 / July 13, 2022 [1] |
Operating system | Unix, Linux, macOS, Windows |
Platform | x86, x86-64 |
Available in | English |
Type | homology modeling of proteins |
License | Proprietary: academic nonprofit freeware, commercial software |
Website | www |
Modeller, often stylized as MODELLER, is a computer program used for homology modeling to produce models of protein tertiary structures and quaternary structures (rarer). [2] [3] It implements a method inspired by nuclear magnetic resonance spectroscopy of proteins (protein NMR), termed satisfaction of spatial restraints , by which a set of geometrical criteria are used to create a probability density function for the location of each atom in the protein. The method relies on an input sequence alignment between the target amino acid sequence to be modeled and a template protein which structure has been solved.
The program also incorporates limited functions for ab initio structure prediction of loop regions of proteins, which are often highly variable even among homologous proteins and thus difficult to predict by homology modeling.
Modeller was originally written and is currently maintained by Andrej Sali at the University of California, San Francisco. [4] It runs on the operating systems Unix, Linux, macOS, and Windows. It is freeware for academic use. Graphical user interfaces (GUIs) and commercial versions are distributed by Accelrys. The ModWeb comparative protein structure modeling webserver is based on Modeller and other tools for automatic protein structure modeling, with an option to deposit the resulting models into ModBase. Due to Modeller's popularity, several third party GUIs for MODELLER are available:
Protein secondary structure is the local spatial conformation of the polypeptide backbone excluding the side chains. The two most common secondary structural elements are alpha helices and beta sheets, though beta turns and omega loops occur as well. Secondary structure elements typically spontaneously form as an intermediate before the protein folds into its three dimensional tertiary structure.
Protein structure prediction is the inference of the three-dimensional structure of a protein from its amino acid sequence—that is, the prediction of its secondary and tertiary structure from primary structure. Structure prediction is different from the inverse problem of protein design. Protein structure prediction is one of the most important goals pursued by computational biology; it is important in medicine and biotechnology.
Structural bioinformatics is the branch of bioinformatics that is related to the analysis and prediction of the three-dimensional structure of biological macromolecules such as proteins, RNA, and DNA. It deals with generalizations about macromolecular 3D structures such as comparisons of overall folds and local motifs, principles of molecular folding, evolution, binding interactions, and structure/function relationships, working both from experimentally solved structures and from computational models. The term structural has the same meaning as in structural biology, and structural bioinformatics can be seen as a part of computational structural biology. The main objective of structural bioinformatics is the creation of new methods of analysing and manipulating biological macromolecular data in order to solve problems in biology and generate new knowledge.
In computational biology, gene prediction or gene finding refers to the process of identifying the regions of genomic DNA that encode genes. This includes protein-coding genes as well as RNA genes, but may also include prediction of other functional elements such as regulatory regions. Gene finding is one of the first and most important steps in understanding the genome of a species once it has been sequenced.
PyMOL is a source-available molecular visualization system created by Warren Lyford DeLano. It was commercialized initially by DeLano Scientific LLC, which was a private software company dedicated to creating useful tools that become universally accessible to scientific and educational communities. It is currently commercialized by Schrödinger, Inc. As the original software license was a permissive licence, they were able to remove it; new versions are no longer released under the Python license, but under a custom license, and some of the source code is no longer released. PyMOL can produce high-quality 3D images of small molecules and biological macromolecules, such as proteins. According to the original author, by 2009, almost a quarter of all published images of 3D protein structures in the scientific literature were made using PyMOL.
In molecular biology, protein threading, also known as fold recognition, is a method of protein modeling which is used to model those proteins which have the same fold as proteins of known structures, but do not have homologous proteins with known structure. It differs from the homology modeling method of structure prediction as it is used for proteins which do not have their homologous protein structures deposited in the Protein Data Bank (PDB), whereas homology modeling is used for those proteins which do. Threading works by using statistical knowledge of the relationship between the structures deposited in the PDB and the sequence of the protein which one wishes to model.
UCSF Chimera is an extensible program for interactive visualization and analysis of molecular structures and related data, including density maps, supramolecular assemblies, sequence alignments, docking results, trajectories, and conformational ensembles. High-quality images and movies can be created. Chimera includes complete documentation and can be downloaded free of charge for noncommercial use.
Homology modeling, also known as comparative modeling of protein, refers to constructing an atomic-resolution model of the "target" protein from its amino acid sequence and an experimental three-dimensional structure of a related homologous protein. Homology modeling relies on the identification of one or more known protein structures likely to resemble the structure of the query sequence, and on the production of an alignment that maps residues in the query sequence to residues in the template sequence. It has been seen that protein structures are more conserved than protein sequences amongst homologues, but sequences falling below a 20% sequence identity can have very different structure.
Loop modeling is a problem in protein structure prediction requiring the prediction of the conformations of loop regions in proteins with or without the use of a structural template. Computer programs that solve these problems have been used to research a broad range of scientific topics from ADP to breast cancer. Because protein function is determined by its shape and the physiochemical properties of its exposed surface, it is important to create an accurate model for protein/ligand interaction studies. The problem arises often in homology modeling, where the tertiary structure of an amino acid sequence is predicted based on a sequence alignment to a template, or a second sequence whose structure is known. Because loops have highly variable sequences even within a given structural motif or protein fold, they often correspond to unaligned regions in sequence alignments; they also tend to be located at the solvent-exposed surface of globular proteins and thus are more conformationally flexible. Consequently, they often cannot be modeled using standard homology modeling techniques. More constrained versions of loop modeling are also used in the data fitting stages of solving a protein structure by X-ray crystallography, because loops can correspond to regions of low electron density and are therefore difficult to resolve.
EVA was a continuously running benchmark project for assessing the quality and value of protein structure prediction and secondary structure prediction methods. Methods for predicting both secondary structure and tertiary structure - including homology modeling, protein threading, and contact order prediction - were compared to results from each week's newly solved protein structures deposited in the Protein Data Bank. The project aimed to determine the prediction accuracy that would be expected for non-expert users of common, publicly available prediction webservers; this is similar to the related LiveBench project and stands in contrast to the bi-yearly benchmark CASP, which aims to identify the maximum accuracy achievable by prediction experts.
Avogadro is a molecule editor and visualizer designed for cross-platform use in computational chemistry, molecular modeling, bioinformatics, materials science, and related areas. It is extensible via a plugin architecture.
Developed by the Dunbrack group at Fox Chase Cancer Center, MolIDE is an open-source cross-platform program for comparative modelling of protein structures. MolIDE acts as a graphical user interface to the common tasks involved in predicting protein structures based on known homologous structures. It implements the most frequently used steps involved in modeling: secondary structure prediction, multiple-round psiblast alignments, assisted alignment editing, side chain replacement and loop building.
FoldX is a protein design algorithm that uses an empirical force field. It can determine the energetic effect of point mutations as well as the interaction energy of protein complexes. FoldX can mutate protein and DNA side chains using a probability-based rotamer library, while exploring alternative conformations of the surrounding side chains.
Phyre and Phyre2 are free web-based services for protein structure prediction. Phyre is among the most popular methods for protein structure prediction having been cited over 1500 times. Like other remote homology recognition techniques, it is able to regularly generate reliable protein models when other widely used methods such as PSI-BLAST cannot. Phyre2 has been designed to ensure a user-friendly interface for users inexpert in protein structure prediction methods. Its development is funded by the Biotechnology and Biological Sciences Research Council.
ModBase is a database of annotated comparative protein structure models, containing models for more than 3.8 million unique protein sequences. Models are created by the comparative modeling pipeline ModPipe which relies on the MODELLER program.
ProBiS is a computer software which allows prediction of binding sites and their corresponding ligands for a given protein structure. Initially ProBiS was developed as a ProBiS algorithm by Janez Konc and Dušanka Janežič in 2010 and is now available as ProBiS server, ProBiS CHARMMing server, ProBiS algorithm and ProBiS plugin. The name ProBiS originates from the purpose of the software itself, that is to predict for a given Protein structure Binding Sites and their corresponding ligands.
Non-coding RNAs have been discovered using both experimental and bioinformatic approaches. Bioinformatic approaches can be divided into three main categories. The first involves homology search, although these techniques are by definition unable to find new classes of ncRNAs. The second category includes algorithms designed to discover specific types of ncRNAs that have similar properties. Finally, some discovery methods are based on very general properties of RNA, and are thus able to discover entirely new kinds of ncRNAs.
FlexAID is a molecular docking software that can use small molecules and peptides as ligands and proteins and nucleic acids as docking targets. As the name suggests, FlexAID supports full ligand flexibility as well side-chain flexibility of the target. It does using a soft scoring function based on the complementarity of the two surfaces.
In biochemistry, a backbone-dependent rotamer library provides the frequencies, mean dihedral angles, and standard deviations of the discrete conformations of the amino acid side chains in proteins as a function of the backbone dihedral angles φ and ψ of the Ramachandran map. By contrast, backbone-independent rotamer libraries express the frequencies and mean dihedral angles for all side chains in proteins, regardless of the backbone conformation of each residue type. Backbone-dependent rotamer libraries have been shown to have significant advantages over backbone-independent rotamer libraries, principally when used as an energy term, by speeding up search times of side-chain packing algorithms used in protein structure prediction and protein design.