Structural bioinformatics

Last updated
Three-dimensional structure of a protein 1kqf opm.png
Three-dimensional structure of a protein

Structural bioinformatics is the branch of bioinformatics that is related to the analysis and prediction of the three-dimensional structure of biological macromolecules such as proteins, RNA, and DNA. It deals with generalizations about macromolecular 3D structures such as comparisons of overall folds and local motifs, principles of molecular folding, evolution, binding interactions, and structure/function relationships, working both from experimentally solved structures and from computational models. The term structural has the same meaning as in structural biology, and structural bioinformatics can be seen as a part of computational structural biology. The main objective of structural bioinformatics is the creation of new methods of analysing and manipulating biological macromolecular data in order to solve problems in biology and generate new knowledge. [1]

Contents

Introduction

Protein structure

The structure of a protein is directly related to its function. The presence of certain chemical groups in specific locations allows proteins to act as enzymes, catalyzing several chemical reactions. [2] In general, protein structures are classified into four levels: primary (sequences), secondary (local conformation of the polypeptide chain), tertiary (three-dimensional structure of the protein fold), and quaternary (association of multiple polypeptide structures). Structural bioinformatics mainly addresses interactions among structures taking into consideration their space coordinates. Thus, the primary structure is better analyzed in traditional branches of bioinformatics. However, the sequence implies restrictions that allow the formation of conserved local conformations of the polypeptide chain, such as alpha-helix, beta-sheets, and loops (secondary structure [3] ). Also, weak interactions (such as hydrogen bonds) stabilize the protein fold. Interactions could be intrachain, i.e., when occurring between parts of the same protein monomer (tertiary structure), or interchain, i.e., when occurring between different structures (quaternary structure). Finally, the topological arrangement of interactions, whether strong or weak, and entanglements is being studied in the field of structural bioinformatics, utilizing frameworks such as circuit topology.

Structure visualization

Structural visualization of BACTERIOPHAGE T4 LYSOZYME (PDB ID: 2LZM). (A) Cartoon; (B) Lines; (C) Surface; (D) Sticks. 2LZM.png
Structural visualization of BACTERIOPHAGE T4 LYSOZYME (PDB ID: 2LZM). (A) Cartoon; (B) Lines; (C) Surface; (D) Sticks.

Protein structure visualization is an important issue for structural bioinformatics. [4] It allows users to observe static or dynamic representations of the molecules, also allowing the detection of interactions that may be used to make inferences about molecular mechanisms. The most common types of visualization are:

DNA structure

The classic DNA duplexes structure was initially described by Watson and Crick (and contributions of Rosalind Franklin). The DNA molecule is composed of three substances: a phosphate group, a pentose, and a nitrogen base (adenine, thymine, cytosine, or guanine). The DNA double helix structure is stabilized by hydrogen bonds formed between base pairs: adenine with thymine (A-T) and cytosine with guanine (C-G). Many structural bioinformatics studies have focused on understanding interactions between DNA and small molecules, which has been the target of several drug design studies.

Interactions

Interactions are contacts established between parts of molecules at different levels. They are responsible for stabilizing protein structures and perform a varied range of activities. In biochemistry, interactions are characterized by the proximity of atom groups or molecules regions that present an effect upon one another, such as electrostatic forces, hydrogen bonding, and hydrophobic effect. Proteins can perform several types of interactions, such as protein-protein interactions (PPI), protein-peptide interactions [5] , protein-ligand interactions (PLI) [6] , and protein-DNA interaction.

Contacts between two amino acid residues: Q196-R200 (PDB ID- 2X1C) Contacts between two amino acid residues- Q196-R200 (PDB ID- 2X1C).png
Contacts between two amino acid residues: Q196-R200 (PDB ID- 2X1C)

Calculating contacts

Calculating contacts is an important task in structural bioinformatics, being important for the correct prediction of protein structure and folding, thermodynamic stability, protein-protein and protein-ligand interactions, docking and molecular dynamics analyses, and so on. [8]

Traditionally, computational methods have used threshold distance between atoms (also called cutoff) to detect possible interactions. [9] This detection is performed based on Euclidean distance and angles between atoms of determined types. However, most of the methods based on simple Euclidean distance cannot detect occluded contacts. Hence, cutoff free methods, such as Delaunay triangulation, have gained prominence in recent years. In addition, the combination of a set of criteria, for example, physicochemical properties, distance, geometry, and angles, have been used to improve the contact determination. [8]

Distance criteria for contact definition [8]
TypeMax distance criteria
Hydrogen bond3,9 Å
Hydrophobic interaction5 Å
Ionic interaction6 Å
Aromatic Stacking6 Å

Protein Data Bank (PDB)

The number of structures from PDB. (A) The overall growth of released structures in Protein DataBank per year. (B) Growth of structures deposited in PDB from X-ray crystallography, NMR spectroscopy, and 3D electron microscopy experiments per year. Source: https://www.rcsb.org/stats/growth Number of structures of Protein Data Bank (1976-2020).png
The number of structures from PDB. (A) The overall growth of released structures in Protein DataBank per year. (B) Growth of structures deposited in PDB from X-ray crystallography, NMR spectroscopy, and 3D electron microscopy experiments per year. Source: https://www.rcsb.org/stats/growth

The Protein Data Bank (PDB) is a database of 3D structure data for large biological molecules, such as proteins, DNA, and RNA. PDB is managed by an international organization called the Worldwide Protein Data Bank (wwPDB), which is composed of several local organizations, as. PDBe, PDBj, RCSB, and BMRB. They are responsible for keeping copies of PDB data available on the internet at no charge. The number of structure data available at PDB has increased each year, being obtained typically by X-ray crystallography, NMR spectroscopy, or cryo-electron microscopy.

Data format

The PDB format (.pdb) is the legacy textual file format used to store information of three-dimensional structures of macromolecules used by the Protein Data Bank. Due to restrictions in the format structure conception, the PDB format does not allow large structures containing more than 62 chains or 99999 atom records. [10]

The PDBx/mmCIF (macromolecular Crystallographic Information File) is a standard text file format for representing crystallographic information. [11] Since 2014, the PDB format was substituted as the standard PDB archive distribution by the PDBx/mmCIF file format (.cif). While PDB format contains a set of records identified by a keyword of up to six characters, the PDBx/mmCIF format uses a structure based on key and value, where the key is a name that identifies some feature and the value is the variable information. [12]

Other structural databases

In addition to the Protein Data Bank (PDB), there are several databases of protein structures and other macromolecules. Examples include:

Structure comparison

Structural alignment

Structural alignment is a method for comparison between 3D structures based on their shape and conformation. [23] It could be used to infer the evolutionary relationship among a set of proteins even with low sequence similarity. Structural alignment implies superimposing a 3D structure over a second one, rotating and translating atoms in corresponding positions (in general, using the Cα atoms or even the backbone heavy atoms C, N, O, and Cα). Usually, the alignment quality is evaluated based on the root-mean-square deviation (RMSD) of atomic positions, i.e., the average distance between atoms after superimposition:

where δi is the distance between atom i and either a reference atom corresponding in the other structure or the mean coordinate of the N equivalent atoms. In general, the RMSD outcome is measured in Ångström (Å) unit, which is equivalent to 10−10 m. The nearer to zero the RMSD value, the more similar are the structures.

Graph-based structural signatures

Structural signatures, also called fingerprints, are macromolecule pattern representations that can be used to infer similarities and differences. Comparisons among a large set of proteins using RMSD still is a challenge due to the high computational cost of structural alignments. Structural signatures based on graph distance patterns among atom pairs have been used to determine protein identifying vectors and to detect non-trivial information. [24] Furthermore, linear algebra and machine learning can be used for clustering protein signatures, detecting protein-ligand interactions, predicting ΔΔG, and proposing mutations based on Euclidean distance. [25]

Structure prediction

A Ramachandran plot generated from human PCNA (PDB ID 1AXC). The red, brown, and yellow regions represent the favored, allowed, and "generously allowed" regions as defined by ProCheck. This plot can be used to verify incorrectly modeled amino acids. 1axc PCNA ProCheck Rama.jpg
A Ramachandran plot generated from human PCNA (PDB ID 1AXC). The red, brown, and yellow regions represent the favored, allowed, and "generously allowed" regions as defined by ProCheck. This plot can be used to verify incorrectly modeled amino acids.

The atomic structures of molecules can be obtained by several methods, such as X-ray crystallography (XRC), NMR spectroscopy, and 3D electron microscopy; however, these processes can present high costs and sometimes some structures can be hardly established, such as membrane proteins. Hence, it is necessary to use computational approaches for determining 3D structures of macromolecules. The structure prediction methods are classified into comparative modeling and de novo modeling.

Comparative modeling

Comparative modeling, also known as homology modeling, corresponds to the methodology to construct three-dimensional structures from an amino acid sequence of a target protein and a template with known structure. The literature has described that evolutionarily related proteins tend to present a conserved three-dimensional structure. [26] In addition, sequences of distantly related proteins with identity lower than 20% can present different folds. [27]

De novo modeling

In structural bioinformatics, de novo modeling, also known as ab initio modeling, refers to approaches for obtaining three-dimensional structures from sequences without the necessity of a homologous known 3D structure. Despite the new algorithms and methods proposed in the last years, de novo protein structure prediction is still considered one of the remain outstanding issues in modern science. [28]

Structure validation

After structure modeling, an additional step of structure validation is necessary since many of both comparative and 'de novo' modeling algorithms and tools use heuristics to try assembly the 3D structure, which can generate many errors. Some validation strategies consist of calculating energy scores and comparing them with experimentally determined structures. For example, the DOPE score is an energy score used by the MODELLER tool for determining the best model. [29]

Another validation strategy is calculating φ and ψ backbone dihedral angles of all residues and construct a Ramachandran plot. The side-chain of amino acids and the nature of interactions in the backbone restrict these two angles, and thus, the visualization of allowed conformations could be performed based on the Ramachandran plot. A high quantity of amino acids allocated in no permissive positions of the chart is an indication of a low-quality modeling.

Prediction tools

A list with commonly used software tools for protein structure prediction, including comparative modeling, protein threading, de novo protein structure prediction, and secondary structure prediction is available in the list of protein structure prediction software.

Molecular docking

Representation of docking a ligand (green) to a protein target (black). Docking representation 2.png
Representation of docking a ligand (green) to a protein target (black).

Molecular docking (also referred to only as docking) is a method used to predict the orientation coordinates of a molecule (ligand) when bound to another one (receptor or target). The binding may be mostly through non-covalent interactions while covalently linked binding can also be studied. Molecular docking aims to predict possible poses (binding modes) of the ligand when it interacts with specific regions on the receptor. Docking tools use force fields to estimate a score for ranking best poses that favored better interactions between the two molecules.

In general, docking protocols are used to predict the interactions between small molecules and proteins. However, docking also can be used to detect associations and binding modes among proteins, peptides, DNA or RNA molecules, carbohydrates, and other macromolecules.

Virtual screening

Virtual screening (VS) is a computational approach used for fast screening of large compound libraries for drug discovery. Usually, virtual screening uses docking algorithms to rank small molecules with the highest affinity to a target receptor.

In recent times, several tools have been used to evaluate the use of virtual screening in the process of discovering new drugs. However, problems such as missing information, inaccurate understanding of drug-like molecular properties, weak scoring functions, or insufficient docking strategies hinder the docking process. Hence, the literature has described that it is still not considered a mature technology. [30] [31]

Molecular dynamics

Example: molecular dynamics of a glucose-tolerant b-Glucosidase Molecular dynamics of a glucose-tolerant b-Glucosidase.gif
Example: molecular dynamics of a glucose-tolerant β-Glucosidase

Molecular dynamics (MD) is a computational method for simulating interactions between molecules and their atoms during a given period of time. [33] This method allows the observation of the behavior of molecules and their interactions, considering the system as a whole. To calculate the behavior of the systems and, thus, determine the trajectories, an MD can use Newton's equation of motion, in addition to using molecular mechanics methods to estimate the forces that occur between particles (force fields). [34]

Applications

Informatics approaches used in structural bioinformatics are:

Tools

List of structural bioinformatics tools
SoftwareDescription
I-TASSER Predicting three-dimensional structure model of protein molecules from amino acid sequences.
MOE Molecular Operating Environment (MOE) is an extensive platform including structural modeling for proteins, protein families and antibodies [35]
SBL The Structural Bioinformatics Library: end-user applications and advanced algorithms
BALLView Molecular modeling and visualization [36]
STING Visualization and analysis
PyMOL Viewer and modeling [37]
VMD Viewer, molecular dynamics [38]
Gromacs Protein folding, molecular dynamics, molecular model refinement, molecular model force field generation [39]
LAMMPS Protein folding, molecular dynamics, molecular model refinement, Quantum mechanical macro-molecular interactions [40]
GAMESS Molecular Force Field, Charge refinement, Quantum molecular dynamics, Protein-Molecular chemical reaction simulations (electron transfer), [41]
KiNG An open-source Java kinemage viewer
STRIDE Determination of secondary structure from coordinates [42]
DSSP Algorithm assigning a secondary structure to the amino acids of a protein
MolProbity Structure-validation web server
PROCHECK A structure-validation web service
CheShift A protein structure-validation on-line application
3D-mol.js A molecular viewer for web applications developed using Javascript
PROPKA Rapid prediction of protein pKa values based on empirical structure/function relationships
CARA Computer Aided Resonance Assignment
Docking Server A molecular docking web server
StarBiochem A java protein viewer, features direct search of protein databank
SPADE The structural proteomics application development environment
PocketSuite A web portal for various web-servers for binding site-level analysis. PocketSuite is divided into:: PocketDepth (Binding site prediction)

PocketMatch (Binding site comparison), PocketAlign (Binding site alignment), and PocketAnnotate (Binding site annotation).

MSL An open-source C++ molecular modeling software library for the implementation of structural analysis, prediction and design methods
PSSpred Protein secondary structure prediction
Proteus Webtool for suggesting mutation pairs
SDM A server for predicting effects of mutations on protein stability

See also

Related Research Articles

<span class="mw-page-title-main">Bioinformatics</span> Computational analysis of large, complex sets of biological data

Bioinformatics is an interdisciplinary field of science that develops methods and software tools for understanding biological data, especially when the data sets are large and complex. Bioinformatics uses biology, chemistry, physics, computer science, computer programming, information engineering, mathematics and statistics to analyze and interpret biological data. The subsequent process of analyzing and interpreting data is referred to as computational biology.

<span class="mw-page-title-main">Protein</span> Biomolecule consisting of chains of amino acid residues

Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residues. Proteins perform a vast array of functions within organisms, including catalysing metabolic reactions, DNA replication, responding to stimuli, providing structure to cells and organisms, and transporting molecules from one location to another. Proteins differ from one another primarily in their sequence of amino acids, which is dictated by the nucleotide sequence of their genes, and which usually results in protein folding into a specific 3D structure that determines its activity.

<span class="mw-page-title-main">Structural biology</span> Study of molecular structures in biology

Structural biology, as defined by the Journal of Structural Biology, deals with structural analysis of living material at every level of organization. Early structural biologists throughout the 19th and early 20th centuries were primarily only able to study structures to the limit of the naked eye's visual acuity and through magnifying glasses and light microscopes.

The Protein Data Bank (PDB) is a database for the three-dimensional structural data of large biological molecules, such as proteins and nucleic acids. The data, typically obtained by X-ray crystallography, NMR spectroscopy, or, increasingly, cryo-electron microscopy, and submitted by biologists and biochemists from around the world, are freely accessible on the Internet via the websites of its member organisations. The PDB is overseen by an organization called the Worldwide Protein Data Bank, wwPDB.

<span class="mw-page-title-main">Protein structure prediction</span> Type of biological prediction

Protein structure prediction is the inference of the three-dimensional structure of a protein from its amino acid sequence—that is, the prediction of its secondary and tertiary structure from primary structure. Structure prediction is different from the inverse problem of protein design. Protein structure prediction is one of the most important goals pursued by computational biology; it is important in medicine and biotechnology.

<span class="mw-page-title-main">Structural alignment</span> Aligning molecular sequences using sequence and structural information

Structural alignment attempts to establish homology between two or more polymer structures based on their shape and three-dimensional conformation. This process is usually applied to protein tertiary structures but can also be used for large RNA molecules. In contrast to simple structural superposition, where at least some equivalent residues of the two structures are known, structural alignment requires no a priori knowledge of equivalent positions. Structural alignment is a valuable tool for the comparison of proteins with low sequence similarity, where evolutionary relationships between proteins cannot be easily detected by standard sequence alignment techniques. Structural alignment can therefore be used to imply evolutionary relationships between proteins that share very little common sequence. However, caution should be used in using the results as evidence for shared evolutionary ancestry because of the possible confounding effects of convergent evolution by which multiple unrelated amino acid sequences converge on a common tertiary structure.

<span class="mw-page-title-main">Protein–protein interaction</span> Physical interactions and constructions between multiple proteins

Protein–protein interactions (PPIs) are physical contacts of high specificity established between two or more protein molecules as a result of biochemical events steered by interactions that include electrostatic forces, hydrogen bonding and the hydrophobic effect. Many are physical contacts with molecular associations between chains that occur in a cell or in a living organism in a specific biomolecular context.

Macromolecular docking is the computational modelling of the quaternary structure of complexes formed by two or more interacting biological macromolecules. Protein–protein complexes are the most commonly attempted targets of such modelling, followed by protein–nucleic acid complexes.

Internal Coordinate Mechanics (ICM) is a software program and algorithm to predict low-energy conformations of molecules by sampling the space of internal coordinates defining molecular geometry. In ICM each molecule is constructed as a tree from an entry atom where each next atom is built iteratively from the preceding three atoms via three internal variables. The rings kept rigid or imposed via additional restraints. ICM is used for modelling peptides and interactions with substrates and coenzymes.

In bioinformatics, the root-mean-square deviation of atomic positions, or simply root-mean-square deviation (RMSD), is the measure of the average distance between the atoms of superimposed molecules. In the study of globular protein conformations, one customarily measures the similarity in three-dimensional structure by the RMSD of the Cα atomic coordinates after optimal rigid body superposition.

<span class="mw-page-title-main">Biomolecular structure</span> 3D conformation of a biological sequence, like DNA, RNA, proteins

Biomolecular structure is the intricate folded, three-dimensional shape that is formed by a molecule of protein, DNA, or RNA, and that is important to its function. The structure of these molecules may be considered at any of several length scales ranging from the level of individual atoms to the relationships among entire protein subunits. This useful distinction among scales is often expressed as a decomposition of molecular structure into four levels: primary, secondary, tertiary, and quaternary. The scaffold for this multiscale organization of the molecule arises at the secondary level, where the fundamental structural elements are the molecule's various hydrogen bonds. This leads to several recognizable domains of protein structure and nucleic acid structure, including such secondary-structure features as alpha helixes and beta sheets for proteins, and hairpin loops, bulges, and internal loops for nucleic acids. The terms primary, secondary, tertiary, and quaternary structure were introduced by Kaj Ulrik Linderstrøm-Lang in his 1951 Lane Medical Lectures at Stanford University.

<span class="mw-page-title-main">BALL</span>

BALL is a C++ class framework and set of algorithms and data structures for molecular modelling and computational structural bioinformatics, a Python interface to this library, and a graphical user interface to BALL, the molecule viewer BALLView.

<span class="mw-page-title-main">Homology modeling</span> Method of protein structure prediction using other known proteins

Homology modeling, also known as comparative modeling of protein, refers to constructing an atomic-resolution model of the "target" protein from its amino acid sequence and an experimental three-dimensional structure of a related homologous protein. Homology modeling relies on the identification of one or more known protein structures likely to resemble the structure of the query sequence, and on the production of an alignment that maps residues in the query sequence to residues in the template sequence. It has been seen that protein structures are more conserved than protein sequences amongst homologues, but sequences falling below a 20% sequence identity can have very different structure.

<span class="mw-page-title-main">Molecular biophysics</span> Interdisciplinary research area

Molecular biophysics is a rapidly evolving interdisciplinary area of research that combines concepts in physics, chemistry, engineering, mathematics and biology. It seeks to understand biomolecular systems and explain biological function in terms of molecular structure, structural organization, and dynamic behaviour at various levels of complexity. This discipline covers topics such as the measurement of molecular forces, molecular associations, allosteric interactions, Brownian motion, and cable theory. Additional areas of study can be found on Outline of Biophysics. The discipline has required development of specialized equipment and procedures capable of imaging and manipulating minute living structures, as well as novel experimental approaches.

In the fields of computational chemistry and molecular modelling, scoring functions are mathematical functions used to approximately predict the binding affinity between two molecules after they have been docked. Most commonly one of the molecules is a small organic compound such as a drug and the second is the drug's biological target such as a protein receptor. Scoring functions have also been developed to predict the strength of intermolecular interactions between two proteins or between protein and DNA.

In biology, a protein structure database is a database that is modeled around the various experimentally determined protein structures. The aim of most protein structure databases is to organize and annotate the protein structures, providing the biological community access to the experimental data in a useful way. Data included in protein structure databases often includes three-dimensional coordinates as well as experimental information, such as unit cell dimensions and angles for x-ray crystallography determined structures. Though most instances, in this case either proteins or a specific structure determinations of a protein, also contain sequence information and some databases even provide means for performing sequence based queries, the primary attribute of a structure database is structural information, whereas sequence databases focus on sequence information, and contain no structural information for the majority of entries. Protein structure databases are critical for many efforts in computational biology such as structure based drug design, both in developing the computational methods used and in providing a large experimental dataset used by some methods to provide insights about the function of a protein.

Computer Atlas of Surface Topography of Proteins (CASTp) aims to provide comprehensive and detailed quantitative characterization of topographic features of protein, is now updated to version 3.0. Since its release in 2006, the CASTp server has ≈45000 visits and fulfills ≈33000 calculation requests annually. CASTp has been proven as a confident tool for a wide range of researches, including investigations of signaling receptors, discoveries of cancer therapeutics, understanding of mechanism of drug actions, studies of immune disorder diseases, analysis of protein–nanoparticle interactions, inference of protein functions and development of high-throughput computational tools. This server is maintained by Jie Liang's lab in University of Illinois at Chicago.

<span class="mw-page-title-main">Structure validation</span> Process of evaluating 3-dimensional atomic models of biomacromolecules

Macromolecular structure validation is the process of evaluating reliability for 3-dimensional atomic models of large biological molecules such as proteins and nucleic acids. These models, which provide 3D coordinates for each atom in the molecule, come from structural biology experiments such as x-ray crystallography or nuclear magnetic resonance (NMR). The validation has three aspects: 1) checking on the validity of the thousands to millions of measurements in the experiment; 2) checking how consistent the atomic model is with those experimental data; and 3) checking consistency of the model with known physical and chemical properties.

<span class="mw-page-title-main">I-TASSER</span>

I-TASSER is a bioinformatics method for predicting three-dimensional structure model of protein molecules from amino acid sequences. It detects structure templates from the Protein Data Bank by a technique called fold recognition. The full-length structure models are constructed by reassembling structural fragments from threading templates using replica exchange Monte Carlo simulations. I-TASSER is one of the most successful protein structure prediction methods in the community-wide CASP experiments.

Molecular Operating Environment (MOE) is a drug discovery software platform that integrates visualization, modeling and simulations, as well as methodology development, in one package. MOE scientific applications are used by biologists, medicinal chemists and computational chemists in pharmaceutical, biotechnology and academic research. MOE runs on Windows, Linux, Unix, and macOS. Main application areas in MOE include structure-based design, fragment-based design, ligand-based design, pharmacophore discovery, medicinal chemistry applications, biologics applications, structural biology and bioinformatics, protein and antibody modeling, molecular modeling and simulations, virtual screening, cheminformatics & QSAR. The Scientific Vector Language (SVL) is the built-in command, scripting and application development language of MOE.

References

  1. Gu J, Bourne PE (2011). Structural Bioinformatics (2nd ed.). Hoboken: John Wiley & Sons. ISBN   978-1-118-21056-7. OCLC   778339075.
  2. Gu J, Bourne PE (2009-03-16). Structural Bioinformatics. John Wiley & Sons. ISBN   978-0-470-18105-8.
  3. Kocincová L, Jarešová M, Byška J, Parulek J, Hauser H, Kozlíková B (February 2017). "Comparative visualization of protein secondary structures". BMC Bioinformatics. 18 (Suppl 2): 23. doi: 10.1186/s12859-016-1449-z . PMC   5333176 . PMID   28251875.
  4. Shi M, Gao J, Zhang MQ (July 2017). "Web3DMol: interactive protein structure visualization based on WebGL". Nucleic Acids Research. 45 (W1): W523–W527. doi:10.1093/nar/gkx383. PMC   5570197 . PMID   28482028.
  5. Stanfield RL, Wilson IA (February 1995). "Protein-peptide interactions". Current Opinion in Structural Biology. 5 (1): 103–13. doi:10.1016/0959-440X(95)80015-S. PMID   7773739.
  6. Klebe G (2015). "Protein–Ligand Interactions as the Basis for Drug Action". In Scapin G, Patel D, Arnold E (eds.). Drug Design. NATO Science for Peace and Security Series A: Chemistry and Biology. Dordrecht: Springer. pp. 83–92. doi:10.1007/978-3-642-17907-5_4. ISBN   978-3-642-17906-8.
  7. "Proteus | PROTein Engineering Supporter |". proteus.dcc.ufmg.br. Retrieved 2020-02-26.
  8. 1 2 3 Martins PM, Mayrink VD, de Silveira S, da Silveira CH, de Lima LH, de Melo-Minardi RC (2018). "How to compute protein residue contacts more accurately?". Proceedings of the 33rd Annual ACM Symposium on Applied Computing. Pau, France: ACM Press. pp. 60–67. doi:10.1145/3167132.3167136. ISBN   978-1-4503-5191-1. S2CID   49562347.
  9. da Silveira CH, Pires DE, Minardi RC, Ribeiro C, Veloso CJ, Lopes JC, et al. (February 2009). "Protein cutoff scanning: A comparative analysis of cutoff dependent and cutoff free methods for prospecting contacts in proteins" (PDF). Proteins. 74 (3): 727–43. doi:10.1002/prot.22187. PMID   18704933. S2CID   1208256.
  10. "PDBx/mmCIF General FAQ". mmcif.wwpdb.org. Retrieved 2020-02-26.
  11. wwPDB.org. "wwPDB: File Formats and the PDB". www.wwpdb.org. Retrieved 2020-02-26.
  12. "PDBx/mmCIF Dictionary Resources". mmcif.wwpdb.org. Retrieved 2020-02-26.
  13. "Macromolecular Structures Resource Group". www.ncbi.nlm.nih.gov. Retrieved 2020-04-13.
  14. "Nucleic Acid Database (NDB)". ndbserver.rutgers.edu. Retrieved 2020-04-13.
  15. "SCOP: Structural Classification of Proteins". 2007-09-11. Archived from the original on 2007-09-11. Retrieved 2020-04-13.
  16. Ilyin VA, Abyzov A, Leslin CM (July 2004). "Structural alignment of proteins by a novel TOPOFIT method, as a superimposition of common volumes at a topomax point". Protein Science. 13 (7): 1865–74. doi:10.1110/ps.04672604. PMC   2279929 . PMID   15215530.
  17. "EDS - Uppsala Electron Density Server". eds.bmc.uu.se. Retrieved 2020-04-13.
  18. "Home - Prediction Center". www.predictioncenter.org. Retrieved 2020-04-13.
  19. ":: Dunbrack Lab". dunbrack.fccc.edu. Retrieved 2020-04-13.
  20. "Structural Biology KnowlegebaseSBKB - SBKB". sbkb.org. Retrieved 2020-04-13.
  21. "Protein Common Interface Database". dunbrack2.fccc.edu. Retrieved 2020-04-13.
  22. "AlphaFold".
  23. "Structural alignment (genomics)". ScienceDaily. Retrieved 2020-02-26.
  24. Pires DE, de Melo-Minardi RC, dos Santos MA, da Silveira CH, Santoro MM, Meira W (December 2011). "Cutoff Scanning Matrix (CSM): structural classification and function prediction by protein inter-residue distance patterns". BMC Genomics. 12 Suppl 4 (S4): S12. doi: 10.1186/1471-2164-12-S4-S12 . PMC   3287581 . PMID   22369665.
  25. Mariano DC, Santos LH, Machado KD, Werhli AV, de Lima LH, de Melo-Minardi RC (January 2019). "A Computational Method to Propose Mutations in Enzymes Based on Structural Signature Variation (SSV)". International Journal of Molecular Sciences. 20 (2): 333. doi: 10.3390/ijms20020333 . PMC   6359350 . PMID   30650542.
  26. Kaczanowski S, Zielenkiewicz P (March 2010). "Why similar protein sequences encode similar three-dimensional structures?" (PDF). Theoretical Chemistry Accounts. 125 (3–6): 643–650. doi:10.1007/s00214-009-0656-3. ISSN   1432-881X. S2CID   95593331.
  27. Chothia C, Lesk AM (April 1986). "The relation between the divergence of sequence and structure in proteins". The EMBO Journal. 5 (4): 823–6. doi:10.1002/j.1460-2075.1986.tb04288.x. PMC   1166865 . PMID   3709526.
  28. "So much more to know". Science. 309 (5731): 78–102. July 2005. doi: 10.1126/science.309.5731.78b . PMID   15994524.
  29. Webb B, Sali A (September 2014). "Comparative Protein Structure Modeling Using MODELLER". Current Protocols in Bioinformatics. 47 (1): 5.6.1–32. doi:10.1002/0471250953.bi0506s47. PMC   4186674 . PMID   25199792.
  30. Dhasmana A, Raza S, Jahan R, Lohani M, Arif JM (2019-01-01). "Chapter 19 - High-Throughput Virtual Screening (HTVS) of Natural Compounds and Exploration of Their Biomolecular Mechanisms: An In Silico Approach". In Ahmad Khan MS, Ahmad I, Chattopadhyay D (eds.). New Look to Phytomedicine. Academic Press. pp. 523–548. doi:10.1016/b978-0-12-814619-4.00020-3. ISBN   978-0-12-814619-4. S2CID   69534557.
  31. Wermuth CG, Villoutreix B, Grisoni S, Olivier A, Rocher JP (January 2015). "Strategies in the search for new lead compounds or original working hypotheses.". In Wermuth CG, Aldous D, Raboisson P, Rognan D (eds.). The practice of Medicinal Chemistry. Academic Press. pp. 73–99. doi:10.1016/B978-0-12-417205-0.00004-3. ISBN   978-0-12-417205-0.
  32. Costa LS, Mariano DC, Rocha RE, Kraml J, Silveira CH, Liedl KR, et al. (September 2019). "Molecular Dynamics Gives New Insights into the Glucose Tolerance and Inhibition Mechanisms on β-Glucosidases". Molecules. 24 (18): 3215. doi: 10.3390/molecules24183215 . PMC   6766793 . PMID   31487855.
  33. Alder BJ, Wainwright TE (August 1959). "Studies in Molecular Dynamics. I. General Method". The Journal of Chemical Physics. 31 (2): 459–466. Bibcode:1959JChPh..31..459A. doi:10.1063/1.1730376. ISSN   0021-9606.
  34. Yousif, Ragheed Hussam (2020). "Exploring the Molecular Interactions between Neoculin and the Human Sweet Taste Receptors through Computational Approaches" (PDF). Sains Malaysiana. 49 (3): 517–525. doi: 10.17576/jsm-2020-4903-06 .
  35. MOE
  36. BALLView
  37. PyMOL
  38. VMD
  39. Gromacs
  40. LAMMPS
  41. GAMESS
  42. STRIDE

Further reading