Crystal structure prediction

Last updated

Crystal structure prediction (CSP) is the calculation of the crystal structures of solids from first principles. Reliable methods of predicting the crystal structure of a compound, based only on its composition, has been a goal of the physical sciences since the 1950s. [1] Computational methods employed include simulated annealing, evolutionary algorithms, distributed multipole analysis, random sampling, basin-hopping, data mining, density functional theory and molecular mechanics. [2]

Contents

History

The crystal structures of simple ionic solids have long been rationalised in terms of Pauling's rules, first set out in 1929 by Linus Pauling. [3] For metals and semiconductors one has different rules involving valence electron concentration. However, prediction and rationalization are rather different things. Most commonly, the term crystal structure prediction means a search for the minimum-energy arrangement of its constituent atoms (or, for molecular crystals, of its molecules) in space. The problem has two facets: combinatorics (the "search phase space", in practice most acute for inorganic crystals), and energetics (or "stability ranking", most acute for molecular organic crystals). For complex non-molecular crystals (where the "search problem" is most acute), major recent advances have been the development of the Martonak version of metadynamics, [4] [5] the Oganov-Glass evolutionary algorithm USPEX, [6] and first principles random search. [7] The latter are capable of solving the global optimization problem with up to around a hundred degrees of freedom, while the approach of metadynamics is to reduce all structural variables to a handful of "slow" collective variables (which often works).

Molecular crystals

Predicting organic crystal structures is important in academic and industrial science, particularly for pharmaceuticals and pigments, where understanding polymorphism is beneficial. [8] The crystal structures of molecular substances, particularly organic compounds, are very hard to predict and rank in order of stability. Intermolecular interactions are relatively weak and non-directional and long range. [9] This results in typical lattice and free energy differences between polymorphs that are often only a few kJ/mol, very rarely exceeding 10 kJ/mol. [10] Crystal structure prediction methods often locate many possible structures within this small energy range. These small energy differences are challenging to predict reliably without excessive computational effort.

Since 2007, significant progress has been made in the CSP of small organic molecules, with several different methods proving effective. [11] [12] The most widely discussed method first ranks the energies of all possible crystal structures using a customised MM force field, and finishes by using a dispersion-corrected DFT step to estimate the lattice energy and stability of each short-listed candidate structure. [13] More recent efforts to predict crystal structures have focused on estimating crystal free energy by including the effects of temperature and entropy in organic crystals using vibrational analysis or molecular dynamics. [14] [15]

Crystal structure prediction software

The following codes can predict stable and metastable structures given chemical composition and external conditions (pressure, temperature):

Further reading

Related Research Articles

<span class="mw-page-title-main">Computational chemistry</span> Branch of chemistry

Computational chemistry is a branch of chemistry that uses computer simulation to assist in solving chemical problems. It uses methods of theoretical chemistry, incorporated into computer programs, to calculate the structures and properties of molecules, groups of molecules, and solids. The importance of this subject stems from the fact that, with the exception of some relatively recent findings related to the hydrogen molecular ion, achieving an accurate quantum mechanical depiction of chemical systems analytically, or in a closed form, is not feasible. The complexity inherent in many-body problem exacerbates the challenge of providing detailed descriptions in quantum mechanical systems. While computational results normally complement the information obtained by chemical experiments, it can in some cases predict unobserved chemical phenomena.

<span class="mw-page-title-main">Molecular dynamics</span> Computer simulations to discover and understand chemical properties

Molecular dynamics (MD) is a computer simulation method for analyzing the physical movements of atoms and molecules. The atoms and molecules are allowed to interact for a fixed period of time, giving a view of the dynamic "evolution" of the system. In the most common version, the trajectories of atoms and molecules are determined by numerically solving Newton's equations of motion for a system of interacting particles, where forces between the particles and their potential energies are often calculated using interatomic potentials or molecular mechanical force fields. The method is applied mostly in chemical physics, materials science, and biophysics.

<span class="mw-page-title-main">Protein structure prediction</span> Type of biological prediction

Protein structure prediction is the inference of the three-dimensional structure of a protein from its amino acid sequence—that is, the prediction of its secondary and tertiary structure from primary structure. Structure prediction is different from the inverse problem of protein design. Protein structure prediction is one of the most important goals pursued by computational biology; and it is important in medicine and biotechnology.

<span class="mw-page-title-main">Structural bioinformatics</span> Bioinformatics subfield

Structural bioinformatics is the branch of bioinformatics that is related to the analysis and prediction of the three-dimensional structure of biological macromolecules such as proteins, RNA, and DNA. It deals with generalizations about macromolecular 3D structures such as comparisons of overall folds and local motifs, principles of molecular folding, evolution, binding interactions, and structure/function relationships, working both from experimentally solved structures and from computational models. The term structural has the same meaning as in structural biology, and structural bioinformatics can be seen as a part of computational structural biology. The main objective of structural bioinformatics is the creation of new methods of analysing and manipulating biological macromolecular data in order to solve problems in biology and generate new knowledge.

<span class="mw-page-title-main">Magnesium peroxide</span> Chemical compound

Magnesium peroxide (MgO2) is an odorless fine powder peroxide with a white to off-white color. It is similar to calcium peroxide because magnesium peroxide also releases oxygen by breaking down at a controlled rate with water. Commercially, magnesium peroxide often exists as a compound of magnesium peroxide and magnesium hydroxide.

Protein design is the rational design of new protein molecules to design novel activity, behavior, or purpose, and to advance basic understanding of protein function. Proteins can be designed from scratch or by making calculated variants of a known protein structure and its sequence. Rational protein design approaches make protein-sequence predictions that will fold to specific structures. These predicted sequences can then be validated experimentally through methods such as peptide synthesis, site-directed mutagenesis, or artificial gene synthesis.

<span class="mw-page-title-main">Intrinsically disordered proteins</span> Protein without a fixed 3D structure

In molecular biology, an intrinsically disordered protein (IDP) is a protein that lacks a fixed or ordered three-dimensional structure, typically in the absence of its macromolecular interaction partners, such as other proteins or RNA. IDPs range from fully unstructured to partially structured and include random coil, molten globule-like aggregates, or flexible linkers in large multi-domain proteins. They are sometimes considered as a separate class of proteins along with globular, fibrous and membrane proteins.

<i>Force field</i> (chemistry) Concept on molecular modeling

In the context of chemistry, molecular physics and physical chemistry and molecular modelling, a force field is a computational model that is used to describe the forces between atoms within molecules or between molecules as well as in crystals. More precisely, the force field refers to the functional form and parameter sets used to calculate the potential energy of a system of the atomistic level. Force fields are usually used in molecular dynamics or Monte Carlo simulations. The parameters for a chosen energy function may be derived from classical laboratory experiment data, calculations in quantum mechanics, or both. Force fields utilize the same concept as force fields in classical physics, with the main difference that the force field parameters in chemistry describe the energy landscape on the atomistic level. From a force field, the acting forces on every particle are derived as a gradient of the potential energy with respect to the particle coordinates.

Nucleic acid structure prediction is a computational method to determine secondary and tertiary nucleic acid structure from its sequence. Secondary structure can be predicted from one or several nucleic acid sequences. Tertiary structure can be predicted from the sequence, or by comparative modeling.

<span class="mw-page-title-main">Crystal engineering</span> Designing solid structures with tailored properties

Crystal engineering studies the design and synthesis of solid-state structures with desired properties through deliberate control of intermolecular interactions. It is an interdisciplinary academic field, bridging solid-state and supramolecular chemistry.

Car–Parrinello molecular dynamics or CPMD refers to either a method used in molecular dynamics or the computational chemistry software package used to implement this method.

<span class="mw-page-title-main">CP2K</span>

CP2K is a freely available (GPL) quantum chemistry and solid state physics program package, written in Fortran 2008, to perform atomistic simulations of solid state, liquid, molecular, periodic, material, crystal, and biological systems. It provides a general framework for different methods: density functional theory (DFT) using a mixed Gaussian and plane waves approach (GPW) via LDA, GGA, MP2, or RPA levels of theory, classical pair and many-body potentials, semi-empirical and tight-binding Hamiltonians, as well as Quantum Mechanics/Molecular Mechanics (QM/MM) hybrid schemes relying on the Gaussian Expansion of the Electrostatic Potential (GEEP). The Gaussian and Augmented Plane Waves method (GAPW) as an extension of the GPW method allows for all-electron calculations. CP2K can do simulations of molecular dynamics, metadynamics, Monte Carlo, Ehrenfest dynamics, vibrational analysis, core level spectroscopy, energy minimization, and transition state optimization using NEB or dimer method.

Desmond is a software package developed at D. E. Shaw Research to perform high-speed molecular dynamics simulations of biological systems on conventional computer clusters. The code uses novel parallel algorithms and numerical methods to achieve high performance on platforms containing multiple processors, but may also be executed on a single computer.

<span class="mw-page-title-main">Metadynamics</span> Scientific computer simulation method

Metadynamics is a computer simulation method in computational physics, chemistry and biology. It is used to estimate the free energy and other state functions of a system, where ergodicity is hindered by the form of the system's energy landscape. It was first suggested by Alessandro Laio and Michele Parrinello in 2002 and is usually applied within molecular dynamics simulations. MTD closely resembles a number of newer methods such as adaptively biased molecular dynamics, adaptive reaction coordinate forces and local elevation umbrella sampling. More recently, both the original and well-tempered metadynamics were derived in the context of importance sampling and shown to be a special case of the adaptive biasing potential setting. MTD is related to the Wang–Landau sampling.

Protein function prediction methods are techniques that bioinformatics researchers use to assign biological or biochemical roles to proteins. These proteins are usually ones that are poorly studied or predicted based on genomic sequence data. These predictions are often driven by data-intensive computational procedures. Information may come from nucleic acid sequence homology, gene expression profiles, protein domain structures, text mining of publications, phylogenetic profiles, phenotypic profiles, and protein-protein interaction. Protein function is a broad term: the roles of proteins range from catalysis of biochemical reactions to transport to signal transduction, and a single protein may play a role in multiple processes or cellular pathways.

Local elevation is a technique used in computational chemistry or physics, mainly in the field of molecular simulation. It was developed in 1994 by Huber, Torda and van Gunsteren to enhance the searching of conformational space in molecular dynamics simulations and is available in the GROMOS software for molecular dynamics simulation. The method was, together with the conformational flooding method, the first to introduce memory dependence into molecular simulations. Many recent methods build on the principles of the local elevation technique, including the Engkvist-Karlström, adaptive biasing force, Wang–Landau, metadynamics, adaptively biased molecular dynamics, adaptive reaction coordinate forces, and local elevation umbrella sampling methods. The basic principle of the method is to add a memory-dependent potential energy term in the simulation so as to prevent the simulation to revisit already sampled configurations, which leads to the increased probability of discovering new configurations. The method can be seen as a continuous variant of the Tabu search method.

<span class="mw-page-title-main">Michele Parrinello</span> Italian physicist (born 1945)

Michele Parrinello is an Italian physicist particularly known for his work in molecular dynamics. Parrinello and Roberto Car were awarded the Dirac Medal of the International Centre for Theoretical Physics (ICTP) and the Sidney Fernbach Award in 2009 for their continuing development of the Car–Parrinello method, first proposed in their seminal 1985 paper, "Unified Approach for Molecular Dynamics and Density-Functional Theory". They have continued to receive awards for this breakthrough, most recently the Dreyfus Prize in the Chemical Sciences and the 2020 Benjamin Franklin Medal in Chemistry.

<span class="mw-page-title-main">Interatomic potential</span> Functions for calculating potential energy

Interatomic potentials are mathematical functions to calculate the potential energy of a system of atoms with given positions in space. Interatomic potentials are widely used as the physical basis of molecular mechanics and molecular dynamics simulations in computational chemistry, computational physics and computational materials science to explain and predict materials properties. Examples of quantitative properties and qualitative phenomena that are explored with interatomic potentials include lattice parameters, surface energies, interfacial energies, adsorption, cohesion, thermal expansion, and elastic and plastic material behavior, as well as chemical reactions.

<span class="mw-page-title-main">Artem Oganov</span>

Artem R. Oganov is a Russian theoretical crystallographer, mineralogist, chemist, physicist, and materials scientist. He is known mostly for his works on computational materials discovery and crystal structure prediction, studies of matter at extreme conditions, including matter of planetary interiors.

Machine learning in bioinformatics is the application of machine learning algorithms to bioinformatics, including genomics, proteomics, microarrays, systems biology, evolution, and text mining.

References

  1. G. R. Desiraju (2002). "Cryptic crystallography". Nature Materials . 1 (2): 77–79. doi:10.1038/nmat726. PMID   12618812. S2CID   6056119.
  2. S. M. Woodley, R. Catlow; Catlow (2008). "Crystal structure prediction from first principles". Nature Materials . 7 (12): 937–946. Bibcode:2008NatMa...7..937W. doi:10.1038/nmat2321. PMID   19029928.
  3. L. Pauling (1929). "The principles determining the structure of complex ionic crystals". Journal of the American Chemical Society . 51 (4): 1010–1026. doi:10.1021/ja01379a006.
  4. Martonak R., Laio A., Parrinello M. (2003). "Predicting crystal structures: The Parrinello-Rahman method revisited". Physical Review Letters . 90 (3): 75502. arXiv: cond-mat/0211551 . Bibcode:2003PhRvL..90g5503M. doi:10.1103/physrevlett.90.075503. PMID   12633242. S2CID   25238210.{{cite journal}}: CS1 maint: multiple names: authors list (link)
  5. Martonak R., Donadio D., Oganov A. R., Parrinello M.; Donadio; Oganov; Parrinello (2006). "Crystal structure transformations in SiO2 from classical and ab initio metadynamics". Nature Materials. 5 (8): 623–626. Bibcode:2006NatMa...5..623M. doi:10.1038/nmat1696. PMID   16845414. S2CID   30791206.{{cite journal}}: CS1 maint: multiple names: authors list (link)
  6. Oganov, A. R.; Glass, C. W. (2006). "Crystal structure prediction using ab initio evolutionary techniques: principles and applications". Journal of Chemical Physics . 124 (24): 244704. arXiv: 0911.3186 . Bibcode:2006JChPh.124x4704O. doi:10.1063/1.2210932. PMID   16821993. S2CID   9688132.
  7. Pickard, C. J.; Needs, R. J. (2006). "High-Pressure Phases of Silane". Physical Review Letters . 97 (4): 045504. arXiv: cond-mat/0604454 . Bibcode:2006PhRvL..97d5504P. doi:10.1103/PhysRevLett.97.045504. PMID   16907590. S2CID   36278251.
  8. Price, Sarah L. (2014-03-10). "Predicting crystal structures of organic compounds". Chemical Society Reviews. 43 (7): 2098–2111. doi: 10.1039/C3CS60279F . ISSN   1460-4744. PMID   24263977.
  9. Stone, Anthony (2013). The Theory of Intermolecular Forces. Oxford University Press.
  10. Nyman, Jonas; Day, Graeme M. (2015). "Static and lattice vibrational energy differences between polymorphs". CrystEngComm. 17 (28): 5154–5165. doi: 10.1039/C5CE00045A .
  11. K. Sanderson (2007). "Model predicts structure of crystals". Nature . 450 (7171): 771. Bibcode:2007Natur.450..771S. doi: 10.1038/450771a . PMID   18063962.
  12. Day, Graeme M.; Cooper, Timothy G.; Cruz-Cabeza, Aurora J.; Hejczyk, Katarzyna E.; Ammon, Herman L.; Boerrigter, Stephan X. M.; Tan, Jeffrey S.; Della Valle, Raffaele G.; Venuti, Elisabetta; Jose, Jovan; Gadre, Shridhar R.; Desiraju, Gautam R.; Thakur, Tejender S.; Van Eijck, Bouke P.; Facelli, Julio C.; Bazterra, Victor E.; Ferraro, Marta B.; Hofmann, Detlef W. M.; Neumann, Marcus A.; Leusen, Frank J. J.; Kendrick, John; Price, Sarah L.; Misquitta, Alston J.; Karamertzanis, Panagiotis G.; Welch, Gareth W. A.; Scheraga, Harold A.; Arnautova, Yelena A.; Schmidt, Martin U.; Van De Streek, Jacco; et al. (2009). "Significant progress in predicting the crystal structures of small organic molecules – a report on the fourth blind test" (PDF). Acta Crystallographica B . 65 (Pt 2): 107–125. doi: 10.1107/S0108768109004066 . PMID   19299868.
  13. M. A. Neumann, F. J. J. Leusen, J. Kendrick; Leusen; Kendrick (2008). "A Major Advance in Crystal Structure Prediction". Angewandte Chemie International Edition . 47 (13): 2427–2430. arXiv: 1506.05421 . doi:10.1002/anie.200704247. PMID   18288660.{{cite journal}}: CS1 maint: multiple names: authors list (link)
  14. Reilly, Anthony M.; Cooper, Richard I.; Adjiman, Claire S.; Bhattacharya, Saswata; Boese, A. Daniel; Brandenburg, Jan Gerit; Bygrave, Peter J.; Bylsma, Rita; Campbell, Josh E.; Car, Roberto; Case, David H.; Chadha, Renu; Cole, Jason C.; Cosburn, Katherine; Cuppen, Herma M.; Curtis, Farren; Day, Graeme M.; DiStasio, Robert A.; Dzyabchenko, Alexander; Van Eijck, Bouke P.; Elking, Dennis M.; Van Den Ende, Joost A.; Facelli, Julio C.; Ferraro, Marta B.; Fusti-Molnar, Laszlo; Gatsiou, Christina Anna; Gee, Thomas S.; De Gelder, Rene; Ghiringhelli, Luca M.; et al. (2016). "Report on the sixth blind test of organic crystal structure prediction methods". Acta Crystallographica B . 72 (4): 439–459. doi:10.1107/S2052520616007447. PMC   4971545 . PMID   27484368.
  15. Dybeck, Eric C.; Abraham, Nathan S.; Schieber, Natalie P.; Shirts, Michael R. (2017). "Capturing Entropic Contributions to Temperature-Mediated Polymorphic Transformations Through Molecular Modeling". Journal of Chemical Theory and Computation . 17 (4): 1775–1787. doi:10.1021/acs.cgd.6b01762.