Computational chemistry is a branch of chemistry that uses computer simulation to assist in solving chemical problems. It uses methods of theoretical chemistry, incorporated into computer programs, to calculate the structures and properties of molecules, groups of molecules, and solids. It is essential because, apart from relatively recent results concerning the hydrogen molecular ion (dihydrogen cation, see references therein for more details), the quantum many-body problem cannot be solved analytically, much less in closed form. While computational results normally complement the information obtained by chemical experiments, it can in some cases predict hitherto unobserved chemical phenomena. It is widely used in the design of new drugs and materials.
Examples of such properties are structure (i.e., the expected positions of the constituent atoms), absolute and relative (interaction) energies, electronic charge density distributions, dipoles and higher multipole moments, vibrational frequencies, reactivity, or other spectroscopic quantities, and cross sections for collision with other particles.
The methods used cover both static and dynamic situations. In all cases, the computer time and other resources (such as memory and disk space) increase quickly with the size of the system being studied. That system can be a molecule, a group of molecules, or a solid. Computational chemistry methods range from very approximate to highly accurate; the latter is usually feasible for small systems only. Ab initio methods are based entirely on quantum mechanics and basic physical constants. Other methods are called empirical or semi-empirical because they use additional empirical parameters.
Both ab initio and semi-empirical approaches involve approximations. These range from simplified forms of the first-principles equations that are easier or faster to solve, to approximations limiting the size of the system (for example, periodic boundary conditions), to fundamental approximations to the underlying equations that are required to achieve any solution to them at all. For example, most ab initio calculations make the Born–Oppenheimer approximation, which greatly simplifies the underlying Schrödinger equation by assuming that the nuclei remain in place during the calculation. In principle, ab initio methods eventually converge to the exact solution of the underlying equations as the number of approximations is reduced. In practice, however, it is impossible to eliminate all approximations, and residual error inevitably remains. The goal of computational chemistry is to minimize this residual error while keeping the calculations tractable.
In some cases, the details of electronic structure are less important than the long-time phase space behavior of molecules. This is the case in conformational studies of proteins and protein-ligand binding thermodynamics. Classical approximations to the potential energy surface are used, typically with molecular mechanics force fields, as they are computationally less intensive than electronic calculations, to enable longer simulations of molecular dynamics. Furthermore, cheminformatics uses even more empirical (and computationally cheaper) methods like machine learning based on physicochemical properties. One typical problem in cheminformatics is to predict the binding affinity of drug molecules to a given target. Other problems include predicting binding specificity, off-target effects, toxicity, and pharmacokinetic properties.
Building on the founding discoveries and theories in the history of quantum mechanics, the first theoretical calculations in chemistry were those of Walter Heitler and Fritz London in 1927, using valence bond theory. The books that were influential in the early development of computational quantum chemistry include Linus Pauling and E. Bright Wilson's 1935 Introduction to Quantum Mechanics – with Applications to Chemistry, Eyring, Walter and Kimball's 1944 Quantum Chemistry, Heitler's 1945 Elementary Wave Mechanics – with Applications to Quantum Chemistry, and later Coulson's 1952 textbook Valence, each of which served as primary references for chemists in the decades to follow.
With the development of efficient computer technology in the 1940s, the solutions of elaborate wave equations for complex atomic systems began to be a realizable objective. In the early 1950s, the first semi-empirical atomic orbital calculations were performed. Theoretical chemists became extensive users of the early digital computers. One major advance came with the 1951 paper in Reviews of Modern Physics by Clemens C. J. Roothaan in 1951, largely on the "LCAO MO" approach (Linear Combination of Atomic Orbitals Molecular Orbitals), for many years the second-most cited paper in that journal. A very detailed account of such use in the United Kingdom is given by Smith and Sutcliffe.The first ab initio Hartree–Fock method calculations on diatomic molecules were performed in 1956 at MIT, using a basis set of Slater orbitals. For diatomic molecules, a systematic study using a minimum basis set and the first calculation with a larger basis set were published by Ransil and Nesbet respectively in 1960. The first polyatomic calculations using Gaussian orbitals were performed in the late 1950s. The first configuration interaction calculations were performed in Cambridge on the EDSAC computer in the 1950s using Gaussian orbitals by Boys and coworkers. By 1971, when a bibliography of ab initio calculations was published, the largest molecules included were naphthalene and azulene. Abstracts of many earlier developments in ab initio theory have been published by Schaefer.
In 1964, Hückel method calculations (using a simple linear combination of atomic orbitals (LCAO) method to determine electron energies of molecular orbitals of π electrons in conjugated hydrocarbon systems) of molecules, ranging in complexity from butadiene and benzene to ovalene, were generated on computers at Berkeley and Oxford.These empirical methods were replaced in the 1960s by semi-empirical methods such as CNDO.
In the early 1970s, efficient ab initio computer programs such as ATMOL, Gaussian, IBMOL, and POLYAYTOM, began to be used to speed ab initio calculations of molecular orbitals. Of these four programs, only Gaussian, now vastly expanded, is still in use, but many other programs are now in use. At the same time, the methods of molecular mechanics, such as MM2 force field, were developed, primarily by Norman Allinger.
One of the first mentions of the term computational chemistry can be found in the 1970 book Computers and Their Role in the Physical Sciences by Sidney Fernbach and Abraham Haskell Taub, where they state "It seems, therefore, that 'computational chemistry' can finally be more and more of a reality."During the 1970s, widely different methods began to be seen as part of a new emerging discipline of computational chemistry. The Journal of Computational Chemistry was first published in 1980.
Computational chemistry has featured in several Nobel Prize awards, most notably in 1998 and 2013. Walter Kohn, "for his development of the density-functional theory", and John Pople, "for his development of computational methods in quantum chemistry", received the 1998 Nobel Prize in Chemistry.Martin Karplus, Michael Levitt and Arieh Warshel received the 2013 Nobel Prize in Chemistry for "the development of multiscale models for complex chemical systems".
The term theoretical chemistry may be defined as a mathematical description of chemistry, whereas computational chemistry is usually used when a mathematical method is sufficiently well developed that it can be automated for implementation on a computer. In theoretical chemistry, chemists, physicists, and mathematicians develop algorithms and computer programs to predict atomic and molecular properties and reaction paths for chemical reactions. Computational chemists, in contrast, may simply apply existing computer programs and methodologies to specific chemical questions.
Computational chemistry has two different aspects:
Thus, computational chemistry can assist the experimental chemist or it can challenge the experimental chemist to find entirely new chemical objects.
Several major areas may be distinguished within computational chemistry:
Computational chemistry is not an exact description of real-life chemistry, as our mathematical models of the physical laws of nature can only provide us with an approximation. However, the majority of chemical phenomena can be described to a certain degree in a qualitative or approximate quantitative computational scheme.
Molecules consist of nuclei and electrons, so the methods of quantum mechanics apply. Computational chemists often attempt to solve the non-relativistic Schrödinger equation, with relativistic corrections added, although some progress has been made in solving the fully relativistic Dirac equation. In principle, it is possible to solve the Schrödinger equation in either its time-dependent or time-independent form, as appropriate for the problem in hand; in practice, this is not possible except for very small systems. Therefore, a great number of approximate methods strive to achieve the best trade-off between accuracy and computational cost.
Accuracy can always be improved with greater computational cost. Significant errors can present themselves in ab initio models comprising many electrons, due to the computational cost of full relativistic-inclusive methods. This complicates the study of molecules interacting with high atomic mass unit atoms, such as transitional metals and their catalytic properties. Present algorithms in computational chemistry can routinely calculate the properties of small molecules that contain up to about 40 electrons with errors for energies less than a few kJ/mol. For geometries, bond lengths can be predicted within a few picometers and bond angles within 0.5 degrees. The treatment of larger molecules that contain a few dozen atoms is computationally tractable by more approximate methods such as density functional theory (DFT).
There is some dispute within the field whether or not the latter methods are sufficient to describe complex chemical reactions, such as those in biochemistry. Large molecules can be studied by semi-empirical approximate methods. Even larger molecules are treated by classical mechanics methods that use what are called molecular mechanics (MM). In QM-MM methods, small parts of large complexes are treated quantum mechanically (QM), and the remainder is treated approximately (MM).
One molecular formula can represent more than one molecular isomer: a set of isomers. Each isomer is a local minimum on the energy surface (called the potential energy surface) created from the total energy (i.e., the electronic energy, plus the repulsion energy between the nuclei) as a function of the coordinates of all the nuclei. A stationary point is a geometry such that the derivative of the energy with respect to all displacements of the nuclei is zero. A local (energy) minimum is a stationary point where all such displacements lead to an increase in energy. The local minimum that is lowest is called the global minimum and corresponds to the most stable isomer. If there is one particular coordinate change that leads to a decrease in the total energy in both directions, the stationary point is a transition structure and the coordinate is the reaction coordinate. This process of determining stationary points is called geometry optimization.
The determination of molecular structure by geometry optimization became routine only after efficient methods for calculating the first derivatives of the energy with respect to all atomic coordinates became available. Evaluation of the related second derivatives allows the prediction of vibrational frequencies if harmonic motion is estimated. More importantly, it allows for the characterization of stationary points. The frequencies are related to the eigenvalues of the Hessian matrix, which contains second derivatives. If the eigenvalues are all positive, then the frequencies are all real and the stationary point is a local minimum. If one eigenvalue is negative (i.e., an imaginary frequency), then the stationary point is a transition structure. If more than one eigenvalue is negative, then the stationary point is a more complex one and is usually of little interest. When one of these is found, it is necessary to move the search away from it if the experimenter is looking solely for local minima and transition structures.
The total energy is determined by approximate solutions of the time-dependent Schrödinger equation, usually with no relativistic terms included, and by making use of the Born–Oppenheimer approximation, which allows for the separation of electronic and nuclear motions, thereby simplifying the Schrödinger equation. This leads to the evaluation of the total energy as a sum of the electronic energy at fixed nuclei positions and the repulsion energy of the nuclei. A notable exception is certain approaches called direct quantum chemistry, which treat electrons and nuclei on a common footing. Density functional methods and semi-empirical methods are variants of the major theme. For very large systems, the relative total energies can be compared using molecular mechanics. The ways of determining the total energy to predict molecular structures are:
The programs used in computational chemistry are based on many different quantum-chemical methods that solve the molecular Schrödinger equation associated with the molecular Hamiltonian. Methods that do not include any empirical or semi-empirical parameters in their equations – being derived directly from theoretical principles, with no inclusion of experimental data – are called ab initio methods . This does not imply that the solution is an exact one; they are all approximate quantum mechanical calculations. It means that a particular approximation is rigorously defined on first principles (quantum theory) and then solved within an error margin that is qualitatively known beforehand. If numerical iterative methods must be used, the aim is to iterate until full machine accuracy is obtained (the best that is possible with a finite word length on the computer, and within the mathematical and/or physical approximations made).
The simplest type of ab initio electronic structure calculation is the Hartree–Fock method (HF), an extension of molecular orbital theory, in which the correlated electron-electron repulsion is not specifically taken into account; only its average effect is included in the calculation. As the basis set size is increased, the energy and wave function tend towards a limit called the Hartree–Fock limit. Many types of calculations (termed post-Hartree–Fock methods) begin with a Hartree–Fock calculation and subsequently correct for electron-electron repulsion, referred to also as electronic correlation. As these methods are pushed to the limit, they approach the exact solution of the non-relativistic Schrödinger equation. To obtain exact agreement with the experiment, it is necessary to include relativistic and spin orbit terms, both of which are far more important for heavy atoms. In all of these approaches, along with a choice of method, it is necessary to choose a basis set. This is a set of functions, usually centered on the different atoms in the molecule, which are used to expand the molecular orbitals with the linear combination of atomic orbitals (LCAO) molecular orbital method ansatz. Ab initio methods need to define a level of theory (the method) and a basis set.
The Hartree–Fock wave function is a single configuration or determinant. In some cases, particularly for bond-breaking processes, this is inadequate, and several configurations must be used. Here, the coefficients of the configurations, and of the basis functions, are optimized together.
The total molecular energy can be evaluated as a function of the molecular geometry; in other words, the potential energy surface. Such a surface can be used for reaction dynamics. The stationary points of the surface lead to predictions of different isomers and the transition structures for conversion between isomers, but these can be determined without full knowledge of the complete surface.
A particularly important objective, called computational thermochemistry, is to calculate thermochemical quantities such as the enthalpy of formation to chemical accuracy. Chemical accuracy is the accuracy required to make realistic chemical predictions and is generally considered to be 1 kcal/mol or 4 kJ/mol. To reach that accuracy in an economic way it is necessary to use a series of post-Hartree–Fock methods and combine the results. These methods are called quantum chemistry composite methods.
Density functional theory (DFT) methods are often considered to be ab initio methods for determining the molecular electronic structure, even though many of the most common functionals use parameters derived from empirical data, or from more complex calculations. In DFT, the total energy is expressed in terms of the total one-electron density rather than the wave function. In this type of calculation, there is an approximate Hamiltonian and an approximate expression for the total electron density. DFT methods can be very accurate for little computational cost. Some methods combine the density functional exchange functional with the Hartree–Fock exchange term and are termed hybrid functional methods.
Semi-empirical quantum chemistry methods are based on the Hartree–Fock method formalism, but make many approximations and obtain some parameters from empirical data. They were very important in computational chemistry from the 60s to the 90s, especially for treating large molecules where the full Hartree–Fock method without the approximations were too costly. The use of empirical parameters appears to allow some inclusion of correlation effects into the methods.
Primitive semi-empirical methods were designed even before, where the two-electron part of the Hamiltonian is not explicitly included. For π-electron systems, this was the Hückel method proposed by Erich Hückel, and for all valence electron systems, the extended Hückel method proposed by Roald Hoffmann. Sometimes, Hückel methods are referred to as "completely emprirical" because they do not derive from a Hamiltonian.Yet, the term "empirical methods", or "empirical force fields" is usually used to describe Molecular Mechanics.
In many cases, large molecular systems can be modeled successfully while avoiding quantum mechanical calculations entirely. Molecular mechanics simulations, for example, use one classical expression for the energy of a compound, for instance, the harmonic oscillator. All constants appearing in the equations must be obtained beforehand from experimental data or ab initio calculations.
The database of compounds used for parameterization, i.e., the resulting set of parameters and functions is called the force field, is crucial to the success of molecular mechanics calculations. A force field parameterized against a specific class of molecules, for instance, proteins, would be expected to only have any relevance when describing other molecules of the same class.
These methods can be applied to proteins and other large biological molecules, and allow studies of the approach and interaction (docking) of potential drug molecules.
Computational chemical methods can be applied to solid-state physics problems. The electronic structure of a crystal is in general described by a band structure, which defines the energies of electron orbitals for each point in the Brillouin zone. Ab initio and semi-empirical calculations yield orbital energies; therefore, they can be applied to band structure calculations. Since it is time-consuming to calculate the energy for a molecule, it is even more time-consuming to calculate them for the entire list of points in the Brillouin zone.
Once the electronic and nuclear variables are separated (within the Born–Oppenheimer representation), in the time-dependent approach, the wave packet corresponding to the nuclear degrees of freedom is propagated via the time evolution operator (physics) associated to the time-dependent Schrödinger equation (for the full molecular Hamiltonian). In the complementary energy-dependent approach, the time-independent Schrödinger equation is solved using the scattering theory formalism. The potential representing the interatomic interaction is given by the potential energy surfaces. In general, the potential energy surfaces are coupled via the vibronic coupling terms.
The most popular methods for propagating the wave packet associated to the molecular geometry are:
Molecular dynamics (MD) use either quantum mechanics, molecular mechanics or a mixture of both to calculate forces which are then used to solve Newton's laws of motion to examine the time-dependent behavior of systems. The result of a molecular dynamics simulation is a trajectory that describes how the position and velocity of particles varies with time. The phase point of a system described by the positions and momenta of all its particles on a previous time point will determine the next phase point in time by integrating over Newton's laws of motion.
Monte Carlo (MC) generates configurations of a system by making random changes to the positions of its particles, together with their orientations and conformations where appropriate. It is a random sampling method, which makes use of the so-called importance sampling. Importance sampling methods are able to generate low energy states, as this enables properties to be calculated accurately. The potential energy of each configuration of the system can be calculated, together with the values of other properties, from the positions of the atoms.
QM/MM is a hybrid method that attempts to combine the accuracy of quantum mechanics with the speed of molecular mechanics. It is useful for simulating very large molecules such as enzymes.
The atoms in molecules (QTAIM) model of Richard Bader was developed to effectively link the quantum mechanical model of a molecule, as an electronic wavefunction, to chemically useful concepts such as atoms in molecules, functional groups, bonding, the theory of Lewis pairs, and the valence bond model. Bader has demonstrated that these empirically useful chemistry concepts can be related to the topology of the observable charge density distribution, whether measured or calculated from a quantum mechanical wavefunction. QTAIM analysis of molecular wavefunctions is implemented, for example, in the AIMAll software package.
Many self-sufficient computational chemistry software packages exist. Some include many methods covering a wide range, while others concentrate on a very specific range or even on one method. Details of most of them can be found in:
Quantum chemistry, also called molecular quantum mechanics, is a branch of physical chemistry focused on the application of quantum mechanics to chemical systems, particularly towards the quantum-mechanical calculation of electronic contributions to physical and chemical properties of molecules, materials, and solutions at the atomic level. These calculations include systematically applied approximations intended to make calculations computationally feasible while still capturing as much information about important contributions to the computed wave functions as well as to observable properties such as structures, spectra, and thermodynamic properties. Quantum chemistry is also concerned with the computation of quantum effects on molecular dynamics and chemical kinetics.
Theoretical chemistry is the branch of chemistry which develops theoretical generalizations that are part of the theoretical arsenal of modern chemistry: for example, the concepts of chemical bonding, chemical reaction, valence, the surface of potential energy, molecular orbitals, orbital interactions, and molecule activation.
In quantum chemistry, electronic structure is the state of motion of electrons in an electrostatic field created by stationary nuclei. The term encompasses both the wave functions of the electrons and the energies associated with them. Electronic structure is obtained by solving quantum mechanical equations for the aforementioned clamped-nuclei problem.
In computational physics and chemistry, the Hartree–Fock (HF) method is a method of approximation for the determination of the wave function and the energy of a quantum many-body system in a stationary state.
In chemistry, molecular orbital theory is a method for describing the electronic structure of molecules using quantum mechanics. It was proposed early in the 20th century.
Psi is an ab initio computational chemistry package originally written by the research group of Henry F. Schaefer, III. Utilizing Psi, one can perform a calculation on a molecular system with various kinds of methods such as Hartree-Fock, Post-Hartree–Fock electron correlation methods, and Density functional theory. The program can compute energies, optimize molecular geometries, and compute vibrational frequencies. The major part of the program is written in C++, while Python API is also available, which allows users to perform complex computations or automate tasks easily.
In molecular physics, the Pariser–Parr–Pople method applies semi-empirical quantum mechanical methods to the quantitative prediction of electronic structures and spectra, in molecules of interest in the field of organic chemistry. Previous methods existed—such as the Hückel method which led to Hückel's rule—but were limited in their scope, application and complexity, as is the Extended Hückel method.
Møller–Plesset perturbation theory (MP) is one of several quantum chemistry post–Hartree–Fock ab initio methods in the field of computational chemistry. It improves on the Hartree–Fock method by adding electron correlation effects by means of Rayleigh–Schrödinger perturbation theory (RS-PT), usually to second (MP2), third (MP3) or fourth (MP4) order. Its main idea was published as early as 1934 by Christian Møller and Milton S. Plesset.
Electronic correlation is the interaction between electrons in the electronic structure of a quantum system. The correlation energy is a measure of how much the movement of one electron is influenced by the presence of all other electrons.
In theoretical and computational chemistry, a basis set is a set of functions that is used to represent the electronic wave function in the Hartree–Fock method or density-functional theory in order to turn the partial differential equations of the model into algebraic equations suitable for efficient implementation on a computer.
Koopmans' theorem states that in closed-shell Hartree–Fock theory (HF), the first ionization energy of a molecular system is equal to the negative of the orbital energy of the highest occupied molecular orbital (HOMO). This theorem is named after Tjalling Koopmans, who published this result in 1934.
CNDO is the abbreviation for Complete Neglect of Differential Overlap, one of the first semi empirical methods in quantum chemistry. It uses two approximations:
PQS is a general purpose quantum chemistry program. Its roots go back to the first ab initio gradient program developed in Professor Peter Pulay's group but now it is developed and distributed commercially by Parallel Quantum Solutions. There is a reduction in cost for academic users and a site license. Its strong points are geometry optimization, NMR chemical shift calculations, and large MP2 calculations, and high parallel efficiency on computing clusters. It includes many other capabilities including Density functional theory, the semiempirical methods, MINDO/3, MNDO, AM1 and PM3, Molecular mechanics using the SYBYL 5.0 Force Field, the quantum mechanics/molecular mechanics mixed method using the ONIOM method, natural bond orbital (NBO) analysis and COSMO solvation models. Recently, a highly efficient parallel CCSD(T) code for closed shell systems has been developed. This code includes many other post Hartree–Fock methods: MP2, MP3, MP4, CISD, CEPA, QCISD and so on.
Jaguar is a computer software package used for ab initio quantum chemistry calculations for both gas and solution phases. It is commercial software marketed by the company Schrödinger. The program was originated in research groups of Richard Friesner and William Goddard and was initially called PS-GVB.
Spartan is a molecular modelling and computational chemistry application from Wavefunction. It contains code for molecular mechanics, semi-empirical methods, ab initio models, density functional models, post-Hartree–Fock models, and thermochemical recipes including G3(MP2) and T1. Quantum chemistry calculations in Spartan are powered by Q-Chem.
Computational chemical methods in solid-state physics follow the same approach as they do for molecules, but with two differences. First, the translational symmetry of the solid has to be utilised, and second, it is possible to use completely delocalised basis functions such as plane waves as an alternative to the molecular atom-centered basis functions. The electronic structure of a crystal is in general described by a band structure, which defines the energies of electron orbitals for each point in the Brillouin zone. Ab initio and semi-empirical calculations yield orbital energies, therefore they can be applied to band structure calculations. Since it is time-consuming to calculate the energy for a molecule, it is even more time-consuming to calculate them for the entire list of points in the Brillouin zone.
Semi-empirical quantum chemistry methods are based on the Hartree–Fock formalism, but make many approximations and obtain some parameters from empirical data. They are very important in computational chemistry for treating large molecules where the full Hartree–Fock method without the approximations is too expensive. The use of empirical parameters appears to allow some inclusion of electron correlation effects into the methods.
Ab initio quantum chemistry methods are computational chemistry methods based on quantum chemistry. The term ab initio was first used in quantum chemistry by Robert Parr and coworkers, including David Craig in a semiempirical study on the excited states of benzene. The background is described by Parr. Ab initio means "from first principles" or "from the beginning", implying that the only inputs into an ab initio calculation are physical constants. Ab initio quantum chemistry methods attempt to solve the electronic Schrödinger equation given the positions of the nuclei and the number of electrons in order to yield useful information such as electron densities, energies and other properties of the system. The ability to run these calculations has enabled theoretical chemists to solve a range of problems and their importance is highlighted by the awarding of the Nobel prize to John Pople and Walter Kohn.
CP2K is a freely available (GPL) quantum chemistry and solid state physics program package, written in Fortran 2008, to perform atomistic simulations of solid state, liquid, molecular, periodic, material, crystal, and biological systems. It provides a general framework for different methods: density functional theory (DFT) using a mixed Gaussian and plane waves approach (GPW) via LDA, GGA, MP2, or RPA levels of theory, classical pair and many-body potentials, semi-empirical and tight-binding Hamiltonians, as well as Quantum Mechanics/Molecular Mechanics (QM/MM) hybrid schemes relying on the Gaussian Expansion of the Electrostatic Potential (GEEP). The Gaussian and Augmented Plane Waves method (GAPW) as an extension of the GPW method allows for all-electron calculations. CP2K can do simulations of molecular dynamics, metadynamics, Monte Carlo, Ehrenfest dynamics, vibrational analysis, core level spectroscopy, energy minimization, and transition state optimization using NEB or dimer method.