Energy minimization

Last updated

In the field of computational chemistry, energy minimization (also called energy optimization, geometry minimization, or geometry optimization) is the process of finding an arrangement in space of a collection of atoms where, according to some computational model of chemical bonding, the net inter-atomic force on each atom is acceptably close to zero and the position on the potential energy surface (PES) is a stationary point (described later). The collection of atoms might be a single molecule, an ion, a condensed phase, a transition state or even a collection of any of these. The computational model of chemical bonding might, for example, be quantum mechanics.

Contents

As an example, when optimizing the geometry of a water molecule, one aims to obtain the hydrogen-oxygen bond lengths and the hydrogen-oxygen-hydrogen bond angle which minimize the forces that would otherwise be pulling atoms together or pushing them apart.

The motivation for performing a geometry optimization is the physical significance of the obtained structure: optimized structures often correspond to a substance as it is found in nature and the geometry of such a structure can be used in a variety of experimental and theoretical investigations in the fields of chemical structure, thermodynamics, chemical kinetics, spectroscopy and others.

Typically, but not always, the process seeks to find the geometry of a particular arrangement of the atoms that represents a local or global energy minimum. Instead of searching for global energy minimum, it might be desirable to optimize to a transition state, that is, a saddle point on the potential energy surface. [1] Additionally, certain coordinates (such as a chemical bond length) might be fixed during the optimization.

Molecular geometry and mathematical interpretation

The geometry of a set of atoms can be described by a vector of the atoms' positions. This could be the set of the Cartesian coordinates of the atoms or, when considering molecules, might be so called internal coordinates formed from a set of bond lengths, bond angles and dihedral angles.

Given a set of atoms and a vector, r, describing the atoms' positions, one can introduce the concept of the energy as a function of the positions, E(r). Geometry optimization is then a mathematical optimization problem, in which it is desired to find the value of r for which E(r) is at a local minimum, that is, the derivative of the energy with respect to the position of the atoms, E/r, is the zero vector and the second derivative matrix of the system, , also known as the Hessian matrix, which describes the curvature of the PES at r, has all positive eigenvalues (is positive definite).

A special case of a geometry optimization is a search for the geometry of a transition state; this is discussed below.

The computational model that provides an approximate E(r) could be based on quantum mechanics (using either density functional theory or semi-empirical methods), force fields, or a combination of those in case of QM/MM. Using this computational model and an initial guess (or ansatz) of the correct geometry, an iterative optimization procedure is followed, for example:

  1. calculate the force on each atom (that is, -E/r)
  2. if the force is less than some threshold, finish
  3. otherwise, move the atoms by some computed step r that is predicted to reduce the force
  4. repeat from the start

Practical aspects of optimization

As described above, some method such as quantum mechanics can be used to calculate the energy, E(r) , the gradient of the PES, that is, the derivative of the energy with respect to the position of the atoms, E/r and the second derivative matrix of the system, E/rirj, also known as the Hessian matrix, which describes the curvature of the PES at r.

An optimization algorithm can use some or all of E(r) , E/r and E/rirj to try to minimize the forces and this could in theory be any method such as gradient descent, conjugate gradient or Newton's method, but in practice, algorithms which use knowledge of the PES curvature, that is the Hessian matrix, are found to be superior. For most systems of practical interest, however, it may be prohibitively expensive to compute the second derivative matrix, and it is estimated from successive values of the gradient, as is typical in a Quasi-Newton optimization.

The choice of the coordinate system can be crucial for performing a successful optimization. Cartesian coordinates, for example, are redundant since a non-linear molecule with N atoms has 3N–6 vibrational degrees of freedom whereas the set of Cartesian coordinates has 3N dimensions. Additionally, Cartesian coordinates are highly correlated, that is, the Hessian matrix has many non-diagonal terms that are not close to zero. This can lead to numerical problems in the optimization, because, for example, it is difficult to obtain a good approximation to the Hessian matrix and calculating it precisely is too computationally expensive. However, in case that energy is expressed with standard force fields, computationally efficient methods have been developed [2] able to derive analytically the Hessian matrix in Cartesian coordinates while preserving a computational complexity of the same order to that of gradient computations. Internal coordinates tend to be less correlated but are more difficult to set-up and it can be difficult to describe some systems, such as ones with symmetry or large condensed phases. [3] Many modern computational chemistry software packages contain automatic procedures for the automatic generation of reasonable coordinate systems for optimization. [4]

Degree of freedom restriction

Some degrees of freedom can be eliminated from an optimization, for example, positions of atoms or bond lengths and angles can be given fixed values. Sometimes these are referred to as being frozen degrees of freedom.

Figure 1 depicts a geometry optimization of the atoms in a carbon nanotube in the presence of an external electrostatic field. In this optimization, the atoms on the left have their positions frozen. Their interaction with the other atoms in the system are still calculated, but alteration the atoms' position during the optimization is prevented.

Figure 1: Electrostatic deflections of a carbon nanotube in an electric field. Electric deflection CNT.JPG
Figure 1: Electrostatic deflections of a carbon nanotube in an electric field.

Transition state optimization

Transition state structures can be determined by searching for saddle points on the PES of the chemical species of interest. [5] A first-order saddle point is a position on the PES corresponding to a minimum in all directions except one; a second-order saddle point is a minimum in all directions except two, and so on. Defined mathematically, an nth order saddle point is characterized by the following: E/r = 0 and the Hessian matrix, E/rirj, has exactly n negative eigenvalues.

Algorithms to locate transition state geometries fall into two main categories: local methods and semi-global methods. Local methods are suitable when the starting point for the optimization is very close to the true transition state (very close will be defined shortly) and semi-global methods find application when it is sought to locate the transition state with very little a priori knowledge of its geometry. Some methods, such as the Dimer method (see below), fall into both categories.

Local searches

A so-called local optimization requires an initial guess of the transition state that is very close to the true transition state. Very close typically means that the initial guess must have a corresponding Hessian matrix with one negative eigenvalue, or, the negative eigenvalue corresponding to the reaction coordinate must be greater in magnitude than the other negative eigenvalues. Further, the eigenvector with the most negative eigenvalue must correspond to the reaction coordinate, that is, it must represent the geometric transformation relating to the process whose transition state is sought.

Given the above pre-requisites, a local optimization algorithm can then move "uphill" along the eigenvector with the most negative eigenvalue and "downhill" along all other degrees of freedom, using something similar to a quasi-Newton method.

Dimer method

The dimer method [6] can be used to find possible transition states without knowledge of the final structure or to refine a good guess of a transition structure. The “dimer” is formed by two images very close to each other on the PES. The method works by moving the dimer uphill from the starting position whilst rotating the dimer to find the direction of lowest curvature (ultimately negative).

Activation Relaxation Technique (ART)

The Activation Relaxation Technique (ART) [7] [8] [9] is also an open-ended method to find new transition states or to refine known saddle points on the PES. The method follows the direction of lowest negative curvature (computed using the Lanczos algorithm) on the PES to reach the saddle point, relaxing in the perpendicular hyperplane between each "jump" (activation) in this direction.

Chain-of-state methods

Chain-of-state [10] methods can be used to find the approximate geometry of the transition state based on the geometries of the reactant and product. The generated approximate geometry can then serve as a starting point for refinement via a local search, which was described above.

Chain-of-state methods use a series of vectors, that is points on the PES, connecting the reactant and product of the reaction of interest, rreactant and rproduct, thus discretizing the reaction pathway. Very commonly, these points are referred to as beads due to an analogy of a set of beads connected by strings or springs, which connect the reactant and products. The series of beads is often initially created by interpolating between rreactant and rproduct, for example, for a series of N + 1 beads, bead i might be given by

where i 0, 1, ..., N. Each of the beads ri has an energy, E(ri), and forces, -E/ri and these are treated with a constrained optimization process that seeks to get an as accurate as possible representation of the reaction pathway. For this to be achieved, spacing constraints must be applied so that each bead ri does not simply get optimized to the reactant and product geometry.

Often this constraint is achieved by projecting out components of the force on each bead ri, or alternatively the movement of each bead during optimization, that are tangential to the reaction path. For example, if for convenience, it is defined that gi = E/ri, then the energy gradient at each bead minus the component of the energy gradient that is tangential to the reaction pathway is given by

where I is the identity matrix and τi is a unit vector representing the reaction path tangent at ri. By projecting out components of the energy gradient or the optimization step that are parallel to the reaction path, an optimization algorithm significantly reduces the tendency of each of the beads to be optimized directly to a minimum.

Synchronous transit

The simplest chain-of-state method is the linear synchronous transit (LST) method. It operates by taking interpolated points between the reactant and product geometries and choosing the one with the highest energy for subsequent refinement via a local search. The quadratic synchronous transit (QST) method extends LST by allowing a parabolic reaction path, with optimization of the highest energy point orthogonally to the parabola.

Nudged elastic band

In Nudged elastic band (NEB) [11] method, the beads along the reaction pathway have simulated spring forces in addition to the chemical forces, -E/ri, to cause the optimizer to maintain the spacing constraint. Specifically, the force fi on each point i is given by

where

is the spring force parallel to the pathway at each point ri (k is a spring constant and τi, as before, is a unit vector representing the reaction path tangent at ri).

In a traditional implementation, the point with the highest energy is used for subsequent refinement in a local search. There are many variations on the NEB method, [12] such including the climbing image NEB, in which the point with the highest energy is pushed upwards during the optimization procedure so as to (hopefully) give a geometry which is even closer to that of the transition state. There have also been extensions [13] to include Gaussian process regression for reducing the number of evaluations. For systems with non-Euclidean (R^2) geometry, like magnetic systems, the method is modified to the geodesic nudged elastic band approach. [14]

String method

The string method [15] [16] [17] uses splines connecting the points, ri, to measure and enforce distance constraints between the points and to calculate the tangent at each point. In each step of an optimization procedure, the points might be moved according to the force acting on them perpendicular to the path, and then, if the equidistance constraint between the points is no-longer satisfied, the points can be redistributed, using the spline representation of the path to generate new vectors with the required spacing.

Variations on the string method include the growing string method, [18] in which the guess of the pathway is grown in from the end points (that is the reactant and products) as the optimization progresses.

Comparison with other techniques

Geometry optimization is fundamentally different from a molecular dynamics simulation. The latter simulates the motion of molecules with respect to time, subject to temperature, chemical forces, initial velocities, Brownian motion of a solvent, and so on, via the application of Newton's laws of Motion. This means that the trajectories of the atoms which get computed, have some physical meaning. Geometry optimization, by contrast, does not produce a "trajectory" with any physical meaning – it is concerned with minimization of the forces acting on each atom in a collection of atoms, and the pathway via which it achieves this lacks meaning. Different optimization algorithms could give the same result for the minimum energy structure, but arrive at it via a different pathway.

See also

Related Research Articles

Computational chemistry is a branch of chemistry that uses computer simulation to assist in solving chemical problems. It uses methods of theoretical chemistry, incorporated into computer programs, to calculate the structures and properties of molecules, groups of molecules, and solids. It is essential because, apart from relatively recent results concerning the hydrogen molecular ion, the quantum many-body problem cannot be solved analytically, much less in closed form. While computational results normally complement the information obtained by chemical experiments, it can in some cases predict hitherto unobserved chemical phenomena. It is widely used in the design of new drugs and materials.

Quadratic programming (QP) is the process of solving certain mathematical optimization problems involving quadratic functions. Specifically, one seeks to optimize a multivariate quadratic function subject to linear constraints on the variables. Quadratic programming is a type of nonlinear programming.

<span class="mw-page-title-main">Mathematical optimization</span> Study of mathematical algorithms for optimization problems

Mathematical optimization or mathematical programming is the selection of a best element, with regard to some criterion, from some set of available alternatives. Optimization problems of sorts arise in all quantitative disciplines from computer science and engineering to operations research and economics, and the development of solution methods has been of interest in mathematics for centuries.

In mathematical optimization, the method of Lagrange multipliers is a strategy for finding the local maxima and minima of a function subject to equality constraints. It is named after the mathematician Joseph-Louis Lagrange. The basic idea is to convert a constrained problem into a form such that the derivative test of an unconstrained problem can still be applied. The relationship between the gradient of the function and gradients of the constraints rather naturally leads to a reformulation of the original problem, known as the Lagrangian function.

<span class="mw-page-title-main">Gradient descent</span> Optimization algorithm

In mathematics, gradient descent is a first-order iterative optimization algorithm for finding a local minimum of a differentiable function. The idea is to take repeated steps in the opposite direction of the gradient of the function at the current point, because this is the direction of steepest descent. Conversely, stepping in the direction of the gradient will lead to a local maximum of that function; the procedure is then known as gradient ascent.

Density-functional theory (DFT) is a computational quantum mechanical modelling method used in physics, chemistry and materials science to investigate the electronic structure of many-body systems, in particular atoms, molecules, and the condensed phases. Using this theory, the properties of a many-electron system can be determined by using functionals, i.e. functions of another function. In the case of DFT, these are functionals of the spatially dependent electron density. DFT is among the most popular and versatile methods available in condensed-matter physics, computational physics, and computational chemistry.

In mathematics, the Hessian matrix or Hessian is a square matrix of second-order partial derivatives of a scalar-valued function, or scalar field. It describes the local curvature of a function of many variables. The Hessian matrix was developed in the 19th century by the German mathematician Ludwig Otto Hesse and later named after him. Hesse originally used the term "functional determinants".

<span class="mw-page-title-main">Molecular modelling</span> Discovering chemical properties by physical simulations

Molecular modelling encompasses all methods, theoretical and computational, used to model or mimic the behaviour of molecules. The methods are used in the fields of computational chemistry, drug design, computational biology and materials science to study molecular systems ranging from small chemical systems to large biological molecules and material assemblies. The simplest calculations can be performed by hand, but inevitably computers are required to perform molecular modelling of any reasonably sized system. The common feature of molecular modelling methods is the atomistic level description of the molecular systems. This may include treating atoms as the smallest individual unit, or explicitly modelling protons and neutrons with its quarks, anti-quarks and gluons and electrons with its photons.

Transition state Configuration of a chemical reaction when potential energy is greatest

In chemistry, the transition state of a chemical reaction is a particular configuration along the reaction coordinate. It is defined as the state corresponding to the highest potential energy along this reaction coordinate. It is often marked with the double dagger ‡ symbol.

<span class="mw-page-title-main">Gauss–Newton algorithm</span>

The Gauss–Newton algorithm is used to solve non-linear least squares problems, which is equivalent to minimizing a sum of squared function values. It is an extension of Newton's method for finding a minimum of a non-linear function. Since a sum of squares must be nonnegative, the algorithm can be viewed as using Newton's method to iteratively approximate zeroes of the sum, and thus minimizing the sum. It has the advantage that second derivatives, which can be challenging to compute, are not required.

Newtons method in optimization Method for finding stationary points of a function

In calculus, Newton's method is an iterative method for finding the roots of a differentiable function F, which are solutions to the equation F (x) = 0. As such, Newton's method can be applied to the derivative f of a twice-differentiable function f to find the roots of the derivative, also known as the critical points of f. These solutions may be minima, maxima, or saddle points; see section "Several variables" in Critical point (mathematics) and also section "Geometric interpretation" in this article. This is relevant in optimization, which aims to find (global) minima of the function f.

Potential energy surface Function describing the energy of a physical system in terms of certain parameters

A potential energy surface (PES) describes the energy of a system, especially a collection of atoms, in terms of certain parameters, normally the positions of the atoms. The surface might define the energy as a function of one or more coordinates; if there is only one coordinate, the surface is called a potential energy curve or energy profile. An example is the Morse/Long-range potential.

In numerical optimization, the Broyden–Fletcher–Goldfarb–Shanno (BFGS) algorithm is an iterative method for solving unconstrained nonlinear optimization problems. Like the related Davidon–Fletcher–Powell method, BFGS determines the descent direction by preconditioning the gradient with curvature information. It does so by gradually improving an approximation to the Hessian matrix of the loss function, obtained only from gradient evaluations via a generalized secant method.

Critical point (mathematics) Point where the derivative of a function is zero

Critical point is a wide term used in many branches of mathematics.

PQS (software)

PQS is a general purpose quantum chemistry program. Its roots go back to the first ab initio gradient program developed in Professor Peter Pulay's group but now it is developed and distributed commercially by Parallel Quantum Solutions. There is a reduction in cost for academic users and a site license. Its strong points are geometry optimization, NMR chemical shift calculations, and large MP2 calculations, and high parallel efficiency on computing clusters. It includes many other capabilities including Density functional theory, the semiempirical methods, MINDO/3, MNDO, AM1 and PM3, Molecular mechanics using the SYBYL 5.0 Force Field, the quantum mechanics/molecular mechanics mixed method using the ONIOM method, natural bond orbital (NBO) analysis and COSMO solvation models. Recently, a highly efficient parallel CCSD(T) code for closed shell systems has been developed. This code includes many other post Hartree–Fock methods: MP2, MP3, MP4, CISD, CEPA, QCISD and so on.

In mathematics, preconditioning is the application of a transformation, called the preconditioner, that conditions a given problem into a form that is more suitable for numerical solving methods. Preconditioning is typically related to reducing a condition number of the problem. The preconditioned problem is then usually solved by an iterative method.

Limited-memory BFGS is an optimization algorithm in the family of quasi-Newton methods that approximates the Broyden–Fletcher–Goldfarb–Shanno algorithm (BFGS) using a limited amount of computer memory. It is a popular algorithm for parameter estimation in machine learning. The algorithm's target problem is to minimize over unconstrained values of the real-vector where is a differentiable scalar function.

The GF method, sometimes referred to as FG method, is a classical mechanical method introduced by Edgar Bright Wilson to obtain certain internal coordinates for a vibrating semi-rigid molecule, the so-called normal coordinatesQk. Normal coordinates decouple the classical vibrational motions of the molecule and thus give an easy route to obtaining vibrational amplitudes of the atoms as a function of time. In Wilson's GF method it is assumed that the molecular kinetic energy consists only of harmonic vibrations of the atoms, i.e., overall rotational and translational energy is ignored. Normal coordinates appear also in a quantum mechanical description of the vibrational motions of the molecule and the Coriolis coupling between rotations and vibrations.

Energy profile (chemistry)

For a chemical reaction or process an energy profile is a theoretical representation of a single energetic pathway, along the reaction coordinate, as the reactants are transformed into products. Reaction coordinate diagrams are derived from the corresponding potential energy surface (PES), which are used in computational chemistry to model chemical reactions by relating the energy of a molecule(s) to its structure. The reaction coordinate is a parametric curve that follows the pathway of a reaction and indicates the progress of a reaction.

Non-linear least squares is the form of least squares analysis used to fit a set of m observations with a model that is non-linear in n unknown parameters (m ≥ n). It is used in some forms of nonlinear regression. The basis of the method is to approximate the model by a linear one and to refine the parameters by successive iterations. There are many similarities to linear least squares, but also some significant differences. In economic theory, the non-linear least squares method is applied in (i) the probit regression, (ii) threshold regression, (iii) smooth regression, (iv) logistic link regression, (v) Box-Cox transformed regressors.

References

  1. "Input reference of CP2K version trunk, Section GEO_OPT, Keyword TYPE". CP2K. Retrieved 30 April 2015.
  2. Chatzieleftheriou, S.; Adendorff, M. R.; Lagaros, N. D. (2016). "Generalized Potential Energy Finite Elements for Modeling Molecular Nanostructures". J. Chem. Inf. Model. 56 (10): 1963–1978. doi:10.1021/acs.jcim.6b00356. PMID   27653992.
  3. Peng, C.; Ayala, P. Y.; Schlegel, H. B. (1996). "Using Redundant Internal Coordinates to Optimize Equilibrium Geometries and Transition States". Journal of Computational Chemistry. 17 (1): 49–56. doi:10.1002/(sici)1096-987x(19960115)17:1<49::aid-jcc5>3.3.co;2-#.
  4. "Home". gaussian.com.
  5. Frank Jensen (1999). Introduction to Computational Chemistry. England: John Wiley and Sons Ltd.
  6. Graeme Henkelman; Hannes Jónsson (1999). "A dimer method for finding saddle points on high dimensional potential surfaces using only first derivatives". J. Chem. Phys. 111 (15): 7010–7022. Bibcode:1999JChPh.111.7010H. doi:10.1063/1.480097.
  7. G.T. Barkema; Normand Mousseau (1996). "Event-Based Relaxation of Continuous Disordered Systems". Phys. Rev. Lett. 77 (21): 4358–4361. arXiv: cond-mat/9607156 . Bibcode:1996PhRvL..77.4358B. doi:10.1103/PhysRevLett.77.4358. PMID   10062518. S2CID   27932059.
  8. Rachid Malek; Normand Mousseau (2011). "Optimized energy landscape exploration using the ab initio based activation-relaxation technique". Physical Review E. 135 (6): 7723–7728. arXiv: cond-mat/0006042 . Bibcode:2000PhRvE..62.7723M. doi:10.1103/PhysRevE.62.7723. PMID   11138044. S2CID   119453527.
  9. Eduardo Machado-Charry; Laurent Karim Béland; Damien Caliste; Luigi Genovese; Thierry Deutsch; Normand Mousseau; Pascal Pochet (2011). "Optimized energy landscape exploration using the ab initio based activation-relaxation technique". J. Chem. Phys. 62 (3): 034102–034112. Bibcode:2011JChPh.135c4102M. doi:10.1063/1.3609924. PMID   21786982.
  10. Jensen, F. Introduction to Computational Chemistry; Wiley: 2 ed.; 2006
  11. (a) G. Mills and H. Jónsson, Phys. Rev. Lett. 72, 1124 (1994) (b) Graeme Henkelman and Hannes Jónsson, Improved tangent estimate in the nudged elastic band method for finding minimum energy paths and saddle points, J. Chem. Phys. 113, 9978 - 9985 (2000)
  12. "Nudged Elastic Band". UT Austin. Archived from the original on 2014-02-03.
  13. Koistinen, Olli-Pekka; Dagbjartsdóttir, Freyja B.; Ásgeirsson, Vilhjálmur; Vehtari, Aki; Jónsson, Hannes (2017-10-21). "Nudged elastic band calculations accelerated with Gaussian process regression". The Journal of Chemical Physics. 147 (15): 152720. arXiv: 1706.04606 . Bibcode:2017JChPh.147o2720K. doi:10.1063/1.4986787. ISSN   0021-9606. PMID   29055305. S2CID   21822734.
  14. Ivanov, A V; Dagbartsson, D; Tranchida, J; Uzdin, V M; Jónsson, H (2020-08-12). "Efficient optimization method for finding minimum energy paths of magnetic transitions". Journal of Physics: Condensed Matter. 32 (34): 345901. arXiv: 2001.10372 . Bibcode:2020JPCM...32H5901I. doi:10.1088/1361-648X/ab8b9c. ISSN   0953-8984. PMID   32316000. S2CID   210932577.
  15. "Rare Events, Transition Pathways and Reaction Rates". and "The string method page".
  16. Weinan E, Weiqing Ren, Eric Vanden-Eijnden (2002). "String method for the study of rare Events". Phys. Rev. B. 66 (5): 052301. arXiv: cond-mat/0205527 . Bibcode:2002PhRvB..66e2301E. doi:10.1103/PhysRevB.66.052301. S2CID   119326534.{{cite journal}}: CS1 maint: multiple names: authors list (link)
  17. Amit Samanta; Weinan E (2010). "Modified string method for finding minimum energy path". arXiv: 1009.5612 [physics.comp-ph].
  18. Baron Peters; Andreas Heyden; Alexis T. Bell; Arup Chakraborty (2004). "A growing string method for determining transition states: Comparison to the nudged elastic band and string methods". J. Chem. Phys. 120 (17): 7877–7886. Bibcode:2004JChPh.120.7877P. doi:10.1063/1.1691018. PMID   15267702.

Additional references