Metadynamics (MTD; also abbreviated as METAD or MetaD) is a computer simulation method in computational physics, chemistry and biology. It is used to estimate the free energy and other state functions of a system, where ergodicity is hindered by the form of the system's energy landscape. It was first suggested by Alessandro Laio and Michele Parrinello in 2002 [1] and is usually applied within molecular dynamics simulations. MTD closely resembles a number of newer methods such as adaptively biased molecular dynamics, [2] adaptive reaction coordinate forces [3] and local elevation umbrella sampling. [4] More recently, both the original and well-tempered metadynamics [5] were derived in the context of importance sampling and shown to be a special case of the adaptive biasing potential setting. [6] MTD is related to the Wang–Landau sampling. [7]
The technique builds on a large number of related methods including (in a chronological order) the deflation, [8] tunneling, [9] tabu search, [10] local elevation, [11] conformational flooding, [12] Engkvist-Karlström [13] and Adaptive Biasing Force methods. [14]
Metadynamics has been informally described as "filling the free energy wells with computational sand". [15] The algorithm assumes that the system can be described by a few collective variables (CV). During the simulation, the location of the system in the space determined by the collective variables is calculated and a positive Gaussian potential is added to the real energy landscape of the system. In this way the system is discouraged to come back to the previous point. During the evolution of the simulation, more and more Gaussians sum up, thus discouraging more and more the system to go back to its previous steps, until the system explores the full energy landscape—at this point the modified free energy becomes a constant as a function of the collective variables which is the reason for the collective variables to start fluctuating heavily. At this point the energy landscape can be recovered as the opposite of the sum of all Gaussians.
The time interval between the addition of two Gaussian functions, as well as the Gaussian height and Gaussian width, are tuned to optimize the ratio between accuracy and computational cost. By simply changing the size of the Gaussian, metadynamics can be fitted to yield very quickly a rough map of the energy landscape by using large Gaussians, or can be used for a finer grained description by using smaller Gaussians. [1] Usually, the well-tempered metadynamics [5] is used to change the Gaussian size adaptively. Also, the Gaussian width can be adapted with the adaptive Gaussian metadynamics. [16]
Metadynamics has the advantage, upon methods like adaptive umbrella sampling, of not requiring an initial estimate of the energy landscape to explore. [1] However, it is not trivial to choose proper collective variables for a complex simulation. Typically, it requires several trials to find a good set of collective variables, but there are several automatic procedures proposed: essential coordinates, [17] Sketch-Map, [18] and non-linear data-driven collective variables. [19]
Independent metadynamics simulations (replicas) can be coupled together to improve usability and parallel performance. There are several such methods proposed: the multiple walker MTD, [20] the parallel tempering MTD, [21] the bias-exchange MTD, [22] and the collective-variable tempering MTD. [23] The last three are similar to the parallel tempering method and use replica exchanges to improve sampling. Typically, the Metropolis–Hastings algorithm is used for replica exchanges, but the infinite swapping [24] and Suwa-Todo [25] algorithms give better replica exchange rates. [26]
Typical (single-replica) MTD simulations can include up to 3 CVs, even using the multi-replica approach, it is hard to exceed 8 CVs in practice. This limitation comes from the bias potential, constructed by adding Gaussian functions (kernels). It is a special case of the kernel density estimator (KDE). The number of required kernels, for a constant KDE accuracy, increases exponentially with the number of dimensions. So MTD simulation length has to increase exponentially with the number of CVs to maintain the same accuracy of the bias potential. Also, the bias potential, for fast evaluation, is typically approximated with a regular grid. [27] The required memory to store the grid increases exponentially with the number of dimensions (CVs) too.
A high-dimensional generalization of metadynamics is NN2B. [28] It is based on two machine learning algorithms: the nearest-neighbor density estimator (NNDE) and the artificial neural network (ANN). NNDE replaces KDE to estimate the updates of bias potential from short biased simulations, while ANN is used to approximate the resulting bias potential. ANN is a memory-efficient representation of high-dimensional functions, where derivatives (biasing forces) are effectively computed with the backpropagation algorithm. [28] [29]
An alternative method, exploiting ANN for the adaptive bias potential, uses mean potential forces for the estimation. [30] This method is also a high-dimensional generalization of the Adaptive Biasing Force (ABF) method. [31] Additionally, the training of ANN is improved using Bayesian regularization, [32] and the error of approximation can be inferred by training an ensemble of ANNs. [30]
In 2015, White, Dama, and Voth introduced experiment-directed metadynamics, a method that allows for shaping molecular dynamics simulations to match a desired free energy surface. This technique guides the simulation towards conformations that align with experimental data, enhancing our understanding of complex molecular systems and their behavior. [33]
In 2020, an evolution of metadynamics was proposed, the on-the-fly probability enhanced sampling method (OPES), [34] [35] [36] which is now the method of choice of Michele Parrinello's research group. [37] The OPES method has only a few robust parameters, converges faster than metadynamics, and has a straightforward reweighting scheme. [38] In 2024, a replica-exchange variant of OPES was developed, named OneOPES [39] , designed to exploit a thermal gradient and multiple CVs to sample large biochemical systems with several degrees of freedom. This variant aims to address the challenge of describing such systems, where the numerous degrees of freedom are often difficult to capture with only a few CVs. OPES has been implemented in the PLUMED library since version 2.7. [40]
Assume we have a classical -particle system with positions at in the Cartesian coordinates . The particle interaction are described with a potential function . The potential function form (e.g. two local minima separated by a high-energy barrier) prevents an ergodic sampling with molecular dynamics or Monte Carlo methods.
A general idea of MTD is to enhance the system sampling by discouraging revisiting of sampled states. It is achieved by augmenting the system Hamiltonian with a bias potential :
The bias potential is a function of collective variables . A collective variable is a function of the particle positions . The bias potential is continuously updated by adding bias at rate , where is an instantaneous collective variable value at time :
At infinitely long simulation time , the accumulated bias potential converges to free energy with opposite sign (and irrelevant constant ):
For a computationally efficient implementation, the update process is discretised into time intervals ( denotes the floor function) and -function is replaced with a localized positive kernel function . The bias potential becomes a sum of the kernel functions centred at the instantaneous collective variable values at time :
Typically, the kernel is a multi-dimensional Gaussian function, whose covariance matrix has diagonal non-zero elements only:
The parameter , , and are determined a priori and kept constant during the simulation.
Below there is a pseudocode of MTD base on molecular dynamics (MD), where and are the -particle system positions and velocities, respectively. The bias is updated every MD steps, and its contribution to the system forces is .
set initial and setevery MD step: compute CV values: every MD steps: update bias potential: compute atomic forces: propagate and by
The finite size of the kernel makes the bias potential to fluctuate around a mean value. A converged free energy can be obtained by averaging the bias potential. The averaging is started from , when the motion along the collective variable becomes diffusive:
Metadynamics has been used to study:
PLUMED [47] is an open-source library implementing many MTD algorithms and collective variables. It has a flexible object-oriented design [48] [49] and can be interfaced with several MD programs (AMBER, GROMACS, LAMMPS, NAMD, Quantum ESPRESSO, DL_POLY_4, CP2K, and OpenMM). [50] [51]
Other MTD implementations exist in the Collective Variables Module [52] (for LAMMPS, NAMD, and GROMACS), ORAC, CP2K, [53] EDM, [54] and Desmond.
Computational chemistry is a branch of chemistry that uses computer simulations to assist in solving chemical problems. It uses methods of theoretical chemistry incorporated into computer programs to calculate the structures and properties of molecules, groups of molecules, and solids. The importance of this subject stems from the fact that, with the exception of some relatively recent findings related to the hydrogen molecular ion, achieving an accurate quantum mechanical depiction of chemical systems analytically, or in a closed form, is not feasible. The complexity inherent in the many-body problem exacerbates the challenge of providing detailed descriptions of quantum mechanical systems. While computational results normally complement information obtained by chemical experiments, it can occasionally predict unobserved chemical phenomena.
Molecular dynamics (MD) is a computer simulation method for analyzing the physical movements of atoms and molecules. The atoms and molecules are allowed to interact for a fixed period of time, giving a view of the dynamic "evolution" of the system. In the most common version, the trajectories of atoms and molecules are determined by numerically solving Newton's equations of motion for a system of interacting particles, where forces between the particles and their potential energies are often calculated using interatomic potentials or molecular mechanical force fields. The method is applied mostly in chemical physics, materials science, and biophysics.
In computational chemistry, molecular physics, and physical chemistry, the Lennard-Jones potential is an intermolecular pair potential. Out of all the intermolecular potentials, the Lennard-Jones potential is probably the one that has been the most extensively studied. It is considered an archetype model for simple yet realistic intermolecular interactions. The Lennard-Jones potential is often used as a building block in molecular models for more complex substances. Many studies of the idealized "Lennard-Jones substance" use the potential to understand the physical nature of matter.
Molecular mechanics uses classical mechanics to model molecular systems. The Born–Oppenheimer approximation is assumed valid and the potential energy of all systems is calculated as a function of the nuclear coordinates using force fields. Molecular mechanics can be used to study molecule systems ranging in size and complexity from small to large biological systems or material assemblies with many thousands to millions of atoms.
In the context of chemistry, molecular physics, physical chemistry, and molecular modelling, a force field is a computational model that is used to describe the forces between atoms within molecules or between molecules as well as in crystals. Force fields are a variety of interatomic potentials. More precisely, the force field refers to the functional form and parameter sets used to calculate the potential energy of a system on the atomistic level. Force fields are usually used in molecular dynamics or Monte Carlo simulations. The parameters for a chosen energy function may be derived from classical laboratory experiment data, calculations in quantum mechanics, or both. Force fields utilize the same concept as force fields in classical physics, with the main difference being that the force field parameters in chemistry describe the energy landscape on the atomistic level. From a force field, the acting forces on every particle are derived as a gradient of the potential energy with respect to the particle coordinates.
Car–Parrinello molecular dynamics or CPMD refers to either a method used in molecular dynamics or the computational chemistry software package used to implement this method.
Implicit solvation is a method to represent solvent as a continuous medium instead of individual “explicit” solvent molecules, most often used in molecular dynamics simulations and in other applications of molecular mechanics. The method is often applied to estimate free energy of solute-solvent interactions in structural and chemical processes, such as folding or conformational transitions of proteins, DNA, RNA, and polysaccharides, association of biological macromolecules with ligands, or transport of drugs across biological membranes.
In computational chemistry, a water model is used to simulate and thermodynamically calculate water clusters, liquid water, and aqueous solutions with explicit solvent. The models are determined from quantum mechanics, molecular mechanics, experimental results, and these combinations. To imitate a specific nature of molecules, many types of models have been developed. In general, these can be classified by the following three points; (i) the number of interaction points called site, (ii) whether the model is rigid or flexible, (iii) whether the model includes polarization effects.
The Berendsen thermostat is an algorithm to re-scale the velocities of particles in molecular dynamics simulations to control the simulation temperature.
Umbrella sampling is a technique in computational physics and chemistry, used to improve sampling of a system where ergodicity is hindered by the form of the system's energy landscape. It was first suggested by Torrie and Valleau in 1977. It is a particular physical application of the more general importance sampling in statistics.
The Wang and Landau algorithm, proposed by Fugao Wang and David P. Landau, is a Monte Carlo method designed to estimate the density of states of a system. The method performs a non-Markovian random walk to build the density of states by quickly visiting all the available energy spectrum. The Wang and Landau algorithm is an important method to obtain the density of states required to perform a multicanonical simulation.
CP2K is a freely available (GPL) quantum chemistry and solid state physics program package, written in Fortran 2008, to perform atomistic simulations of solid state, liquid, molecular, periodic, material, crystal, and biological systems. It provides a general framework for different methods: density functional theory (DFT) using a mixed Gaussian and plane waves approach (GPW) via LDA, GGA, MP2, or RPA levels of theory, classical pair and many-body potentials, semi-empirical and tight-binding Hamiltonians, as well as Quantum Mechanics/Molecular Mechanics (QM/MM) hybrid schemes relying on the Gaussian Expansion of the Electrostatic Potential (GEEP). The Gaussian and Augmented Plane Waves method (GAPW) as an extension of the GPW method allows for all-electron calculations. CP2K can do simulations of molecular dynamics, metadynamics, Monte Carlo, Ehrenfest dynamics, vibrational analysis, core level spectroscopy, energy minimization, and transition state optimization using NEB or dimer method.
Crystal structure prediction (CSP) is the calculation of the crystal structures of solids from first principles. Reliable methods of predicting the crystal structure of a compound, based only on its composition, has been a goal of the physical sciences since the 1950s. Computational methods employed include simulated annealing, evolutionary algorithms, distributed multipole analysis, random sampling, basin-hopping, data mining, density functional theory and molecular mechanics.
Local elevation is a technique used in computational chemistry or physics, mainly in the field of molecular simulation. It was developed in 1994 by Huber, Torda and van Gunsteren to enhance the searching of conformational space in molecular dynamics simulations and is available in the GROMOS software for molecular dynamics simulation. The method was, together with the conformational flooding method, the first to introduce memory dependence into molecular simulations. Many recent methods build on the principles of the local elevation technique, including the Engkvist-Karlström, adaptive biasing force, Wang–Landau, metadynamics, adaptively biased molecular dynamics, adaptive reaction coordinate forces, and local elevation umbrella sampling methods. The basic principle of the method is to add a memory-dependent potential energy term in the simulation so as to prevent the simulation to revisit already sampled configurations, which leads to the increased probability of discovering new configurations. The method can be seen as a continuous variant of the Tabu search method.
Michele Parrinello is an Italian physicist particularly known for his work in molecular dynamics. Parrinello and Roberto Car were awarded the Dirac Medal of the International Centre for Theoretical Physics (ICTP) and the Sidney Fernbach Award in 2009 for their continuing development of the Car–Parrinello method, first proposed in their seminal 1985 paper, "Unified Approach for Molecular Dynamics and Density-Functional Theory". They have continued to receive awards for this breakthrough, most recently the Dreyfus Prize in the Chemical Sciences and the 2021 Benjamin Franklin Medal in Chemistry.
Interatomic potentials are mathematical functions to calculate the potential energy of a system of atoms with given positions in space. Interatomic potentials are widely used as the physical basis of molecular mechanics and molecular dynamics simulations in computational chemistry, computational physics and computational materials science to explain and predict materials properties. Examples of quantitative properties and qualitative phenomena that are explored with interatomic potentials include lattice parameters, surface energies, interfacial energies, adsorption, cohesion, thermal expansion, and elastic and plastic material behavior, as well as chemical reactions.
Boson sampling is a restricted model of non-universal quantum computation introduced by Scott Aaronson and Alex Arkhipov after the original work of Lidror Troyansky and Naftali Tishby, that explored possible usage of boson scattering to evaluate expectation values of permanents of matrices. The model consists of sampling from the probability distribution of identical bosons scattered by a linear interferometer. Although the problem is well defined for any bosonic particles, its photonic version is currently considered as the most promising platform for a scalable implementation of a boson sampling device, which makes it a non-universal approach to linear optical quantum computing. Moreover, while not universal, the boson sampling scheme is strongly believed to implement computing tasks which are hard to implement with classical computers by using far fewer physical resources than a full linear-optical quantum computing setup. This advantage makes it an ideal candidate for demonstrating the power of quantum computation in the near term.
PLUMED is an open-source library implementing enhanced-sampling algorithms, various free-energy methods, and analysis tools for molecular dynamics simulations. It is designed to be used together with ACEMD, AMBER, DL_POLY, GROMACS, LAMMPS, NAMD, OpenMM, ABIN, CP2K, i-PI, PINY-MD, and Quantum ESPRESSO, but it can also be used together with analysis and visualization tools VMD, HTMD, and OpenPathSampling.
Qbox is an open-source software package for atomic-scale simulations of molecules, liquids and solids. It implements first principles molecular dynamics, a simulation method in which inter-atomic forces are derived from quantum mechanics. Qbox is released under a GNU General Public License (GPL) with documentation provided at http://qboxcode.org. It is available as a FreeBSD port.
In physics and chemistry, photoemission orbital tomography is a combined experimental / theoretical approach which was initially developed to reveal information about the spatial distribution of individual one-electron surface-state wave functions and later extended to study molecular orbitals. Experimentally, it uses angle-resolved photoemission spectroscopy (ARPES) to obtain constant binding energy photoemission angular distribution maps. In their pioneering work, Mugarza et al. in 2003 used a phase-retrieval method to obtain the wave function of electron surface states based on ARPES data acquired from stepped gold crystalline surfaces; they obtained the respective wave functions and, upon insertion into the Schrödinger equation, also the binding potential. More recently, photoemission maps, also known as tomograms, have been shown to reveal information about the electron probability distribution in molecular orbitals. Theoretically, one rationalizes these tomograms as hemispherical cuts through the molecular orbital in momentum space. This interpretation relies on the assumption of a plane wave final state, i.e., the idea that the outgoing electron can be treated as a free electron, which can be further exploited to reconstruct real-space images of molecular orbitals on a sub-Ångström length scale in two or three dimensions. Presently, POT has been applied to various organic molecules forming well-oriented monolayers on single crystal surfaces or to two-dimensional materials.