Monte Carlo method in statistical mechanics

Last updated October 18, 2023

Monte Carlo in statistical physics refers to the application of the Monte Carlo method to problems in statistical physics, or statistical mechanics.

Overview

The general motivation to use the Monte Carlo method in statistical physics is to evaluate a multivariable integral. The typical problem begins with a system for which the Hamiltonian is known, it is at a given temperature and it follows the Boltzmann statistics. To obtain the mean value of some macroscopic variable, say A, the general approach is to compute, over all the phase space, PS for simplicity, the mean value of A using the Boltzmann distribution:

\langle A\rangle =\int _{PS}A_{\vec {r}}{\frac {e^{-\beta E_{\vec {r}}}}{Z}}d{\vec {r}}

.

where $E({\vec {r}})=E_{\vec {r}}$ is the energy of the system for a given state defined by ${\vec {r}}$ - a vector with all the degrees of freedom (for instance, for a mechanical system, ${\vec {r}}=\left({\vec {q}},{\vec {p}}\right)$ ), $\beta \equiv 1/k_{b}T$ and

Z=\int _{PS}e^{-\beta E_{\vec {r}}}d{\vec {r}}

is the partition function.

One possible approach to solve this multivariable integral is to exactly enumerate all possible configurations of the system, and calculate averages at will. This is done in exactly solvable systems, and in simulations of simple systems with few particles. In realistic systems, on the other hand, an exact enumeration can be difficult or impossible to implement.

For those systems, the Monte Carlo integration (and not to be confused with Monte Carlo method, which is used to simulate molecular chains) is generally employed. The main motivation for its use is the fact that, with the Monte Carlo integration, the error goes as $1/{\sqrt {N}}$ , independently of the dimension of the integral. Another important concept related to the Monte Carlo integration is the importance sampling, a technique that improves the computational time of the simulation.

In the following sections, the general implementation of the Monte Carlo integration for solving this kind of problems is discussed.

Importance sampling

An estimation, under Monte Carlo integration, of an integral defined as

\langle A\rangle =\int _{PS}A_{\vec {r}}e^{-\beta E_{\vec {r}}}d{\vec {r}}/Z

is

\langle A\rangle \simeq {\frac {1}{N}}\sum _{i=1}^{N}A_{{\vec {r}}_{i}}e^{-\beta E_{{\vec {r}}_{i}}}/Z

where ${\vec {r}}_{i}$ are uniformly obtained from all the phase space (PS) and N is the number of sampling points (or function evaluations).

From all the phase space, some zones of it are generally more important to the mean of the variable $A$ than others. In particular, those that have the value of $e^{-\beta E_{{\vec {r}}_{i}}}$ sufficiently high when compared to the rest of the energy spectra are the most relevant for the integral. Using this fact, the natural question to ask is: is it possible to choose, with more frequency, the states that are known to be more relevant to the integral? The answer is yes, using the importance sampling technique.

Lets assume $p({\vec {r}})$ is a distribution that chooses the states that are known to be more relevant to the integral.

The mean value of $A$ can be rewritten as

\langle A\rangle =\int _{PS}p^{-1}({\vec {r}}){\frac {A_{\vec {r}}}{p^{-1}({\vec {r}})}}e^{-\beta E_{\vec {r}}}/Zd{\vec {r}}=\int _{PS}p^{-1}({\vec {r}})A_{\vec {r}}^{*}e^{-\beta E_{\vec {r}}}/Zd{\vec {r}}

,

where $A_{\vec {r}}^{*}$ are the sampled values taking into account the importance probability $p({\vec {r}})$ . This integral can be estimated by

\langle A\rangle \simeq {\frac {1}{N}}\sum _{i=1}^{N}p^{-1}({\vec {r}}_{i})A_{{\vec {r}}_{i}}^{*}e^{-\beta E_{{\vec {r}}_{i}}}/Z

where ${\vec {r}}_{i}$ are now randomly generated using the $p({\vec {r}})$ distribution. Since most of the times it is not easy to find a way of generating states with a given distribution, the Metropolis algorithm must be used.

Canonical

Because it is known that the most likely states are those that maximize the Boltzmann distribution, a good distribution, $p({\vec {r}})$ , to choose for the importance sampling is the Boltzmann distribution or canonic distribution. Let

p({\vec {r}})={\frac {e^{-\beta E_{\vec {r}}}}{Z}}

be the distribution to use. Substituting on the previous sum,

\langle A\rangle \simeq {\frac {1}{N}}\sum _{i=1}^{N}A_{{\vec {r}}_{i}}^{*}

.

So, the procedure to obtain a mean value of a given variable, using metropolis algorithm, with the canonical distribution, is to use the Metropolis algorithm to generate states given by the distribution $p({\vec {r}})$ and perform means over $A_{\vec {r}}^{*}$ .

One important issue must be considered when using the metropolis algorithm with the canonical distribution: when performing a given measure, i.e. realization of ${\vec {r}}_{i}$ , one must ensure that that realization is not correlated with the previous state of the system (otherwise the states are not being "randomly" generated). On systems with relevant energy gaps, this is the major drawback of the use of the canonical distribution because the time needed to the system de-correlate from the previous state can tend to infinity.

Multi-canonical

As stated before, micro-canonical approach has a major drawback, which becomes relevant in most of the systems that use Monte Carlo Integration. For those systems with "rough energy landscapes", the multicanonic approach can be used.

The multicanonic approach uses a different choice for importance sampling:

p({\vec {r}})={\frac {1}{\Omega (E_{\vec {r}})}}

where $\Omega (E)$ is the density of states of the system. The major advantage of this choice is that the energy histogram is flat, i.e. the generated states are equally distributed on energy. This means that, when using the Metropolis algorithm, the simulation doesn't see the "rough energy landscape", because every energy is treated equally.

The major drawback of this choice is the fact that, on most systems, $\Omega (E)$ is unknown. To overcome this, the Wang and Landau algorithm is normally used to obtain the DOS during the simulation. Note that after the DOS is known, the mean values of every variable can be calculated for every temperature, since the generation of states does not depend on $\beta$ .

Implementation

On this section, the implementation will focus on the Ising model. Lets consider a two-dimensional spin network, with L spins (lattice sites) on each side. There are naturally $N=L^{2}$ spins, and so, the phase space is discrete and is characterized by N spins, ${\vec {r}}=(\sigma _{1},\sigma _{2},...,\sigma _{N})$ where $\sigma _{i}\in \{-1,1\}$ is the spin of each lattice site. The system's energy is given by $E({\vec {r}})=\sum _{i=1}^{N}\sum _{j\in viz_{i}}(1-J_{ij}\sigma _{i}\sigma _{j})$ , where $viz_{i}$ are the set of first neighborhood spins of i and J is the interaction matrix (for a ferromagnetic ising model, J is the identity matrix). The problem is stated.

On this example, the objective is to obtain $\langle M\rangle$ and $\langle M^{2}\rangle$ (for instance, to obtain the magnetic susceptibility of the system) since it is straightforward to generalize to other observables. According to the definition, $M({\vec {r}})=\sum _{i=1}^{N}\sigma _{i}$ .

Canonical

First, the system must be initialized: let $\beta =1/k_{b}T$ be the system's Boltzmann temperature and initialize the system with an initial state (which can be anything since the final result should not depend on it).

With micro-canonic choice, the metropolis method must be employed. Because there is no right way of choosing which state is to be picked, one can particularize and choose to try to flip one spin at the time. This choice is usually called single spin flip. The following steps are to be made to perform a single measurement.

step 1: generate a state that follows the $p({\vec {r}})$ distribution:

step 1.1: Perform TT times the following iteration:

step 1.1.1: pick a lattice site at random (with probability 1/N), which will be called i, with spin $\sigma _{i}$ .

step 1.1.2: pick a random number $\alpha \in [0,1]$ .

step 1.1.3: calculate the energy change of trying to flip the spin i:

\Delta E=2\sigma _{i}\sum _{j\in viz_{i}}\sigma _{j}

and its magnetization change: $\Delta M=-2\sigma _{i}$

step 1.1.4: if $\alpha <\min(1,e^{-\beta \Delta E})$ , flip the spin ( $\sigma _{i}=-\sigma _{i}$ ), otherwise, don't.

step 1.1.5: update the several macroscopic variables in case the spin flipped: $E=E+\Delta E$ , $M=M+\Delta M$

after TT times, the system is considered to be not correlated from its previous state, which means that, at this moment, the probability of the system to be on a given state follows the Boltzmann distribution, which is the objective proposed by this method.

step 2: perform the measurement:

step 2.1: save, on a histogram, the values of M and M².

As a final note, one should note that TT is not easy to estimate because it is not easy to say when the system is de-correlated from the previous state. To surpass this point, one generally do not use a fixed TT, but TT as a tunneling time. One tunneling time is defined as the number of steps 1. the system needs to make to go from the minimum of its energy to the maximum of its energy and return.

A major drawback of this method with the single spin flip choice in systems like Ising model is that the tunneling time scales as a power law as $N^{2+z}$ where z is greater than 0.5, phenomenon known as critical slowing down.

Applicability

The method thus neglects dynamics, which can be a major drawback, or a great advantage. Indeed, the method can only be applied to static quantities, but the freedom to choose moves makes the method very flexible. An additional advantage is that some systems, such as the Ising model, lack a dynamical description and are only defined by an energy prescription; for these the Monte Carlo approach is the only one feasible.

Generalizations

The great success of this method in statistical mechanics has led to various generalizations such as the method of simulated annealing for optimization, in which a fictitious temperature is introduced and then gradually lowered.

Related Research Articles

Fermi–Dirac statistics is a type of quantum statistics that applies to the physics of a system consisting of many non-interacting, identical particles that obey the Pauli exclusion principle. A result is the Fermi–Dirac distribution of particles over energy states. It is named after Enrico Fermi and Paul Dirac, each of whom derived the distribution independently in 1926. Fermi–Dirac statistics is a part of the field of statistical mechanics and uses the principles of quantum mechanics.

In physics, a Langevin equation is a stochastic differential equation describing how a system evolves when subjected to a combination of deterministic and fluctuating ("random") forces. The dependent variables in a Langevin equation typically are collective (macroscopic) variables changing only slowly in comparison to the other (microscopic) variables of the system. The fast (microscopic) variables are responsible for the stochastic nature of the Langevin equation. One application is to Brownian motion, which models the fluctuating motion of a small particle in a fluid.

In thermodynamics, the Helmholtz free energy is a thermodynamic potential that measures the useful work obtainable from a closed thermodynamic system at a constant temperature (isothermal). The change in the Helmholtz energy during a process is equal to the maximum amount of work that the system can perform in a thermodynamic process in which temperature is held constant. At constant temperature, the Helmholtz free energy is minimized at equilibrium.

In physics, specifically in quantum mechanics, a coherent state is the specific quantum state of the quantum harmonic oscillator, often described as a state which has dynamics most closely resembling the oscillatory behavior of a classical harmonic oscillator. It was the first example of quantum dynamics when Erwin Schrödinger derived it in 1926, while searching for solutions of the Schrödinger equation that satisfy the correspondence principle. The quantum harmonic oscillator arise in the quantum theory of a wide range of physical systems. For instance, a coherent state describes the oscillating motion of a particle confined in a quadratic potential well. The coherent state describes a state in a system for which the ground-state wavepacket is displaced from the origin of the system. This state can be related to classical solutions by a particle oscillating with an amplitude equivalent to the displacement.

<span class="mw-page-title-main">Lattice model (physics)</span>

In mathematical physics, a lattice model is a mathematical model of a physical system that is defined on a lattice, as opposed to a continuum, such as the continuum of space or spacetime. Lattice models originally occurred in the context of condensed matter physics, where the atoms of a crystal automatically form a lattice. Currently, lattice models are quite popular in theoretical physics, for many reasons. Some models are exactly solvable, and thus offer insight into physics beyond what can be learned from perturbation theory. Lattice models are also ideal for study by the methods of computational physics, as the discretization of any continuum model automatically turns it into a lattice model. The exact solution to many of these models includes the presence of solitons. Techniques for solving these include the inverse scattering transform and the method of Lax pairs, the Yang–Baxter equation and quantum groups. The solution of these models has given insights into the nature of phase transitions, magnetization and scaling behaviour, as well as insights into the nature of quantum field theory. Physical lattice models frequently occur as an approximation to a continuum theory, either to give an ultraviolet cutoff to the theory to prevent divergences or to perform numerical computations. An example of a continuum theory that is widely studied by lattice models is the QCD lattice model, a discretization of quantum chromodynamics. However, digital physics considers nature fundamentally discrete at the Planck scale, which imposes upper limit to the density of information, aka Holographic principle. More generally, lattice gauge theory and lattice field theory are areas of study. Lattice models are also used to simulate the structure and dynamics of polymers.

The Ising model, named after the physicists Ernst Ising and Wilhelm Lenz, is a mathematical model of ferromagnetism in statistical mechanics. The model consists of discrete variables that represent magnetic dipole moments of atomic "spins" that can be in one of two states. The spins are arranged in a graph, usually a lattice, allowing each spin to interact with its neighbors. Neighboring spins that agree have a lower energy than those that disagree; the system tends to the lowest energy but heat disturbs this tendency, thus creating the possibility of different structural phases. The model allows the identification of phase transitions as a simplified model of reality. The two-dimensional square-lattice Ising model is one of the simplest statistical models to show a phase transition.

In physics, a partition function describes the statistical properties of a system in thermodynamic equilibrium. Partition functions are functions of the thermodynamic state variables, such as the temperature and volume. Most of the aggregate thermodynamic variables of the system, such as the total energy, free energy, entropy, and pressure, can be expressed in terms of the partition function or its derivatives. The partition function is dimensionless.

In quantum mechanics and computing, the Bloch sphere is a geometrical representation of the pure state space of a two-level quantum mechanical system (qubit), named after the physicist Felix Bloch.

The classical XY model is a lattice model of statistical mechanics. In general, the XY model can be seen as a specialization of Stanley's n-vector model for $n = 2$ .

In quantum mechanics, a two-state system is a quantum system that can exist in any quantum superposition of two independent quantum states. The Hilbert space describing such a system is two-dimensional. Therefore, a complete basis spanning the space will consist of two independent states. Any two-state system can also be seen as a qubit.

<span class="mw-page-title-main">Beam emittance</span> Property of a charged particle beam

In accelerator physics, emittance is a property of a charged particle beam. It refers to the area occupied by the beam in a position-and-momentum phase space.

In solid-state physics, the tight-binding model is an approach to the calculation of electronic band structure using an approximate set of wave functions based upon superposition of wave functions for isolated atoms located at each atomic site. The method is closely related to the LCAO method used in chemistry. Tight-binding models are applied to a wide variety of solids. The model gives good qualitative results in many cases and can be combined with other models that give better results where the tight-binding model fails. Though the tight-binding model is a one-electron model, the model also provides a basis for more advanced calculations like the calculation of surface states and application to various kinds of many-body problem and quasiparticle calculations.

In applied mathematics, the numerical sign problem is the problem of numerically evaluating the integral of a highly oscillatory function of a large number of variables. Numerical methods fail because of the near-cancellation of the positive and negative contributions to the integral. Each has to be integrated to very high precision in order for their difference to be obtained with useful accuracy.

The Swendsen–Wang algorithm is the first non-local or cluster algorithm for Monte Carlo simulation for large systems near criticality. It has been introduced by Robert Swendsen and Jian-Sheng Wang in 1987 at Carnegie Mellon.

<span class="mw-page-title-main">Thermal fluctuations</span> Random temperature-influenced deviations of particles from their average state

In statistical mechanics, thermal fluctuations are random deviations of an atomic system from its average state, that occur in a system at equilibrium. All thermal fluctuations become larger and more frequent as the temperature increases, and likewise they decrease as temperature approaches absolute zero.

The Jeans equations are a set of partial differential equations that describe the motion of a collection of stars in a gravitational field. The Jeans equations relate the second-order velocity moments to the density and potential of a stellar system for systems without collision. They are analogous to the Euler equations for fluid flow and may be derived from the collisionless Boltzmann equation. The Jeans equations can come in a variety of different forms, depending on the structure of what is being modelled. Most utilization of these equations has been found in simulations with large number of gravitationally bound objects.

In statistical mechanics, the Griffiths inequality, sometimes also called Griffiths–Kelly–Sherman inequality or GKS inequality, named after Robert B. Griffiths, is a correlation inequality for ferromagnetic spin systems. Informally, it says that in ferromagnetic spin systems, if the 'a-priori distribution' of the spin is invariant under spin flipping, the correlation of any monomial of the spins is non-negative; and the two point correlation of two monomial of the spins is non-negative.

In statistics and physics, multicanonical ensemble is a Markov chain Monte Carlo sampling technique that uses the Metropolis–Hastings algorithm to compute integrals where the integrand has a rough landscape with multiple local minima. It samples states according to the inverse of the density of states, which has to be known a priori or be computed using other techniques like the Wang and Landau algorithm. Multicanonical sampling is an important technique for spin systems like the Ising model or spin glasses.

In statistical mechanics, Lee–Yang theory, sometimes also known as Yang–Lee theory, is a scientific theory which seeks to describe phase transitions in large physical systems in the thermodynamic limit based on the properties of small, finite-size systems. The theory revolves around the complex zeros of partition functions of finite-size systems and how these may reveal the existence of phase transitions in the thermodynamic limit.

The Gibbs rotational ensemble represents the possible states of a mechanical system in thermal and rotational equilibrium at temperature $and angular velocity . The Jaynes procedure can be used to obtain this ensemble. An ensemble is the set of microstates corresponding to a given macrostate.$

References

Allen, M.P. & Tildesley, D.J. (1987). Computer Simulation of Liquids . Oxford University Press. ISBN 0-19-855645-4.
Frenkel, D. & Smit, B. (2001). Understanding Molecular Simulation. Academic Press. ISBN 0-12-267351-4.
Binder, K. & Heermann, D.W. (2002). Monte Carlo Simulation in Statistical Physics. An Introduction (4th ed.). Springer. ISBN 3-540-43221-3.
Spanier, Jerome; Gelbard, Ely M. (2008). "Importance Sampling". Monte Carlo Principles and Neutron Transport Problems. Dover. pp. 110–124. ISBN 978-0-486-46293-6.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

Monte Carlo method in statistical mechanics

Contents

Overview

Importance sampling

Canonical

Multi-canonical

Implementation

Canonical

Applicability

Generalizations

See also

Related Research Articles

References