Multilevel Monte Carlo method

Last updated February 06, 2022

Multilevel Monte Carlo (MLMC) methods in numerical analysis are algorithms for computing expectations that arise in stochastic simulations. Just as Monte Carlo methods, they rely on repeated random sampling, but these samples are taken on different levels of accuracy. MLMC methods can greatly reduce the computational cost of standard Monte Carlo methods by taking most samples with a low accuracy and corresponding low cost, and only very few samples are taken at high accuracy and corresponding high cost.

Goal

The goal of a multilevel Monte Carlo method is to approximate the expected value $\operatorname {E} [G]$ of the random variable $G$ that is the output of a stochastic simulation. Suppose this random variable cannot be simulated exactly, but there is a sequence of approximations $G_{0},G_{1},\ldots ,G_{L}$ with increasing accuracy, but also increasing cost, that converges to $G$ as $L\rightarrow \infty$ . The basis of the multilevel method is the telescoping sum identity,^[1]

\operatorname {E} [G_{L}]=\operatorname {E} [G_{0}]+\sum _{\ell =1}^{L}\operatorname {E} [G_{\ell }-G_{\ell -1}],

that is trivially satisfied because of the linearity of the expectation operator. Each of the expectations $\operatorname {E} [G_{\ell }-G_{\ell -1}]$ is then approximated by a Monte Carlo method, resulting in the multilevel Monte Carlo method. Note that taking a sample of the difference $G_{\ell }-G_{\ell -1}$ at level $\ell$ requires a simulation of both $G_{\ell }$ and $G_{\ell -1}$ .

The MLMC method works if the variances $\operatorname {V} [G_{\ell }-G_{\ell -1}]\rightarrow 0$ as $\ell \rightarrow \infty$ , which will be the case if both $G_{\ell }$ and $G_{\ell -1}$ approximate the same random variable $G$ . By the Central Limit Theorem, this implies that one needs fewer and fewer samples to accurately approximate the expectation of the difference $G_{\ell }-G_{\ell -1}$ as $\ell \rightarrow \infty$ . Hence, most samples will be taken on level $0$ , where samples are cheap, and only very few samples will be required at the finest level $L$ . In this sense, MLMC can be considered as a recursive control variate strategy.

Applications

The first application of MLMC is attributed to Mike Giles,^[2] in the context of stochastic differential equations (SDEs) for option pricing, however, earlier traces are found in the work of Heinrich in the context of parametric integration.^[3] Here, the random variable $G=f(X(T))$ is known as the payoff function, and the sequence of approximations $G_{\ell }$ , $\ell =0,\ldots ,L$ use an approximation to the sample path $X(t)$ with time step $h_{\ell }=2^{-\ell }T$ .

The application of MLMC to problems in uncertainty quantification (UQ) is an active area of research.^[4]^[5] An important prototypical example of these problems are partial differential equations (PDEs) with random coefficients. In this context, the random variable $G$ is known as the quantity of interest, and the sequence of approximations corresponds to a discretization of the PDE with different mesh sizes.

An algorithm for MLMC simulation

A simple level-adaptive algorithm for MLMC simulation is given below in pseudo-code.

 $L\gets 0$ repeat     Take warm-up samples at level  $L$      Compute the sample variance on all levels  $\ell =0,\ldots ,L$      Define the optimal number of samples  $N_{\ell }$  on all levels  $\ell =0,\ldots ,L$      Take additional samples on each level  $\ell$  according to  $N_{\ell }$ if $L\geq 2$ then         Test for convergence     endif not converged then $L\gets L+1$ enduntil converged

Extensions of MLMC

Recent extensions of the multilevel Monte Carlo method include multi-index Monte Carlo,^[6] where more than one direction of refinement is considered, and the combination of MLMC with the Quasi-Monte Carlo method.^[7]^[8]

Related Research Articles

In probability theory, the central limit theorem (CLT) establishes that, in many situations, when independent random variables are summed up, their properly normalized sum tends toward a normal distribution even if the original variables themselves are not normally distributed. The theorem is a key concept in probability theory because it implies that probabilistic and statistical methods that work for normal distributions can be applicable to many problems involving other types of distributions. This theorem has seen many changes during the formal development of probability theory. Previous versions of the theorem date back to 1811, but in its modern general form, this fundamental result in probability theory was precisely stated as late as 1920, thereby serving as a bridge between classical and modern probability theory.

In probability theory, there exist several different notions of convergence of random variables. The convergence of sequences of random variables to some limit random variable is an important concept in probability theory, and its applications to statistics and stochastic processes. The same concepts are known in more general mathematics as stochastic convergence and they formalize the idea that a sequence of essentially random or unpredictable events can sometimes be expected to settle down into a behavior that is essentially unchanging when items far enough into the sequence are studied. The different possible notions of convergence relate to how such a behavior can be characterized: two readily understood behaviors are that the sequence eventually takes a constant value, and that values in the sequence continue to change but can be described by an unchanging probability distribution.

Monte Carlo methods, or Monte Carlo experiments, are a broad class of computational algorithms that rely on repeated random sampling to obtain numerical results. The underlying concept is to use randomness to solve problems that might be deterministic in principle. They are often used in physical and mathematical problems and are most useful when it is difficult or impossible to use other approaches. Monte Carlo methods are mainly used in three problem classes: optimization, numerical integration, and generating draws from a probability distribution.

In statistics, maximum likelihood estimation (MLE) is a method of estimating the parameters of an assumed probability distribution, given some observed data. This is achieved by maximizing a likelihood function so that, under the assumed statistical model, the observed data is most probable. The point in the parameter space that maximizes the likelihood function is called the maximum likelihood estimate. The logic of maximum likelihood is both intuitive and flexible, and as such the method has become a dominant means of statistical inference.

Law of large numbers Averages of repeated trials converge to the expected value

In probability theory, the law of large numbers (LLN) is a theorem that describes the result of performing the same experiment a large number of times. According to the law, the average of the results obtained from a large number of trials should be close to the expected value and tends to become closer to the expected value as more trials are performed.

In statistics, Markov chain Monte Carlo (MCMC) methods comprise a class of algorithms for sampling from a probability distribution. By constructing a Markov chain that has the desired distribution as its equilibrium distribution, one can obtain a sample of the desired distribution by recording states from the chain. The more steps are included, the more closely the distribution of the sample matches the actual desired distribution. Various algorithms exist for constructing chains, including the Metropolis–Hastings algorithm.

In probability theory and statistics, a Gaussian process is a stochastic process, such that every finite collection of those random variables has a multivariate normal distribution, i.e. every finite linear combination of them is normally distributed. The distribution of a Gaussian process is the joint distribution of all those random variables, and as such, it is a distribution over functions with a continuous domain, e.g. time or space.

In mathematics, a low-discrepancy sequence is a sequence with the property that for all values of N, its subsequence x₁, ..., x_N has a low discrepancy.

In numerical analysis, the quasi-Monte Carlo method is a method for numerical integration and solving some other problems using low-discrepancy sequences. This is in contrast to the regular Monte Carlo method or Monte Carlo integration, which are based on sequences of pseudorandom numbers.

In statistics, importance sampling is a general technique for estimating properties of a particular distribution, while only having samples generated from a different distribution than the distribution of interest. The method was first introduced by Teun Kloek and Herman K. van Dijk in 1978, and is related to umbrella sampling in computational physics. Depending on the application, the term may refer to the process of sampling from this alternative distribution, the process of inference, or both.

In mathematics, Monte Carlo integration is a technique for numerical integration using random numbers. It is a particular Monte Carlo method that numerically computes a definite integral. While other algorithms usually evaluate the integrand at a regular grid, Monte Carlo randomly chooses points at which the integrand is evaluated. This method is particularly useful for higher-dimensional integrals.

Monte Carlo methods are used in corporate finance and mathematical finance to value and analyze (complex) instruments, portfolios and investments by simulating the various sources of uncertainty affecting their value, and then determining the distribution of their value over the range of resultant outcomes. This is usually done by help of stochastic asset models. The advantage of Monte Carlo methods over other techniques increases as the dimensions of the problem increase.

In probability theory, Donsker's theorem, named after Monroe D. Donsker, is a functional extension of the central limit theorem.

In mathematics, more specifically in the theory of Monte Carlo methods, variance reduction is a procedure used to increase the precision of the estimates that can be obtained for a given simulation or computational effort. Every output random variable from the simulation is associated with a variance which limits the precision of the simulation results. In order to make a simulation statistically efficient, i.e., to obtain a greater precision and smaller confidence intervals for the output random variable of interest, variance reduction techniques can be used. The main ones are common random numbers, antithetic variates, control variates, importance sampling, stratified sampling, moment matching, conditional Monte Carlo and quasi random variables. For simulation with black-box models subset simulation and line sampling can also be used. Under these headings are a variety of specialized techniques; for example, particle transport simulations make extensive use of "weight windows" and "splitting/Russian roulette" techniques, which are a form of importance sampling.

The cross-entropy (CE) method is a Monte Carlo method for importance sampling and optimization. It is applicable to both combinatorial and continuous problems, with either a static or noisy objective.

In statistics, a pivotal quantity or pivot is a function of observations and unobservable parameters such that the function's probability distribution does not depend on the unknown parameters. A pivot quantity need not be a statistic—the function and its value can depend on the parameters of the model, but its distribution must not. If it is a statistic, then it is known as an ancillary statistic.

Stochastic approximation methods are a family of iterative methods typically used for root-finding problems or for optimization problems. The recursive update rules of stochastic approximation methods can be used, among other things, for solving linear systems when the collected data is corrupted by noise, or for approximating extreme values of functions which cannot be computed directly, but only estimated via noisy observations.

In probability and statistics, the quantile function, associated with a probability distribution of a random variable, specifies the value of the random variable such that the probability of the variable being less than or equal to that value equals the given probability. Intuitively, the quantile function associates with a range at and below a probability input the likelihood that a random variable is realized in that range for some probability distribution. It is also called the percentile function, percent-point function or inverse cumulative distribution function.

High-dimensional integrals in hundreds or thousands of variables occur commonly in finance. These integrals have to be computed numerically to within a threshold $. If the integral is of dimension then in the worst case, where one has a guarantee of error at most, the computational complexity is typically of order . That is, the problem suffers the curse of dimensionality. In 1977 P. Boyle, University of Waterloo, proposed using Monte Carlo (MC) to evaluate options. Starting in early 1992, J. F. Traub, Columbia University, and a graduate student at the time, S. Paskov, used quasi-Monte Carlo (QMC) to price a Collateralized mortgage obligation with parameters specified by Goldman Sachs. Even though it was believed by the world's leading experts that QMC should not be used for high-dimensional integration, Paskov and Traub found that QMC beat MC by one to three orders of magnitude and also enjoyed other desirable attributes. Their results were first published in 1995. Today QMC is widely used in the financial sector to value financial derivatives; see list of books below.$

In the mathematical theory of random processes, the Markov chain central limit theorem has a conclusion somewhat similar in form to that of the classic central limit theorem (CLT) of probability theory, but the quantity in the role taken by the variance in the classic CLT has a more complicated definition. See also the general form of Bienaymé's identity.

References

↑ Giles, M. B. (2015). "Multilevel Monte Carlo Methods". Acta Numerica. 24: 259–328. arXiv: 1304.5472 . doi:10.1017/s096249291500001x.
↑ Giles, M. B. (2008). "Multilevel Monte Carlo Path Simulation". Operations Research. 56 (3): 607–617. CiteSeerX 10.1.1.121.713 . doi:10.1287/opre.1070.0496.
↑ Heinrich, S. (2001). "Multilevel Monte Carlo Methods". Lecture Notes in Computer Science (Multigrid Methods). Lecture Notes in Computer Science. Springer. 2179: 58–67. doi:10.1007/3-540-45346-6_5. ISBN 978-3-540-43043-8.
↑ Cliffe, A.; Giles, M. B.; Scheichl, R.; Teckentrup, A. (2011). "Multilevel Monte Carlo Methods and Applications to Elliptic PDEs with Random Coefficients" (PDF). Computing and Visualization in Science. 14 (1): 3–15. doi:10.1007/s00791-011-0160-x.
↑ Pisaroni, M.; Nobile, F. B.; Leyland, P. (2017). "A Continuation Multi Level Monte Carlo Method for Uncertainty Quantification in Compressible Inviscid Aerodynamics" (PDF). Computer Methods in Applied Mechanics and Engineering. 326 (C): 20–50. doi:10.1016/j.cma.2017.07.030. Archived from the original (PDF) on 2018-02-14.
↑ Haji-Ali, A. L.; Nobile, F.; Tempone, R. (2016). "Multi-Index Monte Carlo: When Sparsity Meets Sampling". Numerische Mathematik. 132 (4): 767–806. arXiv: 1405.3757 . doi:10.1007/s00211-015-0734-5.
↑ Giles, M. B.; Waterhouse, B. (2009). "Multilevel Quasi-Monte Carlo Path Simulation" (PDF). Advanced Financial Modelling, Radon Series on Computational and Applied Mathematics. De Gruyter: 165–181.
↑ Robbe, P.; Nuyens, D.; Vandewalle, S. (2017). "A Multi-Index Quasi-Monte Carlo Algorithm for Lognormal Diffusion Problems". SIAM Journal on Scientific Computing. 39 (5): A1811–C392. arXiv: 1608.03157 . doi:10.1137/16M1082561.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Giles, M. B. (2015). "Multilevel Monte Carlo Methods". Acta Numerica. 24: 259–328. arXiv: 1304.5472 . doi:10.1017/s096249291500001x.

[2] Giles, M. B. (2008). "Multilevel Monte Carlo Path Simulation". Operations Research. 56 (3): 607–617. CiteSeerX 10.1.1.121.713 . doi:10.1287/opre.1070.0496.

[3] Heinrich, S. (2001). "Multilevel Monte Carlo Methods". Lecture Notes in Computer Science (Multigrid Methods). Lecture Notes in Computer Science. Springer. 2179: 58–67. doi:10.1007/3-540-45346-6_5. ISBN 978-3-540-43043-8.

[4] Cliffe, A.; Giles, M. B.; Scheichl, R.; Teckentrup, A. (2011). "Multilevel Monte Carlo Methods and Applications to Elliptic PDEs with Random Coefficients" (PDF). Computing and Visualization in Science. 14 (1): 3–15. doi:10.1007/s00791-011-0160-x.

[5] Pisaroni, M.; Nobile, F. B.; Leyland, P. (2017). "A Continuation Multi Level Monte Carlo Method for Uncertainty Quantification in Compressible Inviscid Aerodynamics" (PDF). Computer Methods in Applied Mechanics and Engineering. 326 (C): 20–50. doi:10.1016/j.cma.2017.07.030. Archived from the original (PDF) on 2018-02-14.

[6] Haji-Ali, A. L.; Nobile, F.; Tempone, R. (2016). "Multi-Index Monte Carlo: When Sparsity Meets Sampling". Numerische Mathematik. 132 (4): 767–806. arXiv: 1405.3757 . doi:10.1007/s00211-015-0734-5.

[7] Giles, M. B.; Waterhouse, B. (2009). "Multilevel Quasi-Monte Carlo Path Simulation" (PDF). Advanced Financial Modelling, Radon Series on Computational and Applied Mathematics. De Gruyter: 165–181.

[8] Robbe, P.; Nuyens, D.; Vandewalle, S. (2017). "A Multi-Index Quasi-Monte Carlo Algorithm for Lognormal Diffusion Problems". SIAM Journal on Scientific Computing. 39 (5): A1811–C392. arXiv: 1608.03157 . doi:10.1137/16M1082561.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]