Latent Gaussian models
Let
denote the response variable (that is, the observations) which belongs to an exponential family, with the mean
(of
) being linked to a linear predictor
via an appropriate link function. The linear predictor can take the form of a (Bayesian) additive model. All latent effects (the linear predictor, the intercept, coefficients of possible covariates, and so on) are collectively denoted by the vector
. The hyperparameters of the model are denoted by
. As per Bayesian statistics,
and
are random variables with prior distributions.
The observations are assumed to be conditionally independent given
and
:
where
is the set of indices for observed elements of
(some elements may be unobserved, and for these INLA computes a posterior predictive distribution). Note that the linear predictor
is part of
.
For the model to be a latent Gaussian model, it is assumed that
is a Gaussian Markov Random Field (GMRF) [1] (that is, a multivariate Gaussian with additional conditional independence properties) with probability density
where
is a
-dependent sparse precision matrix and
is its determinant. The precision matrix is sparse due to the GMRF assumption. The prior distribution
for the hyperparameters need not be Gaussian. However, the number of hyperparameters,
, is assumed to be small (say, less than 15).
Approximate Bayesian inference with INLA
In Bayesian inference, one wants to solve for the posterior distribution of the latent variables
and
. Applying Bayes' theorem
the joint posterior distribution of
and
is given by
Obtaining the exact posterior is generally a very difficult problem. In INLA, the main aim is to approximate the posterior marginals
where
.
A key idea of INLA is to construct nested approximations given by
where
is an approximated posterior density. The approximation to the marginal density
is obtained in a nested fashion by first approximating
and
, and then numerically integrating out
as
where the summation is over the values of
, with integration weights given by
. The approximation of
is computed by numerically integrating
out from
.
To get the approximate distribution
, one can use the relation
as the starting point. Then
is obtained at a specific value of the hyperparameters
with Laplace's approximation [1]
where
is the Gaussian approximation to
whose mode at a given
is
. The mode can be found numerically for example with the Newton-Raphson method.
The trick in the Laplace approximation above is the fact that the Gaussian approximation is applied on the full conditional of
in the denominator since it is usually close to a Gaussian due to the GMRF property of
. Applying the approximation here improves the accuracy of the method, since the posterior
itself need not be close to a Gaussian, and so the Gaussian approximation is not directly applied on
. The second important property of a GMRF, the sparsity of the precision matrix
, is required for efficient computation of
for each value
. [1]
Obtaining the approximate distribution
is more involved, and the INLA method provides three options for this: Gaussian approximation, Laplace approximation, or the simplified Laplace approximation. [1] For the numerical integration to obtain
, also three options are available: grid search, central composite design, or empirical Bayes. [1]