Comparison of Gaussian process software

Last updated

This is a comparison of statistical analysis software that allows doing inference with Gaussian processes often using approximations.

Contents

This article is written from the point of view of Bayesian statistics, which may use a terminology different from the one commonly used in kriging. The next section should clarify the mathematical/computational meaning of the information provided in the table independently of contextual terminology.

Description of columns

This section details the meaning of the columns in the table below.

Solvers

These columns are about the algorithms used to solve the linear system defined by the prior covariance matrix, i.e., the matrix built by evaluating the kernel.

Input

These columns are about the points on which the Gaussian process is evaluated, i.e. if the process is .

Output

These columns are about the values yielded by the process, and how they are connected to the data used in the fit.

Hyperparameters

These columns are about finding values of variables which enter somehow in the definition of the specific problem but that can not be inferred by the Gaussian process fit, for example parameters in the formula of the kernel.

If both the "Prior" and "Posterior" cells contain "Manually", the software provides an interface for computing the marginal likelihood and its gradient w.r.t. hyperparameters, which can be feed into an optimization/sampling algorithm, e.g., gradient descent or Markov chain Monte Carlo.

Linear transformations

These columns are about the possibility of fitting datapoints simultaneously to a process and to linear transformations of it.

Comparison table

Name License Language SolversInputOutputHyperparametersLinear transformationsName
ExactSpecializedApproximateNDNon-realLikelihoodErrorsPriorPosteriorDeriv.FiniteSum
PyMC Apache Python YesKroneckerSparseNDNoAnyCorrelatedYesYesNoYesYes PyMC
Stan BSD, GPL customYesNoNoNDNoAnyCorrelatedYesYesNoYesYes Stan
scikit-learn BSD Python YesNoNoNDYesBernoulliUncorrelatedManuallyManuallyNoNoNo scikit-learn
fbm [7] Free C YesNoNoNDNoBernoulli, PoissonUncorrelated, StationaryManyYesNoNoYes fbm
GPML [8] [7] BSD MATLAB YesNoSparseNDNoManyi.i.d.ManuallyManuallyNoNoNo GPML
GPstuff [7] GNU GPL MATLAB, R YesMarkovSparseNDNoManyCorrelatedManyYesFirst RBFNoYes GPstuff
GPy [9] BSD Python YesNoSparseNDNoManyUncorrelatedYesYesNoNoNo GPy
GPflow [9] Apache Python YesNoSparseNDNoManyUncorrelatedYesYesNoNoNo GPflow
GPyTorch [10] MIT Python YesToeplitz, KroneckerSparseNDNoManyUncorrelatedYesYesFirst RBFManuallyManually GPyTorch
GPvecchia [11] GNU GPL R YesNoSparse, HierarchicalNDNoExponential familyUncorrelatedNoNoNoNoNo GPvecchia
pyGPs [12] BSD Python YesNoSparseNDGraphs, ManuallyBernoullii.i.d.ManuallyManuallyNoNoNo pyGPs
gptk [13] BSD R YesBlock?SparseNDNoGaussianNoManuallyManuallyNoNoNo gptk
celerite [3] MIT Python, Julia, C++ NoSemisep. [a] No1DNoGaussianUncorrelatedManuallyManuallyNoNoNo celerite
george [6] MIT Python, C++ YesNoHierarchicalNDNoGaussianUncorrelatedManuallyManuallyNoNoManually george
neural-tangents [14] [b] Apache Python YesBlock, KroneckerNoNDNoGaussianNoNoNoNoNoNo neural-tangents
DiceKriging [15] GNU GPL R YesNoNoNDNo?GaussianUncorrelatedSCAD RBFMAPNoNoNo DiceKriging
OpenTURNS [16] GNU LGPL Python, C++ YesNoNoNDNoGaussianUncorrelatedManually (no grad.)MAPNoNoNo OpenTURNS
UQLab [17] Proprietary MATLAB YesNoNoNDNoGaussianCorrelatedNoMAPNoNoNo UQLab
ooDACE [18] Proprietary MATLAB YesNoNoNDNoGaussianCorrelatedNoMAPNoNoNo ooDACE
DACE Proprietary MATLAB YesNoNoNDNoGaussianNoNoMAPNoNoNo DACE
GpGp MIT R NoNoSparseNDNoGaussiani.i.d.ManuallyManuallyNoNoNo GpGp
SuperGauss GNU GPL R, C++ NoToeplitz [c] No1DNoGaussianNoManuallyManuallyNoNoNo SuperGauss
STK GNU GPL MATLAB YesNoNoNDNoGaussianUncorrelatedManuallyManuallyNoNoManually STK
GSTools GNU LGPL Python YesNoNoNDNoGaussianYesYesYesYesNoNo GSTools
PyKrige BSD Python YesNoNo2D,3DNoGaussiani.i.d.NoNoNoNoNo PyKrige
GPR Apache C++ YesNoSparseNDNoGaussiani.i.d.Some, ManuallyManuallyFirstNoNo GPR
celerite2 MIT Python NoSemisep. [a] No1DNoGaussianUncorrelatedManually [d] ManuallyNoNoYes celerite2
SMT [19] [20] BSD Python YesNoSparse,PODI [e] ,otherNDNoGaussiani.i.d.SomeSomeFirstNoNo SMT
GPJax Apache Python YesNoSparseNDGraphsBernoulliNoYesYesNoNoNo GPJax
Stheno MIT Python YesLow rankSparseNDNoGaussiani.i.d.ManuallyManuallyApproximateNoYes Stheno
Egobox-gp [21] Apache Rust YesNoSparseNDNoGaussiani.i.d.NoMAPFirstNoNo Egobox-gp
Name License Language ExactSpecializedApproximateNDNon-realLikelihoodErrorsPriorPosteriorDeriv.FiniteSumName
SolversInputOutputHyperparametersLinear transformations

Notes

  1. 1 2 celerite implements only a specific subalgebra of kernels which can be solved in '"`UNIQ--math-00000003-QINU`"'.'"`UNIQ--ref-00000004-QINU`"'
  2. neural-tangents is a specialized package for infinitely wide neural networks.
  3. SuperGauss implements a superfast Toeplitz solver with computational complexity '"`UNIQ--math-00000007-QINU`"'.
  4. celerite2 has a PyMC3 interface.
  5. PODI (Proper Orthogonal Decomposition + Interpolation) is an approximation for high-dimensional multioutput regressions. The regression function is lower-dimensional than the outcomes, and the subspace is chosen with the PCA of the (outcome, dependent variable) data. Each principal component is modeled with an a priori independent Gaussian process.'"`UNIQ--ref-0000000B-QINU`"'

References