Bayes space

Last updated

Bayes space is a function space defined as an equivalence class of measures with the same null-sets. Two measures are defined to be equivalent if they are proportional. The basic ideas of Bayes spaces have their roots in Compositional Data Analysis and the Aitchison geometry. [1] Theoretical applications are mainly in statistics, specifically functional data analysis of density functions, aka density data analysis. [2] [3] [4] [5] Practical applications are in geochemistry [6] , COVID-19 modelling [7] , sediment analysis [8] and developmental research [9] . Alternative approaches to density analysis are based on the Wasserstein metric, often termed Wasserstein regression, have also been applied to medicine [10] .

Contents

The basic structure of the Bayes space is that of a vector space, with addition and multiplication being defined by perturbation and powering. [11] The space is formed over a -finite reference/base measure, denoted or depending on whether it is infinite or finite. Densities are considered as Radon-Nikodym derivatives of the measures with same null-sets as the base measure, and are equivalent if they are proportional. In case of finite base measures, Hilbert space structure can be achieved by defining a centered log-ratio transformation on the measures, mapping them to a subset of consisting of functions integrating to 0. [12]

Definitions and main results

Consider a finite base measure (not necessarily a probability measure) on a domain . This may be a uniform distribution on a bounded interval, or it can be a Radon-Nikodym derivative of the Lebesgue measure (the Gaussian distribution, for example). If we take two densities with respect to , they are said to be B-equivalent if there exists a s.t , denoted (the convention is used in cases where a measure is infinite). It can be shown that is an equivalence relation. The Bayes space is defined as the quotient space of all measures with the same null-sets in as under the equivalence relation .

The first challenge to analysing density functions is that is not linear space under ordinary addition and multiplication since the ordinary difference between two densities would not be non-negative everywhere. Like in the Aitchison geometry for finite dimensional data, perturbation and powering is defined for densities:

Perturbation

Powering

where are densities in and is some real number. It can be shown using the properties of multiplication and powering of real numbers that forms a vector space over the real numbers.

The definition of Bayes space does not strictly require a finite reference measure . If Bayes space is defined over an infinite reference measure , it must be -finite (like the Lebesgue measure). The finite reference measure is, however, necessary for adding Hilbert space structure to a subset of . Consider the subspace

. For , this is a linear subspace and isometrically isomorphic to the Hilbert space via the centered log-ratio (clr) transformation . The subspace of log-square integrable functions is termed the Bayes Hilbert space. It can be shown that the clr transformation is a linear isomorphism between the two spaces. Defining an inner product on as the inner product of the clr transformations will provide the Hilbert space structure for , obtaining the centered log-ratio transformation as a linear isometry.

Multivariate densities

The measure does not have to be univariate (one-dimensional), but can also be defined as a product measure on Cartesian products, characterising bivariate (two-dimensional) or multivariate densities. The geometric structure of Hilbert spaces can be used to decompose multivariate densities orthogonally into independent and interaction parts using the concept of "geometric marginals". [13] [14] [9] This decomposition has relations to copula theory. [14] The geometry in defines norms on densities that can be used to quantify "relative simplicial deviance", which is measure of how much of a bivariate distribution can be explained by the interaction part; [13] in the multivariate case the relative simplicial deviance can be generalised to the "information composition". [14]

See also

References

    1. Egozcue, J. J.; Díaz-Barrero, J. L.; Pawlowsky-Glahn, V. (2006). "Hilbert space of probability density functions based on Aitchison geometry" . Acta Mathematica Sinica. 22 (4): 1175–1183. doi:10.1007/s10114-005-0678-2.
    2. Menafoglio, A.; Guadagnini, A.; Secci, P. (2014). "A kriging approach based on Aitchison geometry for the characterization of particle-size curves in heterogeneous aquifers". Stochastic Environmental Research and Risk Assessment. 28 (7): 1835–1851. Bibcode:2014SERRA..28.1835M. doi:10.1007/s00477-014-0849-8.
    3. Hron, K.; Menafoglio, A.; Templ, M.; Hrůzová, K.; Filzmoser, P. (2016). "Simplicial principal component analysis for density functions in Bayes spaces". Computational Statistics & Data Analysis. 94: 330–350. doi:10.1016/j.csda.2015.07.007. hdl: 11311/973194 .
    4. Talská, R.; Hron, K.; Matys Grygar, T. (2021). "Compositional Scalar-on-function regression with application to sediment particle size distributions" . Mathematical Geosciences. 53 (7): 1667–1695. Bibcode:2021MatGe..53.1667T. doi:10.1007/s11004-021-09941-1.
    5. Maier, E.-M.; Stöcker, A.; Fitzenberger, B.; Greven, S. (2025). "Additive density-on-scalar regression in Bayes Hilbert spaces with an application to gender economics" . The Annals of Applied Statistics. 19: 680–700. doi:10.1214/24-AOAS1979.
    6. Matys Grygar, T.; Hošek, M.; Pacina, J.; Štojdl, J.; Bábek, O.; Sedláček, J.; Hron, K.; Talská, R.; Kříženecká, S.; Tolaszová, J. (2018-11-01). "Changes in the geochemistry of fluvial sediments after dam construction (the Chrudimka River, the Czech Republic)". Applied Geochemistry. 98: 94–108. doi:10.1016/j.apgeochem.2018.09.012. ISSN   0883-2927.
    7. Eckardt, Matthias; Mateu, Jorge; Greven, Sonja (2024-08-14). "Generalized functional additive mixed models with (functional) compositional covariates for areal Covid-19 incidence curves". Journal of the Royal Statistical Society Series C: Applied Statistics. 73 (4): 880–901. doi:10.1093/jrsssc/qlae016. ISSN   0035-9254.
    8. Martínez-Hernández, Israel; Euán, Carolina; Burr, Wesley S; Meis, Melanie; Blangiardo, Marta; Pirani, Monica (2025-01-13). "Modelling particle number size distribution: a continuous approach". Journal of the Royal Statistical Society Series C: Applied Statistics. 74 (1): 229–248. doi:10.1093/jrsssc/qlae053. ISSN   0035-9254.
    9. 1 2 Škorňa, S.; Machalová, J.; Burkotová, J.; Hron, K.; Greven, S. (2024). "Approximation of bivariate densities with compositional splines". arXiv: 2405.11615 [stat.ME].
    10. Matabuena, Marcos; Petersen, Alexander; Vidal, Juan C; Gude, Francisco (2021-06-01). "Glucodensities: A new representation of glucose profiles using distributional data analysis". Statistical Methods in Medical Research. 30 (6): 1445–1464. doi:10.1177/0962280221998064. ISSN   0962-2802. PMC   8189016 . PMID   33760665.
    11. van den Boogart, K. G.; Egozcue, J. J.; Pawlowsky-Glahn, V. (2010). "Bayes linear spaces". SORT: Statistics and Operations Research Transactions. 34: 201–222.
    12. van den Boogart, K. G.; Egozcue, J. J.; Pawlowsky-Glahn, V. (2014). "Bayes Hilbert spaces" . Australian & New Zealand Journal of Statistics. 56 (2): 171–194. doi:10.1111/anzs.12074.
    13. 1 2 Hron, K.; Machalová, J.; Menafoglio, A. (2023). "Bivariate densities in Bayes spaces: orthogonal decomposition and spline representation" . Statistical Papers. 64 (5): 1629–1667. doi:10.1007/s00362-022-01359-z.
    14. 1 2 3 Genest, C.; Hron, K.; Nešlehová, J. G. (2023). "Orthogonal decomposition of multivariate densities in Bayes spaces and relation with their copula-based representation". Journal of Multivariate Analysis. 198 (5) 105228. doi: 10.1016/j.jmva.2023.105228 .