Functional additive models

Last updated

In statistics, functional additive models (FAM) can be viewed as extensions of generalized functional linear models where the linearity assumption between the response (scalar or functional) and the functional linear predictor is replaced by an additivity assumption.

Contents

Overview

Functional Additive Model

In these models, functional predictors () are paired with responses () that can be either scalar or functional. The response can follow a continuous or discrete distribution and this distribution may be in the exponential family. In the latter case, there would be a canonical link that connects predictors and responses. Functional predictors (or responses) can be viewed as random trajectories generated by a square-integrable stochastic process. Using functional principal component analysis and the Karhunen-Loève expansion, these processes can be equivalently expressed as a countable sequence of their functional principal component scores (FPCs) and eigenfunctions. In the FAM [1] the responses (scalar or functional) conditional on the predictor functions are modeled as function of the functional principal component scores of the predictor function in an additive structure. This model can be categorized as a Frequency Additive Model since it is additive in the predictor FPC scores.

Continuously Additive Model

The Continuously Additive Model (CAM) [2] assumes additivity in the time domain. The functional predictors are assumed to be smooth across the time domain since the times contained in an interval domain are an uncountable set, an unrestricted time-additive model is not feasible. This motivates to approximate sums of additive functions by integrals so that the traditional vector additive model be replaced by a smooth additive surface. CAM can handle generalized responses paired with multiple functional predictors.

Functional Generalized Additive Model

The Functional Generalized Additive Model (FGAM) [3] is an extension of generalized additive model with a scalar response and a functional predictor. This model can also deal with multiple functional predictors. The CAM and the FGAM are essentially equivalent apart from implementation details and therefore can be covered under one description. They can be categorized as Time-Additive Models.

Functional Additive Model

Model

Functional Additive Model for scalar and functional responses respectively, are given by

where and are FPC scores of the processes and respectively, and are the eigenfunctions of processes and respectively, and and are arbitrary smooth functions.

To ensure identifiability one may require,

Implementation

The above model is considered under the assumption that the true FPC scores for predictor processes are known. In general, estimation in the generalized additive model requires backfitting algorithm or smooth backfitting to account for the dependencies between predictors. Now FPCs are always uncorrelated and if the predictor processes are assumed to be gaussian then the FPCs are independent. Then

similarly for functional responses

This simplifies the estimation and requires only one-dimensional smoothing of responses against individual predictor scores and will yield consistent estimates of In data analysis one needs to estimate before proceeding to infer the functions and , so there are errors in the predictors. functional principal component analysis generates estimates of for individual predictor trajectories along with estimates for eigenfunctions, eigenvalues, mean functions and covariance functions. Different smoothing methods can be applied to the data and to estimate and respectively.

The fitted Functional Additive Model for scalar response is given by

and the fitted Functional Additive Model for functional responses is by

Note: The truncation points and need to be chosen data-adaptively. Possible methods include pseudo-AIC, fraction of variance explained or minimization of prediction error or cross-validation.[ citation needed ]

Extensions

For the case of multiple functional predictors with a scalar response, the Functional Additive Model can be extended by fitting a functional regression which is additive in the FPCs of each of the predictor processes . The model considered here is Additive Functional Score Model (AFSM) given by

In case of multiple predictors the FPCs of different predictors are in general correlated and a smooth backfitting [4] technique has been developed to obtain consistent estimates of the component functions when the predictors are observed with errors having unknown distribution.

Continuously Additive Model

Model

Since the number of time points on an interval domain is uncountable, an unrestricted time-additive model is not feasible. Thus a sequence of time-additive models is considered on an increasingly dense finite time grid in leading to

where for a smooth bivariate function with (to ensure identifiability). In the limit this becomes the continuously additive model

Special Cases

Generalized Functional Linear Model

For the model reduces to generalized functional linear model

Functional Transformation Model

For non-Gaussian predictor process, where is a smooth transformation of reduces CAM to a Functional Transformation model.

Extensions

This model has also been introduced with a different notation under the name Functional Generalized Additive Model (FGAM). Adding a link function to the mean-response and applying a probability transformation to yields the FGAM given by

where is the intercept.
Note: For estimation and implementation see [2] [3]

Related Research Articles

<span class="mw-page-title-main">Dirac delta function</span> Generalized function whose value is zero everywhere except at zero

In mathematical analysis, the Dirac delta function, also known as the unit impulse, is a generalized function on the real numbers, whose value is zero everywhere except at zero, and whose integral over the entire real line is equal to one. Since there is no function having this property, modelling the delta "function" rigorously involves the use of limits or, as is common in mathematics, measure theory and the theory of distributions.

<span class="mw-page-title-main">Fourier transform</span> Mathematical transform that expresses a function of time as a function of frequency

In physics, engineering and mathematics, the Fourier transform (FT) is an integral transform that takes a function as input and outputs another function that describes the extent to which various frequencies are present in the original function. The output of the transform is a complex-valued function of frequency. The term Fourier transform refers to both this complex-valued function and the mathematical operation. When a distinction needs to be made, the output of the operation is sometimes called the frequency domain representation of the original function. The Fourier transform is analogous to decomposing the sound of a musical chord into the intensities of its constituent pitches.

<span class="mw-page-title-main">Differential operator</span> Typically linear operator defined in terms of differentiation of functions

In mathematics, a differential operator is an operator defined as a function of the differentiation operator. It is helpful, as a matter of notation first, to consider differentiation as an abstract operation that accepts a function and returns another function.

In differential geometry, the Ricci curvature tensor, named after Gregorio Ricci-Curbastro, is a geometric object which is determined by a choice of Riemannian or pseudo-Riemannian metric on a manifold. It can be considered, broadly, as a measure of the degree to which the geometry of a given metric tensor differs locally from that of ordinary Euclidean space or pseudo-Euclidean space.

In the field of mathematical optimization, stochastic programming is a framework for modeling optimization problems that involve uncertainty. A stochastic program is an optimization problem in which some or all problem parameters are uncertain, but follow known probability distributions. This framework contrasts with deterministic optimization, in which all problem parameters are assumed to be known exactly. The goal of stochastic programming is to find a decision which both optimizes some criteria chosen by the decision maker, and appropriately accounts for the uncertainty of the problem parameters. Because many real-world decisions involve uncertainty, stochastic programming has found applications in a broad range of areas ranging from finance to transportation to energy optimization.

In statistics, a generalized linear model (GLM) is a flexible generalization of ordinary linear regression. The GLM generalizes linear regression by allowing the linear model to be related to the response variable via a link function and by allowing the magnitude of the variance of each measurement to be a function of its predicted value.

Functional data analysis (FDA) is a branch of statistics that analyses data providing information about curves, surfaces or anything else varying over a continuum. In its most general form, under an FDA framework, each sample element of functional data is considered to be a random function. The physical continuum over which these functions are defined is often time, but may also be spatial location, wavelength, probability, etc. Intrinsically, functional data are infinite dimensional. The high intrinsic dimensionality of these data brings challenges for theory as well as computation, where these challenges vary with how the functional data were sampled. However, the high or infinite dimensional structure of the data is a rich source of information and there are many interesting challenges for research and data analysis.

In statistics, a generalized additive model (GAM) is a generalized linear model in which the linear response variable depends linearly on unknown smooth functions of some predictor variables, and interest focuses on inference about these smooth functions.

In probability theory and statistical mechanics, the Gaussian free field (GFF) is a Gaussian random field, a central model of random surfaces.

Smoothing splines are function estimates, , obtained from a set of noisy observations of the target , in order to balance a measure of goodness of fit of to with a derivative based measure of the smoothness of . They provide a means for smoothing noisy data. The most familiar example is the cubic smoothing spline, but there are many other possibilities, including for the case where is a vector quantity.

<span class="mw-page-title-main">Errors-in-variables models</span> Regression models accounting for possible errors in independent variables

In statistics, errors-in-variables models or measurement error models are regression models that account for measurement errors in the independent variables. In contrast, standard regression models assume that those regressors have been measured exactly, or observed without error; as such, those models account only for errors in the dependent variables, or responses.

Least-squares support-vector machines (LS-SVM) for statistics and in statistical modeling, are least-squares versions of support-vector machines (SVM), which are a set of related supervised learning methods that analyze data and recognize patterns, and which are used for classification and regression analysis. In this version one finds the solution by solving a set of linear equations instead of a convex quadratic programming (QP) problem for classical SVMs. Least-squares SVM classifiers were proposed by Johan Suykens and Joos Vandewalle. LS-SVMs are a class of kernel-based learning methods.

Functional principal component analysis (FPCA) is a statistical method for investigating the dominant modes of variation of functional data. Using this method, a random function is represented in the eigenbasis, which is an orthonormal basis of the Hilbert space L2 that consists of the eigenfunctions of the autocovariance operator. FPCA represents functional data in the most parsimonious way, in the sense that when using a fixed number of basis functions, the eigenfunction basis explains more variation than any other basis expansion. FPCA can be applied for representing random functions, or in functional regression and classification.

The generalized functional linear model (GFLM) is an extension of the generalized linear model (GLM) that allows one to regress univariate responses of various types on functional predictors, which are mostly random trajectories generated by a square-integrable stochastic processes. Similarly to GLM, a link function relates the expected value of the response variable to a linear predictor, which in case of GFLM is obtained by forming the scalar product of the random predictor function with a smooth parameter function . Functional Linear Regression, Functional Poisson Regression and Functional Binomial Regression, with the important Functional Logistic Regression included, are special cases of GFLM. Applications of GFLM include classification and discrimination of stochastic processes and functional data.

In statistics, linear regression is a statistical model which estimates the linear relationship between a scalar response and one or more explanatory variables. The case of one explanatory variable is called simple linear regression; for more than one, the process is called multiple linear regression. This term is distinct from multivariate linear regression, where multiple correlated dependent variables are predicted, rather than a single scalar variable. If the explanatory variables are measured with error then errors-in-variables models are required, also known as measurement error models.

In statistics, the class of vector generalized linear models (VGLMs) was proposed to enlarge the scope of models catered for by generalized linear models (GLMs). In particular, VGLMs allow for response variables outside the classical exponential family and for more than one parameter. Each parameter can be transformed by a link function. The VGLM framework is also large enough to naturally accommodate multiple responses; these are several independent responses each coming from a particular statistical distribution with possibly different parameter values.

In mathematics, topological recursion is a recursive definition of invariants of spectral curves. It has applications in enumerative geometry, random matrix theory, mathematical physics, string theory, knot theory.

Functional regression is a version of regression analysis when responses or covariates include functional data. Functional regression models can be classified into four types depending on whether the responses or covariates are functional or scalar: (i) scalar responses with functional covariates, (ii) functional responses with scalar covariates, (iii) functional responses with functional covariates, and (iv) scalar or functional responses with functional and scalar covariates. In addition, functional regression models can be linear, partially linear, or nonlinear. In particular, functional polynomial models, functional single and multiple index models and functional additive models are three special cases of functional nonlinear models.

A partially linear model is a form of semiparametric model, since it contains parametric and nonparametric elements. Application of the least squares estimators is available to partially linear model, if the hypothesis of the known of nonparametric element is valid. Partially linear equations were first used in the analysis of the relationship between temperature and usage of electricity by Engle, Granger, Rice and Weiss (1986). Typical application of partially linear model in the field of Microeconomics is presented by Tripathi in the case of profitability of firm's production in 1997. Also, partially linear model applied successfully in some other academic field. In 1994, Zeger and Diggle introduced partially linear model into biometrics. In environmental science, Parda-Sanchez et al. used partially linear model to analysis collected data in 2000. So far, partially linear model was optimized in many other statistic methods. In 1988, Robinson applied Nadaraya-Waston kernel estimator to test the nonparametric element to build a least-squares estimator After that, in 1997, local linear method was found by Truong.

Distributional data analysis is a branch of nonparametric statistics that is related to functional data analysis. It is concerned with random objects that are probability distributions, i.e., the statistical analysis of samples of random distributions where each atom of a sample is a distribution. One of the main challenges in distributional data analysis is that although the space of probability distributions is a convex space, it is not a vector space.

References

  1. Müller and Yao (2008). "Functional Additive Models". Journal of the American Statistical Association. 103 (484): 1534–1544. doi:10.1198/016214508000000751. S2CID   1927777.
  2. 1 2 Müller, Wu and Yao (2013). "Continuously Additive models for nonlinear functional regression". Biometrika. 100 (3): 607–622. CiteSeerX   10.1.1.698.4344 . doi:10.1093/biomet/ast004.
  3. 1 2 McLean; et al. (2014). "Functional Generalized additive models". Journal of Computational and Graphical Statistics. 23 (1): 249–269. doi:10.1080/10618600.2012.729985. PMC   3982924 . PMID   24729671.
  4. Han, Müller and Park (2017). "Smooth Backfitting for Additive Modeling with Small Errors-in-Variables, with an Application to Additive Functional Regression for Multiple Predictor Functions". Bernoulli.