Interval predictor model

Last updated

In regression analysis, an interval predictor model (IPM) is an approach to regression where bounds on the function to be approximated are obtained. This differs from other techniques in machine learning, where usually one wishes to estimate point values or an entire probability distribution. Interval Predictor Models are sometimes referred to as a nonparametric regression technique, because a potentially infinite set of functions are contained by the IPM, and no specific distribution is implied for the regressed variables.

Contents

Multiple-input multiple-output IPMs for multi-point data commonly used to represent functions have been recently developed. [1] These IPM prescribe the parameters of the model as a path-connected, semi-algebraic set using sliced-normal [2] or sliced-exponential distributions. [3] A key advantage of this approach is its ability to characterize complex parameter dependencies to varying fidelity levels. This practice enables the analyst to adjust the desired level of conservatism in the prediction.

As a consequence of the theory of scenario optimization, in many cases rigorous predictions can be made regarding the performance of the model at test time. [4] Hence an interval predictor model can be seen as a guaranteed bound on quantile regression. Interval predictor models can also be seen as a way to prescribe the support of random predictor models, of which a Gaussian process is a specific case . [5]

Convex interval predictor models

Typically the interval predictor model is created by specifying a parametric function, which is usually chosen to be the product of a parameter vector and a basis. Usually the basis is made up of polynomial features or a radial basis is sometimes used. Then a convex set is assigned to the parameter vector, and the size of the convex set is minimized such that every possible data point can be predicted by one possible value of the parameters. Ellipsoidal parameters sets were used by Campi (2009), which yield a convex optimization program to train the IPM. [4] Crespo (2016) proposed the use of a hyperrectangular parameter set, which results in a convenient, linear form for the bounds of the IPM. [6] Hence the IPM can be trained with a linear optimization program:

where the training data examples are and , and the Interval Predictor Model bounds and are parameterised by the parameter vector . The reliability of such an IPM is obtained by noting that for a convex IPM the number of support constraints is less than the dimensionality of the trainable parameters, and hence the scenario approach can be applied.

Lacerda (2017) demonstrated that this approach can be extended to situations where the training data is interval valued rather than point valued. [7]

Non-convex interval predictor models

In Campi (2015) a non-convex theory of scenario optimization was proposed. [8] This involves measuring the number of support constraints, , for the Interval Predictor Model after training and hence making predictions about the reliability of the model. This enables non-convex IPMs to be created, such as a single layer neural network. Campi (2015) demonstrates that an algorithm where the scenario optimization program is only solved times which can determine the reliability of the model at test time without a prior evaluation on a validation set. [8] This is achieved by solving the optimisation program

where the interval predictor model center line , and the model width . This results in an IPM which makes predictions with homoscedastic uncertainty.

Sadeghi (2019) demonstrates that the non-convex scenario approach from Campi (2015) can be extended to train deeper neural networks which predict intervals with hetreoscedastic uncertainty on datasets with imprecision. [9] This is achieved by proposing generalizations to the max-error loss function given by

which is equivalent to solving the optimisation program proposed by Campi (2015).

Applications

Initially, scenario optimization was applied to robust control problems. [10]

Crespo (2015) and (2021) applied Interval Predictor Models to the design of space radiation shielding [11] and to system identification. [12]

In Patelli (2017), Faes (2019), and Crespo (2018), Interval Predictor models were applied to the structural reliability analysis problem. [13] [5] [14] Brandt (2017) applies interval predictor models to fatigue damage estimation of offshore wind turbines jacket substructures. [15]

Garatti (2019) proved that Chebyshev layers (i.e., the minimax layers around functions fitted by linear -regression) belong to a particular class of Interval Predictor Models, for which the reliability is invariant with respect to the distribution of the data. [16]


Software implementations

PyIPM provides an open-source Python implementation of the work of Crespo (2015). [17] OpenCOSSAN provides a Matlab implementation of the work of Crespo (2015). [13]

Related Research Articles

<span class="mw-page-title-main">Supervised learning</span> A paradigm in machine learning

Supervised learning (SL) is a paradigm in machine learning where input objects and a desired output value train a model. The training data is processed, building a function that maps new data on expected output values. An optimal scenario will allow for the algorithm to correctly determine output values for unseen instances. This requires the learning algorithm to generalize from the training data to unseen situations in a "reasonable" way. This statistical quality of an algorithm is measured through the so-called generalization error.

<span class="mw-page-title-main">Statistical inference</span> Process of using data analysis

Statistical inference is the process of using data analysis to infer properties of an underlying distribution of probability. Inferential statistical analysis infers properties of a population, for example by testing hypotheses and deriving estimates. It is assumed that the observed data set is sampled from a larger population.

<span class="mw-page-title-main">Least squares</span> Approximation method in statistics

The method of least squares is a parameters estimation method in regression analysis based on minimizing the sum of the squares of the residuals made in the results of each individual equation.

Sensitivity analysis is the study of how the uncertainty in the output of a mathematical model or system can be divided and allocated to different sources of uncertainty in its inputs. A related practice is uncertainty analysis, which has a greater focus on uncertainty quantification and propagation of uncertainty; ideally, uncertainty and sensitivity analysis should be run in tandem.

<span class="mw-page-title-main">Regression analysis</span> Set of statistical processes for estimating the relationships among variables

In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships between a dependent variable and one or more independent variables. The most common form of regression analysis is linear regression, in which one finds the line that most closely fits the data according to a specific mathematical criterion. For example, the method of ordinary least squares computes the unique line that minimizes the sum of squared differences between the true data and that line. For specific mathematical reasons, this allows the researcher to estimate the conditional expectation of the dependent variable when the independent variables take on a given set of values. Less common forms of regression use slightly different procedures to estimate alternative location parameters or estimate the conditional expectation across a broader collection of non-linear models.

<span class="mw-page-title-main">Coefficient of determination</span> Indicator for how well data points fit a line or curve

In statistics, the coefficient of determination, denoted R2 or r2 and pronounced "R squared", is the proportion of the variation in the dependent variable that is predictable from the independent variable(s).

Convex optimization is a subfield of mathematical optimization that studies the problem of minimizing convex functions over convex sets. Many classes of convex optimization problems admit polynomial-time algorithms, whereas mathematical optimization is in general NP-hard. With recent advancements in computing and optimization algorithms, convex programming is nearly as straightforward as linear programming.

In statistics, Poisson regression is a generalized linear model form of regression analysis used to model count data and contingency tables. Poisson regression assumes the response variable Y has a Poisson distribution, and assumes the logarithm of its expected value can be modeled by a linear combination of unknown parameters. A Poisson regression model is sometimes known as a log-linear model, especially when used to model contingency tables.

In statistics, a generalized additive model (GAM) is a generalized linear model in which the linear response variable depends linearly on unknown smooth functions of some predictor variables, and interest focuses on inference about these smooth functions.

Uncertainty quantification (UQ) is the science of quantitative characterization and estimation of uncertainties in both computational and real world applications. It tries to determine how likely certain outcomes are if some aspects of the system are not exactly known. An example would be to predict the acceleration of a human body in a head-on crash with another car: even if the speed was exactly known, small differences in the manufacturing of individual cars, how tightly every bolt has been tightened, etc., will lead to different results that can only be predicted in a statistical sense.

Stochastic optimization (SO) methods are optimization methods that generate and use random variables. For stochastic problems, the random variables appear in the formulation of the optimization problem itself, which involves random objective functions or random constraints. Stochastic optimization methods also include methods with random iterates. Some stochastic optimization methods use random iterates to solve stochastic problems, combining both meanings of stochastic optimization. Stochastic optimization methods generalize deterministic methods for deterministic problems.

Robust optimization is a field of mathematical optimization theory that deals with optimization problems in which a certain measure of robustness is sought against uncertainty that can be represented as deterministic variability in the value of the parameters of the problem itself and/or its solution. It is related to, but often distinguished from, probabilistic optimization methods such as chance-constrained optimization.

<span class="mw-page-title-main">Probabilistic design</span> Discipline within engineering design

Probabilistic design is a discipline within engineering design. It deals primarily with the consideration and minimization of the effects of random variability upon the performance of an engineering system during the design phase. Typically, these effects studied and optimized are related to quality and reliability. It differs from the classical approach to design by assuming a small probability of failure instead of using the safety factor. Probabilistic design is used in a variety of different applications to assess the likelihood of failure. Disciplines which extensively use probabilistic design principles include product design, quality control, systems engineering, machine design, civil engineering and manufacturing.

Polynomial chaos (PC), also called polynomial chaos expansion (PCE) and Wiener chaos expansion, is a method for representing a random variable in terms of a polynomial function of other random variables. The polynomials are chosen to be orthogonal with respect to the joint probability distribution of these random variables. PCE can be used, e.g., to determine the evolution of uncertainty in a dynamical system when there is probabilistic uncertainty in the system parameters. Note that despite its name, PCE has no immediate connections to chaos theory.

Discriminative models, also referred to as conditional models, are a class of logistical models used for classification or regression. They distinguish decision boundaries through observed data, such as pass/fail, win/lose, alive/dead or healthy/sick.

In statistics and machine learning, lasso is a regression analysis method that performs both variable selection and regularization in order to enhance the prediction accuracy and interpretability of the resulting statistical model. It was originally introduced in geophysics, and later by Robert Tibshirani, who coined the term.

The scenario approach or scenario optimization approach is a technique for obtaining solutions to robust optimization and chance-constrained optimization problems based on a sample of the constraints. It also relates to inductive reasoning in modeling and decision-making. The technique has existed for decades as a heuristic approach and has more recently been given a systematic theoretical foundation.

In statistics, linear regression is a linear approach for modelling a predictive relationship between a scalar response and one or more explanatory variables, which are measured without error. The case of one explanatory variable is called simple linear regression; for more than one, the process is called multiple linear regression. This term is distinct from multivariate linear regression, where multiple correlated dependent variables are predicted, rather than a single scalar variable. If the explanatory variables are measured with error then errors-in-variables models are required, also known as measurement error models.

In probability theory and statistics, the discrete Weibull distribution is the discrete variant of the Weibull distribution. The Discrete Weibull Distribution, first introduced by Toshio Nakagawa and Shunji Osaki, is a discrete analog of the continuous Weibull distribution, predominantly used in reliability engineering. It is particularly applicable for modeling failure data measured in discrete units like cycles or shocks. This distribution provides a versatile tool for analyzing scenarios where the timing of events is counted in distinct intervals, making it distinctively useful in fields that deal with discrete data patterns and reliability analysis.

Gradient-enhanced kriging (GEK) is a surrogate modeling technique used in engineering. A surrogate model is a prediction of the output of an expensive computer code. This prediction is based on a small number of evaluations of the expensive computer code.

References

  1. Crespo, Luis G.; Kenny, Sean P.; Colbert, Brendon K.; Slagel, Tanner (2021). "Interval Predictor Models for Robust System Identification". 2021 60th IEEE Conference on Decision and Control (CDC). pp. 872–879. doi:10.1109/CDC45484.2021.9683582. ISBN   978-1-6654-3659-5. S2CID   246479771.
  2. Crespo, Luis; Colbert, Brendon; Kenny, Sean; Giesy, Daniel (2019). "On the quantification of aleatory and epistemic uncertainty using Sliced-Normal distributions". Systems and Control Letters. 34: 104560. doi:10.1016/j.sysconle.2019.104560. S2CID   209339118.
  3. Crespo, Luis G.; Colbert, Brendon K.; Slager, Tanner; Kenny, Sean P. (2021). "Robust Estimation of Sliced-Exponential Distributions". 2021 60th IEEE Conference on Decision and Control (CDC). pp. 6742–6748. doi:10.1109/CDC45484.2021.9683584. ISBN   978-1-6654-3659-5. S2CID   246476974.
  4. 1 2 Campi, M.C.; Calafiore, G.; Garatti, S. (2009). "Interval predictor models: Identification and reliability". Automatica. 45 (2): 382–392. doi:10.1016/j.automatica.2008.09.004. ISSN   0005-1098.
  5. 1 2 Crespo, Luis G.; Kenny, Sean P.; Giesy, Daniel P. (2018). "Staircase predictor models for reliability and risk analysis". Structural Safety. 75: 35–44. doi:10.1016/j.strusafe.2018.05.002. ISSN   0167-4730. S2CID   126167977.
  6. Crespo, Luis G.; Kenny, Sean P.; Giesy, Daniel P. (2016). "Interval Predictor Models With a Linear Parameter Dependency". Journal of Verification, Validation and Uncertainty Quantification. 1 (2): 021007. doi:10.1115/1.4032070. ISSN   2377-2158.
  7. Lacerda, Marcio J.; Crespo, Luis G. (2017). "Interval predictor models for data with measurement uncertainty". 2017 American Control Conference (ACC). pp. 1487–1492. doi:10.23919/ACC.2017.7963163. hdl:2060/20170005690. ISBN   978-1-5090-5992-8. S2CID   3713493.
  8. 1 2 Campi, Marco C.; Garatti, Simone; Ramponi, Federico A. (2015). "Non-convex scenario optimization with application to system identification". 2015 54th IEEE Conference on Decision and Control (CDC). pp. 4023–4028. doi:10.1109/CDC.2015.7402845. ISBN   978-1-4799-7886-1. S2CID   127406.
  9. Sadeghi, Jonathan C.; De Angelis, Marco; Patelli, Edoardo (2019). "Efficient Training of Interval Neural Networks for Imprecise Training Data". Neural Networks. 118: 338–351. doi:10.1016/j.neunet.2019.07.005. PMID   31369950. S2CID   199383010.
  10. Campi, Marco C.; Garatti, Simone; Prandini, Maria (2009). "The scenario approach for systems and control design". Annual Reviews in Control. 33 (2): 149–157. doi:10.1016/j.arcontrol.2009.07.001. ISSN   1367-5788.
  11. Crespo, Luis G.; Kenny, Sean P.; Giesy, Daniel P.; Norman, Ryan B.; Blattnig, Steve (2016). "Application of Interval Predictor Models to Space Radiation Shielding". 18th AIAA Non-Deterministic Approaches Conference. doi:10.2514/6.2016-0431. hdl:2060/20160007750. ISBN   978-1-62410-397-1. S2CID   124192684.
  12. Crespo, Luis G.; Kenny, Sean P.; Colbert, Brendon K.; Slagel, Tanner (2021). "Interval Predictor Models for Robust System Identification". 2021 60th IEEE Conference on Decision and Control (CDC). pp. 872–879. doi:10.1109/CDC45484.2021.9683582. ISBN   978-1-6654-3659-5. S2CID   246479771.
  13. 1 2 Patelli, Edoardo; Broggi, Matteo; Tolo, Silvia; Sadeghi, Jonathan (2017). "Cossan Software: A Multidisciplinary and Collaborative Software for Uncertainty Quantification". Proceedings of the 2nd International Conference on Uncertainty Quantification in Computational Sciences and Engineering (UNCECOMP 2017). pp. 212–224. doi:10.7712/120217.5364.16982. ISBN   978-618-82844-4-9.
  14. Faes, Matthias; Sadeghi, Jonathan; Broggi, Matteo; De Angelis, Marco; Patelli, Edoardo; Beer, Michael; Moens, David (2019). "On the robust estimation of small failure probabilities for strong non-linear models". ASCE-ASME Journal of Risk and Uncertainty in Engineering Systems, Part B: Mechanical Engineering. 5 (4). doi:10.1115/1.4044044. ISSN   2332-9017. S2CID   197472507.
  15. Brandt, Sebastian; Broggi, Matteo; Hafele, Jan; Guillermo Gebhardt, Cristian; Rolfes, Raimund; Beer, Michael (2017). "Meta-models for fatigue damage estimation of offshore wind turbines jacket substructures". Procedia Engineering. 199: 1158–1163. doi: 10.1016/j.proeng.2017.09.292 . ISSN   1877-7058.
  16. Garatti, S.; Campi, M.C.; Carè, A. (2019). "On a class of Interval Predictor Models with universal reliability". Automatica. 110: 108542. doi:10.1016/j.automatica.2019.108542. hdl: 11311/1121161 . ISSN   0005-1098. S2CID   204188183.
  17. Sadeghi, Jonathan (2019). "PyIPM". doi:10.5281/zenodo.2784750.{{cite journal}}: Cite journal requires |journal= (help)