Grey box model

Last updated

In mathematics, statistics, and computational modelling, a grey box model [1] [2] [3] [4] combines a partial theoretical structure with data to complete the model. The theoretical structure may vary from information on the smoothness of results, to models that need only parameter values from data or existing literature. [5] Thus, almost all models are grey box models as opposed to black box where no model form is assumed or white box models that are purely theoretical. Some models assume a special form such as a linear regression [6] [7] or neural network. [8] [9] These have special analysis methods. In particular linear regression techniques [10] are much more efficient than most non-linear techniques. [11] [12] The model can be deterministic or stochastic (i.e. containing random components) depending on its planned use.

Contents

Model form

The general case is a non-linear model with a partial theoretical structure and some unknown parts derived from data. Models with unlike theoretical structures need to be evaluated individually, [1] [13] [14] possibly using simulated annealing or genetic algorithms.

Within a particular model structure, parameters [14] [15] or variable parameter relations [5] [16] may need to be found. For a particular structure it is arbitrarily assumed that the data consists of sets of feed vectors f, product vectors p, and operating condition vectors c. [5] Typically c will contain values extracted from f, as well as other values. In many cases a model can be converted to a function of the form: [5] [17] [18]

m(f,p,q)

where the vector function m gives the errors between the data p, and the model predictions. The vector q gives some variable parameters that are the model's unknown parts.

The parameters q vary with the operating conditions c in a manner to be determined. [5] [17] This relation can be specified as q = Ac where A is a matrix of unknown coefficients, and c as in linear regression [6] [7] includes a constant term and possibly transformed values of the original operating conditions to obtain non-linear relations [19] [20] between the original operating conditions and q. It is then a matter of selecting which terms in A are non-zero and assigning their values. The model completion becomes an optimization problem to determine the non-zero values in A that minimizes the error terms m(f,p,Ac) over the data. [1] [16] [21] [22] [23]

Model completion

Once a selection of non-zero values is made, the remaining coefficients in A can be determined by minimizing m(f,p,Ac) over the data with respect to the nonzero values in A, typically by non-linear least squares. Selection of the nonzero terms can be done by optimization methods such as simulated annealing and evolutionary algorithms. Also the non-linear least squares can provide accuracy estimates [11] [15] for the elements of A that can be used to determine if they are significantly different from zero, thus providing a method of term selection. [24] [25]

It is sometimes possible to calculate values of q for each data set, directly or by non-linear least squares. Then the more efficient linear regression can be used to predict q using c thus selecting the non-zero values in A and estimating their values. Once the non-zero values are located non-linear least squares can be used on the original model m(f,p,Ac) to refine these values . [16] [21] [22]

A third method is model inversion, [5] [17] [18] which converts the non-linear m(f,p,Ac) into an approximate linear form in the elements of A, that can be examined using efficient term selection [24] [25] and evaluation of the linear regression. [10] For the simple case of a single q value (q = aTc) and an estimate q* of q. Putting dq = aTc  q* gives

m(f,p,aTc) = m(f,p,q* + dq) ≈ m(f,p.q*) + dq m’(f,p,q*) = m(f,p.q*) + (aTc − q*) m’(f,p,q*)

so that aT is now in a linear position with all other terms known, and thus can be analyzed by linear regression techniques. For more than one parameter the method extends in a direct manner. [5] [18] [17] After checking that the model has been improved this process can be repeated until convergence. This approach has the advantages that it does not need the parameters q to be able to be determined from an individual data set and the linear regression is on the original error terms [5]

Model validation

Where sufficient data is available, division of the data into a separate model construction set and one or two evaluation sets is recommended. This can be repeated using multiple selections of the construction set and the resulting models averaged or used to evaluate prediction differences.

A statistical test such as chi-squared on the residuals is not particularly useful. [26] The chi squared test requires known standard deviations which are seldom available, and failed tests give no indication of how to improve the model. [11] There are a range of methods to compare both nested and non nested models. These include comparison of model predictions with repeated data.

An attempt to predict the residuals m(, ) with the operating conditions c using linear regression will show if the residuals can be predicted. [21] [22] Residuals that cannot be predicted offer little prospect of improving the model using the current operating conditions. [5] Terms that do predict the residuals are prospective terms to incorporate into the model to improve its performance. [21]

The model inversion technique above can be used as a method of determining whether a model can be improved. In this case selection of nonzero terms is not so important and linear prediction can be done using the significant eigenvectors of the regression matrix. The values in A determined in this manner need to be substituted into the nonlinear model to assess improvements in the model errors. The absence of a significant improvement indicates the available data is not able to improve the current model form using the defined parameters. [5] Extra parameters can be inserted into the model to make this test more comprehensive.

See also

Related Research Articles

Least squares Approximation method in statistics

The method of least squares is a standard approach in regression analysis to approximate the solution of overdetermined systems by minimizing the sum of the squares of the residuals made in the results of every single equation.

In the statistical analysis of time series, autoregressive–moving-average (ARMA) models provide a parsimonious description of a (weakly) stationary stochastic process in terms of two polynomials, one for the autoregression (AR) and the second for the moving average (MA). The general ARMA model was described in the 1951 thesis of Peter Whittle, Hypothesis testing in time series analysis, and it was popularized in the 1970 book by George E. P. Box and Gwilym Jenkins.

Regression analysis Set of statistical processes for estimating the relationships among variables

In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships between a dependent variable and one or more independent variables. The most common form of regression analysis is linear regression, in which one finds the line that most closely fits the data according to a specific mathematical criterion. For example, the method of ordinary least squares computes the unique line that minimizes the sum of squared differences between the true data and that line. For specific mathematical reasons, this allows the researcher to estimate the conditional expectation of the dependent variable when the independent variables take on a given set of values. Less common forms of regression use slightly different procedures to estimate alternative location parameters or estimate the conditional expectation across a broader collection of non-linear models.

Tikhonov regularization Regularization technique for ill-posed problems

Tikhonov regularization, named for Andrey Tikhonov, is a method of regularization of ill-posed problems. Ridge regression is a special case of Tikhonov regularization in which all parameters are regularized equally. Ridge regression is particularly useful to mitigate the problem of multicollinearity in linear regression, which commonly occurs in models with large numbers of parameters. In general, the method provides improved efficiency in parameter estimation problems in exchange for a tolerable amount of bias.

Nonlinear regression

In statistics, nonlinear regression is a form of regression analysis in which observational data are modeled by a function which is a nonlinear combination of the model parameters and depends on one or more independent variables. The data are fitted by a method of successive approximations.

Coefficient of determination

In statistics, the coefficient of determination, denoted R2 or r2 and pronounced "R squared", is the proportion of the variance in the dependent variable that is predictable from the independent variable(s).

In statistics and econometrics, and in particular in time series analysis, an autoregressive integrated moving average (ARIMA) model is a generalization of an autoregressive moving average (ARMA) model. Both of these models are fitted to time series data either to better understand the data or to predict future points in the series (forecasting). ARIMA models are applied in some cases where data show evidence of non-stationarity in the sense of mean, where an initial differencing step can be applied one or more times to eliminate the non-stationarity of the mean function. When the seasonality shows in a time series, the seasonal-differencing could be applied to eliminate the seasonal component. Since the ARMA model, according to the Wold's decomposition theorem, is theoretically sufficient to describe a regular wide-sense stationary time series, we are motivated to make stationary a non-stationary time series, e.g., by using differencing, before we can use the ARMA model. Note that if the time series contains a predictable sub-process, the predictable component is treated as a non-zero-mean but periodic component in the ARIMA framework so that it is eliminated by the seasonal differencing.

Ordinary least squares Method for estimating the unknown parameters in a linear regression model

In statistics, ordinary least squares (OLS) is a type of linear least squares method for estimating the unknown parameters in a linear regression model. OLS chooses the parameters of a linear function of a set of explanatory variables by the principle of least squares: minimizing the sum of the squares of the differences between the observed dependent variable in the given dataset and those predicted by the linear function of the independent variable.

In statistics, the number of degrees of freedom is the number of values in the final calculation of a statistic that are free to vary.

In time series analysis, the Box–Jenkins method, named after the statisticians George Box and Gwilym Jenkins, applies autoregressive moving average (ARMA) or autoregressive integrated moving average (ARIMA) models to find the best fit of a time-series model to past values of a time series.

In statistics, the Bayesian information criterion (BIC) or Schwarz information criterion is a criterion for model selection among a finite set of models; the model with the lowest BIC is preferred. It is based, in part, on the likelihood function and it is closely related to the Akaike information criterion (AIC).

Poisson regression

In statistics, Poisson regression is a generalized linear model form of regression analysis used to model count data and contingency tables. Poisson regression assumes the response variable Y has a Poisson distribution, and assumes the logarithm of its expected value can be modeled by a linear combination of unknown parameters. A Poisson regression model is sometimes known as a log-linear model, especially when used to model contingency tables.

In statistics, a generalized additive model (GAM) is a generalized linear model in which the linear response variable depends linearly on unknown smooth functions of some predictor variables, and interest focuses on inference about these smooth functions. GAMs were originally developed by Trevor Hastie and Robert Tibshirani to blend properties of generalized linear models with additive models.

Spike-triggered average

The spike-triggered average (STA) is a tool for characterizing the response properties of a neuron using the spikes emitted in response to a time-varying stimulus. The STA provides an estimate of a neuron's linear receptive field. It is a useful technique for the analysis of electrophysiological data.

Non-linear least squares

Non-linear least squares is the form of least squares analysis used to fit a set of m observations with a model that is non-linear in n unknown parameters (m ≥ n). It is used in some forms of nonlinear regression. The basis of the method is to approximate the model by a linear one and to refine the parameters by successive iterations. There are many similarities to linear least squares, but also some significant differences. In economic theory, the non-linear least squares method is applied in (i) the probit regression, (ii) threshold regression, (iii) smooth regression, (iv) logistic link regression, (v) Box-Cox transformed regressors.

In statistics, multivariate adaptive regression splines (MARS) is a form of regression analysis introduced by Jerome H. Friedman in 1991. It is a non-parametric regression technique and can be seen as an extension of linear models that automatically models nonlinearities and interactions between variables.

Polynomial regression

In statistics, polynomial regression is a form of regression analysis in which the relationship between the independent variable x and the dependent variable y is modelled as an nth degree polynomial in x. Polynomial regression fits a nonlinear relationship between the value of x and the corresponding conditional mean of y, denoted E(y |x). Although polynomial regression fits a nonlinear model to the data, as a statistical estimation problem it is linear, in the sense that the regression function E(y | x) is linear in the unknown parameters that are estimated from the data. For this reason, polynomial regression is considered to be a special case of multiple linear regression.

Linear least squares

Linear least squares (LLS) is the least squares approximation of linear functions to data. It is a set of formulations for solving statistical problems involved in linear regression, including variants for ordinary (unweighted), weighted, and generalized (correlated) residuals. Numerical methods for linear least squares include inverting the matrix of the normal equations and orthogonal decomposition methods.

Linear seismic inversion Interpretation of seismic data using linear model

Inverse modeling is a mathematical technique where the objective is to determine the physical properties of the subsurface of an earth region that has produced a given seismogram. Cooke and Schneider (1983) defined it as calculation of the earth's structure and physical parameters from some set of observed seismic data. The underlying assumption in this method is that the collected seismic data are from an earth structure that matches the cross-section computed from the inversion algorithm. Some common earth properties that are inverted for include acoustic velocity, formation and fluid densities, acoustic impedance, Poisson's ratio, formation compressibility, shear rigidity, porosity, and fluid saturation.

Linear regression Statistical modeling method which shows linear correlation between variables

In statistics, linear regression is a linear approach to modelling the relationship between a scalar response and one or more explanatory variables. The case of one explanatory variable is called simple linear regression; for more than one, the process is called multiple linear regression. This term is distinct from multivariate linear regression, where multiple correlated dependent variables are predicted, rather than a single scalar variable.

References

  1. 1 2 3 Bohlin, Torsten P. (7 September 2006). Practical Grey-box Process Identification: Theory and Applications. Springer Science & Business Media. ISBN   978-1-84628-403-8.
  2. "Grey-box model estimation". Mathworks 2. 2012.
  3. Kroll, Andreas (2000). Grey-box models: Concepts and application. In: New Frontiers in Computational Intelligence and its Applications, vol.57 of Frontiers in artificial intelligence and applications, pp. 42-51. IOS Press, Amsterdam.
  4. Sohlberg, B., and Jacobsen, E.W., 2008. Grey box modelling - branches and experiences, Proc. 17th World Congress, Int. Federation of Automatic Control, Seoul. pp 11415-11420
  5. 1 2 3 4 5 6 7 8 9 10 Whiten, B., 2013. Model completion and validation using inversion of grey box models, ANZIAM J.,54 (CTAC 2012) pp C187–C199.
  6. 1 2 Draper, Norman R.; Smith, Harry (25 August 2014). Applied Regression Analysis. John Wiley & Sons. pp. 657–. ISBN   978-1-118-62568-2.
  7. 1 2 Weisberg, Sanford (25 November 2013). Applied Linear Regression. Wiley. ISBN   978-1-118-59485-8.
  8. Heaton, J., 2012. Introduction to the math of neural networks, Heaton Research Inc. (Chesterfield, MO), ISBN   978-1475190878
  9. Stergiou, C.; Siganos, D. (2013). "Neural networks". Archived from the original on 2009-12-16. Retrieved 2013-07-03.
  10. 1 2 Lawson, Charles L.; J. Hanson, Richard (1 December 1995). Solving Least Squares Problems. SIAM. ISBN   978-0-89871-356-5.
  11. 1 2 3 Press, W.H.; Teukolsky, S.A.; Vetterling, W.T.; Flannery, B.P. (2007). Numerical Recipes (3rd ed.). Cambridge University Press. ISBN   978-0-521-88068-8.
  12. Gelman, Andrew; Carlin, John B.; Stern, Hal S.; Dunson, David B.; Vehtari, Aki; Rubin, Donald B. (1 November 2013). Bayesian Data Analysis, Third Edition. CRC Press. ISBN   978-1-4398-4095-5.
  13. Mathworks, 2013. Supported grey box models
  14. 1 2 Hauth, J. (2008), Grey Box Modelling for Nonlinear Systems (PDF) (dissertation, Kaiserslautern University of Technology).
  15. 1 2 Nash, J.C. and Walker-Smith, M. 1987. Nonlinear parameter estimation, Marcel Dekker, Inc. (New York).
  16. 1 2 3 Whiten, W.J., 1971. Model building techniques applied to mineral treatment processes, Symp. on Automatic Control Systems in Mineral Processing Plants, (Australas. Inst. Min. Metall., S. Queensland Branch, Brisbane), 129-148.
  17. 1 2 3 4 Whiten, W.J., 1994. Determination of parameter relations within non-linear models, SIGNUM Newsletter, 29(3–4,) 2–5. 10.1145/192527.192535.
  18. 1 2 3 Whiten, B., 2014. Determining the form of ordinary differential equations using model inversion, ANZIAM J. 55 (EMAC2013) pp.C329–C347.
  19. Polynomial
  20. Spline (mathematics)
  21. 1 2 3 4 Kojovic, T., and Whiten W. J., 1994. Evaluation of the quality of simulation models, Innovations in mineral processing, (Lauretian University, Sudbury) pp 437–446. ISBN   088667025X
  22. 1 2 3 Kojovic, T., 1989. The development and application of Model - an automated model builder for mineral processing, PhD thesis, The University of Queensland.
  23. Xiao, J., 1998. Extensions of model building techniques and their applications in mineral processing, PhD thesis, The University of Queensland.
  24. 1 2 Linhart, H.; Zucchini, W. (1986). Model selection. Wiley. ISBN   978-0-471-83722-0.
  25. 1 2 Miller, Alan (15 April 2002). Subset Selection in Regression. CRC Press. ISBN   978-1-4200-3593-3.
  26. Deming, William Edwards (2000). Out of the Crisis p272. MIT Press. ISBN   978-0-262-54115-2.