Quantile Regression Averaging (QRA) is a forecast combination approach to the computation of prediction intervals. It involves applying quantile regression to the point forecasts of a small number of individual forecasting models or experts. It has been introduced in 2014 by Jakub Nowotarski and Rafał Weron [1] and originally used for probabilistic forecasting of electricity prices [2] [3] and loads. [4] [5] Despite its simplicity it has been found to perform extremely well in practice - the top two performing teams in the price track of the Global Energy Forecasting Competition (GEFCom2014) used variants of QRA. [6] [7]
The individual point forecasts are used as independent variables and the corresponding observed target variable as the dependent variable in a standard quantile regression setting. [8] The Quantile Regression Averaging method yields an interval forecast of the target variable, but does not use the prediction intervals of the individual methods. One of the reasons for using point forecasts (and not interval forecasts) is their availability. For years, forecasters have focused on obtaining accurate point predictions. Computing probabilistic forecasts, on the other hand, is generally a much more complex task and has not been discussed in the literature nor developed by practitioners so extensively. Therefore, QRA may be found particularly attractive from a practical point of view as it allows to leverage existing development of point forecasting.
The quantile regression problem can be written as follows:
,
where is the conditional q-th quantile of the dependent variable (), is a vector of point forecasts of individual models (i.e. independent variables) and βq is a vector of parameters (for quantile q). The parameters are estimated by minimizing the loss function for a particular q-th quantile:
QRA assigns weights to individual forecasting methods and combines them to yield forecasts of chosen quantiles. Although the QRA method is based on quantile regression, not least squares, it still suffers from the same problems: the exogenous variables should not be correlated strongly and the number of variables included in the model has to be relatively small in order for the method to be computationally efficient.
The main difficulty associated with applying QRA comes from the fact that only individual models that perform well and (preferably) are distinct should be used. However, there may be many well performing models or many different specifications of each model (with or without exogenous variables, with all or only selected lags, etc.) and it may not be optimal to include all of them in Quantile Regression Averaging.
In Factor Quantile Regression Averaging (FQRA), [3] instead of selecting individual models a priori, the relevant information contained in all forecasting models at hand is extracted using principal component analysis (PCA). The prediction intervals are then constructed on the basis of the common factors () obtained from the panel of point forecasts, as independent variables in a quantile regression. More precisely, in the FQRA method is a vector of factors extracted from a panel of point forecasts of individual models, not a vector of point forecasts of the individual models themselves. A similar principal component-type approach was proposed in the context of obtaining point forecasts from the Survey of Professional Forecasters data. [9]
Instead of considering a (large) panel of forecasts of the individual models, FQRA concentrates on a small number of common factors, which - by construction - are orthogonal to each other, and hence are contemporaneously uncorrelated. FQRA can be also interpreted as a forecast averaging approach. The factors estimated within PCA are linear combinations of individual vectors of the panel and FQRA can therefore be used to assign weights to the forecasting models directly.
QRA may be viewed as an extension of combining point forecasts. The well-known ordinary least squares (OLS) averaging [10] uses linear regression to estimate weights of the point forecasts of individual models. Replacing the quadratic loss function with the absolute loss function leads to quantile regression for the median, or in other words, least absolute deviation (LAD) regression. [11]
In statistical inference, specifically predictive inference, a prediction interval is an estimate of an interval in which a future observation will fall, with a certain probability, given what has already been observed. Prediction intervals are often used in regression analysis.
In probability theory and statistics, the logistic distribution is a continuous probability distribution. Its cumulative distribution function is the logistic function, which appears in logistic regression and feedforward neural networks. It resembles the normal distribution in shape but has heavier tails. The logistic distribution is a special case of the Tukey lambda distribution.
In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships between a dependent variable and one or more independent variables. The most common form of regression analysis is linear regression, in which one finds the line that most closely fits the data according to a specific mathematical criterion. For example, the method of ordinary least squares computes the unique line that minimizes the sum of squared differences between the true data and that line. For specific mathematical reasons, this allows the researcher to estimate the conditional expectation of the dependent variable when the independent variables take on a given set of values. Less common forms of regression use slightly different procedures to estimate alternative location parameters or estimate the conditional expectation across a broader collection of non-linear models.
In statistics, a confidence region is a multi-dimensional generalization of a confidence interval. It is a set of points in an n-dimensional space, often represented as an ellipsoid around a point which is an estimated solution to a problem, although other shapes can occur.
In statistics, nonlinear regression is a form of regression analysis in which observational data are modeled by a function which is a nonlinear combination of the model parameters and depends on one or more independent variables. The data are fitted by a method of successive approximations (iterations).
In statistics, the coefficient of determination, denoted R2 or r2 and pronounced "R squared", is the proportion of the variation in the dependent variable that is predictable from the independent variable(s).
In statistics, ordinary least squares (OLS) is a type of linear least squares method for choosing the unknown parameters in a linear regression model by the principle of least squares: minimizing the sum of the squares of the differences between the observed dependent variable in the input dataset and the output of the (linear) function of the independent variable.
Regression dilution, also known as regression attenuation, is the biasing of the linear regression slope towards zero, caused by errors in the independent variable.
In statistics, simple linear regression (SLR) is a linear regression model with a single explanatory variable. That is, it concerns two-dimensional sample points with one independent variable and one dependent variable and finds a linear function that, as accurately as possible, predicts the dependent variable values as a function of the independent variable. The adjective simple refers to the fact that the outcome variable is related to a single predictor.
Probabilistic forecasting summarizes what is known about, or opinions about, future events. In contrast to single-valued forecasts, probabilistic forecasts assign a probability to each of a number of different outcomes, and the complete set of probabilities represents a probability forecast. Thus, probabilistic forecasting is a type of probabilistic classification.
In statistics, binomial regression is a regression analysis technique in which the response has a binomial distribution: it is the number of successes in a series of independent Bernoulli trials, where each trial has probability of success . In binomial regression, the probability of a success is related to explanatory variables: the corresponding concept in ordinary regression is to relate the mean value of the unobserved response to explanatory variables.
The root-mean-square deviation (RMSD) or root-mean-square error (RMSE) is either one of two closely related and frequently used measures of the differences between true or predicted values on the one hand and observed values or an estimator on the other.
Quantile regression is a type of regression analysis used in statistics and econometrics. Whereas the method of least squares estimates the conditional mean of the response variable across values of the predictor variables, quantile regression estimates the conditional median of the response variable. Quantile regression is an extension of linear regression used when the conditions of linear regression are not met.
Demand forecasting is the prediction of the quantity of goods and services that will be demanded by consumers at a future point in time. More specifically, the methods of demand forecasting entail using predictive analytics to estimate customer demand in consideration of key economic conditions. This is an important tool in optimizing business profitability through efficient supply chain management. Demand forecasting methods are divided into two major categories, qualitative and quantitative methods. Qualitative methods are based on expert opinion and information gathered from the field. This method is mostly used in situations when there is minimal data available for analysis such as when a business or product has recently been introduced to the market. Quantitative methods, however, use available data, and analytical tools in order to produce predictions. Demand forecasting may be used in resource allocation, inventory management, assessing future capacity requirements, or making decisions on whether to enter a new market.
In statistics and machine learning, lasso is a regression analysis method that performs both variable selection and regularization in order to enhance the prediction accuracy and interpretability of the resulting statistical model. The lasso method assumes that the coefficients of the linear model are sparse, meaning that few of them are non-zero. It was originally introduced in geophysics, and later by Robert Tibshirani, who coined the term.
The Generalized Additive Model for Location, Scale and Shape (GAMLSS) is an approach to statistical modelling and learning. GAMLSS is a modern distribution-based approach to (semiparametric) regression. A parametric distribution is assumed for the response (target) variable but the parameters of this distribution can vary according to explanatory variables using linear, nonlinear or smooth functions. In machine learning parlance, GAMLSS is a form of supervised machine learning.
A consensus forecast is a prediction of the future created by combining several separate forecasts which have often been created using different methodologies. They are used in a number of sciences, ranging from econometrics to meteorology, and are also known as combining forecasts, forecast averaging or model averaging and committee machines, ensemble averaging or expert aggregation.
Electricity price forecasting (EPF) is a branch of energy forecasting which focuses on using mathematical, statistical and machine learning models to predict electricity prices in the future. Over the last 30 years electricity price forecasts have become a fundamental input to energy companies’ decision-making mechanisms at the corporate level.
In statistics, linear regression is a statistical model which estimates the linear relationship between a scalar response and one or more explanatory variables. The case of one explanatory variable is called simple linear regression; for more than one, the process is called multiple linear regression. This term is distinct from multivariate linear regression, where multiple correlated dependent variables are predicted, rather than a single scalar variable. If the explanatory variables are measured with error then errors-in-variables models are required, also known as measurement error models.
Non-homogeneous Gaussian regression (NGR) is a type of statistical regression analysis used in the atmospheric sciences as a way to convert ensemble forecasts into probabilistic forecasts. Relative to simple linear regression, NGR uses the ensemble spread as an additional predictor, which is used to improve the prediction of uncertainty and allows the predicted uncertainty to vary from case to case. The prediction of uncertainty in NGR is derived from both past forecast errors statistics and the ensemble spread. NGR was originally developed for site-specific medium range temperature forecasting, but has since also been applied to site-specific medium-range wind forecasting and to seasonal forecasts, and has been adapted for precipitation forecasting. The introduction of NGR was the first demonstration that probabilistic forecasts that take account of the varying ensemble spread could achieve better skill scores than forecasts based on standard Model output statistics approaches applied to the ensemble mean.