Developer(s) | U.S. Census Bureau |
---|---|
Stable release | 3.0 (Windows) / June 15, 2020 |
Repository | |
Operating system | Windows, Linux/Unix |
Type | Statistical software |
License | Public domain [1] (in the US; and copyright granted elsewhere) [2] |
Website | www |
X-13ARIMA-SEATS, successor to X-12-ARIMA and X-11, is a set of statistical methods for seasonal adjustment and other descriptive analysis of time series data that are implemented in the U.S. Census Bureau's software package. [3] These methods are or have been used by Statistics Canada, Australian Bureau of Statistics, and the statistical offices of many other countries. [4] [5]
X-12-ARIMA can be used together with many statistical packages, such as SAS in its econometric and time series (ETS) package, R in its (seasonal) package, [6] Gretl or EViews which provides a graphical user interface for X-12-ARIMA, and NumXL which avails X-12-ARIMA functionality in Microsoft Excel. [7] There is also a version for Matlab. [8]
Notable statistical agencies presently[ when? ] using X-12-ARIMA for seasonal adjustment include Statistics Canada, [9] the U.S. Bureau of Labor Statistics [10] and Census and Statistics Department (Hong Kong). [11] The Brazilian Institute of Geography and Statistics uses X-13-ARIMA. [12]
X-12-ARIMA was the successor to X-11-ARIMA; the current version is X-13ARIMA-SEATS. [13]
X-13-ARIMA-SEATS's source code can be found on the Census Bureau's website. [1]
The default method for seasonal adjustment is based on the X-11 algorithm. It is assumed that the observations in a time series, , can be decomposed additively,
or multiplicatively,
In this decomposition, is the trend (or the "trend cycle" because it also includes cyclical movements such as business cycles) component, is the seasonal component, and is the irregular (or random) component. The goal is to estimate each of the three components and then remove the seasonal component from the time series, producing a seasonally adjusted time series. [14]
The decomposition is accomplished through the iterative application of centered moving averages. For an additive decomposition of a monthly time series, for example, the algorithm follows the following pattern:
The method also includes a number of tests, diagnostics and other statistics for evaluating the quality of the seasonal adjustments.
The software is US government work, and those are in the public domain (in the US); for this software copyright has also been granted for other countries; the "User agrees to make a good faith effort to use the Software in a way that does not cause damage, harm, or embarrassment to the United States/Commerce." [2]
Principal component analysis (PCA) is a popular technique for analyzing large datasets containing a high number of dimensions/features per observation, increasing the interpretability of data while preserving the maximum amount of information, and enabling the visualization of multidimensional data. Formally, PCA is a statistical technique for reducing the dimensionality of a dataset. This is accomplished by linearly transforming the data into a new coordinate system where the variation in the data can be described with fewer dimensions than the initial data. Many studies use the first two principal components in order to plot the data in two dimensions and to visually identify clusters of closely related data points. Principal component analysis has applications in many fields such as population genetics, microbiome studies, and atmospheric science.
A depth buffer, also known as a z-buffer, is a type of data buffer used in computer graphics to represent depth information of objects in 3D space from a particular perspective. Depth buffers are an aid to rendering a scene to ensure that the correct polygons properly occlude other polygons. Z-buffering was first described in 1974 by Wolfgang Straßer in his PhD thesis on fast algorithms for rendering occluded objects. A similar solution to determining overlapping polygons is the painter's algorithm, which is capable of handling non-opaque scene elements, though at the cost of efficiency and incorrect results.
In statistics, Spearman's rank correlation coefficient or Spearman's ρ, named after Charles Spearman and often denoted by the Greek letter (rho) or as , is a nonparametric measure of rank correlation. It assesses how well the relationship between two variables can be described using a monotonic function.
In mathematics, a time series is a series of data points indexed in time order. Most commonly, a time series is a sequence taken at successive equally spaced points in time. Thus it is a sequence of discrete-time data. Examples of time series are heights of ocean tides, counts of sunspots, and the daily closing value of the Dow Jones Industrial Average.
In the statistical analysis of time series, autoregressive–moving-average (ARMA) models provide a parsimonious description of a (weakly) stationary stochastic process in terms of two polynomials, one for the autoregression (AR) and the second for the moving average (MA). The general ARMA model was described in the 1951 thesis of Peter Whittle, Hypothesis testing in time series analysis, and it was popularized in the 1970 book by George E. P. Box and Gwilym Jenkins.
In statistics, a moving average is a calculation to analyze data points by creating a series of averages of different selections of the full data set. It is also called a moving mean (MM) or rolling mean and is a type of finite impulse response filter. Variations include: simple, cumulative, or weighted forms.
The Hodrick–Prescott filter is a mathematical tool used in macroeconomics, especially in real business cycle theory, to remove the cyclical component of a time series from raw data. It is used to obtain a smoothed-curve representation of a time series, one that is more sensitive to long-term than to short-term fluctuations. The adjustment of the sensitivity of the trend to short-term fluctuations is achieved by modifying a multiplier .
In statistics and econometrics, and in particular in time series analysis, an autoregressive integrated moving average (ARIMA) model is a generalization of an autoregressive moving average (ARMA) model. To better comprehend the data or to forecast upcoming series points, both of these models are fitted to time series data. ARIMA models are applied in some cases where data show evidence of non-stationarity in the sense of mean, where an initial differencing step can be applied one or more times to eliminate the non-stationarity of the mean function. When the seasonality shows in a time series, the seasonal-differencing could be applied to eliminate the seasonal component. Since the ARMA model, according to the Wold's decomposition theorem, is theoretically sufficient to describe a regular wide-sense stationary time series, we are motivated to make stationary a non-stationary time series, e.g., by using differencing, before we can use the ARMA model. Note that if the time series contains a predictable sub-process, the predictable component is treated as a non-zero-mean but periodic component in the ARIMA framework so that it is eliminated by the seasonal differencing.
Exponential smoothing is a rule of thumb technique for smoothing time series data using the exponential window function. Whereas in the simple moving average the past observations are weighted equally, exponential functions are used to assign exponentially decreasing weights over time. It is an easily learned and easily applied procedure for making some determination based on prior assumptions by the user, such as seasonality. Exponential smoothing is often used for analysis of time-series data.
In time series analysis, the Box–Jenkins method, named after the statisticians George Box and Gwilym Jenkins, applies autoregressive moving average (ARMA) or autoregressive integrated moving average (ARIMA) models to find the best fit of a time-series model to past values of a time series.
In statistics, the Kendall rank correlation coefficient, commonly referred to as Kendall's τ coefficient, is a statistic used to measure the ordinal association between two measured quantities. A τ test is a non-parametric hypothesis test for statistical dependence based on the τ coefficient. It is a measure of rank correlation: the similarity of the orderings of the data when ranked by each of the quantities. It is named after Maurice Kendall, who developed it in 1938, though Gustav Fechner had proposed a similar measure in the context of time series in 1897.
The Berlin procedure (BV) is a mathematical procedure for time series decomposition and seasonal adjustment of monthly and quarterly economic time series. The mathematical foundations of the procedure were developed in 1960's at the Technical University of Berlin and the German Institute for Economic Research (DIW). The most important user of the procedure is the Federal Statistical Office of Germany.
Seasonal adjustment or deseasonalization is a statistical method for removing the seasonal component of a time series. It is usually done when wanting to analyse the trend, and cyclical deviations from trend, of a time series independently of the seasonal components. Many economic phenomena have seasonal cycles, such as agricultural production, and consumer consumption. It is necessary to adjust for this component in order to understand underlying trends in the economy, so official statistics are often adjusted to remove seasonal components. Typically, seasonally adjusted data is reported for unemployment rates to reveal the underlying trends and cycles in labor markets.
The decomposition of time series is a statistical task that deconstructs a time series into several components, each representing one of the underlying categories of patterns. There are two principal types of decomposition, which are outlined below.
DEVS abbreviating Discrete Event System Specification is a modular and hierarchical formalism for modeling and analyzing general systems that can be discrete event systems which might be described by state transition tables, and continuous state systems which might be described by differential equations, and hybrid continuous state and discrete event systems. DEVS is a timed event system.
In statistics, multivariate adaptive regression splines (MARS) is a form of regression analysis introduced by Jerome H. Friedman in 1991. It is a non-parametric regression technique and can be seen as an extension of linear models that automatically models nonlinearities and interactions between variables.
In time series data, seasonality is the presence of variations that occur at specific regular intervals less than a year, such as weekly, monthly, or quarterly. Seasonality may be caused by various factors, such as weather, vacation, and holidays and consists of periodic, repetitive, and generally regular and predictable patterns in the levels of a time series.
An error correction model (ECM) belongs to a category of multiple time series models most commonly used for data where the underlying variables have a long-run common stochastic trend, also known as cointegration. ECMs are a theoretically-driven approach useful for estimating both short-term and long-term effects of one time series on another. The term error-correction relates to the fact that last-period's deviation from a long-run equilibrium, the error, influences its short-run dynamics. Thus ECMs directly estimate the speed at which a dependent variable returns to equilibrium after a change in other variables.
The Lee–Carter model is a numerical algorithm used in mortality forecasting and life expectancy forecasting. The input to the model is a matrix of age specific mortality rates ordered monotonically by time, usually with ages in columns and years in rows. The output is a forecasted matrix of mortality rates in the same format as the input.
Agustín Maravall Herrero is a Spanish economist. He is known for his contributions to the analysis of statistics and econometrics, particularly in seasonal adjustment and the estimation of signals in economic time series. He created a methodology and several computer programs for such analysis that are used throughout the world by analysts, researchers, and data producers. Maravall retired in December 2014 from the Bank of Spain.