Directional component analysis

Last updated

Directional component analysis (DCA) [1] [2] [3] is a statistical method used in climate science for identifying representative patterns of variability in space-time data-sets such as historical climate observations, [1] weather prediction ensembles [2] or climate ensembles. [3]

Contents

The first DCA pattern is a pattern of weather or climate variability that is both likely to occur (measured using likelihood) and has a large impact (for a specified linear impact function, and given certain mathematical conditions: see below).

The first DCA pattern contrasts with the first PCA pattern, which is likely to occur, but may not have a large impact, and with a pattern derived from the gradient of the impact function, which has a large impact, but may not be likely to occur.

DCA differs from other pattern identification methods used in climate research, such as EOFs, [4] rotated EOFs [5] and extended EOFs [6] in that it takes into account an external vector, the gradient of the impact.

DCA provides a way to reduce large ensembles from weather forecasts [2] or climate models [3] to just two patterns. The first pattern is the ensemble mean, and the second pattern is the DCA pattern, which represents variability around the ensemble mean in a way that takes impact into account. DCA contrasts with other methods that have been proposed for the reduction of ensembles [7] [8] in that it takes impact into account in addition to the structure of the ensemble.

Overview

Inputs

DCA is calculated from two inputs: [1] [2] [3]

Formula

Consider a space-time data set , containing individual spatial pattern vectors , where the individual patterns are each considered as single samples from a multivariate normal distribution with mean zero and covariance matrix .

We define a linear impact function of a spatial pattern as , where is a vector of spatial weights.

The first DCA pattern is given in terms the covariance matrix and the weights by the proportional expression . [1] [2] [3]

The pattern can then be normalized to any length as required. [1]

Properties

If the weather or climate data is elliptically distributed (e.g., is distributed as a multivariate normal distribution or a multivariate t-distribution) then the first DCA pattern (DCA1) is defined as the spatial pattern with the following mathematical properties:

Rainfall Example

For instance, in a rainfall anomaly dataset, using an impact metric defined as the total rainfall anomaly, the first DCA pattern is the spatial pattern that has the highest probability density for a given total rainfall anomaly. If the given total rainfall anomaly is chosen to have a large value, then this pattern combines being extreme in terms of the metric (i.e., representing large amounts of total rainfall) with being likely in terms of the pattern, and so is well suited as a representative extreme pattern.

Comparison with PCA

The main differences between Principal component analysis (PCA) and DCA are [1]

As a result, for unit vector spatial patterns:

The degenerate cases occur when the PCA and DCA patterns are equal.

Also, given the first PCA pattern, the DCA pattern can be scaled so that:

Two Dimensional Example

Source: [1]

Figure 1: PCA (blue) and DCA (red) vectors in a two dimensional example. Directionalcomponentanalysis.svg
Figure 1: PCA (blue) and DCA (red) vectors in a two dimensional example.

Figure 1 gives an example, which can be understood as follows:

From this diagram, the DCA pattern can be seen to possess the following properties:

In this case the total rainfall anomaly of the PCA pattern is quite small, because of anticorrelations between the rainfall anomalies at the two locations. As a result, the first PCA pattern is not a good representative example of a pattern with large total rainfall anomaly, while the first DCA pattern is.

In dimensions the ellipse becomes an ellipsoid, the diagonal line becomes an dimensional plane, and the PCA and DCA patterns are vectors in dimensions.

Applications

Application to Climate Variability

DCA has been applied to the CRU data-set of historical rainfall variability [9] in order to understand the most likely patterns of rainfall extremes in the US and China. [1]

Application to Ensemble Weather Forecasts

DCA has been applied to ECMWF medium-range weather forecast ensembles in order to identify the most likely patterns of extreme temperatures in the ensemble forecast. [2]

Application to Ensemble Climate Model Projections

DCA has been applied to ensemble climate model projections in order to identify the most likely patterns of extreme future rainfall. [3]

Derivation of the First DCA Pattern

Source: [1]

Consider a space-time data-set , containing individual spatial pattern vectors , where the individual patterns are each considered as single samples from a multivariate normal distribution with mean zero and covariance matrix .

As a function of , the log probability density is proportional to .

We define a linear impact function of a spatial pattern as , where is a vector of spatial weights.

We then seek to find the spatial pattern that maximises the probability density for a given value of the linear impact function. This is equivalent to finding the spatial pattern that maximises the log probability density for a given value of the linear impact function, which is slightly easier to solve.

This is a constrained maximisation problem, and can be solved using the method of Lagrange multipliers.

The Lagrangian function is given by

Differentiating by and setting to zero gives the solution

Normalising so that is unit vector gives

This is the first DCA pattern.

Subsequent patterns can be derived which are orthogonal to the first, to form an orthonormal set and a method for matrix factorisation.

Related Research Articles

<span class="mw-page-title-main">White noise</span> Type of signal in signal processing

In signal processing, white noise is a random signal having equal intensity at different frequencies, giving it a constant power spectral density. The term is used, with this or similar meanings, in many scientific and technical disciplines, including physics, acoustical engineering, telecommunications, and statistical forecasting. White noise refers to a statistical model for signals and signal sources, rather than to any specific signal. White noise draws its name from white light, although light that appears white generally does not have a flat power spectral density over the visible band.

<span class="mw-page-title-main">Multivariate random variable</span> Random variable with multiple component dimensions

In probability, and statistics, a multivariate random variable or random vector is a list or vector of mathematical variables each of whose value is unknown, either because the value has not yet occurred or because there is imperfect knowledge of its value. The individual variables in a random vector are grouped together because they are all part of a single mathematical system — often they represent different properties of an individual statistical unit. For example, while a given person has a specific age, height and weight, the representation of these features of an unspecified person from within a group would be a random vector. Normally each element of a random vector is a real number.

<span class="mw-page-title-main">Multivariate normal distribution</span> Generalization of the one-dimensional normal distribution to higher dimensions

In probability theory and statistics, the multivariate normal distribution, multivariate Gaussian distribution, or joint normal distribution is a generalization of the one-dimensional (univariate) normal distribution to higher dimensions. One definition is that a random vector is said to be k-variate normally distributed if every linear combination of its k components has a univariate normal distribution. Its importance derives mainly from the multivariate central limit theorem. The multivariate normal distribution is often used to describe, at least approximately, any set of (possibly) correlated real-valued random variables each of which clusters around a mean value.

<span class="mw-page-title-main">Principal component analysis</span> Method of data analysis

Principal component analysis (PCA) is a linear dimensionality reduction technique with applications in exploratory data analysis, visualization and data preprocessing.

In the field of machine learning, the goal of statistical classification is to use an object's characteristics to identify which class it belongs to. A linear classifier achieves this by making a classification decision based on the value of a linear combination of the characteristics. An object's characteristics are also known as feature values and are typically presented to the machine in a vector called a feature vector. Such classifiers work well for practical problems such as document classification, and more generally for problems with many variables (features), reaching accuracy levels comparable to non-linear classifiers while taking less time to train and use. 5–12–23

<span class="mw-page-title-main">Covariance matrix</span> Measure of covariance of components of a random vector

In probability theory and statistics, a covariance matrix is a square matrix giving the covariance between each pair of elements of a given random vector.

In probability theory and statistics, a Gaussian process is a stochastic process, such that every finite collection of those random variables has a multivariate normal distribution. The distribution of a Gaussian process is the joint distribution of all those random variables, and as such, it is a distribution over functions with a continuous domain, e.g. time or space.

In statistics, the Wishart distribution is a generalization of the gamma distribution to multiple dimensions. It is named in honor of John Wishart, who first formulated the distribution in 1928. Other names include Wishart ensemble, or Wishart–Laguerre ensemble, or LOE, LUE, LSE.

In statistics, the matrix normal distribution or matrix Gaussian distribution is a probability distribution that is a generalization of the multivariate normal distribution to matrix-valued random variables.

<span class="mw-page-title-main">Joint probability distribution</span> Type of probability distribution

Given two random variables that are defined on the same probability space, the joint probability distribution is the corresponding probability distribution on all possible pairs of outputs. The joint distribution can just as well be considered for any given number of random variables. The joint distribution encodes the marginal distributions, i.e. the distributions of each of the individual random variables and the conditional probability distributions, which deal with how the outputs of one random variable are distributed when given information on the outputs of the other random variable(s).

Linear discriminant analysis (LDA), normal discriminant analysis (NDA), or discriminant function analysis is a generalization of Fisher's linear discriminant, a method used in statistics and other fields, to find a linear combination of features that characterizes or separates two or more classes of objects or events. The resulting combination may be used as a linear classifier, or, more commonly, for dimensionality reduction before later classification.

This glossary of statistics and probability is a list of definitions of terms and concepts used in the mathematical sciences of statistics and probability, their sub-disciplines, and related fields. For additional related terms, see Glossary of mathematics and Glossary of experimental design.

In the field of multivariate statistics, kernel principal component analysis (kernel PCA) is an extension of principal component analysis (PCA) using techniques of kernel methods. Using a kernel, the originally linear operations of PCA are performed in a reproducing kernel Hilbert space.

A whitening transformation or sphering transformation is a linear transformation that transforms a vector of random variables with a known covariance matrix into a set of new variables whose covariance is the identity matrix, meaning that they are uncorrelated and each have variance 1. The transformation is called "whitening" because it changes the input vector into a white noise vector.

The ensemble Kalman filter (EnKF) is a recursive filter suitable for problems with a large number of variables, such as discretizations of partial differential equations in geophysical models. The EnKF originated as a version of the Kalman filter for large problems, and it is now an important data assimilation component of ensemble forecasting. EnKF is related to the particle filter but the EnKF makes the assumption that all probability distributions involved are Gaussian; when it is applicable, it is much more efficient than the particle filter.

In probability theory and statistics, the covariance function describes how much two random variables change together (their covariance) with varying spatial or temporal separation. For a random field or stochastic process Z(x) on a domain D, a covariance function C(xy) gives the covariance of the values of the random field at the two locations x and y:

<span class="mw-page-title-main">Singular spectrum analysis</span> Nonparametric spectral estimation method

In time series analysis, singular spectrum analysis (SSA) is a nonparametric spectral estimation method. It combines elements of classical time series analysis, multivariate statistics, multivariate geometry, dynamical systems and signal processing. Its roots lie in the classical Karhunen (1946)–Loève spectral decomposition of time series and random fields and in the Mañé (1981)–Takens (1981) embedding theorem. SSA can be an aid in the decomposition of time series into a sum of components, each having a meaningful interpretation. The name "singular spectrum analysis" relates to the spectrum of eigenvalues in a singular value decomposition of a covariance matrix, and not directly to a frequency domain decomposition.

In statistics, the RV coefficient is a multivariate generalization of the squared Pearson correlation coefficient. It measures the closeness of two set of points that may each be represented in a matrix.

In probability and statistics, an elliptical distribution is any member of a broad family of probability distributions that generalize the multivariate normal distribution. Intuitively, in the simplified two and three dimensional case, the joint distribution forms an ellipse and an ellipsoid, respectively, in iso-density plots.

In signal processing, multidimensional empirical mode decomposition is an extension of the one-dimensional (1-D) EMD algorithm to a signal encompassing multiple dimensions. The Hilbert–Huang empirical mode decomposition (EMD) process decomposes a signal into intrinsic mode functions combined with the Hilbert spectral analysis, known as the Hilbert–Huang transform (HHT). The multidimensional EMD extends the 1-D EMD algorithm into multiple-dimensional signals. This decomposition can be applied to image processing, audio signal processing, and various other multidimensional signals.

References

  1. 1 2 3 4 5 6 7 8 9 10 11 Jewson, S. (2020). "An Alternative to PCA for Estimating Dominant Patterns of Climate Variability and Extremes, with Application to U.S. and China Seasonal Rainfall". Atmosphere. 11 (4): 354. Bibcode:2020Atmos..11..354J. doi: 10.3390/atmos11040354 .
  2. 1 2 3 4 5 6 Scher, S.; Jewson, S.; Messori, G. (2021). "Robust Worst-Case Scenarios from Ensemble Forecasts". Weather and Forecasting. 36 (4): 1357–1373. Bibcode:2021WtFor..36.1357S. doi: 10.1175/WAF-D-20-0219.1 . S2CID   236300040.
  3. 1 2 3 4 5 6 7 8 9 10 Jewson, S.; Messori, G.; Barbato, G.; Mercogliano, P.; Mysiak, J.; Sassi, M. (2022). "Developing Representative Impact Scenarios From Climate Projection Ensembles, With Application to UKCP18 and EURO-CORDEX Precipitation". Journal of Advances in Modeling Earth Systems. 15 (1). doi: 10.1029/2022MS003038 . S2CID   254965361.
  4. Hannachi, A.; Jolliffe, I.; Stephenson, D. (2007). "Empirical orthogonal functions and related techniques in atmospheric science: A review". International Journal of Climatology. 27 (9): 1119. Bibcode:2007IJCli..27.1119H. doi:10.1002/joc.1499. S2CID   52232574.
  5. Mestas-Nunez, A. (2000). "Orthogonality properties of rotated empirical modes". International Journal of Climatology. 20 (12): 1509–1516. doi:10.1002/1097-0088(200010)20:12<1509::AID-JOC553>3.0.CO;2-Q.
  6. Fraedrich, K.; McBride, J.; Frank, W.; Wang, R. (1997). "Extended EOF Analysis of Tropical Disturbances: TOGA COARE". Journal of the Atmospheric Sciences. 41 (19): 2363. Bibcode:1997JAtS...54.2363F. doi: 10.1175/1520-0469(1997)054<2363:EEAOTD>2.0.CO;2 .
  7. Evans, J.; Ji, F.; Abramowitz, G.; Ekstrom, M. (2013). "Optimally choosing small ensemble members to produce robust climate simulations". Environmental Research Letters. 8 (4): 044050. Bibcode:2013ERL.....8d4050E. doi: 10.1088/1748-9326/8/4/044050 . S2CID   155021417.
  8. Herger, N.; Abramowitz, G.; Knutti, R.; Angelil, O.; Lehmann, K.; Sanderson, B. (2017). "Selecting a climate model subset to optimise key ensemble properties". Earth System Dynamics. 9: 135–151. doi: 10.5194/esd-9-135-2018 . hdl: 20.500.11850/246202 .
  9. Harris, I.; Jones, P.; Osborn, T.; Lister, D. (2013). "Updated high-resolution grids of monthly climatic observations— The CRU TS3.10 Dataset" (PDF). International Journal of Climatology. 34 (3): 623. Bibcode:2014IJCli..34..623H. doi:10.1002/joc.3711. S2CID   54866679.