Directional component analysis

Last updated February 27, 2024

Directional component analysis (DCA)^[1]^[2]^[3] is a statistical method used in climate science for identifying representative patterns of variability in space-time data-sets such as historical climate observations,^[1] weather prediction ensembles ^[2] or climate ensembles.^[3]

DCA provides a way to reduce large ensembles from weather forecasts ^[2] or climate models ^[3] to just two patterns. The first pattern is the ensemble mean, and the second pattern is the DCA pattern, which represents variability around the ensemble mean in a way that takes impact into account. DCA contrasts with other methods that have been proposed for the reduction of ensembles^[7]^[8] in that it takes impact into account in addition to the structure of the ensemble.

Overview

Inputs

DCA is calculated from two inputs:^[1]^[2]^[3]

a multivariate dataset of weather or climate data, such as historical climate observations, or a weather or climate ensemble
a linear impact function. The linear impact function is a function which defines a level of impact for every spatial pattern in the weather or climate data as a weighted sum of the values at different locations in the spatial pattern. An example is the mean value across the spatial pattern. The linear impact function can be generated as the first term in the multivariate Taylor series of a non-linear impact function.^[3]

Formula

Consider a space-time data set $X$ , containing individual spatial pattern vectors $x$ , where the individual patterns are each considered as single samples from a multivariate normal distribution with mean zero and covariance matrix $C$ .

We define a linear impact function of a spatial pattern as $r^{t}x$ , where $r$ is a vector of spatial weights.

The first DCA pattern is given in terms the covariance matrix $C$ and the weights $r$ by the proportional expression $x\propto Cr$ . ^[1]^[2]^[3]

The pattern can then be normalized to any length as required.^[1]

Properties

If the weather or climate data is elliptically distributed (e.g., is distributed as a multivariate normal distribution or a multivariate t-distribution) then the first DCA pattern (DCA1) is defined as the spatial pattern with the following mathematical properties:

DCA1 maximises probability density for a given value of impact^[1]
DCA1 maximises impact for a given value of probability density^[1]
DCA1 maximises the product of impact and probability density^[3]
DCA1 is the conditional expectation, conditional on exceeding a certain level of impact^[3]
DCA1 is the impact-weighted ensemble mean^[3]
Any modification of DCA1 will lead to a pattern that is either less extreme, or has a lower probability density.

Rainfall Example

For instance, in a rainfall anomaly dataset, using an impact metric defined as the total rainfall anomaly, the first DCA pattern is the spatial pattern that has the highest probability density for a given total rainfall anomaly. If the given total rainfall anomaly is chosen to have a large value, then this pattern combines being extreme in terms of the metric (i.e., representing large amounts of total rainfall) with being likely in terms of the pattern, and so is well suited as a representative extreme pattern.

Comparison with PCA

The main differences between Principal component analysis (PCA) and DCA are^[1]

PCA is a function of just the covariance matrix, and the first PCA pattern is defined so as to maximise explained variance
DCA is a function of the covariance matrix and a vector direction (the gradient of the impact function), and the first DCA pattern is defined so as to maximise probability density for a given value of the impact metric

As a result, for unit vector spatial patterns:

The first PCA spatial pattern always corresponds to a higher explained variance, but has a lower value of the impact metric (e.g., the total rainfall anomaly), except in degenerate cases
The first DCA spatial pattern always corresponds to a higher value of the impact metric, but has a lower value of the explained variance, except in degenerate cases

The degenerate cases occur when the PCA and DCA patterns are equal.

Also, given the first PCA pattern, the DCA pattern can be scaled so that:

The scaled DCA pattern has the same probability density as the first PCA pattern, but higher impact, or
The scaled DCA pattern has the same impact as the first PCA pattern, but higher probability density.

Two Dimensional Example

Source:^[1]

Figure 1 gives an example, which can be understood as follows:

The two axes represent anomalies of annual mean rainfall at two locations, with the highest total rainfall anomaly values towards the top right corner of the diagram
The joint variability of the rainfall anomalies at the two locations is assumed to follow a bivariate normal distribution
The ellipse shows a single contour of probability density from this bivariate normal, with higher values inside the ellipse
The red dot at the centre of the ellipse shows zero rainfall anomalies at both locations
The blue parallel-line arrow shows the principal axis of the ellipse, which is also the first PCA spatial pattern vector
In this case, the PCA pattern is scaled so that it touches the ellipse
The diagonal straight line shows a line of constant positive total rainfall anomaly, assumed to be at some fairly extreme level
The red dotted-line arrow shows the first DCA pattern, which points towards the point at which the diagonal line is tangent to the ellipse
In this case, the DCA pattern is scaled so that it touches the ellipse

From this diagram, the DCA pattern can be seen to possess the following properties:

Of all the points on the diagonal line, it is the one with the highest probability density
Of all the points on the ellipse, it is the one with the highest total rainfall anomaly
It has the same probability density as the PCA pattern, but represents higher total rainfall (i.e., points further towards the top right hand corner of the diagram)
Any change of the DCA pattern will reduce either the probability density (if it moves out of the ellipse) or reduce the total rainfall anomaly (if it moves along or into the ellipse)

In this case the total rainfall anomaly of the PCA pattern is quite small, because of anticorrelations between the rainfall anomalies at the two locations. As a result, the first PCA pattern is not a good representative example of a pattern with large total rainfall anomaly, while the first DCA pattern is.

In $n$ dimensions the ellipse becomes an ellipsoid, the diagonal line becomes an $n-1$ dimensional plane, and the PCA and DCA patterns are vectors in $n$ dimensions.

Applications

Application to Climate Variability

DCA has been applied to the CRU data-set of historical rainfall variability^[9] in order to understand the most likely patterns of rainfall extremes in the US and China.^[1]

Application to Ensemble Weather Forecasts

DCA has been applied to ECMWF medium-range weather forecast ensembles in order to identify the most likely patterns of extreme temperatures in the ensemble forecast.^[2]

Application to Ensemble Climate Model Projections

DCA has been applied to ensemble climate model projections in order to identify the most likely patterns of extreme future rainfall.^[3]

Derivation of the First DCA Pattern

Source:^[1]

Consider a space-time data-set $X$ , containing individual spatial pattern vectors $x$ , where the individual patterns are each considered as single samples from a multivariate normal distribution with mean zero and covariance matrix $C$ .

As a function of $x$ , the log probability density is proportional to $-x^{t}C^{-1}x$ .

We define a linear impact function of a spatial pattern as $r^{t}x$ , where $r$ is a vector of spatial weights.

We then seek to find the spatial pattern that maximises the probability density for a given value of the linear impact function. This is equivalent to finding the spatial pattern that maximises the log probability density for a given value of the linear impact function, which is slightly easier to solve.

This is a constrained maximisation problem, and can be solved using the method of Lagrange multipliers.

The Lagrangian function is given by

$L(x,\lambda )=-x^{t}C^{-1}x-\lambda (r^{t}x-1)$

Differentiating by $x$ and setting to zero gives the solution

$x\propto Cr$

Normalising so that $x$ is unit vector gives

$x=Cr/(r^{t}CCr)^{1/2}$

This is the first DCA pattern.

Subsequent patterns can be derived which are orthogonal to the first, to form an orthonormal set and a method for matrix factorisation.

Related Research Articles

In signal processing, white noise is a random signal having equal intensity at different frequencies, giving it a constant power spectral density. The term is used, with this or similar meanings, in many scientific and technical disciplines, including physics, acoustical engineering, telecommunications, and statistical forecasting. White noise refers to a statistical model for signals and signal sources, rather than to any specific signal. White noise draws its name from white light, although light that appears white generally does not have a flat power spectral density over the visible band.

In probability, and statistics, a multivariate random variable or random vector is a list or vector of mathematical variables each of whose value is unknown, either because the value has not yet occurred or because there is imperfect knowledge of its value. The individual variables in a random vector are grouped together because they are all part of a single mathematical system — often they represent different properties of an individual statistical unit. For example, while a given person has a specific age, height and weight, the representation of these features of an unspecified person from within a group would be a random vector. Normally each element of a random vector is a real number.

<span class="mw-page-title-main">Multivariate normal distribution</span> Generalization of the one-dimensional normal distribution to higher dimensions

In probability theory and statistics, the multivariate normal distribution, multivariate Gaussian distribution, or joint normal distribution is a generalization of the one-dimensional (univariate) normal distribution to higher dimensions. One definition is that a random vector is said to be k-variate normally distributed if every linear combination of its k components has a univariate normal distribution. Its importance derives mainly from the multivariate central limit theorem. The multivariate normal distribution is often used to describe, at least approximately, any set of (possibly) correlated real-valued random variables each of which clusters around a mean value.

<span class="mw-page-title-main">Principal component analysis</span> Method of data analysis

Principal component analysis (PCA) is a linear dimensionality reduction technique with applications in exploratory data analysis, visualization and data preprocessing.

In the field of machine learning, the goal of statistical classification is to use an object's characteristics to identify which class it belongs to. A linear classifier achieves this by making a classification decision based on the value of a linear combination of the characteristics. An object's characteristics are also known as feature values and are typically presented to the machine in a vector called a feature vector. Such classifiers work well for practical problems such as document classification, and more generally for problems with many variables (features), reaching accuracy levels comparable to non-linear classifiers while taking less time to train and use. 5–12–23

<span class="mw-page-title-main">Covariance matrix</span> Measure of covariance of components of a random vector

In probability theory and statistics, a covariance matrix is a square matrix giving the covariance between each pair of elements of a given random vector.

In probability theory and statistics, a Gaussian process is a stochastic process, such that every finite collection of those random variables has a multivariate normal distribution. The distribution of a Gaussian process is the joint distribution of all those random variables, and as such, it is a distribution over functions with a continuous domain, e.g. time or space.

In statistics, the Wishart distribution is a generalization of the gamma distribution to multiple dimensions. It is named in honor of John Wishart, who first formulated the distribution in 1928. Other names include Wishart ensemble, or Wishart–Laguerre ensemble, or LOE, LUE, LSE.

In statistics, the matrix normal distribution or matrix Gaussian distribution is a probability distribution that is a generalization of the multivariate normal distribution to matrix-valued random variables.

Given two random variables that are defined on the same probability space, the joint probability distribution is the corresponding probability distribution on all possible pairs of outputs. The joint distribution can just as well be considered for any given number of random variables. The joint distribution encodes the marginal distributions, i.e. the distributions of each of the individual random variables and the conditional probability distributions, which deal with how the outputs of one random variable are distributed when given information on the outputs of the other random variable(s).

Linear discriminant analysis (LDA), normal discriminant analysis (NDA), or discriminant function analysis is a generalization of Fisher's linear discriminant, a method used in statistics and other fields, to find a linear combination of features that characterizes or separates two or more classes of objects or events. The resulting combination may be used as a linear classifier, or, more commonly, for dimensionality reduction before later classification.

This glossary of statistics and probability is a list of definitions of terms and concepts used in the mathematical sciences of statistics and probability, their sub-disciplines, and related fields. For additional related terms, see Glossary of mathematics and Glossary of experimental design.

In the field of multivariate statistics, kernel principal component analysis (kernel PCA) is an extension of principal component analysis (PCA) using techniques of kernel methods. Using a kernel, the originally linear operations of PCA are performed in a reproducing kernel Hilbert space.

A whitening transformation or sphering transformation is a linear transformation that transforms a vector of random variables with a known covariance matrix into a set of new variables whose covariance is the identity matrix, meaning that they are uncorrelated and each have variance 1. The transformation is called "whitening" because it changes the input vector into a white noise vector.

The ensemble Kalman filter (EnKF) is a recursive filter suitable for problems with a large number of variables, such as discretizations of partial differential equations in geophysical models. The EnKF originated as a version of the Kalman filter for large problems, and it is now an important data assimilation component of ensemble forecasting. EnKF is related to the particle filter but the EnKF makes the assumption that all probability distributions involved are Gaussian; when it is applicable, it is much more efficient than the particle filter.

In probability theory and statistics, the covariance function describes how much two random variables change together (their covariance) with varying spatial or temporal separation. For a random field or stochastic process Z(x) on a domain D, a covariance function C(x, y) gives the covariance of the values of the random field at the two locations x and y:

In time series analysis, singular spectrum analysis (SSA) is a nonparametric spectral estimation method. It combines elements of classical time series analysis, multivariate statistics, multivariate geometry, dynamical systems and signal processing. Its roots lie in the classical Karhunen (1946)–Loève spectral decomposition of time series and random fields and in the Mañé (1981)–Takens (1981) embedding theorem. SSA can be an aid in the decomposition of time series into a sum of components, each having a meaningful interpretation. The name "singular spectrum analysis" relates to the spectrum of eigenvalues in a singular value decomposition of a covariance matrix, and not directly to a frequency domain decomposition.

In statistics, the RV coefficient is a multivariate generalization of the squared Pearson correlation coefficient. It measures the closeness of two set of points that may each be represented in a matrix.

In probability and statistics, an elliptical distribution is any member of a broad family of probability distributions that generalize the multivariate normal distribution. Intuitively, in the simplified two and three dimensional case, the joint distribution forms an ellipse and an ellipsoid, respectively, in iso-density plots.

In signal processing, multidimensional empirical mode decomposition is an extension of the one-dimensional (1-D) EMD algorithm to a signal encompassing multiple dimensions. The Hilbert–Huang empirical mode decomposition (EMD) process decomposes a signal into intrinsic mode functions combined with the Hilbert spectral analysis, known as the Hilbert–Huang transform (HHT). The multidimensional EMD extends the 1-D EMD algorithm into multiple-dimensional signals. This decomposition can be applied to image processing, audio signal processing, and various other multidimensional signals.

References

1 2 3 4 5 6 7 8 9 10 11 Jewson, S. (2020). "An Alternative to PCA for Estimating Dominant Patterns of Climate Variability and Extremes, with Application to U.S. and China Seasonal Rainfall". Atmosphere. 11 (4): 354. Bibcode:2020Atmos..11..354J. doi: 10.3390/atmos11040354 .
1 2 3 4 5 6 Scher, S.; Jewson, S.; Messori, G. (2021). "Robust Worst-Case Scenarios from Ensemble Forecasts". Weather and Forecasting. 36 (4): 1357–1373. Bibcode:2021WtFor..36.1357S. doi: 10.1175/WAF-D-20-0219.1 . S2CID 236300040.
1 2 3 4 5 6 7 8 9 10 Jewson, S.; Messori, G.; Barbato, G.; Mercogliano, P.; Mysiak, J.; Sassi, M. (2022). "Developing Representative Impact Scenarios From Climate Projection Ensembles, With Application to UKCP18 and EURO-CORDEX Precipitation". Journal of Advances in Modeling Earth Systems. 15 (1). doi: 10.1029/2022MS003038 . S2CID 254965361.
↑ Hannachi, A.; Jolliffe, I.; Stephenson, D. (2007). "Empirical orthogonal functions and related techniques in atmospheric science: A review". International Journal of Climatology. 27 (9): 1119. Bibcode:2007IJCli..27.1119H. doi:10.1002/joc.1499. S2CID 52232574.
↑ Mestas-Nunez, A. (2000). "Orthogonality properties of rotated empirical modes". International Journal of Climatology. 20 (12): 1509–1516. doi:10.1002/1097-0088(200010)20:12<1509::AID-JOC553>3.0.CO;2-Q.
↑ Fraedrich, K.; McBride, J.; Frank, W.; Wang, R. (1997). "Extended EOF Analysis of Tropical Disturbances: TOGA COARE". Journal of the Atmospheric Sciences. 41 (19): 2363. Bibcode:1997JAtS...54.2363F. doi: 10.1175/1520-0469(1997)054<2363:EEAOTD>2.0.CO;2 .
↑ Evans, J.; Ji, F.; Abramowitz, G.; Ekstrom, M. (2013). "Optimally choosing small ensemble members to produce robust climate simulations". Environmental Research Letters. 8 (4): 044050. Bibcode:2013ERL.....8d4050E. doi: 10.1088/1748-9326/8/4/044050 . S2CID 155021417.
↑ Herger, N.; Abramowitz, G.; Knutti, R.; Angelil, O.; Lehmann, K.; Sanderson, B. (2017). "Selecting a climate model subset to optimise key ensemble properties". Earth System Dynamics. 9: 135–151. doi: 10.5194/esd-9-135-2018 . hdl: 20.500.11850/246202 .
↑ Harris, I.; Jones, P.; Osborn, T.; Lister, D. (2013). "Updated high-resolution grids of monthly climatic observations— The CRU TS3.10 Dataset" (PDF). International Journal of Climatology. 34 (3): 623. Bibcode:2014IJCli..34..623H. doi:10.1002/joc.3711. S2CID 54866679.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[jewson-1] 1 2 3 4 5 6 7 8 9 10 11 Jewson, S. (2020). "An Alternative to PCA for Estimating Dominant Patterns of Climate Variability and Extremes, with Application to U.S. and China Seasonal Rainfall". Atmosphere. 11 (4): 354. Bibcode:2020Atmos..11..354J. doi: 10.3390/atmos11040354 .

[scheretal-2] 1 2 3 4 5 6 Scher, S.; Jewson, S.; Messori, G. (2021). "Robust Worst-Case Scenarios from Ensemble Forecasts". Weather and Forecasting. 36 (4): 1357–1373. Bibcode:2021WtFor..36.1357S. doi: 10.1175/WAF-D-20-0219.1 . S2CID 236300040.

[jewsonetal-3] 1 2 3 4 5 6 7 8 9 10 Jewson, S.; Messori, G.; Barbato, G.; Mercogliano, P.; Mysiak, J.; Sassi, M. (2022). "Developing Representative Impact Scenarios From Climate Projection Ensembles, With Application to UKCP18 and EURO-CORDEX Precipitation". Journal of Advances in Modeling Earth Systems. 15 (1). doi: 10.1029/2022MS003038 . S2CID 254965361.

[hannachi-4] Hannachi, A.; Jolliffe, I.; Stephenson, D. (2007). "Empirical orthogonal functions and related techniques in atmospheric science: A review". International Journal of Climatology. 27 (9): 1119. Bibcode:2007IJCli..27.1119H. doi:10.1002/joc.1499. S2CID 52232574.

[mestas-5] Mestas-Nunez, A. (2000). "Orthogonality properties of rotated empirical modes". International Journal of Climatology. 20 (12): 1509–1516. doi:10.1002/1097-0088(200010)20:12<1509::AID-JOC553>3.0.CO;2-Q.

[fraedrich-6] Fraedrich, K.; McBride, J.; Frank, W.; Wang, R. (1997). "Extended EOF Analysis of Tropical Disturbances: TOGA COARE". Journal of the Atmospheric Sciences. 41 (19): 2363. Bibcode:1997JAtS...54.2363F. doi: 10.1175/1520-0469(1997)054<2363:EEAOTD>2.0.CO;2 .

[evans-7] Evans, J.; Ji, F.; Abramowitz, G.; Ekstrom, M. (2013). "Optimally choosing small ensemble members to produce robust climate simulations". Environmental Research Letters. 8 (4): 044050. Bibcode:2013ERL.....8d4050E. doi: 10.1088/1748-9326/8/4/044050 . S2CID 155021417.

[herger-8] Herger, N.; Abramowitz, G.; Knutti, R.; Angelil, O.; Lehmann, K.; Sanderson, B. (2017). "Selecting a climate model subset to optimise key ensemble properties". Earth System Dynamics. 9: 135–151. doi: 10.5194/esd-9-135-2018 . hdl: 20.500.11850/246202 .

[harris-9] Harris, I.; Jones, P.; Osborn, T.; Lister, D. (2013). "Updated high-resolution grids of monthly climatic observations— The CRU TS3.10 Dataset" (PDF). International Journal of Climatology. 34 (3): 623. Bibcode:2014IJCli..34..623H. doi:10.1002/joc.3711. S2CID 54866679.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]