Iterative rational Krylov algorithm

Last updated November 23, 2021

The iterative rational Krylov algorithm (IRKA), is an iterative algorithm, useful for model order reduction (MOR) of single-input single-output (SISO) linear time-invariant dynamical systems.^[1] At each iteration, IRKA does an Hermite type interpolation of the original system transfer function. Each interpolation requires solving $r$ shifted pairs of linear systems, each of size $n\times n$ ; where $n$ is the original system order, and $r$ is the desired reduced model order (usually $r\ll n$ ).

The algorithm was first introduced by Gugercin, Antoulas and Beattie in 2008.^[2] It is based on a first order necessary optimality condition, initially investigated by Meier and Luenberger in 1967.^[3] The first convergence proof of IRKA was given by Flagg, Beattie and Gugercin in 2012,^[4] for a particular kind of systems.

MOR as an optimization problem

Consider a SISO linear time-invariant dynamical system, with input $v(t)$ , and output $y(t)$ :

{\begin{cases}{\dot {x}}(t)=Ax(t)+bv(t)\\y(t)=c^{T}x(t)\end{cases}}\qquad A\in \mathbb {R} ^{n\times n},\,b,c\in \mathbb {R} ^{n},\,v(t),y(t)\in \mathbb {R} ,\,x(t)\in \mathbb {R} ^{n}.

Applying the Laplace transform, with zero initial conditions, we obtain the transfer function $G$ , which is a fraction of polynomials:

G(s)=c^{T}(sI-A)^{-1}b,\quad A\in \mathbb {R} ^{n\times n},\,b,c\in \mathbb {R} ^{n}.

Assume $G$ is stable. Given $r<n$ , MOR tries to approximate the transfer function $G$ , by a stable rational transfer function $G_{r}$ , of order $r$ :

G_{r}(s)=c_{r}^{T}(sI_{r}-A_{r})^{-1}b_{r},\quad A_{r}\in \mathbb {R} ^{r\times r},\,b_{r},c_{r}\in \mathbb {R} ^{r}.

A possible approximation criterion is to minimize the absolute error in $H_{2}$ norm:

G_{r}\in {\underset {\dim({\hat {G}})=r,\,{\hat {G}}{\text{ stable}}}{\operatorname {arg\min } }}\|G-{\hat {G}}\|_{H_{2}},\quad \|G\|_{H_{2}}^{2}:={\frac {1}{2\pi }}\int \limits _{-\infty }^{\infty }|G(ja)|^{2}\,da.

This is known as the $H_{2}$ optimization problem. This problem has been studied extensively, and it is known to be non-convex;^[4] which implies that usually it will be difficult to find a global minimizer.

Meier–Luenberger conditions

The following first order necessary optimality condition for the $H_{2}$ problem, is of great importance for the IRKA algorithm.

Theorem (^[2][Theorem 3.4] ^[4][Theorem 1.2]) — Assume that the $H_{2}$ optimization problem admits a solution $G_{r}$ with simple poles. Denote these poles by: $\lambda _{1}(A_{r}),\ldots ,\lambda _{r}(A_{r})$ . Then, $G_{r}$ must be an Hermite interpolator of $G$ , through the reflected poles of $G_{r}$ :

G_{r}(\sigma _{i})=G(\sigma _{i}),\quad G_{r}^{\prime }(\sigma _{i})=G^{\prime }(\sigma _{i}),\quad \sigma _{i}=-\lambda _{i}(A_{r}),\quad \forall \,i=1,\ldots ,r.

Note that the poles $\lambda _{i}(A_{r})$ are the eigenvalues of the reduced $r\times r$ matrix $A_{r}$ .

Hermite interpolation

An Hermite interpolant $G_{r}$ of the rational function $G$ , through $r$ distinct points $\sigma _{1},\ldots ,\sigma _{r}\in \mathbb {C}$ , has components:

A_{r}=W_{r}^{*}AV_{r},\quad b_{r}=W_{r}^{*}b,\quad c_{r}=V_{r}^{*}c,\quad A_{r}\in \mathbb {R} ^{r\times r},\,b_{r}\in \mathbb {R} ^{r},\,c_{r}\in \mathbb {R} ^{r};

where the matrices $V_{r}=(v_{1}\mid \ldots \mid v_{r})\in \mathbb {C} ^{n\times r}$ and $W_{r}=(w_{1}\mid \ldots \mid w_{r})\in \mathbb {C} ^{n\times r}$ may be found by solving $r$ dual pairs of linear systems, one for each shift ^[4][Theorem 1.1]:

(\sigma _{i}I-A)v_{i}=b,\quad (\sigma _{i}I-A)^{*}w_{i}=c,\quad \forall \,i=1,\ldots ,r.

IRKA algorithm

As can be seen from the previous section, finding an Hermite interpolator $G_{r}$ of $G$ , through $r$ given points, is relatively easy. The difficult part is to find the correct interpolation points. IRKA tries to iteratively approximate these "optimal" interpolation points.

For this, it starts with $r$ arbitrary interpolation points (closed under conjugation), and then, at each iteration $m$ , it imposes the first order necessary optimality condition of the $H_{2}$ problem:

1. find the Hermite interpolant $G_{r}$ of $G$ , through the actual $r$ shift points: $\sigma _{1}^{m},\ldots ,\sigma _{r}^{m}$ .

2. update the shifts by using the poles of the new $G_{r}$ : $\sigma _{i}^{m+1}=-\lambda _{i}(A_{r}),\,\forall \,i=1,\ldots ,r.$

The iteration is stopped when the relative change in the set of shifts of two successive iterations is less than a given tolerance. This condition may be stated as:

{\frac {|\sigma _{i}^{m+1}-\sigma _{i}^{m}|}{|\sigma _{i}^{m}|}}<{\text{tol}},\,\forall \,i=1,\ldots ,r.

As already mentioned, each Hermite interpolation requires solving $r$ shifted pairs of linear systems, each of size $n\times n$ :

(\sigma _{i}^{m}I-A)v_{i}=b,\quad (\sigma _{i}^{m}I-A)^{*}w_{i}=c,\quad \forall \,i=1,\ldots ,r.

Also, updating the shifts requires finding the $r$ poles of the new interpolant $G_{r}$ . That is, finding the $r$ eigenvalues of the reduced $r\times r$ matrix $A_{r}$ .

Pseudocode

The following is a pseudocode for the IRKA algorithm ^[2][Algorithm 4.1].

algorithm IRKA     input: $A,b,c$ ,  ${\text{tol}}>0$ ,  $\sigma _{1},\ldots ,\sigma _{r}$  closed under conjugation          $(\sigma _{i}I-A)v_{i}=b,\,\forall \,i=1,\ldots ,r$  % Solve primal systems          $(\sigma _{i}I-A)^{*}w_{i}=c,\,\forall \,i=1,\ldots ,r$  % Solve dual systems      while relative change in { $\sigma _{i}$ } > tol          $A_{r}=W_{r}^{*}AV_{r}$  % Reduced order matrix          $\sigma _{i}=-\lambda _{i}(A_{r}),\,\forall \,i=1,\ldots ,r$  % Update shifts, using poles of  $G_{r}$  $(\sigma _{i}I-A)v_{i}=b,\,\forall \,i=1,\ldots ,r$  % Solve primal systems          $(\sigma _{i}I-A)^{*}w_{i}=c,\,\forall \,i=1,\ldots ,r$  % Solve dual systems     end whilereturn $A_{r}=W_{r}^{*}AV_{r},\,b_{r}=W_{r}^{*}b,\,c_{r}^{T}=c^{T}V_{r}$  % Reduced order model

Convergence

A SISO linear system is said to have symmetric state space (SSS), whenever: $A=A^{T},\,b=c.$ This type of systems appear in many important applications, such as in the analysis of RC circuits and in inverse problems involving 3D Maxwell's equations.^[4] For SSS systems with distinct poles, the following convergence result has been proven:^[4] "IRKA is a locally convergent fixed point iteration to a local minimizer of the $H_{2}$ optimization problem."

Although there is no convergence proof for the general case, numerous experiments have shown that IRKA often converges rapidly for different kind of linear dynamical systems.^[1]^[4]

Extensions

IRKA algorithm has been extended by the original authors to multiple-input multiple-output (MIMO) systems, and also to discrete time and differential algebraic systems ^[1]^[2][Remark 4.1].

Related Research Articles

In statistics, maximum likelihood estimation (MLE) is a method of estimating the parameters of an assumed probability distribution, given some observed data. This is achieved by maximizing a likelihood function so that, under the assumed statistical model, the observed data is most probable. The point in the parameter space that maximizes the likelihood function is called the maximum likelihood estimate. The logic of maximum likelihood is both intuitive and flexible, and as such the method has become a dominant means of statistical inference.

In probability theory and statistics, a Gaussian process is a stochastic process, such that every finite collection of those random variables has a multivariate normal distribution, i.e. every finite linear combination of them is normally distributed. The distribution of a Gaussian process is the joint distribution of all those random variables, and as such, it is a distribution over functions with a continuous domain, e.g. time or space.

In transcendental number theory, the Lindemann–Weierstrass theorem is a result that is very useful in establishing the transcendence of numbers. It states the following.

In mathematics, in particular functional analysis, the singular values, or s-numbers of a compact operator T : X → Y acting between Hilbert spaces X and Y, are the square roots of non-negative eigenvalues of the self-adjoint operator T^*T.

In abstract algebra, Hilbert's Theorem 90 is an important result on cyclic extensions of fields that leads to Kummer theory. In its most basic form, it states that if L/K is an extension of fields with cyclic Galois group G = Gal(L/K) generated by an element $and if is an element of L of relative norm 1, that is$

Game theory is the branch of mathematics in which games are studied: that is, models describing human behaviour. This is a glossary of some terms of the subject.

Interior-point methods are a certain class of algorithms that solve linear and nonlinear convex optimization problems.

In mathematics, the generalized minimal residual method (GMRES) is an iterative method for the numerical solution of a indefinite nonsymmetric system of linear equations. The method approximates the solution by the vector in a Krylov subspace with minimal residual. The Arnoldi iteration is used to find this vector.

Covariance matrix adaptation evolution strategy (CMA-ES) is a particular kind of strategy for numerical optimization. Evolution strategies (ES) are stochastic, derivative-free methods for numerical optimization of non-linear or non-convex continuous optimization problems. They belong to the class of evolutionary algorithms and evolutionary computation. An evolutionary algorithm is broadly based on the principle of biological evolution, namely the repeated interplay of variation and selection: in each generation (iteration) new individuals are generated by variation, usually in a stochastic way, of the current parental individuals. Then, some individuals are selected to become the parents in the next generation based on their fitness or objective function value $. Like this, over the generation sequence, individuals with better and better -values are generated.$

In numerical linear algebra, an incomplete LU factorization of a matrix is a sparse approximation of the LU factorization often used as a preconditioner.

In computational chemistry, a constraint algorithm is a method for satisfying the Newtonian motion of a rigid body which consists of mass points. A restraint algorithm is used to ensure that the distance between mass points is maintained. The general steps involved are: (i) choose novel unconstrained coordinates, (ii) introduce explicit constraint forces, (iii) minimize constraint forces implicitly by the technique of Lagrange multipliers or projection methods.

In real algebraic geometry, Krivine–StenglePositivstellensatz characterizes polynomials that are positive on a semialgebraic set, which is defined by systems of inequalities of polynomials with real coefficients, or more generally, coefficients from any real closed field.

In computer science, online machine learning is a method of machine learning in which data becomes available in a sequential order and is used to update the best predictor for future data at each step, as opposed to batch learning techniques which generate the best predictor by learning on the entire training data set at once. Online learning is a common technique used in areas of machine learning where it is computationally infeasible to train over the entire dataset, requiring the need of out-of-core algorithms. It is also used in situations where it is necessary for the algorithm to dynamically adapt to new patterns in the data, or when the data itself is generated as a function of time, e.g., stock price prediction. Online learning algorithms may be prone to catastrophic interference, a problem that can be addressed by incremental learning approaches.

The Brownian motion models for financial markets are based on the work of Robert C. Merton and Paul A. Samuelson, as extensions to the one-period market models of Harold Markowitz and William F. Sharpe, and are concerned with defining the concepts of financial assets and markets, portfolios, gains and wealth in terms of continuous-time stochastic processes.

In probability theory and statistics, the generalized chi-squared distribution is the distribution of a quadratic form of a multinormal variable, or a linear combination of different normal variables and squares of normal variables. Equivalently, it is also a linear sum of independent noncentral chi-square variables and a normal variable. There are several other such generalizations for which the same term is sometimes used; some of them are special cases of the family discussed here, for example the gamma distribution.

Sliced inverse regression (SIR) is a tool for dimension reduction in the field of multivariate statistics.

Least-squares support-vector machines (LS-SVM) are least-squares versions of support-vector machines (SVM), which are a set of related supervised learning methods that analyze data and recognize patterns, and which are used for classification and regression analysis. In this version one finds the solution by solving a set of linear equations instead of a convex quadratic programming (QP) problem for classical SVMs. Least-squares SVM classifiers were proposed by Suykens and Vandewalle. LS-SVMs are a class of kernel-based learning methods.

In mathematics, low-rank approximation is a minimization problem, in which the cost function measures the fit between a given matrix and an approximating matrix, subject to a constraint that the approximating matrix has reduced rank. The problem is used for mathematical modeling and data compression. The rank constraint is related to a constraint on the complexity of a model that fits the data. In applications, often there are other constraints on the approximating matrix apart from the rank constraint, e.g., non-negativity and Hankel structure.

Nonuniform sampling is a branch of sampling theory involving results related to the Nyquist–Shannon sampling theorem. Nonuniform sampling is based on Lagrange interpolation and the relationship between itself and the (uniform) sampling theorem. Nonuniform sampling is a generalisation of the Whittaker–Shannon–Kotelnikov (WSK) sampling theorem.

SAMV is a parameter-free superresolution algorithm for the linear inverse problem in spectral estimation, direction-of-arrival (DOA) estimation and tomographic reconstruction with applications in signal processing, medical imaging and remote sensing. The name was coined in 2013 to emphasize its basis on the asymptotically minimum variance (AMV) criterion. It is a powerful tool for the recovery of both the amplitude and frequency characteristics of multiple highly correlated sources in challenging environments. Applications include synthetic-aperture radar, computed tomography scan, and magnetic resonance imaging (MRI).

References

1 2 3 "Iterative Rational Krylov Algorithm". MOR Wiki. Retrieved 3 June 2021.
1 2 3 4 Gugercin, S.; Antoulas, A.C.; Beattie, C. (2008), $H_{2}$ Model Reduction for Large-Scale Linear Dynamical Systems, Journal on Matrix Analysis and Applications, 30, SIAM, pp. 609–638
↑ L. Meier; D.G. Luenberger (1967), Approximation of linear constant systems, IEEE Transactions on Automatic Control, 12, pp. 585–588
1 2 3 4 5 6 7 G. Flagg; C. Beattie; S. Gugercin (2012), Convergence of the Iterative Rational Krylov Algorithm, Systems & Control Letters, 61, pp. 688–691

External links

Model Order Reduction Wiki

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[mor_wiki_2021-1] 1 2 3 "Iterative Rational Krylov Algorithm". MOR Wiki. Retrieved 3 June 2021.

[gugercin_2008-2] 1 2 3 4 Gugercin, S.; Antoulas, A.C.; Beattie, C. (2008), $H_{2}$ Model Reduction for Large-Scale Linear Dynamical Systems, Journal on Matrix Analysis and Applications, 30, SIAM, pp. 609–638

[meier_1967-3] L. Meier; D.G. Luenberger (1967), Approximation of linear constant systems, IEEE Transactions on Automatic Control, 12, pp. 585–588

[flagg_2012-4] 1 2 3 4 5 6 7 G. Flagg; C. Beattie; S. Gugercin (2012), Convergence of the Iterative Rational Krylov Algorithm, Systems & Control Letters, 61, pp. 688–691

[1]

[2]

[3]

[4]