Quasilinearization

Last updated October 20, 2024

Two numerical solutions of the nonlinear example boundary value problem
y
''
=
y
2
{\displaystyle y''=y^{2}}
,
y
(
-
1
)
=
y
(
1
)
=
1
{\displaystyle y(-1)=y(1)=1}
. Solved by a spectral Chebyshev method and quasilinearization. The top curve
u
1
{\displaystyle u_{1}}
used 21 interpolation nodes, and the bottom curve
u
2
{\displaystyle u_{2}}
used 34. Both used 3 iterations. Quasilinearexample.png — Two numerical solutions of the nonlinear example boundary value problem $y''=y^{2}$ , $y(-1)=y(1)=1$ . Solved by a spectral Chebyshev method and quasilinearization. The top curve $u_{1}$ used 21 interpolation nodes, and the bottom curve $u_{2}$ used 34. Both used 3 iterations.

In mathematics, quasilinearization is a technique which replaces a nonlinear differential equation or operator equation (or system of such equations) with a sequence of linear problems, which are presumed to be easier, and whose solutions approximate the solution of the original nonlinear problem with increasing accuracy. It is a generalization of Newton's method; the word "quasilinearization" is commonly used when the differential equation is a boundary value problem.^[1]^[2]

Abstract formulation

Quasilinearization replaces a given nonlinear operator $N$ with a certain linear operator which, being simpler, can be used in an iterative fashion to approximately solve equations containing the original nonlinear operator. This is typically performed when trying to solve an equation such as $N(y) = 0$ together with certain boundary conditions $B$ for which the equation has a solution $y$ . This solution is sometimes called the "reference solution". For quasilinearization to work, the reference solution needs to exist uniquely (at least locally). The process starts with an initial approximation $y 0$ that satisfies the boundary conditions and is "sufficiently close" to the reference solution $y$ in a sense to be defined more precisely later. The first step is to take the Fréchet derivative of the nonlinear operator $N$ at that initial approximation, in order to find the linear operator $L(y 0)$ which best approximates $N(y)-N(y 0)$ locally. The nonlinear equation may then be approximated as $N (y) = N(y k) + L(y k)( y - y k) + O( y-y k) 2$ , taking $k=0$ . Setting this equation to zero and imposing zero boundary conditions and ignoring higher-order terms gives the linear equation $L(y k)( y - y k) = - N(y k)$ . The solution of this linear equation (with zero boundary conditions) might be called $y k+1$ . Computation of $y k$ for $k =1, 2, 3,$ ... by solving these linear equations in sequence is analogous to Newton's iteration for a single equation, and requires recomputation of the Fréchet derivative at each $y k$ . The process can converge quadratically to the reference solution, under the right conditions. Just as with Newton's method for nonlinear algebraic equations, however, difficulties may arise: for instance, the original nonlinear equation may have no solution, or more than one solution, or a multiple solution, in which cases the iteration may converge only very slowly, may not converge at all, or may converge instead to the wrong solution.

The practical test of the meaning of the phrase "sufficiently close" earlier is precisely that the iteration converges to the correct solution. Just as in the case of Newton iteration, there are theorems stating conditions under which one can know ahead of time when the initial approximation is "sufficiently close".

Contrast with discretizing first

One could instead discretize the original nonlinear operator and generate a (typically large) set of nonlinear algebraic equations for the unknowns, and then use Newton's method proper on this system of equations. Generally speaking, the convergence behavior is similar: a similarly good initial approximation will produce similarly good approximate discrete solutions. However, the quasilinearization approach (linearizing the operator equation instead of the discretized equations) seems to be simpler to think about, and has allowed such techniques as adaptive spatial meshes to be used as the iteration proceeds.^[3]

Example

As an example to illustrate the process of quasilinearization, we can approximately solve the two-point boundary value problem for the nonlinear node ${\frac {d^{2}}{dx^{2}}}y(x)=y^{2}(x),$ where the boundary conditions are $y(-1)=1$ and $y(1)=1$ . The exact solution of the differential equation can be expressed using the Weierstrass elliptic function ℘, like so: $y(x)=6\wp (x-\alpha |0,\beta )$ where the vertical bar notation means that the invariants are $g_{2}=0$ and $g_{3}=\beta$ . Finding the values of $\alpha$ and $\beta$ so that the boundary conditions are satisfied requires solving two simultaneous nonlinear equations for the two unknowns $\alpha$ and $\beta$ , namely $6\wp (-1-\alpha |0,\beta )=1$ and $6\wp (1-\alpha |0,\beta )=1$ . This can be done, in an environment where ℘ and its derivatives are available, for instance by Newton's method.^{[lower-alpha 1]}

Applying the technique of quasilinearization instead, one finds by taking the Fréchet derivative at an unknown approximation $y_{k}(x)$ that the linear operator is $L(\varepsilon )={\frac {d^{2}}{dx^{2}}}\varepsilon (x)-2y_{k}(x)\varepsilon (x).$ If the initial approximation is $y_{0}(x)=1$ identically on the interval $-1\leq x\leq 1$ , then the first iteration (at least) can be solved exactly, but is already somewhat complicated. A numerical solution instead, for instance by a Chebyshev spectral method using $n=21$ Chebyshev—Lobatto points $x_{k}=\cos(\pi (n-1-k)/(n-1))$ for $k=0,1,\cdots ,n-1$ gives a solution with residual less than $5\cdot 10^{-9}$ after three iterations; that is, $y_{3}(x)$ is the exact solution to ${\textstyle {\frac {d^{2}}{dx^{2}}}y(x)-y^{2}(x)=5\cdot 10^{-9}v(x)}$ , where the maximum value of $|v(x)|$ is less than 1 on the interval $-1\leq x\leq 1$ . This approximate solution (call it $u_{1}$ ) agrees with the exact solution $6\cdot \wp (x-\alpha |0,\beta )$ with $\{\alpha \approx 3.524459420,\beta \approx 0.006691372637\}.$

Other values of $\alpha$ and $\beta$ give other continuous solutions to this nonlinear two-point boundary-value problem for ODE, such as $\{\alpha \approx 2.55347391110,\beta \approx -1.24923895273\}.$ The solution corresponding to these values plotted in the figure is called $u_{2}$ . Yet other values of the parameters can give discontinuous solutions because ℘ has a double pole at zero and so $y(x)$ has a double pole at $x=\alpha$ . Finding other continuous solutions by quasilinearization requires different initial approximations to the ones used here. The initial approximation $y_{0}=5x^{2}-4$ approximates the exact solution $u_{2}$ and can be used to generate a sequence of approximations converging to $u_{2}$ . Both approximations are plotted in the accompanying figure.

Notes

↑ For more information about elliptic functions, see Lawden (1989).^[4]

Related Research Articles

In computational mathematics, an iterative method is a mathematical procedure that uses an initial value to generate a sequence of improving approximate solutions for a class of problems, in which the i-th approximation is derived from the previous ones.

<span class="mw-page-title-main">Newton's method</span> Algorithm for finding zeros of functions

In numerical analysis, the Newton–Raphson method, also known simply as Newton's method, named after Isaac Newton and Joseph Raphson, is a root-finding algorithm which produces successively better approximations to the roots of a real-valued function. The most basic version starts with a real-valued function $f$ , its derivative $f'$ , and an initial guess $x 0$ for a root of $f$ . If $f$ satisfies certain assumptions and the initial guess is close, then

The method of least squares is a parameter estimation method in regression analysis based on minimizing the sum of the squares of the residuals made in the results of each individual equation.

In mathematics and science, a nonlinear system is a system in which the change of the output is not proportional to the change of the input. Nonlinear problems are of interest to engineers, biologists, physicists, mathematicians, and many other scientists since most systems are inherently nonlinear in nature. Nonlinear dynamical systems, describing changes in variables over time, may appear chaotic, unpredictable, or counterintuitive, contrasting with much simpler linear systems.

In mathematics, a linear differential equation is a differential equation that is defined by a linear polynomial in the unknown function and its derivatives, that is an equation of the form $where a 0 (x), ..., a n (x) and b (x) are arbitrary differentiable functions that do not need to be linear, and y', ..., y (n) are the successive derivatives of an unknown function y of the variable x .$

In mathematics, integral equations are equations in which an unknown function appears under an integral sign. In mathematical notation, integral equations may thus be expressed as being of the form: $where is an integral operator acting on u. Hence, integral equations may be viewed as the analog to differential equations where instead of the equation involving derivatives, the equation contains integrals. A direct comparison can be seen with the mathematical form of the general integral equation above with the general form of a differential equation which may be expressed as follows: where may be viewed as a differential operator of order i . Due to this close connection between differential and integral equations, one can often convert between the two. For example, one method of solving a boundary value problem is by converting the differential equation with its boundary conditions into an integral equation and solving the integral equation. In addition, because one can convert between the two, differential equations in physics such as Maxwell's equations often have an analog integral and differential form. See also, for example, Green's function and Fredholm theory.$

In mathematics and its applications, a Sturm–Liouville problem is a second-order linear ordinary differential equation of the form $for given functions, and, together with some boundary conditions at extreme values of . The goals of a given Sturm-Liouville problem are:$

In mathematics and computing, the Levenberg–Marquardt algorithm, also known as the damped least-squares (DLS) method, is used to solve non-linear least squares problems. These minimization problems arise especially in least squares curve fitting. The LMA interpolates between the Gauss–Newton algorithm (GNA) and the method of gradient descent. The LMA is more robust than the GNA, which means that in many cases it finds a solution even if it starts very far off the final minimum. For well-behaved functions and reasonable starting parameters, the LMA tends to be slower than the GNA. LMA can also be viewed as Gauss–Newton using a trust region approach.

The Gauss–Newton algorithm is used to solve non-linear least squares problems, which is equivalent to minimizing a sum of squared function values. It is an extension of Newton's method for finding a minimum of a non-linear function. Since a sum of squares must be nonnegative, the algorithm can be viewed as using Newton's method to iteratively approximate zeroes of the components of the sum, and thus minimizing the sum. In this sense, the algorithm is also an effective method for solving overdetermined systems of equations. It has the advantage that second derivatives, which can be challenging to compute, are not required.

In mathematics, the conjugate gradient method is an algorithm for the numerical solution of particular systems of linear equations, namely those whose matrix is positive-semidefinite. The conjugate gradient method is often implemented as an iterative algorithm, applicable to sparse systems that are too large to be handled by a direct implementation or other direct methods such as the Cholesky decomposition. Large sparse systems often arise when numerically solving partial differential equations or optimization problems.

In numerical analysis, the Crank–Nicolson method is a finite difference method used for numerically solving the heat equation and similar partial differential equations. It is a second-order method in time. It is implicit in time, can be written as an implicit Runge–Kutta method, and it is numerically stable. The method was developed by John Crank and Phyllis Nicolson in the 1940s.

In numerical optimization, the Broyden–Fletcher–Goldfarb–Shanno (BFGS) algorithm is an iterative method for solving unconstrained nonlinear optimization problems. Like the related Davidon–Fletcher–Powell method, BFGS determines the descent direction by preconditioning the gradient with curvature information. It does so by gradually improving an approximation to the Hessian matrix of the loss function, obtained only from gradient evaluations via a generalized secant method.

In the mathematical field of analysis, the Nash–Moser theorem, discovered by mathematician John Forbes Nash and named for him and Jürgen Moser, is a generalization of the inverse function theorem on Banach spaces to settings when the required solution mapping for the linearized problem is not bounded.

The Duffing equation, named after Georg Duffing (1861–1944), is a non-linear second-order differential equation used to model certain damped and driven oscillators. The equation is given by $where the (unknown) function is the displacement at time t, is the first derivative of with respect to time, i.e. velocity, and is the second time-derivative of i.e. acceleration. The numbers and are given constants.$

In numerical optimization, the nonlinear conjugate gradient method generalizes the conjugate gradient method to nonlinear optimization. For a quadratic function

In numerical linear algebra, the alternating-direction implicit (ADI) method is an iterative method used to solve Sylvester matrix equations. It is a popular method for solving the large matrix equations that arise in systems theory and control, and can be formulated to construct solutions in a memory-efficient, factored form. It is also used to numerically solve parabolic and elliptic partial differential equations, and is a classic method used for modeling heat conduction and solving the diffusion equation in two or more dimensions. It is an example of an operator splitting method.

Non-linear least squares is the form of least squares analysis used to fit a set of m observations with a model that is non-linear in n unknown parameters (m ≥ n). It is used in some forms of nonlinear regression. The basis of the method is to approximate the model by a linear one and to refine the parameters by successive iterations. There are many similarities to linear least squares, but also some significant differences. In economic theory, the non-linear least squares method is applied in (i) the probit regression, (ii) threshold regression, (iii) smooth regression, (iv) logistic link regression, (v) Box–Cox transformed regressors ( $).$

Sidi's generalized secant method is a root-finding algorithm, that is, a numerical method for solving equations of the form $. The method was published by Avram Sidi.$

In mathematics, a linear recurrence with constant coefficients sets equal to 0 a polynomial that is linear in the various iterates of a variable—that is, in the values of the elements of a sequence. The polynomial's linearity means that each of its terms has degree 0 or 1. A linear recurrence denotes the evolution of some variable over time, with the current time period or discrete moment in time denoted as $t$ , one period earlier denoted as $t - 1$ , one period later as $t + 1$ , etc.

In mathematics, Anderson acceleration, also called Anderson mixing, is a method for the acceleration of the convergence rate of fixed-point iterations. Introduced by Donald G. Anderson, this technique can be used to find the solution to fixed point equations $often arising in the field of computational science.$

References

↑ Ascher, Uri M.; Mattheij, Robert M.; Russell, Robert D. (1995). Numerical solution of boundary value problems for ordinary differential equations. SIAM.
↑ Sylvester, R.J.; Meyer, F. (June 1965). "Two point boundary problems by quasilinearization". Journal of the Society for Industrial and Applied Mathematics. 13 (2): 586–602. doi:10.1137/0113038. JSTOR 2946451 . Retrieved 31 March 2022.
↑ Bornemann, Folkmar (1991). An Adaptive Multilevel Approach to Parabolic Equations in Two Space Dimensions. ZIB. Retrieved 6 March 2022.
↑ Lawden, Derek F. (1989). Elliptic Functions and Applications. New York: Springer-Verlag. ISBN 0-387-96965-9.