Numerical certification

Last updated

Numerical certification is the process of verifying the correctness of a candidate solution to a system of equations. In (numerical) computational mathematics, such as numerical algebraic geometry, candidate solutions are computed algorithmically, but there is the possibility that errors have corrupted the candidates. For instance, in addition to the inexactness of input data and candidate solutions, numerical errors or errors in the discretization of the problem may result in corrupted candidate solutions. The goal of numerical certification is to provide a certificate which proves which of these candidates are, indeed, approximate solutions.

In mathematics, a set of simultaneous equations, also known as a system of equations or an equation system, is a finite set of equations for which common solutions are sought. An equation system is usually classified in the same manner as single equations, namely as a:

Numerical algebraic geometry is a field of computational mathematics, particularly computational algebraic geometry, which uses methods from numerical analysis to study and manipulate algebraic varieties on a computer.

Contents

Methods for certification can be divided into two flavors: a priori certification and a posteriori certification. A posteriori certification confirms the correctness of the final answers (regardless of how they are generated), while a priori certification confirms the correctness of each step of a specific computation. A typical example of a posteriori certification is Smale's alpha theory, while a typical example of a priori certification is interval arithmetic.

Stephen Smale American mathematician

Stephen Smale is an American mathematician, known for his research in topology, dynamical systems and mathematical economics. He was awarded the Fields Medal in 1966 and spent more than three decades on the mathematics faculty of the University of California, Berkeley.

Interval arithmetic method for bounding the errors of numerical computations

Interval arithmetic, interval mathematics, interval analysis, or interval computation, is a method developed by mathematicians since the 1950s and 1960s, as an approach to putting bounds on rounding errors and measurement errors in mathematical computation and thus developing numerical methods that yield reliable results. Very simply put, it represents each value as a range of possibilities. For example, instead of estimating the height of someone using standard arithmetic as 2.0 metres, using interval arithmetic we might be certain that that person is somewhere between 1.97 and 2.03 metres.

Certificates

A certificate for a root is a computational proof of the correctness of a candidate solution. For instance, a certificate may consist of an approximate solution , a region containing , and a proof that contains exactly one solution to the system of equations.

In this context, an a priori numerical certificate is a certificate in the sense of correctness in computer science. On the other hand, an a posteriori numerical certificate operates only on solutions, regardless of how they are computed. Hence, a posteriori certification is different from algorithmic correctness – for an extreme example, an algorithm could randomly generate candidates and attempt to certify them as approximate roots using a posteriori certification.

In theoretical computer science, correctness of an algorithm is asserted when it is said that the algorithm is correct with respect to a specification. Functional correctness refers to the input-output behavior of the algorithm.

A posteriori certification methods

There are a variety of methods for a posteriori certification, including

Alpha theory

The cornerstone of Smale's alpha theory is bounding the error for Newton's method. Smale's 1986 work [1] introduced the quantity , which quantifies the convergence of Newton's method. More precisely, let be a system of analytic functions in the variables , the derivative operator, and the Newton operator. The quantities

In numerical analysis, Newton's method, also known as the Newton–Raphson method, named after Isaac Newton and Joseph Raphson, is a root-finding algorithm which produces successively better approximations to the roots of a real-valued function. The most basic version starts with a single-variable function f defined for a real variable x, the function's derivative f ′, and an initial guess x0 for a root of f. If the function satisfies sufficient assumptions and the initial guess is close, then

Derivative Operation in calculus

The derivative of a function of a real variable measures the sensitivity to change of the function value with respect to a change in its argument. Derivatives are a fundamental tool of calculus. For example, the derivative of the position of a moving object with respect to time is the object's velocity: this measures how quickly the position of the object changes when time advances.

and

are used to certify a candidate solution. In particular, if

then is an approximate solution for , i.e., the candidate is in the domain of quadratic convergence for Newton's method. In other words, if this inequality holds, then there is a root of so that iterates of the Newton operator converge as

In numerical analysis, the speed at which a convergent sequence approaches its limit is called the rate of convergence. Although strictly speaking, a limit does not give information about any finite first part of the sequence, the concept of rate of convergence is of practical importance when working with a sequence of successive approximations for an iterative method, as then typically fewer iterations are needed to yield a useful approximation if the rate of convergence is higher. This may even make the difference between needing ten or a million iterations.

The software package alphaCertified provides an implementation of the alpha test for polynomials by estimating and . [2]

Interval Newton and Krawczyck methods

Suppose is a function whose fixed points correspond to the roots of . For example, the Newton operator has this property. Suppose that is a region, then,

  1. If maps into itself, i.e., , then by Brouwer fixed-point theorem, has at least one fixed point in , and, hence has at least one root in .
  2. If is contractive in a region containing , then there is at most one root in .

There are versions of the following methods over the complex numbers, but both the interval arithmetic and conditions must be adjusted to reflect this case.

Interval Newton method

In the univariate case, Newton's method can be directly generalized to certify a root over an interval. For an interval , let be the midpoint of . Then, the interval Newton operator applied to is

In practice, any interval containing can be used in this computation. If is a root of , then by the mean value theorem, there is some such that . In other words, . Since contains the inverse of at all points of , it follows that . Therefore, .

Furthermore, if , then either is a root of and or . Therefore, is at most half the width of . Therefore, if there is some root of in , the iterative procedure of replacing by will converge to this root. If, on the other hand, there is no root of in , this iterative procedure will eventually produce an empty interval, a witness to the nonexistence of roots.

See interval Newton method for higher dimensional analogues of this approach.

Krawczyck method

Let be any invertible matrix in . Typically, one takes to be an approximation to . Then, define the function We observe that is a fixed of if and only if is a root of . Therefore the approach above can be used to identify roots of . This approach is similar to a multivariate version of Newton's method, replacing the derivative with the fixed matrix .

We observe that if is a compact and convex region and , then, for any , there exist such that

Let be the Jacobian matrix of evaluated on . In other words, the entry consists of the image of over . It then follows that where the matrix-vector product is computed using interval arithmetic. Then, allowing to vary in , it follows that the image of on satisfies the following containment: where the calculations are, once again, computed using interval arithmetic. Combining this with the formula for , the result is the Krawczyck operator

where is the identity matrix.

If , then has a fixed point in , i.e., has a root in . On the other hand, if the maximum matrix norm using the supremum norm for vectors of all matrices in is less than , then is contractive within , so has a unique fixed point.

A simpler test, when is an axis-aligned parallelepiped, uses , i.e., the midpoint of . In this case, there is a unique root of if

where is the length of the longest side of .

Miranda test

A priori certification methods

Interval arithmetic

Interval arithmetic can be used to provide an a priori numerical certificate by computing intervals containing unique solutions. By using intervals instead of plain numeric types during path tracking, resulting candidates are represented by intervals. The candidate solution-interval is itself the certificate, in the sense that the solution is guaranteed to be inside the interval.

Condition numbers

Numerical algebraic geometry solves polynomial systems using homotopy continuation and path tracking methods. By monitoring the condition number for a tracked homotopy at every step, and ensuring that no two solution paths ever intersect, one can compute a numerical certificate along with a solution. This scheme is called a priori path tracking. [3]

Non-certified numerical path tracking relies on heuristic methods for controlling time step size and precision. [4] In contrast, a priori certified path tracking goes beyond heuristics to provide step size control that guarantees that for every step along the path, the current point is within the domain of quadratic convergence for the current path.

Related Research Articles

Least squares Method in statistics

The method of least squares is a standard approach in regression analysis to approximate the solution of overdetermined systems, i.e., sets of equations in which there are more equations than unknowns. "Least squares" means that the overall solution minimizes the sum of the squares of the residuals made in the results of every single equation.

In mathematics and computing, a root-finding algorithm is an algorithm for finding roots of continuous functions. A root of a function f, from the real numbers to real numbers or from the complex numbers to the complex numbers, is a number x such that f(x) = 0. As, generally, the roots of a function cannot be computed exactly, nor expressed in closed form, root-finding algorithms provide approximations to roots, expressed either as floating point numbers or as small isolating intervals, or disks for complex roots.

Collision detection is the computational problem of detecting the intersection of two or more objects. While collision detection is most often associated with its use in video games and other physical simulations, it also has applications in robotics. In addition to determining whether two objects have collided, collision detection systems may also calculate time of impact (TOI), and report a contact manifold. Collision response deals with simulating what happens when a collision is detected. Solving collision detection problems requires extensive use of concepts from linear algebra and computational geometry.

In mathematics, a linear differential equation is a differential equation that is defined by a linear polynomial in the unknown function and its derivatives, that is an equation of the form

Global optimization is a branch of applied mathematics and numerical analysis that attempts to find the global minima or maxima of a function or a set of functions on a given set. It is usually described as a minimization problem because the maximization of the real-valued function is obviously equivalent to the minimization of the function .

In mathematics, the regula falsi, method of false position, or false position method is a very old method for solving an equation in one unknown, that, in modified form, is still in use. In simple terms, the method is the trial and error technique of using test ("false") values for the variable and then adjusting the test value according to the outcome. This is sometimes also referred to as "guess and check". Versions of the method predate the advent of algebra and the use of equations.

Gauss–Newton algorithm algorithm

The Gauss–Newton algorithm is used to solve non-linear least squares problems. It is a modification of Newton's method for finding a minimum of a function. Unlike Newton's method, the Gauss–Newton algorithm can only be used to minimize a sum of squared function values, but it has the advantage that second derivatives, which can be challenging to compute, are not required.

In Bayesian statistics, a maximum a posteriori probability (MAP) estimate is an estimate of an unknown quantity, that equals the mode of the posterior distribution. The MAP can be used to obtain a point estimate of an unobserved quantity on the basis of empirical data. It is closely related to the method of maximum likelihood (ML) estimation, but employs an augmented optimization objective which incorporates a prior distribution over the quantity one wants to estimate. MAP estimation can therefore be seen as a regularization of ML estimation.

In numerical optimization, the Broyden–Fletcher–Goldfarb–Shanno (BFGS) algorithm is an iterative method for solving unconstrained nonlinear optimization problems.

Affine arithmetic (AA) is a model for self-validated numerical analysis. In AA, the quantities of interest are represented as affine combinations of certain primitive variables, which stand for sources of uncertainty in the data or approximations made during the computation.

Limited-memory BFGS is an optimization algorithm in the family of quasi-Newton methods that approximates the Broyden–Fletcher–Goldfarb–Shanno (BFGS) algorithm using a limited amount of computer memory. It is a popular algorithm for parameter estimation in machine learning. The algorithm's target problem is to minimize over unconstrained values of the real-vector where is a differentiable scalar function.

Quasi-Newton methods are methods used to either find zeroes or local maxima and minima of functions, as an alternative to Newton's method. They can be used if the Jacobian or Hessian is unavailable or is too expensive to compute at every iteration. The "full" Newton's method requires the Jacobian in order to search for zeros, or the Hessian for finding extrema.

Multi-objective optimization is an area of multiple criteria decision making that is concerned with mathematical optimization problems involving more than one objective function to be optimized simultaneously. Multi-objective optimization has been applied in many fields of science, including engineering, economics and logistics where optimal decisions need to be taken in the presence of trade-offs between two or more conflicting objectives. Minimizing cost while maximizing comfort while buying a car, and maximizing performance whilst minimizing fuel consumption and emission of pollutants of a vehicle are examples of multi-objective optimization problems involving two and three objectives, respectively. In practical problems, there can be more than three objectives.

The Kantorovich theorem, or Newton–Kantorovich theorem, is a mathematical statement on the semi-local convergence of Newton's method. It was first stated by Leonid Kantorovich in 1948. It is similar to the form of the Banach fixed-point theorem, although it does not assume the existence of a fixed point.

In numerical linear algebra, the Alternating Direction Implicit (ADI) method is an iterative method used to solve Sylvester matrix equations. It is a popular method for solving the large matrix equations that arise in systems theory and control, and can be formulated to construct solutions in a memory-efficient, factored form. It is also used to numerically solve parabolic and elliptic partial differential equations, and is a classic method used for modeling heat conduction and solving the diffusion equation in two or more dimensions. It is an example of an operator splitting method.

In the area of mathematics known as numerical ordinary differential equations, the direct multiple shooting method is a numerical method for the solution of boundary value problems. The method divides the interval over which a solution is sought into several smaller intervals, solves an initial value problem in each of the smaller intervals, and imposes additional matching conditions to form a solution on the whole interval. The method constitutes a significant improvement in distribution of nonlinearity and numerical stability over single shooting methods.

In scientific computation and simulation, the method of fundamental solutions (MFS) is getting a growing attention. The method is essentially based on the fundamental solution of a partial differential equation of interest as its basis function. The MFS was developed to overcome the major drawbacks in the boundary element method (BEM) which also uses the fundamental solution to satisfy the governing equation. Consequently, both the MFS and the BEM are of a boundary discretization numerical technique and reduce the computational complexity by one dimensionality and have particular edge over the domain-type numerical techniques such as the finite element and finite volume methods on the solution of infinite domain, thin-walled structures, and inverse problems.

References

  1. Smale, Steve (1986). "Newton's method estimates from data at one point". The merging of disciplines: new directions in pure, applied, and computational mathematics: 185–196.
  2. Hauenstein, Jonathan; Sottile, Frank (2012). "Algorithm 921: alphaCertified: certifying solutions to polynomial systems". ACM Transactions on Mathematical Software. 38 (4): 28. doi:10.1145/2331130.2331136.
  3. Beltran, Carlos; Leykin, Anton (2012). "Certified numerical homotopy tracking". Experimental Mathematics. 21 (1): 69–83.
  4. Bates, Daniel; Hauenstein, Jonathan; Sommese, Andrew; Wampler, Charles (2009). "Stepsize control for path tracking". Contemporary Mathematics. 496 (21).