Implicit function theorem

Last updated

In multivariable calculus, the implicit function theorem [lower-alpha 1] is a tool that allows relations to be converted to functions of several real variables. It does so by representing the relation as the graph of a function. There may not be a single function whose graph can represent the entire relation, but there may be such a function on a restriction of the domain of the relation. The implicit function theorem gives a sufficient condition to ensure that there is such a function.

Contents

More precisely, given a system of m equations fi(x1, ..., xn, y1, ..., ym) = 0, i = 1, ..., m (often abbreviated into F(x, y) = 0), the theorem states that, under a mild condition on the partial derivatives (with respect to each yi ) at a point, the m variables yi are differentiable functions of the xj in some neighborhood of the point. As these functions can generally not be expressed in closed form, they are implicitly defined by the equations, and this motivated the name of the theorem. [1]

In other words, under a mild condition on the partial derivatives, the set of zeros of a system of equations is locally the graph of a function.

History

Augustin-Louis Cauchy (1789–1857) is credited with the first rigorous form of the implicit function theorem. Ulisse Dini (1845–1918) generalized the real-variable version of the implicit function theorem to the context of functions of any number of real variables. [2]

First example

The unit circle can be specified as the level curve f(x, y) = 1 of the function f(x, y) = x + y. Around point A, y can be expressed as a function y(x). In this example this function can be written explicitly as
g
1
(
x
)
=
1
-
x
2
;
{\displaystyle g_{1}(x)={\sqrt {1-x^{2}}};}
in many cases no such explicit expression exists, but one can still refer to the implicit function y(x). No such function exists around point B. Implicit circle.svg
The unit circle can be specified as the level curve f(x, y) = 1 of the function f(x, y) = x + y. Around point A, y can be expressed as a function y(x). In this example this function can be written explicitly as in many cases no such explicit expression exists, but one can still refer to the implicit function y(x). No such function exists around point B.

If we define the function f(x, y) = x2 + y2, then the equation f(x, y) = 1 cuts out the unit circle as the level set {(x, y) | f(x, y) = 1}. There is no way to represent the unit circle as the graph of a function of one variable y = g(x) because for each choice of x ∈ (−1, 1), there are two choices of y, namely .

However, it is possible to represent part of the circle as the graph of a function of one variable. If we let for −1 ≤ x ≤ 1, then the graph of y = g1(x) provides the upper half of the circle. Similarly, if , then the graph of y = g2(x) gives the lower half of the circle.

The purpose of the implicit function theorem is to tell us that functions like g1(x) and g2(x) almost always exist, even in situations where we cannot write down explicit formulas. It guarantees that g1(x) and g2(x) are differentiable, and it even works in situations where we do not have a formula for f(x, y).

Definitions

Let be a continuously differentiable function. We think of as the Cartesian product and we write a point of this product as Starting from the given function , our goal is to construct a function whose graph is precisely the set of all such that .

As noted above, this may not always be possible. We will therefore fix a point which satisfies , and we will ask for a that works near the point . In other words, we want an open set containing , an open set containing , and a function such that the graph of satisfies the relation on , and that no other points within do so. In symbols,

To state the implicit function theorem, we need the Jacobian matrix of , which is the matrix of the partial derivatives of . Abbreviating to , the Jacobian matrix is

where is the matrix of partial derivatives in the variables and is the matrix of partial derivatives in the variables . The implicit function theorem says that if is an invertible matrix, then there are , , and as desired. Writing all the hypotheses together gives the following statement.

Statement of the theorem

Let be a continuously differentiable function, and let have coordinates . Fix a point with , where is the zero vector. If the Jacobian matrix (this is the right-hand panel of the Jacobian matrix shown in the previous section):

is invertible, then there exists an open set containing such that there exists a unique function such that , and . Moreover, is continuously differentiable and, denoting the left-hand panel of the Jacobian matrix shown in the previous section as:

the Jacobian matrix of partial derivatives of in is given by the matrix product: [3]

Higher derivatives

If, moreover, is analytic or continuously differentiable times in a neighborhood of , then one may choose in order that the same holds true for inside . [4] In the analytic case, this is called the analytic implicit function theorem.

Proof for 2D case

Suppose is a continuously differentiable function defining a curve . Let be a point on the curve. The statement of the theorem above can be rewritten for this simple case as follows:

Theorem  If
then in a neighbourhood of the point we can write , where is a real function.

Proof. Since F is differentiable we write the differential of F through partial derivatives:

Since we are restricted to movement on the curve and by assumption around the point (since is continuous at and ). Therefore we have a first-order ordinary differential equation:

Now we are looking for a solution to this ODE in an open interval around the point for which, at every point in it, . Since F is continuously differentiable and from the assumption we have

From this we know that is continuous and bounded on both ends. From here we know that is Lipschitz continuous in both x and y. Therefore, by Cauchy-Lipschitz theorem, there exists unique y(x) that is the solution to the given ODE with the initial conditions. Q.E.D.

The circle example

Let us go back to the example of the unit circle. In this case n = m = 1 and . The matrix of partial derivatives is just a 1 × 2 matrix, given by

Thus, here, the Y in the statement of the theorem is just the number 2b; the linear map defined by it is invertible if and only if b ≠ 0. By the implicit function theorem we see that we can locally write the circle in the form y = g(x) for all points where y ≠ 0. For (±1, 0) we run into trouble, as noted before. The implicit function theorem may still be applied to these two points, by writing x as a function of y, that is, ; now the graph of the function will be , since where b = 0 we have a = 1, and the conditions to locally express the function in this form are satisfied.

The implicit derivative of y with respect to x, and that of x with respect to y, can be found by totally differentiating the implicit function and equating to 0:

giving

and

Application: change of coordinates

Suppose we have an m-dimensional space, parametrised by a set of coordinates . We can introduce a new coordinate system by supplying m functions each being continuously differentiable. These functions allow us to calculate the new coordinates of a point, given the point's old coordinates using . One might want to verify if the opposite is possible: given coordinates , can we 'go back' and calculate the same point's original coordinates ? The implicit function theorem will provide an answer to this question. The (new and old) coordinates are related by f = 0, with

Now the Jacobian matrix of f at a certain point (a, b) [ where ] is given by

where Im denotes the m × m identity matrix, and J is the m × m matrix of partial derivatives, evaluated at (a, b). (In the above, these blocks were denoted by X and Y. As it happens, in this particular application of the theorem, neither matrix depends on a.) The implicit function theorem now states that we can locally express as a function of if J is invertible. Demanding J is invertible is equivalent to det J ≠ 0, thus we see that we can go back from the primed to the unprimed coordinates if the determinant of the Jacobian J is non-zero. This statement is also known as the inverse function theorem.

Example: polar coordinates

As a simple application of the above, consider the plane, parametrised by polar coordinates (R, θ). We can go to a new coordinate system (cartesian coordinates) by defining functions x(R, θ) = R cos(θ) and y(R, θ) = R sin(θ). This makes it possible given any point (R, θ) to find corresponding Cartesian coordinates (x, y). When can we go back and convert Cartesian into polar coordinates? By the previous example, it is sufficient to have det J ≠ 0, with

Since det J = R, conversion back to polar coordinates is possible if R ≠ 0. So it remains to check the case R = 0. It is easy to see that in case R = 0, our coordinate transformation is not invertible: at the origin, the value of θ is not well-defined.

Generalizations

Banach space version

Based on the inverse function theorem in Banach spaces, it is possible to extend the implicit function theorem to Banach space valued mappings. [5] [6]

Let X, Y, Z be Banach spaces. Let the mapping f : X × YZ be continuously Fréchet differentiable. If , , and is a Banach space isomorphism from Y onto Z, then there exist neighbourhoods U of x0 and V of y0 and a Fréchet differentiable function g : UV such that f(x, g(x)) = 0 and f(x, y) = 0 if and only if y = g(x), for all .

Implicit functions from non-differentiable functions

Various forms of the implicit function theorem exist for the case when the function f is not differentiable. It is standard that local strict monotonicity suffices in one dimension. [7] The following more general form was proven by Kumagai based on an observation by Jittorntrum. [8] [9]

Consider a continuous function such that . If there exist open neighbourhoods and of x0 and y0, respectively, such that, for all y in B, is locally one-to-one, then there exist open neighbourhoods and of x0 and y0, such that, for all , the equation f(x, y) = 0 has a unique solution

where g is a continuous function from B0 into A0.

Collapsing manifolds

Perelman’s collapsing theorem for 3-manifolds, the capstone of his proof of Thurston's geometrization conjecture, can be understood as an extension of the implicit function theorem. [10]

See also

Notes

  1. Also called Dini's theorem by the Pisan school in Italy. In the English-language literature, Dini's theorem is a different theorem in mathematical analysis.

Related Research Articles

In calculus, the chain rule is a formula that expresses the derivative of the composition of two differentiable functions f and g in terms of the derivatives of f and g. More precisely, if is the function such that for every x, then the chain rule is, in Lagrange's notation,

<span class="mw-page-title-main">Gradient</span> Multivariate derivative (mathematics)

In vector calculus, the gradient of a scalar-valued differentiable function of several variables is the vector field whose value at a point gives the direction and the rate of fastest increase. The gradient transforms like a vector under change of basis of the space of variables of . If the gradient of a function is non-zero at a point , the direction of the gradient is the direction in which the function increases most quickly from , and the magnitude of the gradient is the rate of increase in that direction, the greatest absolute directional derivative. Further, a point where the gradient is the zero vector is known as a stationary point. The gradient thus plays a fundamental role in optimization theory, where it is used to minimize a function by gradient descent. In coordinate-free terms, the gradient of a function may be defined by:

In mathematics, and more specifically in linear algebra, a linear map is a mapping between two vector spaces that preserves the operations of vector addition and scalar multiplication. The same names and the same definition are also used for the more general case of modules over a ring; see Module homomorphism.

<span class="mw-page-title-main">Mean value theorem</span> On the existence of a tangent to an arc parallel to the line through its endpoints

In mathematics, the mean value theorem states, roughly, that for a given planar arc between two endpoints, there is at least one point at which the tangent to the arc is parallel to the secant through its endpoints. It is one of the most important results in real analysis. This theorem is used to prove statements about a function on an interval starting from local hypotheses about derivatives at points of the interval.

In differential geometry, a subject of mathematics, a symplectic manifold is a smooth manifold, , equipped with a closed nondegenerate differential 2-form , called the symplectic form. The study of symplectic manifolds is called symplectic geometry or symplectic topology. Symplectic manifolds arise naturally in abstract formulations of classical mechanics and analytical mechanics as the cotangent bundles of manifolds. For example, in the Hamiltonian formulation of classical mechanics, which provides one of the major motivations for the field, the set of all possible configurations of a system is modeled as a manifold, and this manifold's cotangent bundle describes the phase space of the system.

In mathematics, a partial derivative of a function of several variables is its derivative with respect to one of those variables, with the others held constant. Partial derivatives are used in vector calculus and differential geometry.

In statistics, maximum likelihood estimation (MLE) is a method of estimating the parameters of an assumed probability distribution, given some observed data. This is achieved by maximizing a likelihood function so that, under the assumed statistical model, the observed data is most probable. The point in the parameter space that maximizes the likelihood function is called the maximum likelihood estimate. The logic of maximum likelihood is both intuitive and flexible, and as such the method has become a dominant means of statistical inference.

<span class="mw-page-title-main">Normal (geometry)</span> Line or vector perpendicular to a curve or a surface

In geometry, a normal is an object that is perpendicular to a given object. For example, the normal line to a plane curve at a given point is the line perpendicular to the tangent line to the curve at the point.

In mathematics, the Laplace operator or Laplacian is a differential operator given by the divergence of the gradient of a scalar function on Euclidean space. It is usually denoted by the symbols , (where is the nabla operator), or . In a Cartesian coordinate system, the Laplacian is given by the sum of second partial derivatives of the function with respect to each independent variable. In other coordinate systems, such as cylindrical and spherical coordinates, the Laplacian also has a useful form. Informally, the Laplacian Δf (p) of a function f at a point p measures by how much the average value of f over small spheres or balls centered at p deviates from f (p).

In vector calculus, the Jacobian matrix of a vector-valued function of several variables is the matrix of all its first-order partial derivatives. When this matrix is square, that is, when the function takes the same number of variables as input as the number of vector components of its output, its determinant is referred to as the Jacobian determinant. Both the matrix and the determinant are often referred to simply as the Jacobian in literature.

In vector calculus, Green's theorem relates a line integral around a simple closed curve C to a double integral over the plane region D bounded by C. It is the two-dimensional special case of Stokes' theorem.

<span class="mw-page-title-main">Differentiable function</span> Mathematical function whose derivative exists

In mathematics, a differentiable function of one real variable is a function whose derivative exists at each point in its domain. In other words, the graph of a differentiable function has a non-vertical tangent line at each interior point in its domain. A differentiable function is smooth and does not contain any break, angle, or cusp.

In mathematics, the covariant derivative is a way of specifying a derivative along tangent vectors of a manifold. Alternatively, the covariant derivative is a way of introducing and working with a connection on a manifold by means of a differential operator, to be contrasted with the approach given by a principal connection on the frame bundle – see affine connection. In the special case of a manifold isometrically embedded into a higher-dimensional Euclidean space, the covariant derivative can be viewed as the orthogonal projection of the Euclidean directional derivative onto the manifold's tangent space. In this case the Euclidean derivative is broken into two parts, the extrinsic normal component and the intrinsic covariant derivative component.

In mathematical analysis, and applications in geometry, applied mathematics, engineering, and natural sciences, a function of a real variable is a function whose domain is the real numbers , or a subset of that contains an interval of positive length. Most real functions that are considered and studied are differentiable in some interval. The most widely considered such functions are the real functions, which are the real-valued functions of a real variable, that is, the functions of a real variable whose codomain is the set of real numbers.

<span class="mw-page-title-main">Stokes' theorem</span> Theorem in vector calculus

Stokes' theorem, also known as the Kelvin–Stokes theorem after Lord Kelvin and George Stokes, the fundamental theorem for curls or simply the curl theorem, is a theorem in vector calculus on . Given a vector field, the theorem relates the integral of the curl of the vector field over some surface, to the line integral of the vector field around the boundary of the surface. The classical theorem of Stokes can be stated in one sentence: The line integral of a vector field over a loop is equal to the surface integral of its curl over the enclosed surface. It is illustrated in the figure, where the direction of positive circulation of the bounding contour ∂Σ, and the direction n of positive flux through the surface Σ, are related by a right-hand-rule. For the right hand the fingers circulate along ∂Σ and the thumb is directed along n.

<span class="mw-page-title-main">Ordinary differential equation</span> Differential equation containing derivatives with respect to only one variable

In mathematics, an ordinary differential equation (ODE) is a differential equation (DE) dependent on only a single independent variable. As with other DE, its unknown(s) consists of one function(s) and involves the derivatives of those functions. The term "ordinary" is used in contrast with partial differential equations (PDEs) which may be with respect to more than one independent variable, and, less commonly, in contrast with stochastic differential equations (SDEs) where the progression is random.

In mathematics, the Kodaira–Spencer map, introduced by Kunihiko Kodaira and Donald C. Spencer, is a map associated to a deformation of a scheme or complex manifold X, taking a tangent space of a point of the deformation space to the first cohomology group of the sheaf of vector fields on X.

In mathematical analysis and its applications, a function of several real variables or real multivariate function is a function with more than one argument, with all arguments being real variables. This concept extends the idea of a function of a real variable to several variables. The "input" variables take real values, while the "output", also called the "value of the function", may be real or complex. However, the study of the complex-valued functions may be easily reduced to the study of the real-valued functions, by considering the real and imaginary parts of the complex function; therefore, unless explicitly specified, only real-valued functions will be considered in this article.

In algebraic geometry, a derived scheme is a homotopy-theoretic generalization of a scheme in which classical commutative rings are replaced with derived versions such as cdgas, commutative simplicial rings, or commutative ring spectra.

In mathematics, calculus on Euclidean space is a generalization of calculus of functions in one or several variables to calculus of functions on Euclidean space as well as a finite-dimensional real vector space. This calculus is also known as advanced calculus, especially in the United States. It is similar to multivariable calculus but is somewhat more sophisticated in that it uses linear algebra more extensively and covers some concepts from differential geometry such as differential forms and Stokes' formula in terms of differential forms. This extensive use of linear algebra also allows a natural generalization of multivariable calculus to calculus on Banach spaces or topological vector spaces.

References

  1. Chiang, Alpha C. (1984). Fundamental Methods of Mathematical Economics (3rd ed.). McGraw-Hill. pp.  204–206. ISBN   0-07-010813-7.
  2. Krantz, Steven; Parks, Harold (2003). The Implicit Function Theorem . Modern Birkhauser Classics. Birkhauser. ISBN   0-8176-4285-4.
  3. de Oliveira, Oswaldo (2013). "The Implicit and Inverse Function Theorems: Easy Proofs". Real Anal. Exchange. 39 (1): 214–216. arXiv: 1212.2066 . doi:10.14321/realanalexch.39.1.0207. S2CID   118792515.
  4. Fritzsche, K.; Grauert, H. (2002). From Holomorphic Functions to Complex Manifolds. Springer. p. 34. ISBN   9780387953953.
  5. Lang, Serge (1999). Fundamentals of Differential Geometry . Graduate Texts in Mathematics. New York: Springer. pp.  15–21. ISBN   0-387-98593-X.
  6. Edwards, Charles Henry (1994) [1973]. Advanced Calculus of Several Variables. Mineola, New York: Dover Publications. pp. 417–418. ISBN   0-486-68336-2.
  7. Kudryavtsev, Lev Dmitrievich (2001) [1994], "Implicit function", Encyclopedia of Mathematics , EMS Press
  8. Jittorntrum, K. (1978). "An Implicit Function Theorem". Journal of Optimization Theory and Applications. 25 (4): 575–577. doi:10.1007/BF00933522. S2CID   121647783.
  9. Kumagai, S. (1980). "An implicit function theorem: Comment". Journal of Optimization Theory and Applications. 31 (2): 285–288. doi:10.1007/BF00934117. S2CID   119867925.
  10. Cao, Jianguo; Ge, Jian (2011). "A simple proof of Perelman's collapsing theorem for 3-manifolds". J. Geom. Anal. 21 (4): 807–869. arXiv: 1003.2215 . doi:10.1007/s12220-010-9169-5. S2CID   514106.

Further reading