Envelope theorem

Last updated

In mathematics and economics, the envelope theorem is a major result about the differentiability properties of the value function of a parameterized optimization problem. [1] As we change parameters of the objective, the envelope theorem shows that, in a certain sense, changes in the optimizer of the objective do not contribute to the change in the objective function. The envelope theorem is an important tool for comparative statics of optimization models. [2]

Contents

The term envelope derives from describing the graph of the value function as the "upper envelope" of the graphs of the parameterized family of functions that are optimized.

Statement

Let and be real-valued continuously differentiable functions on , where are choice variables and are parameters, and consider the problem of choosing , for a given , so as to:

subject to and .

The Lagrangian expression of this problem is given by

where are the Lagrange multipliers. Now let and together be the solution that maximizes the objective function f subject to the constraints (and hence are saddle points of the Lagrangian),

and define the value function

Then we have the following theorem. [3] [4]

Theorem:Assume that and are continuously differentiable. Then

where .

For arbitrary choice sets

Let denote the choice set and let the relevant parameter be . Letting denote the parameterized objective function, the value function and the optimal choice correspondence (set-valued function) are given by:

 

 

 

 

(1)

 

 

 

 

(2)

"Envelope theorems" describe sufficient conditions for the value function to be differentiable in the parameter and describe its derivative as

 

 

 

 

(3)

where denotes the partial derivative of with respect to . Namely, the derivative of the value function with respect to the parameter equals the partial derivative of the objective function with respect to holding the maximizer fixed at its optimal level.

Traditional envelope theorem derivations use the first-order condition for ( 1 ), which requires that the choice set have the convex and topological structure, and the objective function be differentiable in the variable . (The argument is that changes in the maximizer have only a "second-order effect" at the optimum and so can be ignored.) However, in many applications such as the analysis of incentive constraints in contract theory and game theory, nonconvex production problems, and "monotone" or "robust" comparative statics, the choice sets and objective functions generally lack the topological and convexity properties required by the traditional envelope theorems.

Paul Milgrom and Segal (2002) observe that the traditional envelope formula holds for optimization problems with arbitrary choice sets at any differentiability point of the value function, [5] provided that the objective function is differentiable in the parameter:

Theorem 1: Let and . If both and exist, the envelope formula ( 3 ) holds.

Proof: Equation ( 1 ) implies that for ,

Under the assumptions, the objective function of the displayed maximization problem is differentiable at , and the first-order condition for this maximization is exactly equation ( 3 ). Q.E.D.

While differentiability of the value function in general requires strong assumptions, in many applications weaker conditions such as absolute continuity, differentiability almost everywhere, or left- and right-differentiability, suffice. In particular, Milgrom and Segal's (2002) Theorem 2 offers a sufficient condition for to be absolutely continuous, [5] which means that it is differentiable almost everywhere and can be represented as an integral of its derivative:

Theorem 2: Suppose that is absolutely continuous for all . Suppose also that there exists an integrable function such that for all and almost all . Then is absolutely continuous. Suppose, in addition, that is differentiable for all , and that almost everywhere on . Then for any selection ,

 

 

 

 

(4)

Proof: Using ( 1 )(1), observe that for any with ,

This implies that is absolutely continuous. Therefore, is differentiable almost everywhere, and using ( 3 ) yields ( 4 ). Q.E.D.

This result dispels the common misconception that nice behavior of the value function requires correspondingly nice behavior of the maximizer. Theorem 2 ensures the absolute continuity of the value function even though the maximizer may be discontinuous. In a similar vein, Milgrom and Segal's (2002) Theorem 3 implies that the value function must be differentiable at and hence satisfy the envelope formula ( 3 ) when the family is equi-differentiable at and is single-valued and continuous at , even if the maximizer is not differentiable at (e.g., if is described by a set of inequality constraints and the set of binding constraints changes at ). [5]

Applications

Applications to producer theory

Theorem 1 implies Hotelling's lemma at any differentiability point of the profit function, and Theorem 2 implies the producer surplus formula. Formally, let denote the indirect profit function of a price-taking firm with production set facing prices , and let denote the firm's supply function, i.e.,

Let (the price of good ) and fix the other goods' prices at . Applying Theorem 1 to yields (the firm's optimal supply of good ). Applying Theorem 2 (whose assumptions are verified when is restricted to a bounded interval) yields

i.e. the producer surplus can be obtained by integrating under the firm's supply curve for good .

Applications to mechanism design and auction theory

Consider an agent whose utility function over outcomes depends on his type . Let represent the "menu" of possible outcomes the agent could obtain in the mechanism by sending different messages. The agent's equilibrium utility in the mechanism is then given by (1), and the set of the mechanism's equilibrium outcomes is given by (2). Any selection is a choice rule implemented by the mechanism. Suppose that the agent's utility function is differentiable and absolutely continuous in for all , and that is integrable on . Then Theorem 2 implies that the agent's equilibrium utility in any mechanism implementing a given choice rule must satisfy the integral condition (4).

The integral condition (4) is a key step in the analysis of mechanism design problems with continuous type spaces. In particular, in Myerson's (1981) analysis of single-item auctions, the outcome from the viewpoint of one bidder can be described as , where is the bidder's probability of receiving the object and is his expected payment, and the bidder's expected utility takes the form . In this case, letting denote the bidder's lowest possible type, the integral condition (4) for the bidder's equilibrium expected utility takes the form

(This equation can be interpreted as the producer surplus formula for the firm whose production technology for converting numeraire into probability of winning the object is defined by the auction and which resells the object at a fixed price ). This condition in turn yields Myerson's (1981) celebrated revenue equivalence theorem: the expected revenue generated in an auction in which bidders have independent private values is fully determined by the bidders' probabilities of getting the object for all types as well as by the expected payoffs of the bidders' lowest types. Finally, this condition is a key step in Myerson's (1981) of optimal auctions. [6]

For other applications of the envelope theorem to mechanism design see Mirrlees (1971), [7] Holmstrom (1979), [8] Laffont and Maskin (1980), [9] Riley and Samuelson (1981), [10] Fudenberg and Tirole (1991), [11] and Williams (1999). [12] While these authors derived and exploited the envelope theorem by restricting attention to (piecewise) continuously differentiable choice rules or even narrower classes, it may sometimes be optimal to implement a choice rule that is not piecewise continuously differentiable. (One example is the class of trading problems with linear utility described in chapter 6.5 of Myerson (1991). [13] ) Note that the integral condition (3) still holds in this setting and implies such important results as Holmstrom's lemma (Holmstrom, 1979), [8] Myerson's lemma (Myerson, 1981), [6] the revenue equivalence theorem (for auctions), the Green–Laffont–Holmstrom theorem (Green and Laffont, 1979; Holmstrom, 1979), [14] [8] the Myerson–Satterthwaite inefficiency theorem (Myerson and Satterthwaite, 1983), [15] the Jehiel–Moldovanu impossibility theorems (Jehiel and Moldovanu, 2001), [16] the McAfee–McMillan weak-cartels theorem (McAfee and McMillan, 1992), [17] and Weber's martingale theorem (Weber, 1983), [18] etc. The details of these applications are provided in Chapter 3 of Milgrom (2004), [19] who offers an elegant and unifying framework in auction and mechanism design analysis mainly based on the envelope theorem and other familiar techniques and concepts in demand theory.

Applications to multidimensional parameter spaces

For a multidimensional parameter space , Theorem 1 can be applied to partial and directional derivatives of the value function.[ citation needed ] If both the objective function and the value function are (totally) differentiable in , Theorem 1 implies the envelope formula for their gradients:[ citation needed ] for each . While total differentiability of the value function may not be easy to ensure, Theorem 2 can be still applied along any smooth path connecting two parameter values and .[ citation needed ] Namely, suppose that functions are differentiable for all with for all . A smooth path from to is described by a differentiable mapping with a bounded derivative, such that and .[ citation needed ] Theorem 2 implies that for any such smooth path, the change of the value function can be expressed as the path integral of the partial gradient of the objective function along the path:[ citation needed ]

In particular, for , this establishes that cyclic path integrals along any smooth path must be zero:[ citation needed ]

This "integrability condition" plays an important role in mechanism design with multidimensional types, constraining what kind of choice rules can be sustained by mechanism-induced menus .[ citation needed ] In application to producer theory, with being the firm's production vector and being the price vector, , and the integrability condition says that any rationalizable supply function must satisfy

When is continuously differentiable, this integrability condition is equivalent to the symmetry of the substitution matrix . (In consumer theory, the same argument applied to the expenditure minimization problem yields symmetry of the Slutsky matrix.)

Applications to parameterized constraints

Suppose now that the feasible set depends on the parameter, i.e.,

where for some

Suppose that is a convex set, and are concave in , and there exists such that for all . Under these assumptions, it is well known that the above constrained optimization program can be represented as a saddle-point problem for the Lagrangian , where is the vector of Lagrange multipliers chosen by the adversary to minimize the Lagrangian. [20] [ page needed ] [21] [ page needed ] This allows the application of Milgrom and Segal's (2002, Theorem 4) envelope theorem for saddle-point problems, [5] under the additional assumptions that is a compact set in a normed linear space, and are continuous in , and and are continuous in . In particular, letting denote the Lagrangian's saddle point for parameter value , the theorem implies that is absolutely continuous and satisfies

For the special case in which is independent of , , and , the formula implies that for a.e. . That is, the Lagrange multiplier on the constraint is its "shadow price" in the optimization program. [21] [ page needed ]

Other applications

Milgrom and Segal (2002) demonstrate that the generalized version of the envelope theorems can also be applied to convex programming, continuous optimization problems, saddle-point problems, and optimal stopping problems. [5]

See also

Related Research Articles

In differential geometry, a Riemannian manifold or Riemannian space(M, g), so called after the German mathematician Bernhard Riemann, is a real, smooth manifold M equipped with a positive-definite inner product gp on the tangent space TpM at each point p.

<span class="mw-page-title-main">Noether's theorem</span> Statement relating differentiable symmetries to conserved quantities

Noether's theorem or Noether's first theorem states that every differentiable symmetry of the action of a physical system with conservative forces has a corresponding conservation law. The theorem was proven by mathematician Emmy Noether in 1915 and published in 1918. The action of a physical system is the integral over time of a Lagrangian function, from which the system's behavior can be determined by the principle of least action. This theorem only applies to continuous and smooth symmetries over physical space.

In mathematics, a self-adjoint operator on an infinite-dimensional complex vector space V with inner product is a linear map A that is its own adjoint. If V is finite-dimensional with a given orthonormal basis, this is equivalent to the condition that the matrix of A is a Hermitian matrix, i.e., equal to its conjugate transpose A. By the finite-dimensional spectral theorem, V has an orthonormal basis such that the matrix of A relative to this basis is a diagonal matrix with entries in the real numbers. This article deals with applying generalizations of this concept to operators on Hilbert spaces of arbitrary dimension.

In mathematics, the covariant derivative is a way of specifying a derivative along tangent vectors of a manifold. Alternatively, the covariant derivative is a way of introducing and working with a connection on a manifold by means of a differential operator, to be contrasted with the approach given by a principal connection on the frame bundle – see affine connection. In the special case of a manifold isometrically embedded into a higher-dimensional Euclidean space, the covariant derivative can be viewed as the orthogonal projection of the Euclidean directional derivative onto the manifold's tangent space. In this case the Euclidean derivative is broken into two parts, the extrinsic normal component and the intrinsic covariant derivative component.

A directional derivative is a concept in multivariable calculus that measures the rate at which a function changes in a particular direction at a given point.

In mathematics, the Newton polygon is a tool for understanding the behaviour of polynomials over local fields, or more generally, over ultrametric fields. In the original case, the local field of interest was essentially the field of formal Laurent series in the indeterminate X, i.e. the field of fractions of the formal power series ring , over , where was the real number or complex number field. This is still of considerable utility with respect to Puiseux expansions. The Newton polygon is an effective device for understanding the leading terms of the power series expansion solutions to equations where is a polynomial with coefficients in , the polynomial ring; that is, implicitly defined algebraic functions. The exponents here are certain rational numbers, depending on the branch chosen; and the solutions themselves are power series in with for a denominator corresponding to the branch. The Newton polygon gives an effective, algorithmic approach to calculating .

In probability theory and related fields, Malliavin calculus is a set of mathematical techniques and ideas that extend the mathematical field of calculus of variations from deterministic functions to stochastic processes. In particular, it allows the computation of derivatives of random variables. Malliavin calculus is also called the stochastic calculus of variations. P. Malliavin first initiated the calculus on infinite dimensional space. Then, the significant contributors such as S. Kusuoka, D. Stroock, J-M. Bismut, S. Watanabe, I. Shigekawa, and so on finally completed the foundations.

In mathematics and mathematical optimization, the convex conjugate of a function is a generalization of the Legendre transformation which applies to non-convex functions. It is also known as Legendre–Fenchel transformation, Fenchel transformation, or Fenchel conjugate. It allows in particular for a far reaching generalization of Lagrangian duality.

<span class="mw-page-title-main">Lambert series</span> Mathematical term

In mathematics, a Lambert series, named for Johann Heinrich Lambert, is a series taking the form

In quantum mechanics, the Hellmann–Feynman theorem relates the derivative of the total energy with respect to a parameter to the expectation value of the derivative of the Hamiltonian with respect to that same parameter. According to the theorem, once the spatial distribution of the electrons has been determined by solving the Schrödinger equation, all the forces in the system can be calculated using classical electrostatics.

In continuum mechanics, the finite strain theory—also called large strain theory, or large deformation theory—deals with deformations in which strains and/or rotations are large enough to invalidate assumptions inherent in infinitesimal strain theory. In this case, the undeformed and deformed configurations of the continuum are significantly different, requiring a clear distinction between them. This is commonly the case with elastomers, plastically-deforming materials and other fluids and biological soft tissue.

In mathematical optimization, the Karush–Kuhn–Tucker (KKT) conditions, also known as the Kuhn–Tucker conditions, are first derivative tests for a solution in nonlinear programming to be optimal, provided that some regularity conditions are satisfied.

In mathematics, the Gauss–Manin connection is a connection on a certain vector bundle over a base space S of a family of algebraic varieties . The fibers of the vector bundle are the de Rham cohomology groups of the fibers of the family. It was introduced by Yuri Manin (1958) for curves S and by Alexander Grothendieck (1966) in higher dimensions.

<span class="mw-page-title-main">Elliptic boundary value problem</span>

In mathematics, an elliptic boundary value problem is a special kind of boundary value problem which can be thought of as the stable state of an evolution problem. For example, the Dirichlet problem for the Laplacian gives the eventual distribution of heat in a room several hours after the heating is turned on.

In mathematics, the spectral theory of ordinary differential equations is the part of spectral theory concerned with the determination of the spectrum and eigenfunction expansion associated with a linear ordinary differential equation. In his dissertation, Hermann Weyl generalized the classical Sturm–Liouville theory on a finite closed interval to second order differential operators with singularities at the endpoints of the interval, possibly semi-infinite or infinite. Unlike the classical case, the spectrum may no longer consist of just a countable set of eigenvalues, but may also contain a continuous part. In this case the eigenfunction expansion involves an integral over the continuous part with respect to a spectral measure, given by the Titchmarsh–Kodaira formula. The theory was put in its final simplified form for singular differential equations of even degree by Kodaira and others, using von Neumann's spectral theorem. It has had important applications in quantum mechanics, operator theory and harmonic analysis on semisimple Lie groups.

In mathematics, the Plancherel theorem for spherical functions is an important result in the representation theory of semisimple Lie groups, due in its final form to Harish-Chandra. It is a natural generalisation in non-commutative harmonic analysis of the Plancherel formula and Fourier inversion formula in the representation theory of the group of real numbers in classical harmonic analysis and has a similarly close interconnection with the theory of differential equations. It is the special case for zonal spherical functions of the general Plancherel theorem for semisimple Lie groups, also proved by Harish-Chandra. The Plancherel theorem gives the eigenfunction expansion of radial functions for the Laplacian operator on the associated symmetric space X; it also gives the direct integral decomposition into irreducible representations of the regular representation on L2(X). In the case of hyperbolic space, these expansions were known from prior results of Mehler, Weyl and Fock.

The derivatives of scalars, vectors, and second-order tensors with respect to second-order tensors are of considerable use in continuum mechanics. These derivatives are used in the theories of nonlinear elasticity and plasticity, particularly in the design of algorithms for numerical simulations.

For computer science, in statistical learning theory, a representer theorem is any of several related results stating that a minimizer of a regularized empirical risk functional defined over a reproducing kernel Hilbert space can be represented as a finite linear combination of kernel products evaluated on the input points in the training set data.

In mathematics, there are many kinds of inequalities involving matrices and linear operators on Hilbert spaces. This article covers some important operator inequalities connected with traces of matrices.

In mathematics, calculus on Euclidean space is a generalization of calculus of functions in one or several variables to calculus of functions on Euclidean space as well as a finite-dimensional real vector space. This calculus is also known as advanced calculus, especially in the United States. It is similar to multivariable calculus but is somewhat more sophisticated in that it uses linear algebra more extensively and covers some concepts from differential geometry such as differential forms and Stokes' formula in terms of differential forms. This extensive use of linear algebra also allows a natural generalization of multivariable calculus to calculus on Banach spaces or topological vector spaces.

References

  1. Border, Kim C. (2019). "Miscellaneous Notes on Optimization Theory and Related Topics". Lecture Notes. California Institute of Technology: 154.
  2. Carter, Michael (2001). Foundations of Mathematical Economics. Cambridge: MIT Press. pp. 603–609. ISBN   978-0-262-53192-4.
  3. Afriat, S. N. (1971). "Theory of Maxima and the Method of Lagrange". SIAM Journal on Applied Mathematics . 20 (3): 343–357. doi:10.1137/0120037.
  4. Takayama, Akira (1985). Mathematical Economics (Second ed.). New York: Cambridge University Press. pp.  137–138. ISBN   978-0-521-31498-5.
  5. 1 2 3 4 5 Milgrom, Paul; Ilya Segal (2002). "Envelope Theorems for Arbitrary Choice Sets". Econometrica. 70 (2): 583–601. CiteSeerX   10.1.1.217.4736 . doi:10.1111/1468-0262.00296.
  6. 1 2 Myerson, Roger B. (1981). "Optimal Auction Design". Mathematics of Operations Research . 6 (1): 58–73. doi:10.1287/moor.6.1.58. S2CID   12282691.
  7. Mirrlees, James (2002). "An Exploration in the Theory of Optimal Taxation". Review of Economic Studies. 38 (2): 175–208. doi:10.2307/2296779. JSTOR   2296779.
  8. 1 2 3 Holmstrom, Bengt (1979). "Groves Schemes on Restricted Domains". Econometrica. 47 (5): 1137–1144. doi:10.2307/1911954. JSTOR   1911954. S2CID   55414969.
  9. Laffont, Jean-Jacques; Eric Maskin (1980). "A Differentiable Approach to Dominant Strategy Mechanisms". Econometrica. 48 (6): 1507–1520. doi:10.2307/1912821. JSTOR   1912821.
  10. Riley, John G.; Samuelson, William S. (1981). "Optimal Auctions". American Economic Review. 71 (3): 381–392. JSTOR   1802786.
  11. Fudenberg, Drew; Tirole, Jean (1991). Game Theory. Cambridge: MIT Press. ISBN   0-262-06141-4.
  12. Williams, Steven (1999). "A Characterization of Efficient, Bayesian Incentive Compatible Mechanism". Economic Theory. 14: 155–180. doi:10.1007/s001990050286. S2CID   154378924.
  13. Myerson, Roger (1991). Game Theory. Cambridge: Harvard University Press. ISBN   0-674-34115-5.
  14. Green, J.; Laffont, J. J. (1979). Incentives in Public Decision Making. Amsterdam: North-Holland. ISBN   0-444-85144-5.
  15. Myerson, R.; M. Satterthwaite (1983). "Efficient Mechanisms for Bilateral Trading" (PDF). Journal of Economic Theory. 29 (2): 265–281. doi:10.1016/0022-0531(83)90048-0. hdl: 10419/220829 .
  16. Jehiel, Philippe; Moldovanu, Benny (2001). "Efficient Design with Interdependent Valuations". Econometrica. 69 (5): 1237–1259. CiteSeerX   10.1.1.23.7639 . doi:10.1111/1468-0262.00240.
  17. McAfee, R. Preston; John McMillan (1992). "Bidding Rings". American Economic Review. 82 (3): 579–599. JSTOR   2117323.
  18. Weber, Robert (1983). "Multiple-Object Auctions" (PDF). In Engelbrecht-Wiggans, R.; Shubik, M.; Stark, R. M. (eds.). Auctions, Bidding, and Contracting: Uses and Theory. New York: New York University Press. pp. 165–191. ISBN   0-8147-7827-5.
  19. Milgrom, Paul (2004). Putting Auction Theory to Work. Cambridge University Press. ISBN   9780521536721.
  20. Luenberger, D. G. (1969). Optimization by Vector Space Methods. New York: John Wiley & Sons. ISBN   9780471181170.
  21. 1 2 Rockafellar, R. T. (1970). Convex Analysis. Princeton: Princeton University Press. ISBN   0691015864.