Pseudoconvex function

Last updated

In convex analysis and the calculus of variations, both branches of mathematics, a pseudoconvex function is a function that behaves like a convex function with respect to finding its local minima, but need not actually be convex. Informally, a differentiable function is pseudoconvex if it is increasing in any direction where it has a positive directional derivative. The property must hold in all of the function domain, and not only for nearby points.

Contents

Formal definition

Consider a differentiable function , defined on a (nonempty) convex open set of the finite-dimensional Euclidean space . This function is said to be pseudoconvex if the following property holds: [1]

for all .

Equivalently:

for all .

Here is the gradient of , defined by:

Note that the definition may also be stated in terms of the directional derivative of , in the direction given by the vector . This is because, as is differentiable, this directional derivative is given by:

Properties

Relation to other types of "convexity"

Every convex function is pseudoconvex, but the converse is not true. For example, the function is pseudoconvex but not convex. Similarly, any pseudoconvex function is quasiconvex; but the converse is not true, since the function is quasiconvex but not pseudoconvex. This can be summarized schematically as:

convex pseudoconvex quasiconvex
Functions x^3 (quasiconvex but not pseudoconvex) and x^3 + x (pseudoconvex and thus quasiconvex). None of them is convex. Pseudoquasiconvex.png
Functions x^3 (quasiconvex but not pseudoconvex) and x^3 + x (pseudoconvex and thus quasiconvex). None of them is convex.

To see that is not pseudoconvex, consider its derivative at : . Then, if was pseudoconvex, we should have:

In particular it should be true for . But it is not, as: .

Sufficient optimality condition

For any differentiable function, we have the Fermat's theorem necessary condition of optimality, which states that: if has a local minimum at , then must be a stationary point of (that is: ).

Pseudoconvexity is of great interest in the area of optimization, because the converse is also true for any pseudoconvex function. That is: [2] if is a stationary point of a pseudoconvex function , then has a global minimum at . Note also that the result guarantees a global minimum (not only local).

This last result is also true for a convex function, but it is not true for a quasiconvex function. Consider for example the quasiconvex function:

.

This function is not pseudoconvex, but it is quasiconvex. Also, the point is a critical point of , as . However, does not have a global minimum at (not even a local minimum).

Example of a quasiconvex function that is not pseudoconvex. The function has a critical point at
x
=
0
{\displaystyle x=0}
, but this is not a minimum. Quasiconvex optimality.png
Example of a quasiconvex function that is not pseudoconvex. The function has a critical point at , but this is not a minimum.

Finally, note that a pseudoconvex function may not have any critical point. Take for example the pseudoconvex function: , whose derivative is always positive: .

Examples

An example of a function that is pseudoconvex, but not convex, is: The figure shows this function for the case where . This example may be generalized to two variables as:

Pseudoconvex function that is not convex. Pseudoconvex.png
Pseudoconvex function that is not convex.

The previous example may be modified to obtain a function that is not convex, nor pseudoconvex, but is quasiconvex:

The figure shows this function for the case where . As can be seen, this function is not convex because of the concavity, and it is not pseudoconvex because it is not differentiable at .

Quasiconvex function that is not convex, nor pseudoconvex. Quasiconvex.png
Quasiconvex function that is not convex, nor pseudoconvex.

Generalization to nondifferentiable functions

The notion of pseudoconvexity can be generalized to nondifferentiable functions as follows. [3] Given any function , we can define the upper Dini derivative of by:

where u is any unit vector. The function is said to be pseudoconvex if it is increasing in any direction where the upper Dini derivative is positive. More precisely, this is characterized in terms of the subdifferential as follows:

For all : if is such that , then , for all ;

where denotes the line segment adjoining x and y.

A pseudoconcave function is a function whose negative is pseudoconvex. A pseudolinear function is a function that is both pseudoconvex and pseudoconcave. [4] For example, linear–fractional programs have pseudolinear objective functions and linear–inequality constraints. These properties allow fractional-linear problems to be solved by a variant of the simplex algorithm (of George B. Dantzig). [5] [6] [7]

Given a vector-valued function , there is a more general notion of -pseudoconvexity [8] [9] and -pseudolinearity; wherein classical pseudoconvexity and pseudolinearity pertain to the case when .

See also

Notes

  1. Mangasarian 1965
  2. Mangasarian 1965
  3. Floudas & Pardalos 2001
  4. Rapcsak 1991
  5. Chapter five: Craven, B. D. (1988). Fractional programming. Sigma Series in Applied Mathematics. Vol. 4. Berlin: Heldermann Verlag. p. 145. ISBN   3-88538-404-3. MR   0949209.
  6. Kruk, Serge; Wolkowicz, Henry (1999). "Pseudolinear programming". SIAM Review . 41 (4): 795–805. Bibcode:1999SIAMR..41..795K. doi:10.1137/S0036144598335259. JSTOR   2653207. MR   1723002.
  7. Mathis, Frank H.; Mathis, Lenora Jane (1995). "A nonlinear programming algorithm for hospital management". SIAM Review . 37 (2): 230–234. doi:10.1137/1037046. JSTOR   2132826. MR   1343214. S2CID   120626738.
  8. Ansari, Qamrul Hasan; Lalitha, C. S.; Mehta, Monika (2013). Generalized Convexity, Nonsmooth Variational Inequalities, and Nonsmooth Optimization. CRC Press. p. 107. ISBN   9781439868218 . Retrieved 15 July 2019.
  9. Mishra, Shashi K.; Giorgi, Giorgio (2008). Invexity and Optimization. Springer Science & Business Media. p. 39. ISBN   9783540785613 . Retrieved 15 July 2019.

Related Research Articles

In calculus, the chain rule is a formula that expresses the derivative of the composition of two differentiable functions f and g in terms of the derivatives of f and g. More precisely, if is the function such that for every x, then the chain rule is, in Lagrange's notation,

<span class="mw-page-title-main">Cauchy–Riemann equations</span> Conditions required of holomorphic (complex differentiable) functions

In the field of complex analysis in mathematics, the Cauchy–Riemann equations, named after Augustin Cauchy and Bernhard Riemann, consist of a system of two partial differential equations which, together with certain continuity and differentiability criteria, form a necessary and sufficient condition for a complex function to be holomorphic. This system of equations first appeared in the work of Jean le Rond d'Alembert. Later, Leonhard Euler connected this system to the analytic functions. Cauchy then used these equations to construct his theory of functions. Riemann's dissertation on the theory of functions appeared in 1851.

<span class="mw-page-title-main">Gradient</span> Multivariate derivative (mathematics)

In vector calculus, the gradient of a scalar-valued differentiable function f of several variables is the vector field whose value at a point is the vector whose components are the partial derivatives of at . That is, for , its gradient is defined at the point in n-dimensional space as the vector

In calculus, and more generally in mathematical analysis, integration by parts or partial integration is a process that finds the integral of a product of functions in terms of the integral of the product of their derivative and antiderivative. It is frequently used to transform the antiderivative of a product of functions into an antiderivative for which a solution can be more easily found. The rule can be thought of as an integral version of the product rule of differentiation.

The calculus of variations is a field of mathematical analysis that uses variations, which are small changes in functions and functionals, to find maxima and minima of functionals: mappings from a set of functions to the real numbers. Functionals are often expressed as definite integrals involving functions and their derivatives. Functions that maximize or minimize functionals may be found using the Euler–Lagrange equation of the calculus of variations.

<span class="mw-page-title-main">Convex function</span> Real function with secant line between points above the graph itself

In mathematics, a real-valued function is called convex if the line segment between any two points on the graph of the function lies above the graph between the two points. Equivalently, a function is convex if its epigraph is a convex set. A twice-differentiable function of a single variable is convex if and only if its second derivative is nonnegative on its entire domain. Well-known examples of convex functions of a single variable include the quadratic function and the exponential function . In simple terms, a convex function refers to a function whose graph is shaped like a cup , while a concave function's graph is shaped like a cap .

In the theory of ordinary differential equations (ODEs), Lyapunov functions, named after Aleksandr Lyapunov, are scalar functions that may be used to prove the stability of an equilibrium of an ODE. Lyapunov functions are important to stability theory of dynamical systems and control theory. A similar concept appears in the theory of general state space Markov chains, usually under the name Foster–Lyapunov functions.

In differential geometry, the four-gradient is the four-vector analogue of the gradient from vector calculus.

In mathematics, more precisely in the theory of functions of several complex variables, a pseudoconvex set is a special type of open set in the n-dimensional complex space Cn. Pseudoconvex sets are important, as they allow for classification of domains of holomorphy.

In mathematics and economics, the envelope theorem is a major result about the differentiability properties of the value function of a parameterized optimization problem. As we change parameters of the objective, the envelope theorem shows that, in a certain sense, changes in the optimizer of the objective do not contribute to the change in the objective function. The envelope theorem is an important tool for comparative statics of optimization models.

In mathematics, a Schur-convex function, also known as S-convex, isotonic function and order-preserving function is a function that for all such that is majorized by , one has that . Named after Issai Schur, Schur-convex functions are used in the study of majorization. Every function that is convex and symmetric is also Schur-convex. The opposite implication is not true, but all Schur-convex functions are symmetric.

<span class="mw-page-title-main">Quasiconvex function</span>

In mathematics, a quasiconvex function is a real-valued function defined on an interval or on a convex subset of a real vector space such that the inverse image of any set of the form is a convex set. For a function of a single variable, along any stretch of the curve the highest point is one of the endpoints. The negative of a quasiconvex function is said to be quasiconcave.

In mathematics and mathematical physics, raising and lowering indices are operations on tensors which change their type. Raising and lowering indices are a form of index manipulation in tensor expressions.

In optimization, a self-concordant function is a function for which

<span class="mw-page-title-main">Online machine learning</span> Method of machine learning

In computer science, online machine learning is a method of machine learning in which data becomes available in a sequential order and is used to update the best predictor for future data at each step, as opposed to batch learning techniques which generate the best predictor by learning on the entire training data set at once. Online learning is a common technique used in areas of machine learning where it is computationally infeasible to train over the entire dataset, requiring the need of out-of-core algorithms. It is also used in situations where it is necessary for the algorithm to dynamically adapt to new patterns in the data, or when the data itself is generated as a function of time, e.g., stock price prediction. Online learning algorithms may be prone to catastrophic interference, a problem that can be addressed by incremental learning approaches.

<span class="mw-page-title-main">Mild-slope equation</span> Physics phenomenon and formula

In fluid dynamics, the mild-slope equation describes the combined effects of diffraction and refraction for water waves propagating over bathymetry and due to lateral boundaries—like breakwaters and coastlines. It is an approximate model, deriving its name from being originally developed for wave propagation over mild slopes of the sea floor. The mild-slope equation is often used in coastal engineering to compute the wave-field changes near harbours and coasts.

The Clausius–Duhem inequality is a way of expressing the second law of thermodynamics that is used in continuum mechanics. This inequality is particularly useful in determining whether the constitutive relation of a material is thermodynamically allowable.

In vector calculus, an invex function is a differentiable function from to for which there exists a vector valued function such that

In mathematics, mirror descent is an iterative optimization algorithm for finding a local minimum of a differentiable function.

In the calculus of variations, a subfield of mathematics, quasiconvexity is a generalisation of the notion of convexity. It is used to characterise the integrand of a functional and related to the existence of minimisers. Under some natural conditions, quasiconvexity of the integrand is a necessary and sufficient condition for a functional

References