Pseudoconvex function

Last updated November 03, 2022

In convex analysis and the calculus of variations, both branches of mathematics, a pseudoconvex function is a function that behaves like a convex function with respect to finding its local minima, but need not actually be convex. Informally, a differentiable function is pseudoconvex if it is increasing in any direction where it has a positive directional derivative. The property must hold in all of the function domain, and not only for nearby points.

Formal definition

Consider a differentiable function $f:X\subseteq \mathbb {R} ^{n}\rightarrow \mathbb {R}$ , defined on a (nonempty) convex open set $X$ of the finite-dimensional Euclidean space $\mathbb {R} ^{n}$ . This function is said to be pseudoconvex if the following property holds:^[1]

for all

x,y\in X:\quad \nabla f(x)\cdot (y-x)\geq 0\Rightarrow f(y)\geq f(x)

.

Equivalently:

for all

x,y\in X:\quad f(y)<f(x)\Rightarrow \nabla f(x)\cdot (y-x)<0

.

Here $\nabla f$ is the gradient of $f$ , defined by: $\nabla f=\left({\frac {\partial f}{\partial x_{1}}},\dots ,{\frac {\partial f}{\partial x_{n}}}\right).$

Note that the definition may also be stated in terms of the directional derivative of $f$ , in the direction given by the vector $v=y-x$ . This is because, as $f$ is differentiable, this directional derivative is given by:

{\frac {\partial f}{\partial v}}(x)=\nabla f(x)\cdot v=\nabla f(x)\cdot (y-x).

Properties

Relation to other types of "convexity"

Every convex function is pseudoconvex, but the converse is not true. For example, the function $f(x)=x+x^{3}$ is pseudoconvex but not convex. Similarly, any pseudoconvex function is quasiconvex; but the converse is not true, since the function $f(x)=x^{3}$ is quasiconvex but not pseudoconvex. This can be summarized schematically as:

convex

\Rightarrow

pseudoconvex

\Rightarrow

quasiconvex

To see that $f(x)=x^{3}$ is not pseudoconvex, consider its derivative at $x=0$ : $f^{\prime }(0)=0$ . Then, if $f(x)=x^{3}$ was pseudoconvex, we should have:

f^{\prime }(0)(y-0)=0\geq 0\Rightarrow f(y)\geq f(0),\quad \forall \,y\in \mathbb {R} .

In particular it should be true for $y=-1$ . But it is not, as: $f(-1)=(-1)^{3}=-1<f(0)=0$ .

Sufficient optimality condition

For any differentiable function, we have the Fermat's theorem necessary condition of optimality, which states that: if $f$ has a local minimum at $x^{*}$ , then $x^{*}$ must be a stationary point of $f$ (that is: $\nabla f(x^{*})=0$ ).

Pseudoconvexity is of great interest in the area of optimization, because the converse is also true for any pseudoconvex function. That is:^[2] if $x^{*}$ is a stationary point of a pseudoconvex function $f$ , then $f$ has a global minimum at $x^{*}$ . Note also that the result guarantees a global minimum (not only local).

This last result is also true for a convex function, but it is not true for a quasiconvex function. Consider for example the quasiconvex function:

f(x)={\frac {e^{x}}{x^{2}+1}}+{\frac {1}{e^{x}}}

.

This function is not pseudoconvex, but it is quasiconvex. Also, the point $x=0$ is a critical point of $f$ , as $f^{\prime }(0)=0$ . However, $f$ does not have a global minimum at $x=0$ (not even a local minimum).

Finally, note that a pseudoconvex function may not have any critical point. Take for example the pseudoconvex function: $f(x)=x^{3}+x$ , whose derivative is always positive: $f^{\prime }(x)=3x^{2}+1>0,\,\forall \,x\in \mathbb {R}$ .

Examples

An example of a function that is pseudoconvex, but not convex, is: $f(x)={\frac {x^{2}}{x^{2}+k}},\,k>0.$ The figure shows this function for the case where $k=0.2$ . This example may be generalized to two variables as:

f(x)={\frac {x^{2}+y^{2}}{x^{2}+y^{2}+k}},\,k>0.

Pseudoconvex function that is not convex. Pseudoconvex.png — Pseudoconvex function that is not convex.

The previous example may be modified to obtain a function that is not convex, nor pseudoconvex, but is quasiconvex:

f(x)={\frac {|x|^{p}}{|x|^{p}+k}},\,k>0,\,p\in (0,1).

The figure shows this function for the case where $k=0.5,p=0.6$ . As can be seen, this function is not convex because of the concavity, and it is not pseudoconvex because it is not differentiable at $x=0$ .

Quasiconvex function that is not convex, nor pseudoconvex. Quasiconvex.png — Quasiconvex function that is not convex, nor pseudoconvex.

Generalization to nondifferentiable functions

The notion of pseudoconvexity can be generalized to nondifferentiable functions as follows.^[3] Given any function $f:X\rightarrow \mathbb {R}$ , we can define the upper Dini derivative of $f$ by:

f^{+}(x,u)=\limsup _{h\to 0^{+}}{\frac {f(x+hu)-f(x)}{h}};

where u is any unit vector. The function is said to be pseudoconvex if it is increasing in any direction where the upper Dini derivative is positive. More precisely, this is characterized in terms of the subdifferential $\partial f$ as follows:

For all

x,y\in X

: if

x^{*}\in \partial f(x)

is such that

\langle x^{*},y-x\rangle \geq 0

, then

f(x)\leq f(z)

, for all

z\in [x,y]

;

where $[x,y]$ denotes the line segment adjoining x and y.

Related notions

A pseudoconcave function is a function whose negative is pseudoconvex. A pseudolinear function is a function that is both pseudoconvex and pseudoconcave.^[4] For example, linear–fractional programs have pseudolinear objective functions and linear–inequality constraints. These properties allow fractional-linear problems to be solved by a variant of the simplex algorithm (of George B. Dantzig).^[5]^[6]^[7]

Given a vector-valued function $\eta$ , there is a more general notion of $\eta$ -pseudoconvexity^[8]^[9] and $\eta$ -pseudolinearity; wherein classical pseudoconvexity and pseudolinearity pertain to the case when $\eta (x,y)=y-x$ .

Notes

↑ Mangasarian 1965
↑ Mangasarian 1965
↑ Floudas & Pardalos 2001
↑ Rapcsak 1991
↑ Chapter five: Craven, B. D. (1988). Fractional programming. Sigma Series in Applied Mathematics. Vol. 4. Berlin: Heldermann Verlag. p. 145. ISBN 3-88538-404-3. MR 0949209.
↑ Kruk, Serge; Wolkowicz, Henry (1999). "Pseudolinear programming". SIAM Review . 41 (4): 795–805. Bibcode:1999SIAMR..41..795K. doi:10.1137/S0036144598335259. JSTOR 2653207. MR 1723002.
↑ Mathis, Frank H.; Mathis, Lenora Jane (1995). "A nonlinear programming algorithm for hospital management". SIAM Review . 37 (2): 230–234. doi:10.1137/1037046. JSTOR 2132826. MR 1343214. S2CID 120626738.
↑ Ansari, Qamrul Hasan; Lalitha, C. S.; Mehta, Monika (2013). Generalized Convexity, Nonsmooth Variational Inequalities, and Nonsmooth Optimization. CRC Press. p. 107. ISBN 9781439868218 . Retrieved 15 July 2019.
↑ Mishra, Shashi K.; Giorgi, Giorgio (2008). Invexity and Optimization. Springer Science & Business Media. p. 39. ISBN 9783540785613 . Retrieved 15 July 2019.

Related Research Articles

In calculus, the chain rule is a formula that expresses the derivative of the composition of two differentiable functions $f$ and $g$ in terms of the derivatives of $f$ and $g$ . More precisely, if $is the function such that for every x, then the chain rule is, in Lagrange's notation,$

<span class="mw-page-title-main">Cauchy–Riemann equations</span> Conditions required of holomorphic (complex differentiable) functions

In the field of complex analysis in mathematics, the Cauchy–Riemann equations, named after Augustin Cauchy and Bernhard Riemann, consist of a system of two partial differential equations which, together with certain continuity and differentiability criteria, form a necessary and sufficient condition for a complex function to be holomorphic. This system of equations first appeared in the work of Jean le Rond d'Alembert. Later, Leonhard Euler connected this system to the analytic functions. Cauchy then used these equations to construct his theory of functions. Riemann's dissertation on the theory of functions appeared in 1851.

<span class="mw-page-title-main">Gradient</span> Multivariate derivative (mathematics)

In vector calculus, the gradient of a scalar-valued differentiable function $f$ of several variables is the vector field $whose value at a point is the vector whose components are the partial derivatives of at . That is, for, its gradient is defined at the point in n- dimensional space as the vector$

In calculus, and more generally in mathematical analysis, integration by parts or partial integration is a process that finds the integral of a product of functions in terms of the integral of the product of their derivative and antiderivative. It is frequently used to transform the antiderivative of a product of functions into an antiderivative for which a solution can be more easily found. The rule can be thought of as an integral version of the product rule of differentiation.

The calculus of variations is a field of mathematical analysis that uses variations, which are small changes in functions and functionals, to find maxima and minima of functionals: mappings from a set of functions to the real numbers. Functionals are often expressed as definite integrals involving functions and their derivatives. Functions that maximize or minimize functionals may be found using the Euler–Lagrange equation of the calculus of variations.

<span class="mw-page-title-main">Convex function</span> Real function with secant line between points above the graph itself

In mathematics, a real-valued function is called convex if the line segment between any two points on the graph of the function lies above the graph between the two points. Equivalently, a function is convex if its epigraph is a convex set. A twice-differentiable function of a single variable is convex if and only if its second derivative is nonnegative on its entire domain. Well-known examples of convex functions of a single variable include the quadratic function $and the exponential function . In simple terms, a convex function refers to a function whose graph is shaped like a cup, while a concave function's graph is shaped like a cap .$

In the theory of ordinary differential equations (ODEs), Lyapunov functions, named after Aleksandr Lyapunov, are scalar functions that may be used to prove the stability of an equilibrium of an ODE. Lyapunov functions are important to stability theory of dynamical systems and control theory. A similar concept appears in the theory of general state space Markov chains, usually under the name Foster–Lyapunov functions.

In differential geometry, the four-gradient $is the four-vector analogue of the gradient from vector calculus.$

In mathematics, more precisely in the theory of functions of several complex variables, a pseudoconvex set is a special type of open set in the n-dimensional complex space Cⁿ. Pseudoconvex sets are important, as they allow for classification of domains of holomorphy.

In mathematics and economics, the envelope theorem is a major result about the differentiability properties of the value function of a parameterized optimization problem. As we change parameters of the objective, the envelope theorem shows that, in a certain sense, changes in the optimizer of the objective do not contribute to the change in the objective function. The envelope theorem is an important tool for comparative statics of optimization models.

In mathematics, a Schur-convex function, also known as S-convex, isotonic function and order-preserving function is a function $that for all such that is majorized by, one has that . Named after Issai Schur, Schur-convex functions are used in the study of majorization. Every function that is convex and symmetric is also Schur-convex. The opposite implication is not true, but all Schur-convex functions are symmetric.$

In mathematics, a quasiconvex function is a real-valued function defined on an interval or on a convex subset of a real vector space such that the inverse image of any set of the form $is a convex set. For a function of a single variable, along any stretch of the curve the highest point is one of the endpoints. The negative of a quasiconvex function is said to be quasiconcave .$

In mathematics and mathematical physics, raising and lowering indices are operations on tensors which change their type. Raising and lowering indices are a form of index manipulation in tensor expressions.

In optimization, a self-concordant function is a function $for which$

In computer science, online machine learning is a method of machine learning in which data becomes available in a sequential order and is used to update the best predictor for future data at each step, as opposed to batch learning techniques which generate the best predictor by learning on the entire training data set at once. Online learning is a common technique used in areas of machine learning where it is computationally infeasible to train over the entire dataset, requiring the need of out-of-core algorithms. It is also used in situations where it is necessary for the algorithm to dynamically adapt to new patterns in the data, or when the data itself is generated as a function of time, e.g., stock price prediction. Online learning algorithms may be prone to catastrophic interference, a problem that can be addressed by incremental learning approaches.

In fluid dynamics, the mild-slope equation describes the combined effects of diffraction and refraction for water waves propagating over bathymetry and due to lateral boundaries—like breakwaters and coastlines. It is an approximate model, deriving its name from being originally developed for wave propagation over mild slopes of the sea floor. The mild-slope equation is often used in coastal engineering to compute the wave-field changes near harbours and coasts.

The Clausius–Duhem inequality is a way of expressing the second law of thermodynamics that is used in continuum mechanics. This inequality is particularly useful in determining whether the constitutive relation of a material is thermodynamically allowable.

In vector calculus, an invex function is a differentiable function $from to for which there exists a vector valued function such that$

In mathematics, mirror descent is an iterative optimization algorithm for finding a local minimum of a differentiable function.

In the calculus of variations, a subfield of mathematics, quasiconvexity is a generalisation of the notion of convexity. It is used to characterise the integrand of a functional and related to the existence of minimisers. Under some natural conditions, quasiconvexity of the integrand is a necessary and sufficient condition for a functional

References

Floudas, Christodoulos A.; Pardalos, Panos M. (2001), "Generalized monotone multivalued maps", Encyclopedia of Optimization, Springer, p. 227, ISBN 978-0-7923-6932-5 .
Mangasarian, O. L. (January 1965). "Pseudo-Convex Functions". Journal of the Society for Industrial and Applied Mathematics, Series A. 3 (2): 281–290. doi:10.1137/0303020. ISSN 0363-0129..
Rapcsak, T. (1991-02-15). "On pseudolinear functions". European Journal of Operational Research. 50 (3): 353–360. doi:10.1016/0377-2217(91)90267-Y. ISSN 0377-2217.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Mangasarian 1965

[2] Mangasarian 1965

[3] Floudas & Pardalos 2001

[4] Rapcsak 1991

[5] Chapter five: Craven, B. D. (1988). Fractional programming. Sigma Series in Applied Mathematics. Vol. 4. Berlin: Heldermann Verlag. p. 145. ISBN 3-88538-404-3. MR 0949209.

[6] Kruk, Serge; Wolkowicz, Henry (1999). "Pseudolinear programming". SIAM Review . 41 (4): 795–805. Bibcode:1999SIAMR..41..795K. doi:10.1137/S0036144598335259. JSTOR 2653207. MR 1723002.

[7] Mathis, Frank H.; Mathis, Lenora Jane (1995). "A nonlinear programming algorithm for hospital management". SIAM Review . 37 (2): 230–234. doi:10.1137/1037046. JSTOR 2132826. MR 1343214. S2CID 120626738.

[Ansari2013-8] Ansari, Qamrul Hasan; Lalitha, C. S.; Mehta, Monika (2013). Generalized Convexity, Nonsmooth Variational Inequalities, and Nonsmooth Optimization. CRC Press. p. 107. ISBN 9781439868218 . Retrieved 15 July 2019.

[Mishra2008-9] Mishra, Shashi K.; Giorgi, Giorgio (2008). Invexity and Optimization. Springer Science & Business Media. p. 39. ISBN 9783540785613 . Retrieved 15 July 2019.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

v t e Convex analysis and variational analysis
Basic concepts	Convex combination Convex function Convex set
Topics (list)	Choquet theory Convex geometry Convex metric space Convex optimization Duality Lagrange multiplier Legendre transformation Locally convex topological vector space Simplex
Maps	Convex conjugate Concave (Closed K- Logarithmically Proper Pseudo- Quasi-) Convex function Invex function Legendre transformation Semi-continuity Subderivative
Main results (list)	Carathéodory's theorem Ekeland's variational principle Fenchel–Moreau theorem Fenchel-Young inequality Jensen's inequality Hermite–Hadamard inequality Krein–Milman theorem Mazur's lemma Shapley–Folkman lemma Robinson-Ursescu Simons Ursescu
Sets	Convex hull (Pseudo-) Convex set Effective domain Epigraph Hypograph John ellipsoid Lens Radial set/Algebraic interior Zonotope
Series	Convex series related ((cs, lcs)-closed, (cs, bcs)-complete, (lower) ideally convex, (Hx), and (Hwx))
Duality	Dual system Duality gap Strong duality Weak duality