Hyperplane separation theorem

Hyperplane separation theorem
	Illustration of the hyperplane separation theorem.
Type	Theorem
Field	Convex geometry ; Topological vector spaces ; Collision detection ;
Conjectured by	Hermann Minkowski
Open problem	No
Generalizations	Hahn–Banach separation theorem

Last updated August 02, 2023

In geometry, the hyperplane separation theorem is a theorem about disjoint convex sets in n-dimensional Euclidean space. There are several rather similar versions. In one version of the theorem, if both these sets are closed and at least one of them is compact, then there is a hyperplane in between them and even two parallel hyperplanes in between them separated by a gap. In another version, if both disjoint convex sets are open, then there is a hyperplane in between them, but not necessarily any gap. An axis which is orthogonal to a separating hyperplane is a separating axis, because the orthogonal projections of the convex bodies onto the axis are disjoint.

Statements and proof

In all cases, assume $A,B$ to be disjoint, nonempty, and convex subsets of $\mathbb {R} ^{n}$ . The summary of the results are as follows:

summary table
$A$	$B$	$\langle x,v\rangle$	$\langle y,v\rangle$
		$\geq c$	$\leq c$
closed compact	closed	$>c_{1}$	$<c_{2}$ with $c_{2}<c_{1}$
closed	closed compact	$>c_{1}$	$<c_{2}$ with $c_{2}<c_{1}$
open		$>c$	$\leq c$
open	open	$>c$	$<c$

Hyperplane separation theorem^[4] — Let $A$ and $B$ be two disjoint nonempty convex subsets of $\mathbb {R} ^{n}$ . Then there exist a nonzero vector $v$ and a real number $c$ such that

\langle x,v\rangle \geq c\,{\text{ and }}\langle y,v\rangle \leq c

for all $x$ in $A$ and $y$ in $B$ ; i.e., the hyperplane $\langle \cdot ,v\rangle =c$ , $v$ the normal vector, separates $A$ and $B$ .

If both sets are closed, and at least one of them is compact, then the separation can be strict, that is, $\langle x,v\rangle >c_{1}\,{\text{ and }}\langle y,v\rangle <c_{2}$ for some $c_{1}>c_{2}$

The number of dimensions must be finite. In infinite-dimensional spaces there are examples of two closed, convex, disjoint sets which cannot be separated by a closed hyperplane (a hyperplane where a continuous linear functional equals some constant) even in the weak sense where the inequalities are not strict.^[5]

Here, the compactness in the hypothesis cannot be relaxed; see an example in the section Counterexamples and uniqueness. This version of the separation theorem does generalize to infinite-dimension; the generalization is more commonly known as the Hahn–Banach separation theorem.

The proof is based on the following lemma:

Lemma — Let $A$ and $B$ be two disjoint closed subsets of $\mathbb {R} ^{n}$ , and assume $A$ is compact. Then there exist points $a_{0}\in A$ and $b_{0}\in B$ minimizing the distance $\|a-b\|$ over $a\in A$ and $b\in B$ .

Proof of lemma

Let $a\in A$ and $b\in B$ be any pair of points, and let $r_{1}=\|b-a\|$ . Since $A$ is compact, it is contained in some ball centered on $a$ ; let the radius of this ball be $r_{2}$ . Let $S=B\cap {\overline {B_{r_{1}+r_{2}}(a)}}$ be the intersection of $B$ with a closed ball of radius $r_{1}+r_{2}$ around $a$ . Then $S$ is compact and nonempty because it contains $b$ . Since the distance function is continuous, there exist points $a_{0}$ and $b_{0}$ whose distance $\|a_{0}-b_{0}\|$ is the minimum over all pairs of points in $A\times S$ . It remains to show that $a_{0}$ and $b_{0}$ in fact have the minimum distance over all pairs of points in $A\times B$ . Suppose for contradiction that there exist points $a'$ and $b'$ such that $\|a'-b'\|<\|a_{0}-b_{0}\|$ . Then in particular, $\|a'-b'\|<r_{1}$ , and by the triangle inequality, $\|a-b'\|\leq \|a'-b'\|+\|a-a'\|<r_{1}+r_{2}$ . Therefore $b'$ is contained in $S$ , which contradicts the fact that $a_{0}$ and $b_{0}$ had minimum distance over $A\times S$ . $\square$

Proof of theorem

We first prove the second case. (See the diagram.)

WLOG, $A$ is compact. By the lemma, there exist points $a_{0}\in A$ and $b_{0}\in B$ of minimum distance to each other. Since $A$ and $B$ are disjoint, we have $a_{0}\neq b_{0}$ . Now, construct two hyperplanes $L_{A},L_{B}$ perpendicular to line segment $[a_{0},b_{0}]$ , with $L_{A}$ across $a_{0}$ and $L_{B}$ across $b_{0}$ . We claim that neither $A$ nor $B$ enters the space between $L_{A},L_{B}$ , and thus the perpendicular hyperplanes to $(a_{0},b_{0})$ satisfy the requirement of the theorem.

Algebraically, the hyperplanes $L_{A},L_{B}$ are defined by the vector $v:=b_{0}-a_{0}$ , and two constants $c_{A}:=\langle v,a_{0}\rangle <c_{B}:=\langle v,b_{0}\rangle$ , such that $L_{A}=\{x:\langle v,x\rangle =c_{A}\},L_{B}=\{x:\langle v,x\rangle =c_{B}\}$ . Our claim is that $\forall a\in A,\langle v,a\rangle \leq c_{A}$ and $\forall b\in B,\langle v,b\rangle \geq c_{B}$ .

Suppose there is some $a\in A$ such that $\langle v,a\rangle >c_{A}$ , then let $a'$ be the foot of perpendicular from $b_{0}$ to the line segment $[a_{0},a]$ . Since $A$ is convex, $a'$ is inside $A$ , and by planar geometry, $a'$ is closer to $b_{0}$ than $a_{0}$ , contradiction. Similar argument applies to $B$ .

Now for the first case.

Approach both $A,B$ from the inside by $A_{1}\subseteq A_{2}\subseteq \cdots \subseteq A$ and $B_{1}\subseteq B_{2}\subseteq \cdots \subseteq B$ , such that each $A_{k},B_{k}$ is closed and compact, and the unions are the relative interiors $\mathrm {relint} (A),\mathrm {relint} (B)$ . (See relative interior page for details.)

Now by the second case, for each pair $A_{k},B_{k}$ there exists some unit vector $v_{k}$ and real number $c_{k}$ , such that $\langle v_{k},A_{k}\rangle <c_{k}<\langle v_{k},B_{k}\rangle$ .

Since the unit sphere is compact, we can take a convergent subsequence, so that $v_{k}\to v$ . Let $c_{A}:=\sup _{a\in A}\langle v,a\rangle ,c_{B}:=\inf _{b\in B}\langle v,b\rangle$ . We claim that $c_{A}\leq c_{B}$ , thus separating $A,B$ .

Assume not, then there exists some $a\in A,b\in B$ such that $\langle v,a\rangle >\langle v,b\rangle$ , then since $v_{k}\to v$ , for large enough $k$ , we have $\langle v_{k},a\rangle >\langle v_{k},b\rangle$ , contradiction.

Since a separating hyperplane cannot intersect the interiors of open convex sets, we have a corollary:

Separation theorem I — Let $A$ and $B$ be two disjoint nonempty convex sets. If $A$ is open, then there exist a nonzero vector $v$ and real number $c$ such that

\langle x,v\rangle >c\,{\text{ and }}\langle y,v\rangle \leq c

for all $x$ in $A$ and $y$ in $B$ . If both sets are open, then there exist a nonzero vector $v$ and real number $c$ such that

\langle x,v\rangle >c\,{\text{ and }}\langle y,v\rangle <c

for all $x$ in $A$ and $y$ in $B$ .

Case with possible intersections

If the sets $A,B$ have possible intersections, but their relative interiors are disjoint, then the proof of the first case still applies with no change, thus yielding:

Separation theorem II — Let $A$ and $B$ be two nonempty convex subsets of $\mathbb {R} ^{n}$ with disjoint relative interiors. Then there exist a nonzero vector $v$ and a real number $c$ such that

\langle x,v\rangle \geq c\,{\text{ and }}\langle y,v\rangle \leq c

in particular, we have the supporting hyperplane theorem.

Supporting hyperplane theorem — if $A$ is a convex set in $\mathbb {R} ^{n},$ and $a_{0}$ is a point on the boundary of $A$ , then there exists a supporting hyperplane of $A$ containing $a_{0}$ .

Proof

If the affine span of $A$ is not all of $\mathbb {R} ^{n}$ , then extend the affine span to a supporting hyperplane. Else, $\mathrm {relint} (A)=\mathrm {int} (A)$ is disjoint from $\mathrm {relint} (\{a_{0}\})=\{a_{0}\}$ , so apply the above theorem.

Converse of theorem

Note that the existence of a hyperplane that only "separates" two convex sets in the weak sense of both inequalities being non-strict obviously does not imply that the two sets are disjoint. Both sets could have points located on the hyperplane.

Counterexamples and uniqueness

If one of A or B is not convex, then there are many possible counterexamples. For example, A and B could be concentric circles. A more subtle counterexample is one in which A and B are both closed but neither one is compact. For example, if A is a closed half plane and B is bounded by one arm of a hyperbola, then there is no strictly separating hyperplane:

A=\{(x,y):x\leq 0\}

B=\{(x,y):x>0,y\geq 1/x\}.\

(Although, by an instance of the second theorem, there is a hyperplane that separates their interiors.) Another type of counterexample has A compact and B open. For example, A can be a closed square and B can be an open square that touches A.

In the first version of the theorem, evidently the separating hyperplane is never unique. In the second version, it may or may not be unique. Technically a separating axis is never unique because it can be translated; in the second version of the theorem, a separating axis can be unique up to translation.

The horn angle provides a good counterexample to many hyperplane separations. For example, in $\mathbb {R} ^{2}$ , the unit disk is disjoint from the open interval $((1,0),(1,1))$ , but the only line separating them contains the entirety of $((1,0),(1,1))$ . This shows that if $A$ is closed and $B$ is relatively open, then there does not necessarily exist a separation that is strict for $B$ . However, if $A$ is closed polytope then such a separation exists.^[6]

More variants

Farkas' lemma and related results can be understood as hyperplane separation theorems when the convex bodies are defined by finitely many linear inequalities.

More results may be found.^[6]

Use in collision detection

In collision detection, the hyperplane separation theorem is usually used in the following form:

Separating axis theorem — Two closed convex objects are disjoint if there exists a line ("separating axis") onto which the two objects' projections are disjoint.

Regardless of dimensionality, the separating axis is always a line. For example, in 3D, the space is separated by planes, but the separating axis is perpendicular to the separating plane.

The separating axis theorem can be applied for fast collision detection between polygon meshes. Each face's normal or other feature direction is used as a separating axis. Note that this yields possible separating axes, not separating lines/planes.

In 3D, using face normals alone will fail to separate some edge-on-edge non-colliding cases. Additional axes, consisting of the cross-products of pairs of edges, one taken from each object, are required.^[7]

For increased efficiency, parallel axes may be calculated as a single axis.

Notes

↑ Hastie, Trevor; Tibshirani, Robert; Friedman, Jerome (2008). The Elements of Statistical Learning : Data Mining, Inference, and Prediction (PDF) (Second ed.). New York: Springer. pp. 129–135.
↑ Witten, Ian H.; Frank, Eibe; Hall, Mark A.; Pal, Christopher J. (2016). Data Mining: Practical Machine Learning Tools and Techniques (Fourth ed.). Morgan Kaufmann. pp. 253–254. ISBN 9780128043578.
↑ Deisenroth, Marc Peter; Faisal, A. Aldo; Ong, Cheng Soon (2020). Mathematics for Machine Learning. Cambridge University Press. pp. 337–338. ISBN 978-1-108-45514-5.
↑ Boyd & Vandenberghe 2004, Exercise 2.22.
↑ Haïm Brezis, Analyse fonctionnelle : théorie et applications, 1983, remarque 4, p. 7.
1 2 Stoer, Josef; Witzgall, Christoph (1970). Convexity and Optimization in Finite Dimensions I. Springer Berlin, Heidelberg. (2.12.9). doi:10.1007/978-3-642-46216-0. ISBN 978-3-642-46216-0.
↑ "Advanced vector math".

Related Research Articles

In mathematics, more specifically in functional analysis, a Banach space is a complete normed vector space. Thus, a Banach space is a vector space with a metric that allows the computation of vector length and distance between vectors and is complete in the sense that a Cauchy sequence of vectors always converges to a well-defined limit that is within the space.

The Hahn–Banach theorem is a central tool in functional analysis. It allows the extension of bounded linear functionals defined on a subspace of some vector space to the whole space, and it also shows that there are "enough" continuous linear functionals defined on every normed vector space to make the study of the dual space "interesting". Another version of the Hahn–Banach theorem is known as the Hahn–Banach separation theorem or the hyperplane separation theorem, and has numerous uses in convex geometry.

The Riesz representation theorem, sometimes called the Riesz–Fréchet representation theorem after Frigyes Riesz and Maurice René Fréchet, establishes an important connection between a Hilbert space and its continuous dual space. If the underlying field is the real numbers, the two are isometrically isomorphic; if the underlying field is the complex numbers, the two are isometrically anti-isomorphic. The (anti-) isomorphism is a particular natural isomorphism.

Distributions, also known as Schwartz distributions or generalized functions, are objects that generalize the classical notion of functions in mathematical analysis. Distributions make it possible to differentiate functions whose derivatives do not exist in the classical sense. In particular, any locally integrable function has a distributional derivative.

In mathematics, specifically functional analysis, a trace-class operator is a linear operator for which a trace may be defined, such that the trace is a finite number independent of the choice of basis used to compute the trace. This trace of trace-class operators generalizes the trace of matrices studied in linear algebra. All trace-class operators are compact operators.

In mathematics, a linear form is a linear map from a vector space to its field of scalars.

In functional analysis and related areas of mathematics, locally convex topological vector spaces (LCTVS) or locally convex spaces are examples of topological vector spaces (TVS) that generalize normed spaces. They can be defined as topological vector spaces whose topology is generated by translations of balanced, absorbent, convex sets. Alternatively they can be defined as a vector space with a family of seminorms, and a topology can be defined in terms of that family. Although in general such spaces are not necessarily normable, the existence of a convex local base for the zero vector is strong enough for the Hahn–Banach theorem to hold, yielding a sufficiently rich theory of continuous linear functionals.

In functional analysis and related branches of mathematics, the Banach–Alaoglu theorem states that the closed unit ball of the dual space of a normed vector space is compact in the weak* topology. A common proof identifies the unit ball with the weak-* topology as a closed subset of a product of compact sets with the product topology. As a consequence of Tychonoff's theorem, this product, and hence the unit ball within, is compact.

In the mathematical fields of linear algebra and functional analysis, the orthogonal complement of a subspace W of a vector space V equipped with a bilinear form B is the set W^⊥ of all vectors in V that are orthogonal to every vector in W. Informally, it is called the perp, short for perpendicular complement. It is a subspace of V.

In mathematical economics, the Arrow–Debreu model is a theoretical general equilibrium model. It posits that under certain economic assumptions there must be a set of prices such that aggregate supplies will equal aggregate demands for every commodity in the economy.

In functional and convex analysis, and related disciplines of mathematics, the polar set $is a special convex set associated to any subset of a vector space lying in the dual space The bipolar of a subset is the polar of but lies in .$

<span class="mw-page-title-main">Convex analysis</span>

Convex analysis is the branch of mathematics devoted to the study of properties of convex functions and convex sets, often with applications in convex minimization, a subdomain of optimization theory.

In the mathematical discipline of functional analysis, the concept of a compact operator on Hilbert space is an extension of the concept of a matrix acting on a finite-dimensional vector space; in Hilbert space, compact operators are precisely the closure of finite-rank operators in the topology induced by the operator norm. As such, results from matrix theory can sometimes be extended to compact operators using similar arguments. By contrast, the study of general operators on infinite-dimensional spaces often requires a genuinely different approach.

Dual cone and polar cone are closely related concepts in convex analysis, a branch of mathematics.

In functional analysis, the dual norm is a measure of size for a continuous linear function defined on a normed vector space.

In mathematics, Hilbert spaces allow the methods of linear algebra and calculus to be generalized from (finite-dimensional) Euclidean vector spaces to spaces that may be infinite-dimensional. Hilbert spaces arise naturally and frequently in mathematics and physics, typically as function spaces. Formally, a Hilbert space is a vector space equipped with an inner product that defines a distance function for which the space is a complete metric space.

In mathematics, the Pettis integral or Gelfand–Pettis integral, named after Israel M. Gelfand and Billy James Pettis, extends the definition of the Lebesgue integral to vector-valued functions on a measure space, by exploiting duality. The integral was introduced by Gelfand for the case when the measure space is an interval with Lebesgue measure. The integral is also called the weak integral in contrast to the Bochner integral, which is the strong integral.

In functional analysis and related areas of mathematics, a complete topological vector space is a topological vector space (TVS) with the property that whenever points get progressively closer to each other, then there exists some point $towards which they all get closer. The notion of "points that get progressively closer" is made rigorous by Cauchy nets or Cauchy filters, which are generalizations of Cauchy sequences, while "point towards which they all get closer" means that this Cauchy net or filter converges to The notion of completeness for TVSs uses the theory of uniform spaces as a framework to generalize the notion of completeness for metric spaces. But unlike metric-completeness, TVS-completeness does not depend on any metric and is defined for all TVSs, including those that are not metrizable or Hausdorff.$

In mathematics, a dual system, dual pair, or duality over a field $is a triple consisting of two vector spaces and over and a non-degenerate bilinear map .$

This is a glossary for the terminology in a mathematical field of functional analysis.

References

Boyd, Stephen P.; Vandenberghe, Lieven (2004). Convex Optimization (PDF). Cambridge University Press. ISBN 978-0-521-83378-3.
Golshtein, E. G.; Tretyakov, N.V. (1996). Modified Lagrangians and monotone maps in optimization. New York: Wiley. p. 6. ISBN 0-471-54821-9.
Shimizu, Kiyotaka; Ishizuka, Yo; Bard, Jonathan F. (1997). Nondifferentiable and two-level mathematical programming. Boston: Kluwer Academic Publishers. p. 19. ISBN 0-7923-9821-1.

Soltan, V. (2021). Support and separation properties of convex sets in finite dimension. Extracta Math. Vol. 36, no. 2, 241-278.

External links

Collision detection and response

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Hastie, Trevor; Tibshirani, Robert; Friedman, Jerome (2008). The Elements of Statistical Learning : Data Mining, Inference, and Prediction (PDF) (Second ed.). New York: Springer. pp. 129–135.

[2] Witten, Ian H.; Frank, Eibe; Hall, Mark A.; Pal, Christopher J. (2016). Data Mining: Practical Machine Learning Tools and Techniques (Fourth ed.). Morgan Kaufmann. pp. 253–254. ISBN 9780128043578.

[3] Deisenroth, Marc Peter; Faisal, A. Aldo; Ong, Cheng Soon (2020). Mathematics for Machine Learning. Cambridge University Press. pp. 337–338. ISBN 978-1-108-45514-5.

[4] Boyd & Vandenberghe 2004, Exercise 2.22.

[5] Haïm Brezis, Analyse fonctionnelle : théorie et applications, 1983, remarque 4, p. 7.

[:0-6] 1 2 Stoer, Josef; Witzgall, Christoph (1970). Convexity and Optimization in Finite Dimensions I. Springer Berlin, Heidelberg. (2.12.9). doi:10.1007/978-3-642-46216-0. ISBN 978-3-642-46216-0.

[7] "Advanced vector math".

[1]

[2]

[3]

[4]

[5]

[6]

[7]

v t e Topological vector spaces (TVSs)
Basic concepts	Banach space Completeness Continuous linear operator Linear functional Fréchet space Linear map Locally convex space Metrizability Operator topologies Topological vector space Vector space
Main results	Anderson–Kadec Banach–Alaoglu Closed graph theorem F. Riesz's Hahn–Banach (hyperplane separation Vector-valued Hahn–Banach) Open mapping (Banach–Schauder) Bounded inverse Uniform boundedness (Banach–Steinhaus)
Maps	Bilinear operator form Linear map Almost open Bounded Continuous Closed Compact Densely defined Discontinuous Topological homomorphism Functional Linear Bilinear Sesquilinear Norm Seminorm Sublinear function Transpose
Types of sets	Absolutely convex/disk Absorbing/Radial Affine Balanced/Circled Banach disks Bounding points Bounded Complemented subspace Convex Convex cone (subset) Linear cone (subset) Extreme point Pre-compact/Totally bounded Prevalent/Shy Radial Radially convex/Star-shaped Symmetric
Set operations	Affine hull (Relative) Algebraic interior (core) Convex hull Linear span Minkowski addition Polar (Quasi) Relative interior
Types of TVSs	Asplund B-complete/Ptak Banach (Countably) Barrelled BK-space (Ultra-) Bornological Brauner Complete Convenient (DF)-space Distinguished F-space FK-AK space FK-space Fréchet tame Fréchet Grothendieck Hilbert Infrabarreled Interpolation space K-space LB-space LF-space Locally convex space Mackey (Pseudo)Metrizable Montel Quasibarrelled Quasi-complete Quasinormed (Polynomially Semi-) Reflexive Riesz Schwartz Semi-complete Smith Stereotype (B Strictly Uniformly) convex (Quasi-) Ultrabarrelled Uniformly smooth Webbed With the approximation property
Mathematicsportal Category Commons