Hyperplane separation theorem

In geometry, the hyperplane separation theorem is a theorem about disjoint convex sets in n-dimensional Euclidean space. There are several rather similar versions. In one version of the theorem, if both these sets are closed and at least one of them is compact, then there is a hyperplane in between them and even two parallel hyperplanes in between them separated by a gap. In another version, if both disjoint convex sets are open, then there is a hyperplane in between them, but not necessarily any gap. An axis which is orthogonal to a separating hyperplane is a separating axis, because the orthogonal projections of the convex bodies onto the axis are disjoint.

Illustration of the hyperplane separation theorem.

The hyperplane separation theorem is due to Hermann Minkowski. The Hahn–Banach separation theorem generalizes the result to topological vector spaces.

A related result is the supporting hyperplane theorem.

In the context of support-vector machines, the optimally separating hyperplane or maximum-margin hyperplane is a hyperplane which separates two convex hulls of points and is equidistant from the two.[1][2]

Statements and proof

Hyperplane separation theorem[3] — Let A and B be two disjoint nonempty convex subsets of Rⁿ. Then there exist a nonzero vector v and a real number c such that

\langle x,v\rangle \geq c\,{\text{ and }}\langle y,v\rangle \leq c

for all x in A and y in B; i.e., the hyperplane $\langle \cdot ,v\rangle =c$ , v the normal vector, separates A and B.

The proof is based on the following lemma:

Lemma — Let $K$ be a nonempty closed convex subset of Rⁿ. Then there exists a unique vector in $K$ of minimum norm (length).

Proof of lemma: Let $\delta =\inf\{|x|:x\in K\}.$ Let $x_{j}$ be a sequence in $K$ such that $|x_{j}|\to \delta$ . Note that $(x_{i}+x_{j})/2$ is in $K$ since $K$ is convex and so $|x_{i}+x_{j}|^{2}\geq 4\delta ^{2}$ . Since

|x_{i}-x_{j}|^{2}=2|x_{i}|^{2}+2|x_{j}|^{2}-|x_{i}+x_{j}|^{2}\leq 2|x_{i}|^{2}+2|x_{j}|^{2}-4\delta ^{2}\to 0

as $i,j\to \infty$ , $x_{i}$ is a Cauchy sequence and so has limit x in $K$ . It is unique since if y is in $K$ and has norm δ, then $|x-y|^{2}\leq 2|x|^{2}+2|y|^{2}-4\delta ^{2}=0$ and x = y. $\square$

Proof of theorem: Given disjoint nonempty convex sets A, B, let

K=A+(-B)=\{x-y\mid x\in A,y\in B\}.

Since $-B$ is convex and the sum of convex sets is convex, $K$ is convex. By the lemma, the closure ${\overline {K}}$ of $K$ , which is convex, contains a vector v of minimum norm. Since ${\overline {K}}$ is convex, for any $n$ in $K$ , the line segment

v+t(n-v),\,0\leq t\leq 1

lies in ${\overline {K}}$ and so

|v|^{2}\leq |v+t(n-v)|^{2}=|v|^{2}+2t\langle v,n-v\rangle +t^{2}|n-v|^{2}

.

For $0<t\leq 1$ , we thus have:

0\leq 2\langle v,n\rangle -2|v|^{2}+t|n-v|^{2}

and letting $t\to 0$ gives: $\langle n,v\rangle \geq |v|^{2}$ . Hence, for any x in A and y in B, we have: $\langle x-y,v\rangle \geq |v|^{2}$ . Thus, if v is nonzero, the proof is complete since

\inf _{x\in A}\langle x,v\rangle \geq |v|^{2}+\sup _{y\in B}\langle y,v\rangle .

More generally (covering the case v = 0), let us first take the case when the interior of $K$ is nonempty. The interior can be exhausted by a nested sequence of nonempty compact convex subsets $K_{1}\subset K_{2}\subset K_{3}\subset \cdots$ . Since 0 is not in $K$ , each $K_{n}$ contains a nonzero vector $v_{n}$ of minimum length and by the argument in the early part, we have: $\langle x,v_{n}\rangle \geq 0$ for any $x\in K_{n}$ . We can normalize the $v_{n}$ 's to have length one. Then the sequence $v_{n}$ contains a convergent subsequence (because the n-sphere is compact) with limit v, which is nonzero. We have $\langle x,v\rangle \geq 0$ for any x in the interior of $K$ and by continuity the same holds for all x in $K$ . We now finish the proof as before. Finally, if $K$ has empty interior, the affine set that it spans has dimension less than that of the whole space. Consequently $K$ is contained in some hyperplane $\langle \cdot ,v\rangle =c$ ; thus, $\langle x,v\rangle \geq c$ for all x in $K$ and we finish the proof as before. $\square$

The number of dimensions must be finite. In infinite-dimensional spaces there are examples of two closed, convex, disjoint sets which cannot be separated by a closed hyperplane (a hyperplane where a continuous linear functional equals some constant) even in the weak sense where the inequalities are not strict.[4]

The above proof also proves the first version of the theorem mentioned in the lede (to see it, note that $K$ in the proof is closed under the hypothesis of the theorem below.)

Separation theorem I — Let A and B be two disjoint nonempty closed convex sets, one of which is compact. Then there exist a nonzero vector v and real numbers $c_{1}<c_{2}$ such that

\langle x,v\rangle >c_{2}\,{\text{ and }}\langle y,v\rangle <c_{1}

for all x in A and y in B.

Here, the compactness in the hypothesis cannot be relaxed; see an example in the next section. This version of the separation theorem does generalize to infinite-dimension; the generalization is more commonly known as the Hahn–Banach separation theorem.

We also have:

Separation theorem II — Let A and B be two disjoint nonempty convex sets. If A is open, then there exist a nonzero vector v and real number $c$ such that

\langle x,v\rangle >c\,{\text{ and }}\langle y,v\rangle \leq c

for all x in A and y in B. If both sets are open, then there exist a nonzero vector v and real number $c$ such that

\langle x,v\rangle >c\,{\text{ and }}\langle y,v\rangle <c

for all x in A and y in B.

This follows from the standard version since the separating hyperplane cannot intersect the interiors of the convex sets.

Converse of theorem

Note that the existence of a hyperplane that only "separates" two convex sets in the weak sense of both inequalities being non-strict obviously does not imply that the two sets are disjoint. Both sets could have points located on the hyperplane.

Counterexamples and uniqueness

The theorem does not apply if one of the bodies is not convex.

If one of A or B is not convex, then there are many possible counterexamples. For example, A and B could be concentric circles. A more subtle counterexample is one in which A and B are both closed but neither one is compact. For example, if A is a closed half plane and B is bounded by one arm of a hyperbola, then there is no strictly separating hyperplane:

A=\{(x,y):x\leq 0\}

B=\{(x,y):x>0,y\geq 1/x\}.\

(Although, by an instance of the second theorem, there is a hyperplane that separates their interiors.) Another type of counterexample has A compact and B open. For example, A can be a closed square and B can be an open square that touches A.

In the first version of the theorem, evidently the separating hyperplane is never unique. In the second version, it may or may not be unique. Technically a separating axis is never unique because it can be translated; in the second version of the theorem, a separating axis can be unique up to translation.

Use in collision detection

The separating axis theorem (SAT) says that:

Two convex objects do not overlap if there exists a line (called axis) onto which the two objects' projections do not overlap.

SAT suggests an algorithm for testing whether two convex solids intersect or not.

Regardless of dimensionality, the separating axis is always a line. For example, in 3D, the space is separated by planes, but the separating axis is perpendicular to the separating plane.

The separating axis theorem can be applied for fast collision detection between polygon meshes. Each face's normal or other feature direction is used as a separating axis, as well as the cross products. Note that this yields possible separating axes, not separating lines/planes.

If the cross products were not used, certain edge-on-edge non-colliding cases would be treated as colliding. For increased efficiency, parallel axes may be calculated as a single axis.

gollark: > Windows

gollark: No.

gollark: I really need some way to make the Soviet national anthem come out less... garbled.

gollark: The main difference between real electricity and RF is just RF can be much more conveniently stored. Everything has nice buffers in it.

gollark: You can conveniently accumulate it in machine buffers, there are no voltages or AC vs DC or direction or resistance/impedance to worry about, no weird electromagnetic things going on, machines will just run at lower speed if you're lacking power (I experienced this while running my entire machine setup off a cheap 5RF/t solar panel on kukipack).

Notes

Hastie, Trevor; Tibshirani, Robert; Friedman, Jerome (2008). The Elements of Statistical Learning : Data Mining, Inference, and Prediction (PDF) (Second ed.). New York: Springer. pp. 129–135.
Witten, Ian H.; Frank, Eibe; Hall, Mark A.; Pal, Christopher J. (2016). Data Mining: Practical Machine Learning Tools and Techniques (Fourth ed.). Morgan Kaufmann. pp. 253–254.
Boyd–Vandenberghe, Exercise 2.22.
Haïm Brezis, Analyse fonctionnelle : théorie et applications, 1983, remarque 4, p. 7.

References

Boyd, Stephen P.; Vandenberghe, Lieven (2004). Convex Optimization (pdf). Cambridge University Press. ISBN 978-0-521-83378-3.
Golshtein, E. G.; Tretyakov, N.V. (1996). Modified Lagrangians and monotone maps in optimization. New York: Wiley. p. 6. ISBN 0-471-54821-9.
Shimizu, Kiyotaka; Ishizuka, Yo; Bard, Jonathan F. (1997). Nondifferentiable and two-level mathematical programming. Boston: Kluwer Academic Publishers. p. 19. ISBN 0-7923-9821-1.

External links

Collision detection and response

This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.

[1] Hastie, Trevor; Tibshirani, Robert; Friedman, Jerome (2008). The Elements of Statistical Learning : Data Mining, Inference, and Prediction (PDF) (Second ed.). New York: Springer. pp. 129–135.

[2] Witten, Ian H.; Frank, Eibe; Hall, Mark A.; Pal, Christopher J. (2016). Data Mining: Practical Machine Learning Tools and Techniques (Fourth ed.). Morgan Kaufmann. pp. 253–254.

[3] Boyd–Vandenberghe, Exercise 2.22.

[4] Haïm Brezis, Analyse fonctionnelle : théorie et applications, 1983, remarque 4, p. 7.

Functional analysis (topics, glossary)
Spaces	Hilbert space, Banach space, Fréchet space, topological vector space
Theorems	Hahn–Banach theorem, closed graph theorem, uniform boundedness principle, Kakutani fixed-point theorem, Krein–Milman theorem, min-max theorem, Gelfand–Naimark theorem, Banach–Alaoglu theorem
Operators	bounded operator, compact operator, adjoint operator, unitary operator, Hilbert–Schmidt operator, trace class, unbounded operator
Algebras	Banach algebra, C-algebra, spectrum of a C-algebra, operator algebra, group algebra of a locally compact group, von Neumann algebra
Open problems	invariant subspace problem, Mahler's conjecture
Applications	Besov space, Hardy space, spectral theory of ordinary differential equations, heat kernel, index theorem, calculus of variation, functional calculus, integral operator, Jones polynomial, topological quantum field theory, noncommutative geometry, Riemann hypothesis
Advanced topics	locally convex space, approximation property, balanced set, Schwartz space, weak topology, barrelled space, Banach–Mazur distance, Tomita–Takesaki theory

Topological vector spaces (TVSs)
Basic concepts	Banach space Continuous linear operator Functionals Hilbert space Linear operators Locally convex space Homomorphism Topological vector space Vector space
Main results	Closed graph theorem F. Riesz's theorem Hahn–Banach (hyperplane separation Vector-Valued Hahn-Banach) Open mapping (Banach–Schauder) (Bounded inverse) Uniform boundedness (Banach–Steinhaus)
Maps	Almost open Bilinear (form operator) and Sesquilinear forms Closed Compact operator Continuous and Discontinuous Linear maps Densely defined Homomorphism Functionals Norm Operator Seminorm Sublinear Transpose
Types of sets	Absolutely convex/disk Absorbing/Radial Affine Balanced/Circled Bounding points Bounded Complemented subspace Convex Convex cone (subset) Linear cone (subset) Extreme point Pre-compact/Totally bounded Radial Radially convex/Star-shaped Symmetric
Set operations	Affine hull (Relative) Algebraic interior (core) Convex hull Linear span Minkowski addition Polar (Quasi) Relative interior
Types of TVSs	Asplund B-complete/Ptak Banach (Countably) Barrelled (Ultra-) Bornological Brauner Complete (DF)-space Distinguished F-space Fréchet (tame Fréchet) Grothendieck Hilbert Infrabarreled Interpolation space LB-space LF-space Locally convex space Mackey (Pseudo)Metrizable Montel Quasibarrelled Quasi-complete Quasinormed (Polynomially Semi-) Reflexive Riesz Schwartz Semi-complete Smith Stereotype (B Strictly Uniformly convex (Quasi-) Ultrabarrelled Uniformly smooth Webbed With the Approximation property