Tukey depth

Last updated • 1 min readFrom Wikipedia, The Free Encyclopedia

In statistics and computational geometry, the Tukey depth [1] is a measure of the depth of a point in a fixed set of points. The concept is named after its inventor, John Tukey. Given a set of n points in d-dimensional space, Tukey's depth of a point x is the smallest fraction (or number) of points in any closed halfspace that contains x.

Contents

Tukey's depth measures how extreme a point is with respect to a point cloud. It is used to define the bagplot, a bivariate generalization of the boxplot.

For example, for any extreme point of the convex hull there is always a (closed) halfspace that contains only that point, and hence its Tukey depth as a fraction is 1/n.

Definitions

Tukey's depth of a point x wrt to a point cloud. The blue region illustrates a halfspace containing x on the boundary. The halfspace is also a most extreme one so that it contains x but as few observations in the point cloud as possible. Thus, the proportion of points contained in this halfspace becomes the value of Tukey's depth for x. Tukey's halfspace depth.pdf
Tukey's depth of a point x wrt to a point cloud. The blue region illustrates a halfspace containing x on the boundary. The halfspace is also a most extreme one so that it contains x but as few observations in the point cloud as possible. Thus, the proportion of points contained in this halfspace becomes the value of Tukey's depth for x.

Sample Tukey's depth of point x, or Tukey's depth of x with respect to the point cloud , is defined as

where is the indicator function that equals 1 if its argument holds true or 0 otherwise.

Population Tukey's depth of x wrt to a distribution is

where X is a random variable following distribution .


Tukey mean and relation to centerpoint

A centerpoint c of a point set of size n is nothing else but a point of Tukey depth of at least n/(d + 1).

See also


  1. Tukey, John W (1975). Mathematics and the Picturing of Data. Proceedings of the International Congress of Mathematicians. p. 523-531.

Related Research Articles

In commutative algebra, the prime spectrum of a ring R is the set of all prime ideals of R, and is usually denoted by ; in algebraic geometry it is simultaneously a topological space equipped with the sheaf of rings .

<span class="mw-page-title-main">Affine transformation</span> Geometric transformation that preserves lines but not angles nor the origin

In Euclidean geometry, an affine transformation or affinity is a geometric transformation that preserves lines and parallelism, but not necessarily Euclidean distances and angles.

In mathematics, a quadric or quadric surface (quadric hypersurface in higher dimensions), is a generalization of conic sections (ellipses, parabolas, and hyperbolas). It is a hypersurface (of dimension D) in a (D + 1)-dimensional space, and it is defined as the zero set of an irreducible polynomial of degree two in D + 1 variables; for example, D = 1 in the case of conic sections. When the defining polynomial is not absolutely irreducible, the zero set is generally not considered a quadric, although it is often called a degenerate quadric or a reducible quadric.

In mathematics, Fatou's lemma establishes an inequality relating the Lebesgue integral of the limit inferior of a sequence of functions to the limit inferior of integrals of these functions. The lemma is named after Pierre Fatou.

<span class="mw-page-title-main">Point (geometry)</span> Fundamental object of geometry

In classical Euclidean geometry, a point is a primitive notion that models an exact location in space, and has no length, width, or thickness. In modern mathematics, a point refers more generally to an element of some set called a space.

<span class="mw-page-title-main">Stopping time</span>

In probability theory, in particular in the study of stochastic processes, a stopping time is a specific type of “random time”: a random variable whose value is interpreted as the time at which a given stochastic process exhibits a certain behavior of interest. A stopping time is often defined by a stopping rule, a mechanism for deciding whether to continue or stop a process on the basis of the present position and past events, and which will almost always lead to a decision to stop at some finite time.

Convex optimization is a subfield of mathematical optimization that studies the problem of minimizing convex functions over convex sets. Many classes of convex optimization problems admit polynomial-time algorithms, whereas mathematical optimization is in general NP-hard.

<span class="mw-page-title-main">Blowing up</span>

In mathematics, blowing up or blowup is a type of geometric transformation which replaces a subspace of a given space with all the directions pointing out of that subspace. For example, the blowup of a point in a plane replaces the point with the projectivized tangent space at that point. The metaphor is that of zooming in on a photograph to enlarge part of the picture, rather than referring to an explosion.

In algebraic geometry, Proj is a construction analogous to the spectrum-of-a-ring construction of affine schemes, which produces objects with the typical properties of projective spaces and projective varieties. The construction, while not functorial, is a fundamental tool in scheme theory.

<span class="mw-page-title-main">Three-dimensional space</span> Geometric model of the physical space

In geometry, a three-dimensional space is a mathematical structure in which three values (coordinates) are required to determine the position of a point. More specifically, the three-dimensional space is the Euclidean space of dimension three that models physical space.

In mathematics, a random compact set is essentially a compact set-valued random variable. Random compact sets are useful in the study of attractors for random dynamical systems.

In operator theory, a branch of mathematics, a positive-definite kernel is a generalization of a positive-definite function or a positive-definite matrix. It was first introduced by James Mercer in the early 20th century, in the context of solving integral operator equations. Since then, positive-definite functions and their various analogues and generalizations have arisen in diverse parts of mathematics. They occur naturally in Fourier analysis, probability theory, operator theory, complex function-theory, moment problems, integral equations, boundary-value problems for partial differential equations, machine learning, embedding problem, information theory, and other areas.

In mathematics, the cone of curves of an algebraic variety is a combinatorial invariant of importance to the birational geometry of .

In statistics and computational geometry, the notion of centerpoint is a generalization of the median to data in higher-dimensional Euclidean space. Given a set of points in d-dimensional space, a centerpoint of the set is a point such that any hyperplane that goes through that point divides the set of points in two roughly equal subsets: the smaller part should have at least a 1/(d + 1) fraction of the points. Like the median, a centerpoint need not be one of the data points. Every non-empty set of points (with no duplicates) has at least one centerpoint.

In multilinear algebra, the tensor rank decomposition or the decomposition of a tensor is the decomposition of a tensor in terms of a sum of minimum tensors. This is an open problem.

Learning with errors (LWE) is the computational problem of inferring a linear -ary function over a finite ring from given samples some of which may be erroneous. The LWE problem is conjectured to be hard to solve, and thus to be useful in cryptography.

In the geometry of numbers, the Klein polyhedron, named after Felix Klein, is used to generalize the concept of continued fractions to higher dimensions.

This is a glossary of algebraic geometry.

The streamline upwind Petrov–Galerkin pressure-stabilizing Petrov–Galerkin formulation for incompressible Navier–Stokes equations can be used for finite element computations of high Reynolds number incompressible flow using equal order of finite element space by introducing additional stabilization terms in the Navier–Stokes Galerkin formulation.

In mathematics, the hypergraph regularity method is a powerful tool in extremal graph theory that refers to the combined application of the hypergraph regularity lemma and the associated counting lemma. It is a generalization of the graph regularity method, which refers to the use of Szemerédi's regularity and counting lemmas.