Offset filtration

Last updated
The offset filtration at six scale parameters on a point cloud sampled from two circles of different sizes. Offset (union-of-balls) filtration.png
The offset filtration at six scale parameters on a point cloud sampled from two circles of different sizes.

The offset filtration (also called the "union-of-balls" [1] or "union-of-disks" [2] filtration) is a growing sequence of metric balls used to detect the size and scale of topological features of a data set. The offset filtration commonly arises in persistent homology and the field of topological data analysis. Utilizing a union of balls to approximate the shape of geometric objects was first suggested by Frosini in 1992 in the context of submanifolds of Euclidean space. [3] The construction was independently explored by Robins in 1998, and expanded to considering the collection of offsets indexed over a series of increasing scale parameters (i.e., a growing sequence of balls), in order to observe the stability of topological features with respect to attractors. [4] Homological persistence as introduced in these papers by Frosini and Robins was subsequently formalized by Edelsbrunner et al. in their seminal 2002 paper Topological Persistence and Simplification. [5] Since then, the offset filtration has become a primary example in the study of computational topology and data analysis.

Contents

Definition

Let be a finite set in a metric space , and for any let be the closed ball of radius centered at . Then the union is known as the offset of with respect to the parameter (or simply the -offset of ).

By considering the collection of offsets over all we get a family of spaces where whenever . So is a family of nested topological spaces indexed over , which defines a filtration known as the offset filtration on . [6]

Note that it is also possible to view the offset filtration as a functor from the poset category of non-negative real numbers to the category of topological spaces and continuous maps. [7] [8] There are some advantages to the categorical viewpoint, as explored by Bubenik and others. [9]

Properties

A standard application of the nerve theorem shows that the union of balls has the same homotopy type as its nerve, since closed balls are convex and the intersection of convex sets is convex. [10] The nerve of the union of balls is also known as the Čech complex, [11] which is a subcomplex of the Vietoris-Rips complex. [12] Therefore the offset filtration is weakly equivalent to the Čech filtration (defined as the nerve of each offset across all scale parameters), so their homology groups are isomorphic. [13]

Although the Vietoris-Rips filtration is not identical to the Čech filtration in general, it is an approximation in a sense. In particular, for a set we have a chain of inclusions between the Rips and Čech complexes on whenever . [14] In general metric spaces, we have that for all , implying that the Rips and Cech filtrations are 2-interleaved with respect to the interleaving distance as introduced by Chazal et al. in 2009. [15] [16]

It is a well-known result of Niyogi, Smale, and Weinberger that given a sufficiently dense random point cloud sample of a smooth submanifold in Euclidean space, the union of balls of a certain radius recovers the homology of the object via a deformation retraction of the Čech complex. [17]

The offset filtration is also known to be stable with respect to perturbations of the underlying data set. This follows from the fact that the offset filtration can be viewed as a sublevel-set filtration with respect to the distance function of the metric space. The stability of sublevel-set filtrations can be stated as follows: Given any two real-valued functions on a topological space such that for all , the -dimensional homology modules on the sublevel-set filtrations with respect to are point-wise finite dimensional, we have where and denote the bottleneck and sup-norm distances, respectively, and denotes the -dimensional persistent homology barcode. [18] While first stated in 2005, this sublevel stability result also follows directly from an algebraic stability property sometimes known as the "Isometry Theorem," [9] which was proved in one direction in 2009, [16] and the other direction in 2011. [19] [20]

A multiparameter extension of the offset filtration defined by considering points covered by multiple balls is given by the multicover bifiltration, and has also been an object of interest in persistent homology and computational geometry. [21] [22]

Related Research Articles

In mathematics, a continuous function is a function such that a continuous variation of the argument induces a continuous variation of the value of the function. This means there are no abrupt changes in value, known as discontinuities. More precisely, a function is continuous if arbitrarily small changes in its value can be assured by restricting to sufficiently small changes of its argument. A discontinuous function is a function that is not continuous. Until the 19th century, mathematicians largely relied on intuitive notions of continuity and considered only continuous functions. The epsilon–delta definition of a limit was introduced to formalize the definition of continuity.

<span class="mw-page-title-main">Barycentric subdivision</span>

In mathematics, the barycentric subdivision is a standard way to subdivide a given simplex into smaller ones. Its extension on simplicial complexes is a canonical method to refine them. Therefore, the barycentric subdivision is an important tool in algebraic topology.

In mathematics, Hochschild homology (and cohomology) is a homology theory for associative algebras over rings. There is also a theory for Hochschild homology of certain functors. Hochschild cohomology was introduced by Gerhard Hochschild (1945) for algebras over a field, and extended to algebras over more general rings by Henri Cartan and Samuel Eilenberg (1956).

<span class="mw-page-title-main">Vietoris–Rips complex</span> Topological space formed from distances

In topology, the Vietoris–Rips complex, also called the Vietoris complex or Rips complex, is a way of forming a topological space from distances in a set of points. It is an abstract simplicial complex that can be defined from any metric space M and distance δ by forming a simplex for every finite set of points that has diameter at most δ. That is, it is a family of finite subsets of M, in which we think of a subset of k points as forming a (k − 1)-dimensional simplex (an edge for two points, a triangle for three points, a tetrahedron for four points, etc.); if a finite set S has the property that the distance between every pair of points in S is at most δ, then we include S as a simplex in the complex.

Discrete Morse theory is a combinatorial adaptation of Morse theory developed by Robin Forman. The theory has various practical applications in diverse fields of applied mathematics and computer science, such as configuration spaces, homology computation, denoising, mesh compression, and topological data analysis.

In applied mathematics, topological data analysis (TDA) is an approach to the analysis of datasets using techniques from topology. Extraction of information from datasets that are high-dimensional, incomplete and noisy is generally challenging. TDA provides a general framework to analyze such data in a manner that is insensitive to the particular metric chosen and provides dimensionality reduction and robustness to noise. Beyond this, it inherits functoriality, a fundamental concept of modern mathematics, from its topological nature, which allows it to adapt to new mathematical tools.

Persistent homology is a method for computing topological features of a space at different spatial resolutions. More persistent features are detected over a wide range of spatial scales and are deemed more likely to represent true features of the underlying space rather than artifacts of sampling, noise, or particular choice of parameters.

<span class="mw-page-title-main">Sauer–Shelah lemma</span> Notion in combinatorics

In combinatorial mathematics and extremal set theory, the Sauer–Shelah lemma states that every family of sets with small VC dimension consists of a small number of sets. It is named after Norbert Sauer and Saharon Shelah, who published it independently of each other in 1972. The same result was also published slightly earlier and again independently, by Vladimir Vapnik and Alexey Chervonenkis, after whom the VC dimension is named. In his paper containing the lemma, Shelah gives credit also to Micha Perles, and for this reason the lemma has also been called the Perles–Sauer–Shelah lemma.

In mathematics, the mean (topological) dimension of a topological dynamical system is a non-negative extended real number that is a measure of the complexity of the system. Mean dimension was first introduced in 1999 by Gromov. Shortly after it was developed and studied systematically by Lindenstrauss and Weiss. In particular they proved the following key fact: a system with finite topological entropy has zero mean dimension. For various topological dynamical systems with infinite topological entropy, the mean dimension can be calculated or at least bounded from below and above. This allows mean dimension to be used to distinguish between systems with infinite topological entropy. Mean dimension is also related to the problem of embedding topological dynamical systems in shift spaces.

<span class="mw-page-title-main">Čech complex</span>

In algebraic topology and topological data analysis, the Čech complex is an abstract simplicial complex constructed from a point cloud in any metric space which is meant to capture topological information about the point cloud or the distribution it is drawn from. Given a finite point cloud X and an ε > 0, we construct the Čech complex as follows: Take the elements of X as the vertex set of . Then, for each , let if the set of ε-balls centered at points of σ has a nonempty intersection. In other words, the Čech complex is the nerve of the set of ε-balls centered at points of X. By the nerve lemma, the Čech complex is homotopy equivalent to the union of the balls, also known as the Offset Filtration.

<span class="mw-page-title-main">Cubical complex</span>

In mathematics, a cubical complex is a set composed of points, line segments, squares, cubes, and their n-dimensional counterparts. They are used analogously to simplicial complexes and CW complexes in the computation of the homology of topological spaces.

In Category theory and related fields of mathematics, an envelope is a construction that generalizes the operations of "exterior completion", like completion of a locally convex space, or Stone–Čech compactification of a topological space. A dual construction is called refinement.

The degree-Rips bifiltration is a simplicial filtration used in topological data analysis for analyzing the shape of point cloud data. It is a multiparameter extension of the Vietoris–Rips filtration that possesses greater stability to data outliers than single-parameter filtrations, and which is more amenable to practical computation than other multiparameter constructions. Introduced in 2015 by Lesnick and Wright, the degree-Rips bifiltration is a parameter-free and density-sensitive vehicle for performing persistent homology computations on point cloud data.

<span class="mw-page-title-main">Multicover bifiltration</span>

The multicover bifiltration is a two-parameter sequence of nested topological spaces derived from the covering of a finite set in a metric space by growing metric balls. It is a multidimensional extension of the offset filtration that captures density information about the underlying data set by filtering the points of the offsets at each index according to how many balls cover each point. The multicover bifiltration has been an object of study within multidimensional persistent homology and topological data analysis.

A persistence module is a mathematical structure in persistent homology and topological data analysis that formally captures the persistence of topological features of an object across a range of scale parameters. A persistence module often consists of a collection of homology groups corresponding to a filtration of topological spaces, and a collection of linear maps induced by the inclusions of the filtration. The concept of a persistence module was first introduced in 2005 as an application of graded modules over polynomial rings, thus importing well-developed algebraic ideas from classical commutative algebra theory to the setting of persistent homology. Since then, persistence modules have been one of the primary algebraic structures studied in the field of applied topology.

In topological data analysis, a subdivision bifiltration is a collection of filtered simplicial complexes, typically built upon a set of data points in a metric space, that captures shape and density information about the underlying data set. The subdivision bifiltration relies on a natural filtration of the barycentric subdivision of a simplicial complex by flags of minimum dimension, which encodes density information about the metric space upon which the complex is built. The subdivision bifiltration was first introduced by Donald Sheehy in 2011 as part of his doctoral thesis as a discrete model of the multicover bifiltration, a continuous construction whose underlying framework dates back to the 1970s. In particular, Sheehy applied the construction to both the Vietoris-Rips and Čech filtrations, two common objects in the field of topological data analysis. Whereas single parameter filtrations are not robust with respect to outliers in the data, the subdivision-Rips and -Cech bifiltrations satisfy several desirable stability properties.

In topological data analysis, the Vietoris-Rips filtration is the collection of nested Vietoris-Rips complexes on a metric space created by taking the sequence of Vietoris-Rips complexes over an increasing scale parameter. Often, the Vietoris-Rips filtration is used to create a discrete, simplicial model on point cloud data embedded in an ambient metric space. The Vietoris-Rips filtration is a multiscale extension of the Vietoris-Rips complex that enables researchers to detect and track the persistence of topological features, over a range of parameters, by way of computing the persistent homology of the entire filtration.

In persistent homology, a persistent Betti number is a multiscale analog of a Betti number that tracks the number of topological features that persist over multiple scale parameters in a filtration. Whereas the classical Betti number equals the rank of the homology group, the persistent Betti number is the rank of the persistent homology group. The concept of a persistent Betti number was introduced by Herbert Edelsbrunner, David Letscher, and Afra Zomorodian in the 2002 paper Topological Persistence and Simplification, one of the seminal papers in the field of persistent homology and topological data analysis. Applications of the persistent Betti number appear in a variety of fields including data analysis, machine learning, and physics.

In persistent homology, a persistent homology group is a multiscale analog of a homology group that captures information about the evolution of topological features across a filtration of spaces. While the ordinary homology group represents nontrivial homology classes of an individual topological space, the persistent homology group tracks only those classes that remain nontrivial across multiple parameters in the underlying filtration. Analogous to the ordinary Betti number, the ranks of the persistent homology groups are known as the persistent Betti numbers. Persistent homology groups were first introduced by Herbert Edelsbrunner, David Letscher, and Afra Zomorodian in a 2002 paper Topological Persistence and Simplification, one of the foundational papers in the fields of persistent homology and topological data analysis, based largely on the persistence barcodes and the persistence algorithm, that were first described by Serguei Barannikov in the 1994 paper. Since then, the study of persistent homology groups has led to applications in data science, machine learning, materials science, biology, and economics.

In topological data analysis, a persistence barcode, sometimes shortened to barcode, is an algebraic invariant of a persistence module that characterizes the stability of topological features throughout a growing family of spaces. Formally, a persistence barcode consists of a multiset of intervals in the extended real line, where the length of each interval corresponds to the lifetime of a topological feature in a filtration, usually built on a point cloud, a graph, a function, or, more generally, a simplicial complex or a chain complex. Generally, longer intervals in a barcode correspond to more robust features, whereas shorter intervals are more likely to be noise in the data. A persistence barcode is a complete invariant that captures all the topological information in a filtration. In algebraic topology, the persistence barcodes were first introduced by Sergey Barannikov in 1994 as the "canonical forms" invariants consisting of a multiset of line segments with ends on two parallel lines, and later, in geometry processing, by Gunnar Carlsson et al. in 2004.

References

  1. Adams, Henry; Moy, Michael (2021). "Topology Applied to Machine Learning: From Global to Local". Frontiers in Artificial Intelligence. 4: 2. doi: 10.3389/frai.2021.668302 . ISSN   2624-8212. PMC   8160457 . PMID   34056580.
  2. Edelsbrunner, Herbert (2014). A short course in computational geometry and topology. Cham. p. 35. ISBN   978-3-319-05957-0. OCLC   879343648.{{cite book}}: CS1 maint: location missing publisher (link)
  3. Frosini, Patrizio (1992-02-01). Casasent, David P. (ed.). "Measuring shapes by size functions". Intelligent Robots and Computer Vision X: Algorithms and Techniques. Boston, MA. 1607: 122–133. Bibcode:1992SPIE.1607..122F. doi:10.1117/12.57059. S2CID   121295508.
  4. Robins, Vanessa (1999-01-01). "Towards computing homology from approximations" (PDF). Topology Proceedings. 24: 503–532.
  5. Edelsbrunner; Letscher; Zomorodian (2002). "Topological Persistence and Simplification". Discrete & Computational Geometry. 28 (4): 511–533. doi: 10.1007/s00454-002-2885-2 . ISSN   0179-5376.
  6. Halperin, Dan; Kerber, Michael; Shaharabani, Doron (2015), Bansal, Nikhil; Finocchi, Irene (eds.), "The Offset Filtration of Convex Objects", Algorithms - ESA 2015, Lecture Notes in Computer Science, Berlin, Heidelberg: Springer Berlin Heidelberg, vol. 9294, pp. 705–716, arXiv: 1407.6132 , doi:10.1007/978-3-662-48350-3_59, ISBN   978-3-662-48349-7, S2CID   660889 , retrieved 2023-02-25
  7. Bauer, Ulrich; Kerber, Michael; Roll, Fabian; Rolle, Alexander (2023-02-16). "A unified view on the functorial nerve theorem and its variations". Expositiones Mathematicae. 41 (4): 8. arXiv: 2203.03571 . doi:10.1016/j.exmath.2023.04.005. S2CID   247291819.
  8. Blumberg, Andrew J.; Lesnick, Michael (2022-10-17). "Stability of 2-Parameter Persistent Homology". Foundations of Computational Mathematics. arXiv: 2010.09628 . doi:10.1007/s10208-022-09576-6. ISSN   1615-3375. S2CID   224705357.
  9. 1 2 Bubenik, Peter; Scott, Jonathan A. (2014). "Categorification of Persistent Homology". Discrete & Computational Geometry. 51 (3): 600–627. arXiv: 1205.3669 . doi:10.1007/s00454-014-9573-x. ISSN   0179-5376. S2CID   254027425.
  10. Edelsbrunner, Herbert (1993). "The union of balls and its dual shape". Proceedings of the ninth annual symposium on Computational geometry - SCG '93. San Diego, California, United States: ACM Press. pp. 218–231. doi:10.1145/160985.161139. ISBN   978-0-89791-582-3. S2CID   9599628.
  11. Kim, Jisu; Shin, Jaehyeok; Chazal, Frédéric; Rinaldo, Alessandro; Wasserman, Larry (2020-05-12). "Homotopy Reconstruction via the Cech Complex and the Vietoris-Rips Complex". arXiv: 1903.06955 [math.AT].
  12. Edelsbrunner, Herbert (2010). Computational topology : an introduction. J. Harer. Providence, R.I.: American Mathematical Society. p. 61. ISBN   978-0-8218-4925-5. OCLC   427757156.
  13. Chazal, Frédéric; Michel, Bertrand (2021). "An Introduction to Topological Data Analysis: Fundamental and Practical Aspects for Data Scientists". Frontiers in Artificial Intelligence. 4: 667963. doi: 10.3389/frai.2021.667963 . ISSN   2624-8212. PMC   8511823 . PMID   34661095.
  14. de Silva, Vin; Ghrist, Robert (2007-04-25). "Coverage in sensor networks via persistent homology". Algebraic & Geometric Topology. 7 (1): 339–358. doi: 10.2140/agt.2007.7.339 . ISSN   1472-2739.
  15. Anai, Hirokazu; Chazal, Frédéric; Glisse, Marc; Ike, Yuichi; Inakoshi, Hiroya; Tinarrage, Raphaël; Umeda, Yuhei (2020-05-26). Topological Data Analysis. Abel Symposia. Vol. 15. arXiv: 1811.04757 . doi:10.1007/978-3-030-43408-3. ISBN   978-3-030-43407-6. S2CID   242491854.
  16. 1 2 Chazal, Frédéric; Cohen-Steiner, David; Glisse, Marc; Guibas, Leonidas J.; Oudot, Steve Y. (2009-06-08). "Proximity of persistence modules and their diagrams". Proceedings of the twenty-fifth annual symposium on Computational geometry. Aarhus Denmark: ACM. pp. 237–246. doi:10.1145/1542362.1542407. ISBN   978-1-60558-501-7. S2CID   840484.
  17. Niyogi, Partha; Smale, Stephen; Weinberger, Shmuel (2008). "Finding the Homology of Submanifolds with High Confidence from Random Samples". Discrete & Computational Geometry. 39 (1–3): 419–441. doi: 10.1007/s00454-008-9053-2 . ISSN   0179-5376. S2CID   1788129.
  18. Cohen-Steiner, David; Edelsbrunner, Herbert; Harer, John (2007). "Stability of Persistence Diagrams". Discrete & Computational Geometry. 37 (1): 103–120. doi: 10.1007/s00454-006-1276-5 . ISSN   0179-5376.
  19. Lesnick, Michael (2015). "The Theory of the Interleaving Distance on Multidimensional Persistence Modules". Foundations of Computational Mathematics. 15 (3): 613–650. arXiv: 1106.5305 . doi:10.1007/s10208-015-9255-y. ISSN   1615-3375. S2CID   254158297.
  20. Lesnick, Michael (2023). "Lecture notes for AMAT 840: Multiparameter Persistence" (PDF). University at Albany, SUNY.
  21. Corbet, René; Kerber, Michael; Lesnick, Michael; Osang, Georg (2023-02-20). "Computing the Multicover Bifiltration". Discrete & Computational Geometry. 70 (2): 376–405. arXiv: 2103.07823 . doi:10.1007/s00454-022-00476-8. ISSN   0179-5376. PMC   10423148 . PMID   37581017.
  22. Edelsbrunner, Herbert; Osang, Georg (2021). "The Multi-Cover Persistence of Euclidean Balls". Discrete & Computational Geometry. 65 (4): 1296–1313. doi:10.1007/s00454-021-00281-9. ISSN   0179-5376. PMC   8550220 . PMID   34720303.