Persistent homology group

Last updated

In persistent homology, a persistent homology group is a multiscale analog of a homology group that captures information about the evolution of topological features across a filtration of spaces. While the ordinary homology group represents nontrivial homology classes of an individual topological space, the persistent homology group tracks only those classes that remain nontrivial across multiple parameters in the underlying filtration. Analogous to the ordinary Betti number, the ranks of the persistent homology groups are known as the persistent Betti numbers. Persistent homology groups were first introduced by Herbert Edelsbrunner, David Letscher, and Afra Zomorodian in a 2002 paper Topological Persistence and Simplification, one of the foundational papers in the fields of persistent homology and topological data analysis, [1] based largely on the persistence barcodes and the persistence algorithm, that were first described by Serguei Barannikov in the 1994 paper. [2] Since then, the study of persistent homology groups has led to applications in data science, [3] machine learning, [4] materials science, [5] biology, [6] [7] and economics. [8]

Definition

Let be a simplicial complex, and let be a real-valued monotonic function. Then for some values the sublevel-sets yield a sequence of nested subcomplexes known as a filtration of .

Applying homology to each complex yields a sequence of homology groups connected by homomorphisms induced by the inclusion maps of the underlying filtration. When homology is taken over a field, we get a sequence of vector spaces and linear maps known as a persistence module.

Let be the homomorphism induced by the inclusion . Then the persistent homology groups are defined as the images for all . In particular, the persistent homology group .

More precisely, the persistent homology group can be defined as , where and are the standard p-cycle and p-boundary groups, respectively. [9]

Sometimes the elements of are described as the homology classes that are "born" at or before and that have not yet "died" entering . These notions can be made precise as follows. A homology class is said to be born at if it is not contained in the image of the previous persistent homology group, i.e., . Conversely, is said to die entering if is subsumed (i.e., merges with) another older class as the sequence proceeds from . That is to say, but . The determination that an older class persists if it merges with a younger class, instead of the other way around, is sometimes known as the Elder Rule. [10] [11]

The indices at which a homology class is born and dies entering are known as the birth and death indices of . The difference is known as the index persistence of , while the corresponding difference in function values corresponding to those indices is known as the persistence of . If there exists no index at which dies, it is assigned an infinite death index. Thus, the persistence of each class can be represented as an interval in the extended real line of either the form or . Since, in the case of an infinite field, the infinite number of classes always have the same persistence, the collection over all classes of such intervals does not give meaningful multiplicities for a multiset of intervals. Instead, such multiplicities and a multiset of intervals in the extended real line are given by the structure theorem of persistence homology. [2] This multiset is known as a persistence barcode . [12]

Geometrically, a barcode can be plotted as a multiset of points (with possibly infinite coordinates) in the extended plane . By the above definitions, each point will lie above the diagonal, and the distance to the diagonal is exactly equal to the persistence of the corresponding class. This construction is known as a persistence diagram, and it provides a way of visualizing the structure of the persistence of homology classes in the sequence of persistent homology groups. [1]

Related Research Articles

In the mathematical field of algebraic topology, the fundamental group of a topological space is the group of the equivalence classes under homotopy of the loops contained in the space. It records information about the basic shape, or holes, of the topological space. The fundamental group is the first and simplest homotopy group. The fundamental group is a homotopy invariant—topological spaces that are homotopy equivalent have isomorphic fundamental groups. The fundamental group of a topological space is denoted by .

In functional analysis and related areas of mathematics, Fréchet spaces, named after Maurice Fréchet, are special topological vector spaces. They are generalizations of Banach spaces. All Banach and Hilbert spaces are Fréchet spaces. Spaces of infinitely differentiable functions are typical examples of Fréchet spaces, many of which are typically not Banach spaces.

In mathematics, specifically in differential topology, Morse theory enables one to analyze the topology of a manifold by studying differentiable functions on that manifold. According to the basic insights of Marston Morse, a typical differentiable function on a manifold will reflect the topology quite directly. Morse theory allows one to find CW structures and handle decompositions on manifolds and to obtain substantial information about their homology.

In topology, a branch of mathematics, intersection homology is an analogue of singular homology especially well-suited for the study of singular spaces, discovered by Mark Goresky and Robert MacPherson in the fall of 1974 and developed by them over the next few years.

In mathematics, in particular in algebraic topology and differential geometry, the Stiefel–Whitney classes are a set of topological invariants of a real vector bundle that describe the obstructions to constructing everywhere independent sets of sections of the vector bundle. Stiefel–Whitney classes are indexed from 0 to n, where n is the rank of the vector bundle. If the Stiefel–Whitney class of index i is nonzero, then there cannot exist everywhere linearly independent sections of the vector bundle. A nonzero nth Stiefel–Whitney class indicates that every section of the bundle must vanish at some point. A nonzero first Stiefel–Whitney class indicates that the vector bundle is not orientable. For example, the first Stiefel–Whitney class of the Möbius strip, as a line bundle over the circle, is not zero, whereas the first Stiefel–Whitney class of the trivial line bundle over the circle, , is zero.

In mathematics, specifically algebraic topology, an Eilenberg–MacLane space is a topological space with a single nontrivial homotopy group.

In mathematics, specifically in operator K-theory, the Baum–Connes conjecture suggests a link between the K-theory of the reduced C*-algebra of a group and the K-homology of the classifying space of proper actions of that group. The conjecture sets up a correspondence between different areas of mathematics, with the K-homology of the classifying space being related to geometry, differential operator theory, and homotopy theory, while the K-theory of the group's reduced C*-algebra is a purely analytical object.

Size functions are shape descriptors, in a geometrical/topological sense. They are functions from the half-plane to the natural numbers, counting certain connected components of a topological space. They are used in pattern recognition and topology.

Given a size pair where is a manifold of dimension and is an arbitrary real continuous function defined on it, the -th size functor, with , denoted by , is the functor in , where is the category of ordered real numbers, and is the category of Abelian groups, defined in the following way. For , setting , , equal to the inclusion from into , and equal to the morphism in from to ,

In applied mathematics, topological data analysis (TDA) is an approach to the analysis of datasets using techniques from topology. Extraction of information from datasets that are high-dimensional, incomplete and noisy is generally challenging. TDA provides a general framework to analyze such data in a manner that is insensitive to the particular metric chosen and provides dimensionality reduction and robustness to noise. Beyond this, it inherits functoriality, a fundamental concept of modern mathematics, from its topological nature, which allows it to adapt to new mathematical tools.

Persistent homology is a method for computing topological features of a space at different spatial resolutions. More persistent features are detected over a wide range of spatial scales and are deemed more likely to represent true features of the underlying space rather than artifacts of sampling, noise, or particular choice of parameters.

In algebraic geometry, a mixed Hodge structure is an algebraic structure containing information about the cohomology of general algebraic varieties. It is a generalization of a Hodge structure, which is used to study smooth projective varieties.

The degree-Rips bifiltration is a simplicial filtration used in topological data analysis for analyzing the shape of point cloud data. It is a multiparameter extension of the Vietoris–Rips filtration that possesses greater stability to data outliers than single-parameter filtrations, and which is more amenable to practical computation than other multiparameter constructions. Introduced in 2015 by Lesnick and Wright, the degree-Rips bifiltration is a parameter-free and density-sensitive vehicle for performing persistent homology computations on point cloud data.

<span class="mw-page-title-main">Offset filtration</span>

The offset filtration is a growing sequence of metric balls used to detect the size and scale of topological features of a data set. The offset filtration commonly arises in persistent homology and the field of topological data analysis. Utilizing a union of balls to approximate the shape of geometric objects was first suggested by Frosini in 1992 in the context of submanifolds of Euclidean space. The construction was independently explored by Robins in 1998, and expanded to considering the collection of offsets indexed over a series of increasing scale parameters, in order to observe the stability of topological features with respect to attractors. Homological persistence as introduced in these papers by Frosini and Robins was subsequently formalized by Edelsbrunner et al. in their seminal 2002 paper Topological Persistence and Simplification. Since then, the offset filtration has become a primary example in the study of computational topology and data analysis.

<span class="mw-page-title-main">Multicover bifiltration</span>

The multicover bifiltration is a two-parameter sequence of nested topological spaces derived from the covering of a finite set in a metric space by growing metric balls. It is a multidimensional extension of the offset filtration that captures density information about the underlying data set by filtering the points of the offsets at each index according to how many balls cover each point. The multicover bifiltration has been an object of study within multidimensional persistent homology and topological data analysis.

A persistence module is a mathematical structure in persistent homology and topological data analysis that formally captures the persistence of topological features of an object across a range of scale parameters. A persistence module often consists of a collection of homology groups corresponding to a filtration of topological spaces, and a collection of linear maps induced by the inclusions of the filtration. The concept of a persistence module was first introduced in 2005 as an application of graded modules over polynomial rings, thus importing well-developed algebraic ideas from classical commutative algebra theory to the setting of persistent homology. Since then, persistence modules have been one of the primary algebraic structures studied in the field of applied topology.

In topological data analysis, a subdivision bifiltration is a collection of filtered simplicial complexes, typically built upon a set of data points in a metric space, that captures shape and density information about the underlying data set. The subdivision bifiltration relies on a natural filtration of the barycentric subdivision of a simplicial complex by flags of minimum dimension, which encodes density information about the metric space upon which the complex is built. The subdivision bifiltration was first introduced by Donald Sheehy in 2011 as part of his doctoral thesis as a discrete model of the multicover bifiltration, a continuous construction whose underlying framework dates back to the 1970s. In particular, Sheehy applied the construction to both the Vietoris-Rips and Čech filtrations, two common objects in the field of topological data analysis. Whereas single parameter filtrations are not robust with respect to outliers in the data, the subdivision-Rips and -Cech bifiltrations satisfy several desirable stability properties.

In topological data analysis, the Vietoris-Rips filtration is the collection of nested Vietoris-Rips complexes on a metric space created by taking the sequence of Vietoris-Rips complexes over an increasing scale parameter. Often, the Vietoris-Rips filtration is used to create a discrete, simplicial model on point cloud data embedded in an ambient metric space. The Vietoris-Rips filtration is a multiscale extension of the Vietoris-Rips complex that enables researchers to detect and track the persistence of topological features, over a range of parameters, by way of computing the persistent homology of the entire filtration.

In persistent homology, a persistent Betti number is a multiscale analog of a Betti number that tracks the number of topological features that persist over multiple scale parameters in a filtration. Whereas the classical Betti number equals the rank of the homology group, the persistent Betti number is the rank of the persistent homology group. The concept of a persistent Betti number was introduced by Herbert Edelsbrunner, David Letscher, and Afra Zomorodian in the 2002 paper Topological Persistence and Simplification, one of the seminal papers in the field of persistent homology and topological data analysis. Applications of the persistent Betti number appear in a variety of fields including data analysis, machine learning, and physics.

In topological data analysis, a persistence barcode, sometimes shortened to barcode, is an algebraic invariant of a persistence module that characterizes the stability of topological features throughout a growing family of spaces. Formally, a persistence barcode consists of a multiset of intervals in the extended real line, where the length of each interval corresponds to the lifetime of a topological feature in a filtration, usually built on a point cloud, a graph, a function, or, more generally, a simplicial complex or a chain complex. Generally, longer intervals in a barcode correspond to more robust features, whereas shorter intervals are more likely to be noise in the data. A persistence barcode is a complete invariant that captures all the topological information in a filtration. In algebraic topology, the persistence barcodes were first introduced by Sergey Barannikov in 1994 as the "canonical forms" invariants consisting of a multiset of line segments with ends on two parallel lines, and later, in geometry processing, by Gunnar Carlsson et al. in 2004.

References

  1. 1 2 Edelsbrunner; Letscher; Zomorodian (2002). "Topological Persistence and Simplification". Discrete & Computational Geometry. 28 (4): 511–533. doi: 10.1007/s00454-002-2885-2 . ISSN   0179-5376.
  2. 1 2 Barannikov, Sergey (1994). "Framed Morse complex and its invariants" (PDF). Advances in Soviet Mathematics. ADVSOV. 21: 93–115. doi:10.1090/advsov/021/03. ISBN   9780821802373. S2CID   125829976.
  3. Chen, Li M. (2015). Mathematical problems in data science : theoretical and practical methods. Zhixun Su, Bo Jiang. Cham. pp. 120–124. ISBN   978-3-319-25127-1. OCLC   932464024.{{cite book}}: CS1 maint: location missing publisher (link)
  4. Machine Learning and Knowledge Extraction : First IFIP TC 5, WG 8.4, 8.9, 12.9 International Cross-Domain Conference, CD-MAKE 2017, Reggio, Italy, August 29 - September 1, 2017, Proceedings. Andreas Holzinger, Peter Kieseberg, A. Min Tjoa, Edgar R. Weippl. Cham. 2017. pp. 23–24. ISBN   978-3-319-66808-6. OCLC   1005114370.{{cite book}}: CS1 maint: location missing publisher (link) CS1 maint: others (link)
  5. Hirata, Akihiko (2016). Structural analysis of metallic glasses with computational homology. Kaname Matsue, Mingwei Chen. Japan. pp. 63–65. ISBN   978-4-431-56056-2. OCLC   946084762.{{cite book}}: CS1 maint: location missing publisher (link)
  6. Moraleda, Rodrigo Rojas (2020). Computational topology for biomedical image and data analysis : theory and applications. Nektarios A. Valous, Wei Xiong, Niels Halama. Boca Raton, FL. ISBN   978-0-429-81099-2. OCLC   1108919429.{{cite book}}: CS1 maint: location missing publisher (link)
  7. Rabadán, Raúl (2020). Topological data analysis for genomics and evolution : topology in biology. Andrew J. Blumberg. Cambridge, United Kingdom. pp. 132–158. ISBN   978-1-316-67166-5. OCLC   1129044889.{{cite book}}: CS1 maint: location missing publisher (link)
  8. Yen, Peter Tsung-Wen; Cheong, Siew Ann (2021). "Using Topological Data Analysis (TDA) and Persistent Homology to Analyze the Stock Markets in Singapore and Taiwan". Frontiers in Physics. 9: 20. Bibcode:2021FrP.....9...20Y. doi: 10.3389/fphy.2021.572216 . hdl: 10356/155402 . ISSN   2296-424X.
  9. Edelsbrunner, Herbert (2010). Computational topology : an introduction. J. Harer. Providence, R.I.: American Mathematical Society. pp. 149–153. ISBN   978-0-8218-4925-5. OCLC   427757156.
  10. Nielsen, Frank, ed. (2021). Progress in information geometry : theory and applications. Cham. p. 224. ISBN   978-3-030-65459-7. OCLC   1243544872.{{cite book}}: CS1 maint: location missing publisher (link)
  11. Oudot, Steve Y. (2015). Persistence theory : from quiver representations to data analysis. Providence, Rhode Island. pp. 2–3. ISBN   978-1-4704-2545-6. OCLC   918149730.{{cite book}}: CS1 maint: location missing publisher (link)
  12. Ghrist, Robert (2008). "Barcodes: The persistent topology of data". Bulletin of the American Mathematical Society. 45 (1): 61–75. doi: 10.1090/S0273-0979-07-01191-3 . ISSN   0273-0979.