Simplex tree

Last updated
An example of simplicial complex, and the corresponding simplex tree data structure. Notice the two lowest nodes have a path of 4 to the node, indicating the 2 3-dimensional simplexes composed of 4 vertices each. SimplexTree.png
An example of simplicial complex, and the corresponding simplex tree data structure. Notice the two lowest nodes have a path of 4 to the node, indicating the 2 3-dimensional simplexes composed of 4 vertices each.

In topological data analysis, a simplex tree is a type of trie used to represent efficiently any general simplicial complex. Through its nodes, this data structure notably explicitly represents all the simplices. Its flexible structure allows the implementation of many basic operations useful to computing persistent homology. This data structure was invented by Jean-Daniel Boissonnat and Clément Maria in 2014, in the article The Simplex Tree: An Efficient Data Structure for General Simplicial Complexes. [1] This data structure offers efficient operations on sparse simplicial complexes. For dense or maximal simplices, Skeleton-Blocker [2] representations or Toplex Map [3] representations are used.

Contents

Definitions

Many researchers in topological data analysis consider the simplex tree to be the most compact simplex-based data structure for simplicial complexes, and a data structure allowing an intuitive understanding of simplicial complexes due to integrated usage of their mathematical properties. [1] [3] [4]

Heuristic definition

Consider any simplicial complex is a set composed of points (0 dimensions), line segments (1 dimension), triangles (2 dimensions), and their n-dimensional counterparts, called n-simplexes within a topological space. By the mathematical properties of simplexes, any n-simplex is composed of multiple -simplexes. Thus, lines are composed of points, triangles of lines, and tetrahedrons of triangles. Notice each higher level adds 1 vertex to the vertices of the n-simplex. The data structure is simplex-based, therefore, it should represent all simplexes uniquely by the points defining the simplex. A simple way to achieve this is to define each simplex by its points in sorted order.

Let be a simplicial complex of dimension k, its vertex set, where vertices are labeled from 1 to and ordered accordingly. Now, construct a dictionary size containing all vertex labels in order. This represents the 0-dimensional simplexes. Then, for the path to the initial dictionary of each entry in the initial dictionary, add as a child dictionary all vertices fully-connected to the current set of vertices, all of which have a label greater than . Represent this step on k levels. Clearly, considering the first dictionary as depth 0, any entry at depth of any dictionary in this data structure uniquely represents a -simplex within . For completeness, the point to the initial dictionary is considered the representation of the empty simplex. For the practicality of the operations, labels that are repeated on the same level are linked together, forming a looped linked list. Finally, child dictionaries also have pointers to their parent dictionary, for fast ancestor access. [1]

Constructive definition

Let be a simplicial complex of dimension k. We begin by decomposing the simplicial complex into mutually exclusive simplexes. This can be achieved in a greedy way by iteratively removing from the simplicial complex the highest order simplexes until the simplicial complex is empty. We then need to label each vertex from 1 to and associate each simplex with its corresponding "word", that is the ordered list of its vertices by label. Ordering the labels ensures no repetition in the simplex tree, as there is only one way to describe a simplex. We start with a null root, representing the null simplex. Then, we iterate through all simplexes, and through each label of each simplex word. If the label is available as a child to the current root, make that child the temporary root of the insertion process, otherwise, create a new node for the child, make it the new temporary root, and continue with the rest of the word. During this process, k dictionaries are maintained with all the labels and insert the address of the node for the corresponding label. If an address is already at that space in the dictionary, a pointer is created from the old node to the new node. Once the process is finished, all children of each node are entered into a dictionary, and all pointers are looped to make looped linked lists. A wide range of dictionaries could be applied here, like hash tables, but some operations assume the possibility of an ordered traversal of the entries, leading most of the implementations to use red-black trees are dictionaries. [1]

Operations

While simplex trees are not the most space efficient data structures for simplicial complex representation, their operations on sparse data are considered state-of-art. Here, we give the bounds of different useful operations possible through this representation. Many implementations of these operations are available. [1] [4] [5] [6] [7]

We first introduce the notation. Consider is a given simplex, is a given node corresponding to the last vertex of , is the label associate to that node, is the depth of that node, is the dimension of the simplicial complex, is the maximal number of operations to access in a dictionary (if the dictionary is a red-black tree, is the complexity) . Consider is the number of cofaces of , and is the number of nodes of the simplex tree ending with the label at depth greater than . Notice .

  1. Search, insert and remove words are done in . [1]
  2. Insert and remove an entire simplex is done in . [1]
  3. Computing persistent homology , or in a more involved way, computing Betti numbers, using a simplex tree most efficiently remains an open problem, however, current algorithms for this task on sparse simplicial complexes achieve state-of-art performance. [4]
  4. The structure of simplex trees allows for elementary collapse of collapsible simplexes, however the bounds of this operation in the general case are unknown. [1] [5] [7]
  5. A subcase of elementary collapse is edge-contraction. Edge contraction can be

achieved in . [1]

  1. Locating cofaces of given simplex can be achieved in . [1]
  2. Locating cofacets of given simplex can be achieved in . [1]

As for construction, as seen in the constructive definition, construction is proportional to the number and complexity of simplexes in the simplicial complex. This can be especially expensive if the simplicial complex is dense. However, some optimizations for particular simplicial complexes, including for Flag complexes, Rips complexes and Witness complexes. [1] [8]

Applications

Simplex trees are efficient in sparse simplicial complexes. For this purpose, many persistent homology algorithms focusing on high-dimensional real data (often sparse) use simplex trees within these algorithms. While simplex trees are not as efficient as incidence matrices, their simplex-based structure allows them to be useful and efficient for simplicial complex storage within persistent homology algorithms. [9]

Related Research Articles

<span class="mw-page-title-main">Simplex</span> Multi-dimensional generalization of triangle

In geometry, a simplex is a generalization of the notion of a triangle or tetrahedron to arbitrary dimensions. The simplex is so-named because it represents the simplest possible polytope in any given dimension. For example,

<span class="mw-page-title-main">Breadth-first search</span> Algorithm to search the nodes of a graph

Breadth-first search (BFS) is an algorithm for searching a tree data structure for a node that satisfies a given property. It starts at the tree root and explores all nodes at the present depth prior to moving on to the nodes at the next depth level. Extra memory, usually a queue, is needed to keep track of the child nodes that were encountered but not yet explored.

In mathematics, the term homology, originally introduced in algebraic topology, has three primary, closely-related usages. The most direct usage of the term is to take the homology of a chain complex, resulting in a sequence of abelian groups called homology groups. This operation, in turn, allows one to associate various named homologies or homology theories to various other types of mathematical objects. Lastly, since there are many homology theories for topological spaces that produce the same answer, one also often speaks of the homology of a topological space. There is also a related notion of the cohomology of a cochain complex, giving rise to various cohomology theories, in addition to the notion of the cohomology of a topological space.

<span class="mw-page-title-main">Simplicial complex</span> Mathematical set

In mathematics, a simplicial complex is a set composed of points, line segments, triangles, and their n-dimensional counterparts. Simplicial complexes should not be confused with the more abstract notion of a simplicial set appearing in modern simplicial homotopy theory. The purely combinatorial counterpart to a simplicial complex is an abstract simplicial complex. To distinguish a simplicial complex from an abstract simplicial complex, the former is often called a geometric simplicial complex.

In algebraic topology, singular homology refers to the study of a certain set of algebraic invariants of a topological space , the so-called homology groups Intuitively, singular homology counts, for each dimension , the -dimensional holes of a space. Singular homology is a particular example of a homology theory, which has now grown to be a rather broad collection of theories. Of the various theories, it is perhaps one of the simpler ones to understand, being built on fairly concrete constructions.

<span class="mw-page-title-main">Barycentric subdivision</span>

In mathematics, the barycentric subdivision is a standard way to subdivide a given simplex into smaller ones. Its extension on simplicial complexes is a canonical method to refine them. Therefore, the barycentric subdivision is an important tool in algebraic topology.

In computing, a persistent data structure or not ephemeral data structure is a data structure that always preserves the previous version of itself when it is modified. Such data structures are effectively immutable, as their operations do not (visibly) update the structure in-place, but instead always yield a new updated structure. The term was introduced in Driscoll, Sarnak, Sleator, and Tarjan's 1986 article.

<span class="mw-page-title-main">Abstract simplicial complex</span> Mathematical object

In combinatorics, an abstract simplicial complex (ASC), often called an abstract complex or just a complex, is a family of sets that is closed under taking subsets, i.e., every subset of a set in the family is also in the family. It is a purely combinatorial description of the geometric notion of a simplicial complex. For example, in a 2-dimensional simplicial complex, the sets in the family are the triangles, their edges, and their vertices.

In mathematics, a simplicial set is an object composed of simplices in a specific way. Simplicial sets are higher-dimensional generalizations of directed graphs, partially ordered sets and categories. Formally, a simplicial set may be defined as a contravariant functor from the simplex category to the category of sets. Simplicial sets were introduced in 1950 by Samuel Eilenberg and Joseph A. Zilber.

In algebraic topology, simplicial homology is the sequence of homology groups of a simplicial complex. It formalizes the idea of the number of holes of a given dimension in the complex. This generalizes the number of connected components.

<span class="mw-page-title-main">Triangulation (topology)</span> Representation of mathematical space

In mathematics, triangulation describes the replacement of topological spaces by piecewise linear spaces, i.e. the choice of a homeomorphism in a suitable simplicial complex. Spaces being homeomorphic to a simplicial complex are called triangulable. Triangulation has various uses in different branches of mathematics, for instance in algebraic topology, in complex analysis or in modeling.

<span class="mw-page-title-main">Centrality</span> Degree of connectedness within a graph

In graph theory and network analysis, indicators of centrality assign numbers or rankings to nodes within a graph corresponding to their network position. Applications include identifying the most influential person(s) in a social network, key infrastructure nodes in the Internet or urban networks, super-spreaders of disease, and brain networks. Centrality concepts were first developed in social network analysis, and many of the terms used to measure centrality reflect their sociological origin.

<span class="mw-page-title-main">Link (simplicial complex)</span>

The link in a simplicial complex is a generalization of the neighborhood of a vertex in a graph. The link of a vertex encodes information about the local structure of the complex at the vertex.

In mathematics, a Δ-set, often called a Δ-complex or a semi-simplicial set, is a combinatorial object that is useful in the construction and triangulation of topological spaces, and also in the computation of related algebraic invariants of such spaces. A Δ-set is somewhat more general than a simplicial complex, yet not quite as sophisticated as a simplicial set. Simplicial sets have additional structure, so that every simplicial set is also a semi-simplicial set.

<span class="mw-page-title-main">Clique complex</span> Abstract simplicial complex describing a graphs cliques

Clique complexes, independence complexes, flag complexes, Whitney complexes and conformal hypergraphs are closely related mathematical objects in graph theory and geometric topology that each describe the cliques of an undirected graph.

In mathematics, a Stanley–Reisner ring, or face ring, is a quotient of a polynomial algebra over a field by a square-free monomial ideal. Such ideals are described more geometrically in terms of finite simplicial complexes. The Stanley–Reisner ring construction is a basic tool within algebraic combinatorics and combinatorial commutative algebra. Its properties were investigated by Richard Stanley, Melvin Hochster, and Gerald Reisner in the early 1970s.

A simplicial map is a function between two simplicial complexes, with the property that the images of the vertices of a simplex always span a simplex. Simplicial maps can be used to approximate continuous functions between topological spaces that can be triangulated; this is formalized by the simplicial approximation theorem.

<span class="mw-page-title-main">Pseudomanifold</span>

In mathematics, a pseudomanifold is a special type of topological space. It looks like a manifold at most of its points, but it may contain singularities. For example, the cone of solutions of forms a pseudomanifold.

Persistent homology is a method for computing topological features of a space at different spatial resolutions. More persistent features are detected over a wide range of spatial scales and are deemed more likely to represent true features of the underlying space rather than artifacts of sampling, noise, or particular choice of parameters.

In topological data analysis, a subdivision bifiltration is a collection of filtered simplicial complexes, typically built upon a set of data points in a metric space, that captures shape and density information about the underlying data set. The subdivision bifiltration relies on a natural filtration of the barycentric subdivision of a simplicial complex by flags of minimum dimension, which encodes density information about the metric space upon which the complex is built. The subdivision bifiltration was first introduced by Donald Sheehy in 2011 as part of his doctoral thesis as a discrete model of the multicover bifiltration, a continuous construction whose underlying framework dates back to the 1970s. In particular, Sheehy applied the construction to both the Vietoris-Rips and Čech filtrations, two common objects in the field of topological data analysis. Whereas single parameter filtrations are not robust with respect to outliers in the data, the subdivision-Rips and -Cech bifiltrations satisfy several desirable stability properties.

References

  1. 1 2 3 4 5 6 7 8 9 10 11 12 Boissonnat, Jean-Daniel; Maria, Clément (November 2014). "The Simplex Tree: an Efficient Data Structure for General Simplicial Complexes". Algorithmica. 70 (3): 406–427. arXiv: 2001.02581 . doi:10.1007/s00453-014-9887-3. ISSN   0178-4617. S2CID   15335393.
  2. Salinas, David (7 February 2020). "Skeleton-Blocker". Geometry Understanding in Higher Dimensions. Retrieved 9 December 2021.
  3. 1 2 Godi, Francois (7 February 2020). "Toplex Map". Geometry Understanding in Higher Dimensions. Retrieved 9 December 2021.
  4. 1 2 3 Boissonnat, Jean-Daniel. "Simplex tree reference manual". Geometry Understanding in Higher Dimensions. Retrieved 9 December 2021.
  5. 1 2 Piekenbrock, Matt (13 September 2020). "simplex_tree: Simplex Tree". rdrr.io. Retrieved 9 December 2021.
  6. Nanda, Vidit. "Perseus, the Persistent Homology Software". The Perseus Software Project for Rapid Computation of Persistent Homology. Retrieved 9 December 2021.
  7. 1 2 Morozov, Dmitriy (2019). "Basics". Dionysus 2. Retrieved 9 December 2021.
  8. Morozov, Dmitriy (2019). "Vietoris–Rips Complexes". Dionysus 2. Retrieved 9 December 2021.
  9. Mandal, Sayan (2020). "Applications of Persistent Homology and Cycles". The Ohio State University. PHD Thesis.