Topological deep learning

Last updated

Topological deep learning (TDL) [1] [2] [3] represents a field at the intersection of topology and deep learning, offering approaches to analyze and learn from data structured in topological spaces. By leveraging the principles of topology, TDL offers an approach to understanding and processing data supported on topological spaces. [4] [5] [6]

Contents

Motivation

Conventional deep learning often operates under the assumption that the data under examination reside in a linear vector space and can be effectively characterized using feature vectors. However, there is a growing recognition that this conventional perspective may be inadequate for describing various real-world datasets. For instance, molecules are more aptly represented as graphs rather than feature vectors. Similarly, three-dimensional objects, such as those encountered in computer graphics and geometry processing, are better represented as meshes. Additionally, data originating from social networks, where actors are interconnected in complex ways, defy simple vector-based descriptions. Consequently, there has been a surge in interest in integrating concepts from topology into traditional deep learning frameworks to gain more accurate structural representation of the underlying data. [1]

An introduction to topological domains

One of the core concepts in topological deep learning is the domain upon which this data is defined and supported. In case of Euclidian data, such as images, this domain is a grid, upon which the pixel value of the image is supported. In a more general setting this domain might be a topological domain. Next, we introduce the most common topological domains that are encountered in a deep learning setting. These domains include, but not limited to, graphs, simplicial complexes, cell complexes, combinatorial complexes and hypergraphs.

Given a finite set S of abstract entities, a neighborhood function on S is an assignment that attach to every point in S a subset of S or a relation. Such a function can be induced by equipping S with an auxiliary structure. Edges provide one way of defining relations among the entities of S. More specifically, edges in a graph allow one to define the notion of neighborhood using, for instance, the one hop neighborhood notion. Edges however, limited in their modeling capacity as they can only be used to model binary relations among entities of S since every edge is connected typically to two entities. In many applications, it is desirable to permit relations that incorporate more than two entities. The idea of using relations that involve more than two entities is central to topological domains. Such higher-order relations allow for a broader range of neighborhood functions to be defined on S to capture multi-way interactions among entities of S.

Next we review the main properties, advantages, and disadvantages of some commonly studied topological domains in the context of deep learning, including (abstract) simplicial complexes, regular cell complexes, hypergraphs, and combinatorial complexes.

(a): A group S is made up of basic parts (vertices) without any connections.(b): A graph represents simple connections between its parts (vertices) that are elements of S.(c): A simplicial complex shows a way parts (relations) are connected to each other, but with strict rules about how they're connected.(d): Like simplicial complexes, a cell complex shows how parts (relations) are connected, but it's more flexible in how they're shaped (like 'cells').(f): A hypergraph shows any kind of connections between parts of S, but these connections aren't organized in any particular order.(e): A CC mixes elements from cell complexes (connections with order) and hypergraphs (varied connections), covering both kinds of setups. Higher order networks.png
(a): A group S is made up of basic parts (vertices) without any connections.(b): A graph represents simple connections between its parts (vertices) that are elements of S.(c): A simplicial complex shows a way parts (relations) are connected to each other, but with strict rules about how they're connected.(d): Like simplicial complexes, a cell complex shows how parts (relations) are connected, but it's more flexible in how they're shaped (like 'cells').(f): A hypergraph shows any kind of connections between parts of S, but these connections aren't organized in any particular order.(e): A CC mixes elements from cell complexes (connections with order) and hypergraphs (varied connections), covering both kinds of setups.

Comparisons among topological domains

Each of the enumerated topological domains has its own characteristics, advantages, and limitations:

Hierarchical structure and set-type relations

The properties of simplicial complexes, cell complexes, and hypergraphs give rise to two main features of relations on higher-order domains, namely hierarchies of relations and set-type relations. [1]

Rank function

A rank function on a higher-order domain X is an order-preserving function rk: XZ, where rk(x) attaches a non-negative integer value to each relation x in X, preserving set inclusion in X. Cell and simplicial complexes are common examples of higher-order domains equipped with rank functions and therefore with hierarchies of relations. [1]

Set-type relations

Relations in a higher-order domain are called set-type relations if the existence of a relation is not implied by another relation in the domain. Hypergraphs constitute examples of higher-order domains equipped with set-type relations. Given the modeling limitations of simplicial complexes, cell complexes, and hypergraphs, we develop the combinatorial complex, a higher-order domain that features both hierarchies of relations and set-type relations. [1]

Learning on topological spaces

Learning Tasks on topological domains can be broadly classified into three categories : cell classification, cell prediction and complex classification . Learning tasks on topological spaces.jpg
Learning Tasks on topological domains can be broadly classified into three categories : cell classification, cell prediction and complex classification .

The learning tasks in TDL can be broadly classified into three categories: [1]

In practice, to perform the aforementioned tasks, deep learning models designed for specific topological spaces must be constructed and implemented. These models, known as topological neural networks, are tailored to operate effectively within these spaces.

Topological neural networks

Central to TDL are topological neural networks (TNNs), specialized architectures designed to operate on data structured in topological domains. [2] [1] Unlike traditional neural networks tailored for grid-like structures, TNNs are adept at handling more intricate data representations, such as graphs, simplicial complexes, and cell complexes. By harnessing the inherent topology of the data, TNNs can capture both local and global relationships, enabling nuanced analysis and interpretation.

Message passing topological neural networks

In a general topological domain, higher-order message passing involves exchanging messages among entities and cells using a set of neighborhood functions.

Definition: Higher-Order Message Passing on a General Topological Domain

Higher order message passing is a deep learning model defined on a topological domain and relies on message passing information among entities in the underlying domain in order to perform a learning task . Higher order message passing.png
Higher order message passing is a deep learning model defined on a topological domain and relies on message passing information among entities in the underlying domain in order to perform a learning task .

Let be a topological domain. We define a set of neighborhood functions on . Consider a cell and let for some . A message between cells and is a computation dependent on these two cells or the data supported on them. Denote as the multi-set , and let represent some data supported on cell at layer . Higher-order message passing on , [1] [7] induced by , is defined by the following four update rules:

  1. , where is the intra-neighborhood aggregation function.
  2. , where is the inter-neighborhood aggregation function.
  3. , where are differentiable functions.

Some remarks on Definition above are as follows.

First, Equation 1 describes how messages are computed between cells and . The message is influenced by both the data and associated with cells and , respectively. Additionally, it incorporates characteristics specific to the cells themselves, such as orientation in the case of cell complexes. This allows for a richer representation of spatial relationships compared to traditional graph-based message passing frameworks.

Second, Equation 2 defines how messages from neighboring cells are aggregated within each neighborhood. The function aggregates these messages, allowing information to be exchanged effectively between adjacent cells within the same neighborhood.

Third, Equation 3 outlines the process of combining messages from different neighborhoods. The function aggregates messages across various neighborhoods, facilitating communication between cells that may not be directly connected but share common neighborhood relationships.

Fourth, Equation 4 specifies how the aggregated messages influence the state of a cell in the next layer. Here, the function updates the state of cell based on its current state and the aggregated message obtained from neighboring cells.

Non-message passing topological neural networks

While the majority of TNNs follow the message passing paradigm. A few models have been suggested that do not fall this model. For instance, in [8] leverages geometric information from simplicial complexes embedded in multidimensional spaces using node coordinates. This offers interpretability and geometric consistency without relying on message passing. Furthermore, in [9] a contrastive loss-based method was suggested to learn the simplicial representation.

Applications

TDL is rapidly finding new applications in across various fields including data compression, [10] enhancing the expressive capacity of graph neural networks, [11] action recognition, [12] and trajectory prediction. [13]

Related Research Articles

In probability theory and statistics, a Gaussian process is a stochastic process, such that every finite collection of those random variables has a multivariate normal distribution. The distribution of a Gaussian process is the joint distribution of all those random variables, and as such, it is a distribution over functions with a continuous domain, e.g. time or space.

<span class="mw-page-title-main">Nonlinear dimensionality reduction</span> Projection of data onto lower-dimensional manifolds

Nonlinear dimensionality reduction, also known as manifold learning, is any of various related techniques that aim to project high-dimensional data onto lower-dimensional latent manifolds, with the goal of either visualizing the data in the low-dimensional space, or learning the mapping itself. The techniques described below can be understood as generalizations of linear decomposition methods used for dimensionality reduction, such as singular value decomposition and principal component analysis.

<span class="mw-page-title-main">Abstract simplicial complex</span> Mathematical object

In combinatorics, an abstract simplicial complex (ASC), often called an abstract complex or just a complex, is a family of sets that is closed under taking subsets, i.e., every subset of a set in the family is also in the family. It is a purely combinatorial description of the geometric notion of a simplicial complex. For example, in a 2-dimensional simplicial complex, the sets in the family are the triangles, their edges, and their vertices.

Multi-task learning (MTL) is a subfield of machine learning in which multiple learning tasks are solved at the same time, while exploiting commonalities and differences across tasks. This can result in improved learning efficiency and prediction accuracy for the task-specific models, when compared to training the models separately. Early versions of MTL were called "hints".

<span class="mw-page-title-main">Triangulation (topology)</span>

In mathematics, triangulation describes the replacement of topological spaces by piecewise linear spaces, i.e. the choice of a homeomorphism in a suitable simplicial complex. Spaces being homeomorphic to a simplicial complex are called triangulable. Triangulation has various uses in different branches of mathematics, for instance in algebraic topology, in complex analysis or in modeling.

In machine learning, kernel machines are a class of algorithms for pattern analysis, whose best known member is the support-vector machine (SVM). These methods involve using linear classifiers to solve nonlinear problems. The general task of pattern analysis is to find and study general types of relations in datasets. For many algorithms that solve these tasks, the data in raw representation have to be explicitly transformed into feature vector representations via a user-specified feature map: in contrast, kernel methods require only a user-specified kernel, i.e., a similarity function over all pairs of data points computed using inner products. The feature map in kernel machines is infinite dimensional but only requires a finite dimensional matrix from user-input according to the Representer theorem. Kernel machines are slow to compute for datasets larger than a couple of thousand examples without parallel processing.

An autoencoder is a type of artificial neural network used to learn efficient codings of unlabeled data. An autoencoder learns two functions: an encoding function that transforms the input data, and a decoding function that recreates the input data from the encoded representation. The autoencoder learns an efficient representation (encoding) for a set of data, typically for dimensionality reduction.

In mathematics, Hochschild homology (and cohomology) is a homology theory for associative algebras over rings. There is also a theory for Hochschild homology of certain functors. Hochschild cohomology was introduced by Gerhard Hochschild (1945) for algebras over a field, and extended to algebras over more general rings by Henri Cartan and Samuel Eilenberg (1956).

In mathematics, the cotangent complex is a common generalisation of the cotangent sheaf, normal bundle and virtual tangent bundle of a map of geometric spaces such as manifolds or schemes. If is a morphism of geometric or algebraic objects, the corresponding cotangent complex can be thought of as a universal "linearization" of it, which serves to control the deformation theory of . It is constructed as an object in a certain derived category of sheaves on using the methods of homotopical algebra.

Algebraic signal processing (ASP) is an emerging area of theoretical signal processing (SP). In the algebraic theory of signal processing, a set of filters is treated as an (abstract) algebra, a set of signals is treated as a module or vector space, and convolution is treated as an algebra representation. The advantage of algebraic signal processing is its generality and portability.

In the mathematical theory of artificial neural networks, universal approximation theorems are theorems of the following form: Given a family of neural networks, for each function from a certain function space, there exists a sequence of neural networks from the family, such that according to some criteria. That is, the family of neural networks is dense in the function space.

In algebraic geometry, a derived scheme is a homotopy-theoretic generalization of a scheme in which classical commutative rings are replaced with derived versions such as cdgas, commutative simplicial rings, or commutative ring spectra.

In the study of artificial neural networks (ANNs), the neural tangent kernel (NTK) is a kernel that describes the evolution of deep artificial neural networks during their training by gradient descent. It allows ANNs to be studied using theoretical tools from kernel methods.

In the mathematical field of graph theory, Hall-type theorems for hypergraphs are several generalizations of Hall's marriage theorem from graphs to hypergraphs. Such theorems were proved by Ofra Kessler, Ron Aharoni, Penny Haxell, Roy Meshulam, and others.

A graph neural network (GNN) belongs to a class of artificial neural networks for processing data that can be represented as graphs.

In mathematics, the hypergraph regularity method is a powerful tool in extremal graph theory that refers to the combined application of the hypergraph regularity lemma and the associated counting lemma. It is a generalization of the graph regularity method, which refers to the use of Szemerédi's regularity and counting lemmas.

Tensor informally refers in machine learning to two different concepts that organize and represent data. Data may be organized in a multidimensional array (M-way array) that is informally referred to as a "data tensor"; however in the strict mathematical sense, a tensor is a multilinear mapping over a set of domain vector spaces to a range vector space. Observations, such as images, movies, volumes, sounds, and relationships among words and concepts, stored in an M-way array ("data tensor") may be analyzed either by artificial neural networks or tensor methods.

The degree-Rips bifiltration is a simplicial filtration used in topological data analysis for analyzing the shape of point cloud data. It is a multiparameter extension of the Vietoris–Rips filtration that possesses greater stability to data outliers than single-parameter filtrations, and which is more amenable to practical computation than other multiparameter constructions. Introduced in 2015 by Lesnick and Wright, the degree-Rips bifiltration is a parameter-free and density-sensitive vehicle for performing persistent homology computations on point cloud data.

In topological data analysis, a subdivision bifiltration is a collection of filtered simplicial complexes, typically built upon a set of data points in a metric space, that captures shape and density information about the underlying data set. The subdivision bifiltration relies on a natural filtration of the barycentric subdivision of a simplicial complex by flags of minimum dimension, which encodes density information about the metric space upon which the complex is built. The subdivision bifiltration was first introduced by Donald Sheehy in 2011 as part of his doctoral thesis as a discrete model of the multicover bifiltration, a continuous construction whose underlying framework dates back to the 1970s. In particular, Sheehy applied the construction to both the Vietoris-Rips and Čech filtrations, two common objects in the field of topological data analysis. Whereas single parameter filtrations are not robust with respect to outliers in the data, the subdivision-Rips and -Cech bifiltrations satisfy several desirable stability properties.

Neural operators are a class of deep learning architectures designed to learn maps between infinite-dimensional function spaces. Neural operators represent an extension of traditional artificial neural networks, marking a departure from the typical focus on learning mappings between finite-dimensional Euclidean spaces or finite sets. Neural operators directly learn operators between function spaces; they can receive input functions, and the output function can be evaluated at any discretization.

References

  1. 1 2 3 4 5 6 7 8 9 10 11 12 Hajij, M.; Zamzmi, G.; Papamarkou, T.; Miolane, N.; Guzmán-Sáenz, A.; Ramamurthy, K. N.; Schaub, M. T. (2022), Topological deep learning: Going beyond graph data, arXiv: 2206.00606
  2. 1 2 Papillon, M.; Sanborn, S.; Hajij, M.; Miolane, N. (2023). "Architectures of topological deep learning: A survey on topological neural networks". arXiv: 2304.10031 [cs.LG].
  3. Ebli, S.; Defferrard, M.; Spreemann, G. (2020), Simplicial neural networks, arXiv: 2010.03633
  4. Battiloro, C.; Testa, L.; Giusti, L.; Sardellitti, S.; Di Lorenzo, P.; Barbarossa, S. (2023), Generalized simplicial attention neural networks, arXiv: 2309.02138
  5. Yang, M.; Isufi, E. (2023), Convolutional learning on simplicial complexes, arXiv: 2301.11163
  6. Chen, Y.; Gel, Y. R.; Poor, H. V. (2022), "BScNets: Block simplicial complex neural networks", Proceedings of the AAAI Conference on Artificial Intelligence, 36 (6): 6333–6341, doi:10.1609/aaai.v36i6.20583
  7. Hajij, M.; Istvan, K.; Zamzmi, G. (2020), Cell complex neural networks, arXiv: 2010.00743
  8. Maggs, K.; Hacker, C.; Rieck, B. (2023), Simplicial Representation Learning with Neural $ k $-forms, arXiv: 2312.08515
  9. Ramamurthy, K. N.; Guzmán-Sáenz, A.; Hajij, M. (2023), Topo-mlp: A simplicial network without message passing, pp. 1–5
  10. Battiloro, C.; Di Lorenzo, P.; Ribeiro, A. (September 2023), Parametric dictionary learning for topological signal representation, IEEE, pp. 1958–1962
  11. Bodnar, C.; Frasca, F.; Wang, Y.; Otter, N.; Montufar, G. F.; Lio, P.; Bronstein, M. (July 2021), Weisfeiler and lehman go topological: Message passing simplicial networks, PMLR, arXiv: 2103.03212
  12. Wang, C.; Ma, N.; Wu, Z.; Zhang, J.; Yao, Y. (August 2022), Survey of Hypergraph Neural Networks and Its Application to Action Recognition, Springer Nature Switzerland, pp. 387–398
  13. Roddenberry, T. M.; Glaze, N.; Segarra, S. (July 2021), Principled simplicial neural networks for trajectory prediction, PMLR, pp. 9020–9029, arXiv: 2102.10058

Bibliography