Graphical time warping

Last updated

Graphical time warping (GTW) is a framework for jointly aligning multiple pairs of time series or sequences. [1] GTW considers both the alignment accuracy of each sequence pair and the similarity among pairs. On contrary, alignment with dynamic time warping (DTW) considers the pairs independently and minimizes only the distance between the two sequences in a given pair. Therefore, GTW generalizes DTW and could achieve a better alignment performance when similarity among pairs is expected.

Contents

One application of GTW is signal propagation analysis in time-lapse bio-imaging data, where the propagation patterns in adjacent pixels are generally similar. Other applications include signature identification, binocular stereo depth calculation, and liquid chromatography–mass spectrometry (LC-MS) profile alignment in proteomics data analysis. [2] Indeed, as long as the data are structured with inter-dependent time series/sequences, they can be analyzed with GTW.

GTW is able to model constraints or similarities between warping paths by transforming the DTW-equivalent shortest path problem to the maximum flow problem in the dual graph, which can be solved by most max-flow algorithms. However, when the data is large, these algorithms become time-consuming and the memory usage is high. An efficient algorithm, Bidirectional pushing with Linear Component Operations (BILCO), [3] was developed to solve the GTW problem. It could achieve an average 10-fold improvement in both computational and memory usage compared with the state of art generic maximum flow algorithms in GTW applications.

Joint alignment and GTW formulation

Joint alignment

Assume there are pairs of time series , and each pair has a corresponding warping path . Some pairs of warping paths are known to be similar, and the set of all such pairs is denoted as . For example, if is in this set, warping paths and are similar. To optimize both the similarity between the aligned time series and the warping paths distances, the joint alignment problem is formulated as a minimization problem:

Here denotes the distance between and after alignment with warping function , is the distance between warping paths and defined by the area of the region bounded by and , and is a hyperparameter balancing the time series alignment cost term and warping function distance term.

Notice that the similarity strength can be application-specific or user-designed. For different related warping paths pair , we can set different parameters . For simplicity, here we use the unified hyperparameter .

The above minimization problem is intuitively formulated. However, it is not clear how to efficiently solve it in its original form, and a naïve enumeration of the warping paths leads to an NP-hard problem.

GTW formulation

GTW graph construction. The first row demonstrates the process for constructing GTW subgraph from one time-series pair, where the min-cut of GTW subgraph is dual to the shortest path of DTW graph, as well as the optimal warping path. The second row shows the process of adding cross edges between related time-series pairs, where the cross edges are colored green. The area size between adjacent warping paths is proportional to the number of edges cut among cross edges. Figure for Graphical time warping graph.png
GTW graph construction. The first row demonstrates the process for constructing GTW subgraph from one time-series pair, where the min-cut of GTW subgraph is dual to the shortest path of DTW graph, as well as the optimal warping path. The second row shows the process of adding cross edges between related time-series pairs, where the cross edges are colored green. The area size between adjacent warping paths is proportional to the number of edges cut among cross edges.

This minimization problem can be reformulated into a minimum cut problem on a special graph termed GTW graph, where the minimum cut and the warping paths are equivalent. [1] The formulation could be described as:

Explanation of the equivalence

Each GTW subgraph is the dual graph of the DTW graph representing the alignment of a single time-series pair. As a result, the cut within a GTW subgraph is dual to a warping path in DTW graph, and the profile alignment cost term can be represented by the cut cost within subgraphs. The infinite capacities of reverse edges are used to guarantee the monotonicity and continuity of warping paths.

Cross edges constrain the similarity of warping paths and contribute to the distance term in the objective function. Notice that in a minimum cut problem, the nodes would eventually be assigned to the source side or the sink side, and the final cut is defined by the edges between two sides. Each pair of mismatched nodes in and contribute to the distance between and and would result in an extra cost. Thus, the distance term could be represented by the cut cost in cross edges.

Therefore, the cut cost in the GTW graph corresponds to the cost terms in the objective function. Recalling the fact that the cut within each subgraph corresponds to the warping path of one time-series pair, the minimum cut of GTW graph corresponds to the optimal solution of warping paths in the joint alignment.

Extension

Neighbor-wise Compound-specific Graphical Time Warping (ncGTW)

In multiple sequence alignment, the purpose is to align all sequences to a common reference. However, this common reference is usually unknown. In addition, there is also structural information among the sequences. Though GTW cannot be directly applied in these applications, a two-stage framework called ncGTW was built upon GTW to solve this problem. In the first stage, the prior structural knowledge among the sequences is utilized to obtain the warping functions. In the second stage, these warping functions help to jointly align all sequences to a virtual reference, which does not need to be explicitly specified. ncGTW was applied to LC-MS profile alignment problems in proteomics data and performed better than existing approaches. [2]

Efficient algorithm

Bidirectional pushing with Linear Component Operations (BILCO)

Solving the minimum cut problem on GTW graph through traditional maximum flow algorithms would take a long running time and large memory usage due to the large graph size, which limits the usage of GTW. BILCO algorithm utilizes two important properties of the joint alignment problem and achieves an average 10-fold improvement in both running time and memory usage. The two properties are:

The max-flow problem is analogized to pumping water from connected water tanks. Each GTW subgraph is a water tank and the flow is just the water flow. The algorithm iteratively operates Drain operations (drain water from water tank) and Discharge operations (exchange water between adjacent water tanks) until reaching the max-flow. Diagram for BILCO algorithm.png
The max-flow problem is analogized to pumping water from connected water tanks. Each GTW subgraph is a water tank and the flow is just the water flow. The algorithm iteratively operates Drain operations (drain water from water tank) and Discharge operations (exchange water between adjacent water tanks) until reaching the max-flow.

According to the first property, BILCO divides the flow exchange into two types: (1) Flow exchange within GTW subgraph; (2) Flow exchange across related GTW subgraphs. The process can be analogized to the process of pumping water from connected water tanks, and two types of flow exchange are termed Drain and Discharge. To fully utilize such property, components (each component is a connected subset of GTW subgraph), rather than single nodes, are used as the operation unit. Both Drain and Discharge component operations can be implemented in linear time.

The second property inspires the bidirectional-pushing strategy. In this strategy, BILCO first segments the graph into two parts using the initial approximate solution, and then pushes excess/deficit in obtained sink/source parts, respectively. Compared with existing push-relabel-based maximum flow algorithms, BILCO significantly reduces redundant computation. It is worth noting that such a strategy could be also utilized to help accelerate other push-relabel-based algorithms. [3]

Applications

Signal propagation analysis

In time-lapse bio-imaging data, signal propagation is a widely observed phenomenon in many cell types. [4] Studying signal propagation may help uncover the function of these cells in both normal and pathological conditions. The propagation information could be derived from the warping paths by aligning pixels’ curves with a reference signal. Due to the low signal-to-noise ratio in bio-imaging data, pairwise alignment methods usually lead to unsatisfactory results. Considering the spatial correlation of the signals, the similarity of warping paths between adjacent pixels can be utilized in GTW to enhance the alignment performance, which may lead to a more accurate calculation of propagation properties.

Depth extraction

In binocular stereo images, alignment technique can be used to extract depth information. [5] The depth could be derived by the disparity of the same row between left image and right image. Since the depth of adjacent rows should be similar, GTW could be utilized to enhance the extraction result.

Signature identification

A signature usually contains multiple feature sequences, such as the x location, the y location, and the pressure. [6] Those feature sequences are correlated, which indicates that when comparing two signatures, the distance measure obtained by pairwise alignment is not optimal. GTW could take the dependency between features into account and provide a better distance measure.

Biological sequence alignment

In a biological sequence data set, it is common that there is some structural information among the sequences. In LC-MS data, the samples of nearby profiles tend to have similar patterns of distortion and GTW is extended to jointly align these profiles. The same technique may also be applied to the joint alignment of other sequences. Structural information between sequences also exists in DNA and amino acids data. For example, the sequences between related species are more similar compared with sequences from more remotely related species. This information could be utilized by GTW.

See also

Related Research Articles

In computer science, the Floyd–Warshall algorithm is an algorithm for finding shortest paths in a directed weighted graph with positive or negative edge weights. A single execution of the algorithm will find the lengths of shortest paths between all pairs of vertices. Although it does not return details of the paths themselves, it is possible to reconstruct the paths with simple modifications to the algorithm. Versions of the algorithm can also be used for finding the transitive closure of a relation , or widest paths between all pairs of vertices in a weighted graph.

<span class="mw-page-title-main">Component (graph theory)</span> Maximal subgraph whose vertices can reach each other

In graph theory, a component of an undirected graph is a connected subgraph that is not part of any larger connected subgraph. The components of any graph partition its vertices into disjoint sets, and are the induced subgraphs of those sets. A graph that is itself connected has exactly one component, consisting of the whole graph. Components are sometimes called connected components.

In graph theory, a branch of mathematics, the (binary) cycle space of an undirected graph is the set of its even-degree subgraphs.

<span class="mw-page-title-main">Dynamic time warping</span> An algorithm for measuring similarity between two temporal sequences, which may vary in speed

In time series analysis, dynamic time warping (DTW) is an algorithm for measuring similarity between two temporal sequences, which may vary in speed. For instance, similarities in walking could be detected using DTW, even if one person was walking faster than the other, or if there were accelerations and decelerations during the course of an observation. DTW has been applied to temporal sequences of video, audio, and graphics data — indeed, any data that can be turned into a one-dimensional sequence can be analyzed with DTW. A well-known application has been automatic speech recognition, to cope with different speaking speeds. Other applications include speaker recognition and online signature recognition. It can also be used in partial shape matching applications.

<span class="mw-page-title-main">Strongly connected component</span> Partition of a graph whose components are reachable from all vertices

In the mathematical theory of directed graphs, a graph is said to be strongly connected if every vertex is reachable from every other vertex. The strongly connected components of an arbitrary directed graph form a partition into subgraphs that are themselves strongly connected. It is possible to test the strong connectivity of a graph, or to find its strongly connected components, in linear time (that is, Θ(V + E)).

In mathematics, computer science and especially graph theory, a distance matrix is a square matrix containing the distances, taken pairwise, between the elements of a set. Depending upon the application involved, the distance being used to define this matrix may or may not be a metric. If there are N elements, this matrix will have size N×N. In graph-theoretic applications, the elements are more often referred to as points, nodes or vertices.

In computer science, a topological sort or topological ordering of a directed graph is a linear ordering of its vertices such that for every directed edge uv from vertex u to vertex v, u comes before v in the ordering. For instance, the vertices of the graph may represent tasks to be performed, and the edges may represent constraints that one task must be performed before another; in this application, a topological ordering is just a valid sequence for the tasks. Precisely, a topological sort is a graph traversal in which each node v is visited only after all its dependencies are visited. A topological ordering is possible if and only if the graph has no directed cycles, that is, if it is a directed acyclic graph (DAG). Any DAG has at least one topological ordering, and algorithms are known for constructing a topological ordering of any DAG in linear time. Topological sorting has many applications especially in ranking problems such as feedback arc set. Topological sorting is possible even when the DAG has disconnected components.

<span class="mw-page-title-main">Needleman–Wunsch algorithm</span> Method for aligning biological sequences

The Needleman–Wunsch algorithm is an algorithm used in bioinformatics to align protein or nucleotide sequences. It was one of the first applications of dynamic programming to compare biological sequences. The algorithm was developed by Saul B. Needleman and Christian D. Wunsch and published in 1970. The algorithm essentially divides a large problem into a series of smaller problems, and it uses the solutions to the smaller problems to find an optimal solution to the larger problem. It is also sometimes referred to as the optimal matching algorithm and the global alignment technique. The Needleman–Wunsch algorithm is still widely used for optimal global alignment, particularly when the quality of the global alignment is of the utmost importance. The algorithm assigns a score to every possible alignment, and the purpose of the algorithm is to find all possible alignments having the highest score.

<span class="mw-page-title-main">Connectivity (graph theory)</span> Basic concept of graph theory

In mathematics and computer science, connectivity is one of the basic concepts of graph theory: it asks for the minimum number of elements that need to be removed to separate the remaining nodes into two or more isolated subgraphs. It is closely related to the theory of network flow problems. The connectivity of a graph is an important measure of its resilience as a network.

<span class="mw-page-title-main">Feedback arc set</span> Edges that hit all cycles in a graph

In graph theory and graph algorithms, a feedback arc set or feedback edge set in a directed graph is a subset of the edges of the graph that contains at least one edge out of every cycle in the graph. Removing these edges from the graph breaks all of the cycles, producing a directed acyclic graph, an acyclic subgraph of the given graph. The feedback arc set with the fewest possible edges is the minimum feedback arc set and its removal leaves the maximum acyclic subgraph; weighted versions of these optimization problems are also used. If a feedback arc set is minimal, meaning that removing any edge from it produces a subset that is not a feedback arc set, then it has an additional property: reversing all of its edges, rather than removing them, produces a directed acyclic graph.

In computational phylogenetics, tree alignment is a computational problem concerned with producing multiple sequence alignments, or alignments of three or more sequences of DNA, RNA, or protein. Sequences are arranged into a phylogenetic tree, modeling the evolutionary relationships between species or taxa. The edit distances between sequences are calculated for each of the tree's internal vertices, such that the sum of all edit distances within the tree is minimized. Tree alignment can be accomplished using one of several algorithms with various trade-offs between manageable tree size and computational effort.

In graph theory, a connected graph is k-edge-connected if it remains connected whenever fewer than k edges are removed.

In computer science, the Hopcroft–Karp algorithm is an algorithm that takes a bipartite graph as input and produces a maximum-cardinality matching as output — a set of as many edges as possible with the property that no two edges share an endpoint. It runs in time in the worst case, where is set of edges in the graph, is set of vertices of the graph, and it is assumed that . In the case of dense graphs the time bound becomes , and for sparse random graphs it runs in time with high probability.

In graph theory, the planar separator theorem is a form of isoperimetric inequality for planar graphs, that states that any planar graph can be split into smaller pieces by removing a small number of vertices. Specifically, the removal of vertices from an n-vertex graph can partition the graph into disjoint subgraphs each of which has at most vertices.

In computer science, a property testing algorithm for a decision problem is an algorithm whose query complexity to its input is much smaller than the instance size of the problem. Typically property testing algorithms are used to distinguish if some combinatorial structure S satisfies some property P, or is "far" from having this property, using only a small number of "local" queries to the object.

In computer science and graph theory, the term color-coding refers to an algorithmic technique which is useful in the discovery of network motifs. For example, it can be used to detect a simple path of length k in a given graph. The traditional color-coding algorithm is probabilistic, but it can be derandomized without much overhead in the running time.

<span class="mw-page-title-main">Cycle basis</span> Cycles in a graph that generate all cycles

In graph theory, a branch of mathematics, a cycle basis of an undirected graph is a set of simple cycles that forms a basis of the cycle space of the graph. That is, it is a minimal set of cycles that allows every even-degree subgraph to be expressed as a symmetric difference of basis cycles.

In theoretical computer science and network routing, Suurballe's algorithm is an algorithm for finding two disjoint paths in a nonnegatively-weighted directed graph, so that both paths connect the same pair of vertices and have minimum total length. The algorithm was conceived by John W. Suurballe and published in 1974. The main idea of Suurballe's algorithm is to use Dijkstra's algorithm to find one path, to modify the weights of the graph edges, and then to run Dijkstra's algorithm a second time. The output of the algorithm is formed by combining these two paths, discarding edges that are traversed in opposite directions by the paths, and using the remaining edges to form the two paths to return as the output. The modification to the weights is similar to the weight modification in Johnson's algorithm, and preserves the non-negativity of the weights while allowing the second instance of Dijkstra's algorithm to find the correct second path.

In network theory, the Wiener connector is a means of maximizing efficiency in connecting specified "query vertices" in a network. Given a connected, undirected graph and a set of query vertices in a graph, the minimum Wiener connector is an induced subgraph that connects the query vertices and minimizes the sum of shortest path distances among all pairs of vertices in the subgraph. In combinatorial optimization, the minimum Wiener connector problem is the problem of finding the minimum Wiener connector. It can be thought of as a version of the classic Steiner tree problem, where instead of minimizing the size of the tree, the objective is to minimize the distances in the subgraph.

<span class="mw-page-title-main">Matroid parity problem</span> Largest independent set of paired elements

In combinatorial optimization, the matroid parity problem is a problem of finding the largest independent set of paired elements in a matroid. The problem was formulated by Lawler (1976) as a common generalization of graph matching and matroid intersection. It is also known as polymatroid matching, or the matchoid problem.

References

  1. 1 2 Wang, Yizhi; Miller, David J; Poskanzer, Kira; Wang, Yue; Tian, Lin; Yu, Guoqiang (2016). "Graphical Time Warping for Joint Alignment of Multiple Curves". Advances in Neural Information Processing Systems. Curran Associates, Inc. 29.
  2. 1 2 Wu, Chiung-Ting; Wang, Yizhi; Wang, Yinxue; Ebbels, Timothy; Karaman, Ibrahim; Graça, Gonçalo; Pinto, Rui; Herrington, David M; Wang, Yue; Yu, Guoqiang (1 May 2020). "Targeted realignment of LC-MS profiles by neighbor-wise compound-specific graphical time warping with misalignment detection". Bioinformatics. 36 (9): 2862–2871. doi:10.1093/bioinformatics/btaa037. PMC   7203744 . PMID   31950989.
  3. 1 2 Mi, Xuelong; Wang, Mengfan; Chen, Alex; Lim, Jing-Xuan; Wang, Yizhi; Ahrens, Misha B.; Yu, Guoqiang (6 December 2022). "BILCO: An Efficient Algorithm for Joint Alignment of Time Series". Advances in Neural Information Processing Systems. 35: 36270–36281.
  4. Wang, Yizhi; DelRosso, Nicole V.; Vaidyanathan, Trisha V.; Cahill, Michelle K.; Reitman, Michael E.; Pittolo, Silvia; Mi, Xuelong; Yu, Guoqiang; Poskanzer, Kira E. (November 2019). "Accurate quantification of astrocyte and neurotransmitter fluorescence dynamics for single-cell and population-level physiology". Nature Neuroscience. 22 (11): 1936–1944. doi:10.1038/s41593-019-0492-2. ISSN   1546-1726. PMC   6858541 . PMID   31570865.
  5. Ishikawa, Hiroshi; Geiger, Davi (1998). "Occlusions, discontinuities, and epipolar lines in stereo". Computer Vision — ECCV'98. Lecture Notes in Computer Science. Vol. 1406. Springer. pp. 232–248. doi:10.1007/BFb0055670. ISBN   978-3-540-64569-6.
  6. Okawa, Manabu (2019). "Template Matching Using Time-Series Averaging and DTW With Dependent Warping for Online Signature Verification". IEEE Access. 7: 81010–81019. doi: 10.1109/ACCESS.2019.2923093 . S2CID   195774867.