Suffix tree clustering

Last updated December 15, 2025

Suffix Tree Clustering, often abbreviated as STC is an approach for clustering that uses suffix trees.^[1] A suffix tree cluster keeps track of all n-grams of any given length to be inserted into a set word string, while simultaneously allowing differing strings to be inserted incrementally in a linear order. This has the advantage of ensuring that a large number of clusters can be handled sequentially. However, a potential disadvantage may be that it also increases the number of possible documents that need to be looked through when handling large sets of data. Suffix tree clusters can either be decompositional or agglomerative in nature, depending on the type of data being handled.^[2]

References

↑ Branson, Steve; Greenberg, Ari. "Clustering Web Search Results Using Suffix Tree Methods, CS276A Final Project" (PDF). www.stanford.edu. Stanford University . Retrieved 2 January 2015.
↑ Davis, Ernest. "Lecture 4: Clustering". www.cs.nyu.edu. New York University . Retrieved 2 January 2015.

This algorithms or data structures-related article is a stub. You can help Wikipedia by expanding it.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Branson, Steve; Greenberg, Ari. "Clustering Web Search Results Using Suffix Tree Methods, CS276A Final Project" (PDF). www.stanford.edu. Stanford University . Retrieved 2 January 2015.

[2] Davis, Ernest. "Lecture 4: Clustering". www.cs.nyu.edu. New York University . Retrieved 2 January 2015.

[1]

[2]