Threaded binary tree

Last updated
A threaded tree, with the special threading links shown by dashed arrows Threaded tree.svg
A threaded tree, with the special threading links shown by dashed arrows

In computing, a threaded binary tree is a binary tree variant that facilitates traversal in a particular order (often the same order already defined for the tree).

Contents

An entire binary search tree can be easily traversed in order of the main key, but given only a pointer to a node, finding the node which comes next may be slow or impossible. For example, leaf nodes by definition have no descendants, so given only a pointer to a leaf node no other node can be reached. A threaded tree adds extra information in some or all nodes, so that for any given single node the "next" node can be found quickly, allowing tree traversal without recursion and the extra storage (proportional to the tree's depth) that recursion requires.

Threading

"A binary tree is threaded by making all right child pointers that would normally be null point to the in-order successor of the node (if it exists), and all left child pointers that would normally be null point to the in-order predecessor of the node." [1]

This assumes the traversal order is the same as in-order traversal of the tree. However, pointers can instead (or in addition) be added to tree nodes, rather than replacing. Linked lists thus defined are also commonly called "threads", and can be used to enable traversal in any order(s) desired. For example, a tree whose nodes represent information about people might be sorted by name, but have extra threads allowing quick traversal in order of birth date, weight, or any other known characteristic.

Motivation

Trees, including (but not limited to) binary search trees, can be used to store items in a particular order, such as the value of some property stored in each node, often called a key. One useful operation on such a tree is traversal: visiting all the items in order of the key.

A simple recursive traversal algorithm that visits each node of a binary search tree is the following. Assume t is a pointer to a node, or nil. "Visiting" t can mean performing any action on the node t or its contents.

Algorithm traverse(t):

  • Input: a pointer t to a node (or nil)
  • If t = nil, return.
  • Else:
    • traverse(left-child(t))
    • Visit t
    • traverse(right-child(t))

One problem with this algorithm is that, because of its recursion, it uses stack space proportional to the height of a tree. If the tree is fairly balanced, this amounts to O(log n) space for a tree containing n elements. In the worst case, when the tree takes the form of a chain, the height of the tree is n so the algorithm takes O(n) space. A second problem is that all traversals must begin at the root when nodes have pointers only to their children. It is common to have a pointer to a particular node, but that is not sufficient to get back to the rest of the tree unless extra information is added, such as thread pointers.

In this approach, it may not be possible to tell whether the left and/or right pointers in a given node actually point to children, or are a consequence of threading. If the distinction is necessary, adding a single bit to each node is enough to record it.

In a 1968 textbook, Donald Knuth asked whether a non-recursive algorithm for in-order traversal exists, that uses no stack and leaves the tree unmodified. [2] One of the solutions to this problem is tree threading, presented by Joseph M. Morris in 1979. [3] [4] In the 1969 follow-up edition, [5] Knuth attributed the threaded tree representation to Perlis and Thornton (1960). [6]

Relation to parent pointers

Another way to achieve similar goals is to include a pointer in every node, to that node's parent node. Given that, the "next" node can always be reached. "right" pointers are still null whenever there are no right children. To find the "next" node from a node whose right pointer is null, walk up through "parent" pointers until reaching a node whose right pointer is not null, and is not the child you just came up from. That node is the "next" node, and after it come its descendants on the right.

It is also possible to discover the parent of a node from a threaded binary tree, without explicit use of parent pointers or a stack, although it is slower. To see this, consider a node k with right child r. Then the left pointer of r must be either a child or a thread back to k. In the case that r has a left child, that left child must in turn have either a left child of its own or a thread back to k, and so on for all successive left children. So by following the chain of left pointers from r, we will eventually find a thread pointing back to k. The situation is symmetrically similar when q is the left child of pwe can follow q's right children to a thread pointing ahead to p.

In Python:

defparent(node):ifnodeisnode.tree.root:returnNonex=nodey=nodewhileTrue:ifis_thread(y):p=y.rightifpisNoneorp.leftisnotnode:p=xwhilenotis_thread(p.left):p=p.leftp=p.leftreturnpelifis_thread(x):p=x.leftifpisNoneorp.rightisnotnode:p=ywhilenotis_thread(p.right):p=p.rightp=p.rightreturnpx=x.lefty=y.right

Types

  1. Single threaded: each node is threaded towards either the in-order predecessor or successor (left or right).
  2. Double threaded: each node is threaded towards both the in-order predecessor and successor (left and right).

The array of in-order traversal

Threads are reference to the predecessors and successors of the node according to an inorder traversal.

In-order traversal of the threaded tree is A,B,C,D,E,F,G,H,I, the predecessor of E is D, the successor of E is F.

ThreadTree Inorder Array.png

Example

ThreadTree Inorder Array123456789.png

Let's make the Threaded Binary tree out of a normal binary tree:

Normal Binary Tree.png

The in-order traversal for the above tree is — D B A E C. So, the respective Threaded Binary tree will be --

Threaded Binary Tree.png

In an m-way threaded binary tree with n nodes, there are n*m - (n-1) void links.

Related Research Articles

<span class="mw-page-title-main">AVL tree</span> Self-balancing binary search tree

In computer science, an AVL tree is a self-balancing binary search tree (BST). It was the first such data structure to be invented. In an AVL tree, the heights of the two child subtrees of any node differ by at most one; if at any time they differ by more than one, rebalancing is done to restore this property. Lookup, insertion, and deletion all take O(log n) time in both the average and worst cases, where is the number of nodes in the tree prior to the operation. Insertions and deletions may require the tree to be rebalanced by one or more tree rotations.

Binary search tree Rooted binary tree data structure

In computer science, a binary search tree (BST), also called an ordered or sorted binary tree, is a rooted binary tree data structure with the key of each internal node being greater than all the keys in the respective node's left subtree and less than the ones in its right subtree. The time complexity of operations on the binary search tree is directly proportional to the height of the tree.

<span class="mw-page-title-main">Binary tree</span> Limited form of tree data structure

In computer science, a binary tree is a tree data structure in which each node has at most two children, which are referred to as the left child and the right child. A recursive definition using just set theory notions is that a (non-empty) binary tree is a tuple, where L and R are binary trees or the empty set and S is a singleton set containing the root. Some authors allow the binary tree to be the empty set as well.

In computer science, a red–black tree is a kind of self-balancing binary search tree. Each node stores an extra bit representing "color", used to ensure that the tree remains balanced during insertions and deletions.

A splay tree is a binary search tree with the additional property that recently accessed elements are quick to access again. Like self-balancing binary search trees, a splay tree performs basic operations such as insertion, look-up and removal in O(log n) amortized time. For random access patterns drawn from a non-uniform random distribution, their amortized time can be faster than logarithmic, proportional to the entropy of the access pattern. For many patterns of non-random operations, also, splay trees can take better than logarithmic time, without requiring advance knowledge of the pattern. According to the unproven dynamic optimality conjecture, their performance on all access patterns is within a constant factor of the best possible performance that could be achieved by any other self-adjusting binary search tree, even one selected to fit that pattern. The splay tree was invented by Daniel Sleator and Robert Tarjan in 1985.

Tree (data structure) Abstract data type simulating a hierarchical tree structure and represented as a set of linked nodes

In computer science, a tree is a widely used abstract data type that represents a hierarchical tree structure with a set of connected nodes. Each node in the tree can be connected to many children, but must be connected to exactly one parent, except for the root node, which has no parent. These constraints mean there are no cycles or "loops", and also that each child can be treated like the root node of its own subtree, making recursion a useful technique for tree traversal. In contrast to linear data structures, many trees cannot be represented by relationships between neighboring nodes in a single straight line.

Tree rotation A local change in a binary tree that preserves leaf order

In discrete mathematics, tree rotation is an operation on a binary tree that changes the structure without interfering with the order of the elements. A tree rotation moves one node up in the tree and one node down. It is used to change the shape of the tree, and in particular to decrease its height by moving smaller subtrees down and larger subtrees up, resulting in improved performance of many tree operations.

Trie K-ary search tree data structure

In computer science, a trie, also called digital tree or prefix tree, is a type of k-ary search tree, a tree data structure used for locating specific keys from within a set. These keys are most often strings, with links between nodes defined not by the entire key, but by individual characters. In order to access a key, the trie is traversed depth-first, following the links between nodes, which represent each character in the key.

Treap

In computer science, the treap and the randomized binary search tree are two closely related forms of binary search tree data structures that maintain a dynamic set of ordered keys and allow binary searches among the keys. After any sequence of insertions and deletions of keys, the shape of the tree is a random variable with the same probability distribution as a random binary tree; in particular, with high probability its height is proportional to the logarithm of the number of keys, so that each search, insertion, or deletion operation takes logarithmic time to perform.

In computer science, tree traversal is a form of graph traversal and refers to the process of visiting each node in a tree data structure, exactly once. Such traversals are classified by the order in which the nodes are visited. The following algorithms are described for a binary tree, but they may be generalized to other trees as well.

Radix tree Data structure

In computer science, a radix tree is a data structure that represents a space-optimized trie in which each node that is the only child is merged with its parent. The result is that the number of children of every internal node is at most the radix r of the radix tree, where r is a positive integer and a power x of 2, having x ≥ 1. Unlike regular trees, edges can be labeled with sequences of elements as well as single elements. This makes radix trees much more efficient for small sets and for sets of strings that share long prefixes.

An AA tree in computer science is a form of balanced tree used for storing and retrieving ordered data efficiently. AA trees are named after Arne Andersson, the one who theorized them.

The Day–Stout–Warren (DSW) algorithm is a method for efficiently balancing binary search trees – that is, decreasing their height to O(log n) nodes, where n is the total number of nodes. Unlike a self-balancing binary search tree, it does not do this incrementally during each operation, but periodically, so that its cost can be amortized over many operations. The algorithm was designed by Quentin F. Stout and Bette Warren in a 1986 CACM paper, based on work done by Colin Day in 1976.

<i>m</i>-ary tree

In graph theory, an m-ary tree is a rooted tree in which each node has no more than m children. A binary tree is the special case where m = 2, and a ternary tree is another case with m = 3 that limits its children to three.

In computer science, a leftist tree or leftist heap is a priority queue implemented with a variant of a binary heap. Every node x has an s-value which is the distance to the nearest leaf in subtree rooted at x. In contrast to a binary heap, a leftist tree attempts to be very unbalanced. In addition to the heap property, leftist trees are maintained so the right descendant of each node has the lower s-value.

Recursion (computer science) Use of functions that call themselves

In computer science, recursion is a method of solving a computational problem where the solution depends on solutions to smaller instances of the same problem. Recursion solves such recursive problems by using functions that call themselves from within their own code. The approach can be applied to many types of problems, and recursion is one of the central ideas of computer science.

The power of recursion evidently lies in the possibility of defining an infinite set of objects by a finite statement. In the same manner, an infinite number of computations can be described by a finite recursive program, even if this program contains no explicit repetitions.

A link/cut tree is a data structure for representing a forest, a set of rooted trees, and offers the following operations:

Left-child right-sibling binary tree

Every multi-way or k-ary tree structure studied in computer science admits a representation as a binary tree, which goes by various names including child-sibling representation, left-child, right-sibling binary tree, doubly chained tree or filial-heir chain.

Cartesian tree

In computer science, a Cartesian tree is a binary tree derived from a sequence of numbers; it can be uniquely defined from the properties that it is heap-ordered and that a symmetric (in-order) traversal of the tree returns the original sequence. Introduced by Vuillemin (1980) in the context of geometric range searching data structures, Cartesian trees have also been used in the definition of the treap and randomized binary search tree data structures for binary search problems. The Cartesian tree for a sequence may be constructed in linear time using a stack-based algorithm for finding all nearest smaller values in a sequence.

A binary expression tree is a specific kind of a binary tree used to represent expressions. Two common types of expressions that a binary expression tree can represent are algebraic and boolean. These trees can represent expressions that contain both unary and binary operators.

References

  1. Van Wyk, Christopher J. Data Structures and C Programs, Addison-Wesley, 1988, p. 175. ISBN   978-0-201-16116-8.
  2. Knuth, D.E. (1968). Fundamental Algorithms. The Art of Computer Programming. Vol. 1 (1st ed.). Reading/MA: Addison Wesley.
  3. Morris, Joseph H. (1979). "Traversing binary trees simply and cheaply". Information Processing Letters . 9 (5). doi:10.1016/0020-0190(79)90068-1.
  4. Mateti, Prabhaker; Manghirmalani, Ravi (1988). "Morris' tree traversal algorithm reconsidered". Science of Computer Programming. 11: 29–43. doi: 10.1016/0167-6423(88)90063-9 .
  5. Knuth, D.E. (1969). Fundamental Algorithms. The Art of Computer Programming. Vol. 1 (2 ed.). Addison Wesley. Hre: Sect.2.3.1 "Traversing Binary Trees".
  6. Perlis, Alan Jay; Thornton, C. (Apr 1960). "Symbol manipulation by threaded lists". Communications of the ACM. 3 (4): 195–204. doi:10.1145/367177.367202.