Stack-sortable permutation

Last updated

In mathematics and computer science, a stack-sortable permutation (also called a tree permutation) [1] is a permutation whose elements may be sorted by an algorithm whose internal storage is limited to a single stack data structure. The stack-sortable permutations are exactly the permutations that do not contain the permutation pattern 231; they are counted by the Catalan numbers, and may be placed in bijection with many other combinatorial objects with the same counting function including Dyck paths and binary trees.

Contents

Sorting with a stack

The problem of sorting an input sequence using a stack was first posed by Knuth (1968), who gave the following linear time algorithm (closely related to algorithms for the later all nearest smaller values problem):

Knuth observed that this algorithm correctly sorts some input sequences, and fails to sort others. For instance, the sequence 3,2,1 is correctly sorted: the three elements are all pushed onto the stack, and then popped in the order 1,2,3. However, the sequence 2,3,1 is not correctly sorted: the algorithm first pushes 2, and pops it when it sees the larger input value 3, causing 2 to be output before 1 rather than after it.

Because this algorithm is a comparison sort, its success or failure does not depend on the numerical values of the input sequence, but only on their relative order; that is, an input may be described by the permutation needed to form that input from a sorted sequence of the same length. Knuth characterized the permutations that this algorithm correctly sorts as being exactly the permutations that do not contain the permutation pattern 231: three elements x, y, and z, appearing in the input in that respective order, with z < x < y. Moreover, he observed that, if the algorithm fails to sort an input, then that input cannot be sorted with a single stack.

As well as inspiring much subsequent work on sorting using more complicated systems of stacks and related data structures, [2] Knuth's research kicked off the study of permutation patterns and of permutation classes defined by forbidden patterns.

Bijections and enumeration

The sequence of pushes and pops performed by Knuth's sorting algorithm as it sorts a stack-sortable permutation form a Dyck language: reinterpreting a push as a left parenthesis and a pop as a right parenthesis produces a string of balanced parentheses. Moreover, every Dyck string comes from a stack-sortable permutation in this way, and every two different stack-sortable permutations produce different Dyck strings. For this reason, the number of stack-sortable permutations of length n is the same as the number of Dyck strings of length 2n, the Catalan number

[3]
Bijection between binary trees (with nodes numbered left-to-right) and stack-sortable permutations, generated by listing the same node numbers in preorder Tree permutation bijection.svg
Bijection between binary trees (with nodes numbered left-to-right) and stack-sortable permutations, generated by listing the same node numbers in preorder

Stack-sortable permutations may also be translated directly to and from (unlabeled) binary trees, another combinatorial class whose counting function is the sequence of Catalan numbers. A binary tree may be transformed into a stack-sortable permutation by numbering its nodes in left-to-right order, and then listing these numbers in the order they would be visited by a preorder traversal of the tree: the root first, then the left subtree, then the right subtree, continuing recursively within each subtree. In the reverse direction, a stack-sortable permutation may be decoded into a tree in which the first value x of the permutation corresponds to the root of the tree, the next x  1 values are decoded recursively to give the left child of the root, and the remaining values are again decoded recursively to give the right child. [1]

Several other classes of permutations may also be placed in bijection with the stack-sortable permutations. For instance, the permutations that avoid the patterns 132, 213, and 312 may be formed respectively from the stack-sortable (231-avoiding) permutations by reversing the permutation, replacing each value x in the permutation by n + 1  x, or both operations combined. The 312-avoiding permutations are also the inverses of the 231-avoiding permutations, and have been called the stack-realizable permutations as they are the permutations that can be formed from the identity permutation by a sequence of push-from-input and pop-to-output operations on a stack. [4] As Knuth (1968) noted, the 123-avoiding and 321-avoiding permutations also have the same counting function despite being less directly related to the stack-sortable permutations.

Random stack-sortable permutations

Rotem (1981) investigates the properties of stack-sortable permutations chosen uniformly at random among all such permutations of a given length. The expected length of the longest descending subsequence in such a permutation is , differing by a constant factor from unconstrained random permutations (for which the expected length is approximately ). The expected length of the longest ascending sequence differs even more strongly from unconstrained permutations: it is . The expected number of values within the permutation that are larger than all previous values is only , smaller than its logarithmic value for unconstrained permutations. And the expected number of inversions is , in contrast to its value of for unconstrained permutations.

Additional properties

Every permutation defines a permutation graph, a graph whose vertices are the elements of the permutation and whose edges connect pairs of elements that are inverted by the permutation. The permutation graphs of stack-sortable permutations are trivially perfect. [4]

For each element i of a permutation p, define bi to be the number of other elements that are to the left of and greater than i. Then p is stack-sortable if and only if, for all i, bi  bi + 1  1. [1]

Algorithms

Knott (1977) uses the bijection between stack-sortable permutations and binary trees to define a numerical rank for each binary tree, and to construct efficient algorithms for computing the rank of a tree ("ranking") and for computing the tree with a given rank ("unranking").

Micheli & Rossin (2006) defined two edit operations on permutations: deletion (making a permutation pattern) and its inverse. Using the same correspondence between trees and permutations, they observed that these operations correspond to edge contraction in a tree and its inverse. By applying a polynomial time dynamic programming algorithm for edit distance in trees, they showed that the edit distance between two stack-sortable permutations (and hence also the longest common pattern) can be found in polynomial time. This technique was later generalized to algorithms for finding longest common patterns of separable permutations; [5] however, the longest common pattern problem is NP-complete for arbitrary permutations. [6]

Notes

Related Research Articles

<span class="mw-page-title-main">AVL tree</span> Self-balancing binary search tree

In computer science, an AVL tree is a self-balancing binary search tree. In an AVL tree, the heights of the two child subtrees of any node differ by at most one; if at any time they differ by more than one, rebalancing is done to restore this property. Lookup, insertion, and deletion all take O(log n) time in both the average and worst cases, where is the number of nodes in the tree prior to the operation. Insertions and deletions may require the tree to be rebalanced by one or more tree rotations.

<span class="mw-page-title-main">Binary search algorithm</span> Search algorithm finding the position of a target value within a sorted array

In computer science, binary search, also known as half-interval search, logarithmic search, or binary chop, is a search algorithm that finds the position of a target value within a sorted array. Binary search compares the target value to the middle element of the array. If they are not equal, the half in which the target cannot lie is eliminated and the search continues on the remaining half, again taking the middle element to compare to the target value, and repeating this until the target value is found. If the search ends with the remaining half being empty, the target is not in the array.

<span class="mw-page-title-main">Binary search tree</span> Rooted binary tree data structure

In computer science, a binary search tree (BST), also called an ordered or sorted binary tree, is a rooted binary tree data structure with the key of each internal node being greater than all the keys in the respective node's left subtree and less than the ones in its right subtree. The time complexity of operations on the binary search tree is linear with respect to the height of the tree.

<span class="mw-page-title-main">Binary tree</span> Limited form of tree data structure

In computer science, a binary tree is a tree data structure in which each node has at most two children, referred to as the left child and the right child. That is, it is a k-ary tree with k = 2. A recursive definition using set theory is that a binary tree is a tuple (L, S, R), where L and R are binary trees or the empty set and S is a singleton set containing the root.

A splay tree is a binary search tree with the additional property that recently accessed elements are quick to access again. Like self-balancing binary search trees, a splay tree performs basic operations such as insertion, look-up and removal in O(log n) amortized time. For random access patterns drawn from a non-uniform random distribution, their amortized time can be faster than logarithmic, proportional to the entropy of the access pattern. For many patterns of non-random operations, also, splay trees can take better than logarithmic time, without requiring advance knowledge of the pattern. According to the unproven dynamic optimality conjecture, their performance on all access patterns is within a constant factor of the best possible performance that could be achieved by any other self-adjusting binary search tree, even one selected to fit that pattern. The splay tree was invented by Daniel Sleator and Robert Tarjan in 1985.

<span class="mw-page-title-main">Trie</span> K-ary search tree data structure

In computer science, a trie, also called digital tree or prefix tree, is a type of k-ary search tree, a tree data structure used for locating specific keys from within a set. These keys are most often strings, with links between nodes defined not by the entire key, but by individual characters. In order to access a key, the trie is traversed depth-first, following the links between nodes, which represent each character in the key.

In computer science, patience sorting is a sorting algorithm inspired by, and named after, the card game patience. A variant of the algorithm efficiently computes the length of a longest increasing subsequence in a given array.

<span class="mw-page-title-main">Comparison sort</span> Type of sorting algorithm that works by comparing pairs of elements

A comparison sort is a type of sorting algorithm that only reads the list elements through a single abstract comparison operation that determines which of two elements should occur first in the final sorted list. The only requirement is that the operator forms a total preorder over the data, with:

  1. if ab and bc then ac (transitivity)
  2. for all a and b, ab or ba (connexity).

In computer science, the longest increasing subsequence problem aims to find a subsequence of a given sequence in which the subsequence's elements are sorted in an ascending order and in which the subsequence is as long as possible. This subsequence is not necessarily contiguous or unique. The longest increasing subsequences are studied in the context of various disciplines related to mathematics, including algorithmics, random matrix theory, representation theory, and physics. The longest increasing subsequence problem is solvable in time where denotes the length of the input sequence.

In computer science, the prefix sum, cumulative sum, inclusive scan, or simply scan of a sequence of numbers x0, x1, x2, ... is a second sequence of numbers y0, y1, y2, ..., the sums of prefixes of the input sequence:

<span class="mw-page-title-main">Tarjan's strongly connected components algorithm</span> Graph algorithm

Tarjan's strongly connected components algorithm is an algorithm in graph theory for finding the strongly connected components (SCCs) of a directed graph. It runs in linear time, matching the time bound for alternative methods including Kosaraju's algorithm and the path-based strong component algorithm. The algorithm is named for its inventor, Robert Tarjan.

<span class="mw-page-title-main">Cartesian tree</span> Binary tree derived from a sequence of numbers

In computer science, a Cartesian tree is a binary tree derived from a sequence of distinct numbers. To construct the Cartesian tree, set its root to be the minimum number in the sequence, and recursively construct its left and right subtrees from the subsequences before and after this number. It is uniquely defined as a min-heap whose symmetric (in-order) traversal returns the original sequence.

In computer science, the all nearest smaller values problem is the following task: for each position in a sequence of numbers, search among the previous positions for the last position that contains a smaller value. This problem can be solved efficiently both by parallel and non-parallel algorithms: Berkman, Schieber & Vishkin (1993), who first identified the procedure as a useful subroutine for other parallel programs, developed efficient algorithms to solve it in the Parallel Random Access Machine model; it may also be solved in linear time on a non-parallel computer using a stack-based algorithm. Later researchers have studied algorithms to solve it in other models of parallel computation.

In computer science and probability theory, a random binary tree is a binary tree selected at random from some probability distribution on binary trees. Two different distributions are commonly used: binary trees formed by inserting nodes one at a time according to a random permutation, and binary trees chosen from a uniform discrete distribution in which all distinct trees are equally likely. It is also possible to form other distributions, for instance by repeated splitting. Adding and removing nodes directly in a random binary tree will in general disrupt its random structure, but the treap and related randomized binary search tree data structures use the principle of binary trees formed from a random permutation in order to maintain a balanced binary search tree dynamically as nodes are inserted and deleted.

<span class="mw-page-title-main">Separable permutation</span>

In combinatorial mathematics, a separable permutation is a permutation that can be obtained from the trivial permutation 1 by direct sums and skew sums. Separable permutations may be characterized by the forbidden permutation patterns 2413 and 3142; they are also the permutations whose permutation graphs are cographs and the permutations that realize the series-parallel partial orders. It is possible to test in polynomial time whether a given separable permutation is a pattern in a larger permutation, or to find the longest common subpattern of two separable permutations.

In combinatorial mathematics and theoretical computer science, a permutation pattern is a sub-permutation of a longer permutation. Any permutation may be written in one-line notation as a sequence of digits representing the result of applying the permutation to the digit sequence 123...; for instance the digit sequence 213 represents the permutation on three elements that swaps elements 1 and 2. If π and σ are two permutations represented in this way, then π is said to contain σ as a pattern if some subsequence of the digits of π has the same relative order as all of the digits of σ.

In computer science, an optimal binary search tree (Optimal BST), sometimes called a weight-balanced binary tree, is a binary search tree which provides the smallest possible search time (or expected search time) for a given sequence of accesses (or access probabilities). Optimal BSTs are generally divided into two types: static and dynamic.

In computer science, merge-insertion sort or the Ford–Johnson algorithm is a comparison sorting algorithm published in 1959 by L. R. Ford Jr. and Selmer M. Johnson. It uses fewer comparisons in the worst case than the best previously known algorithms, binary insertion sort and merge sort, and for 20 years it was the sorting algorithm with the fewest known comparisons. Although not of practical significance, it remains of theoretical interest in connection with the problem of sorting with a minimum number of comparisons. The same algorithm may have also been independently discovered by Stanisław Trybuła and Czen Ping.

The Garsia–Wachs algorithm is an efficient method for computers to construct optimal binary search trees and alphabetic Huffman codes, in linearithmic time. It is named after Adriano Garsia and Michelle L. Wachs.

Interpolation sort is a kind of bucket sort. It uses an interpolation formula to assign data to the bucket. A general interpolation formula is:

References