Longest alternating subsequence

Last updated

In combinatorial mathematics, probability, and computer science, in the longest alternating subsequence problem, one wants to find a subsequence of a given sequence in which the elements are in alternating order, and in which the sequence is as long as possible.

Contents

Formally, if is a sequence of distinct real numbers, then the subsequence is alternating [1] (or zigzag or down-up) if

Similarly, is reverse alternating (or up-down) if

Note that every sequence of length 1 is both alternating and reverse alternating.

Let denote the length (number of terms) of the longest alternating subsequence of . For example, if we consider some of the permutations of the integers 1,2,3,4,5, we have that

Efficient algorithms

In a sequence of distinct elements, the subsequence of local extrema (elements larger than both adjacent elements, or smaller than both adjacent elements) forms a canonical longest alternating sequence. [2] As a consequence, the longest alternating subsequence of a sequence of elements can be found in time . In sequences that allow repetitions, the same method can be applied after first replacing each run of repeated elements by a single copy of that element.[ citation needed ]

Distributional results

If is a random permutation of the integers and , then it is possible to show [3] [4] [5] that

Moreover, as , the random variable , appropriately centered and scaled, converges to a standard normal distribution.

Online algorithms

The longest alternating subsequence problem has also been studied in the setting of online algorithms, in which the elements of are presented in an online fashion, and a decision maker needs to decide whether to include or exclude each element at the time it is first presented, without any knowledge of the elements that will be presented in the future, and without the possibility of recalling on preceding observations.

Given a sequence of independent random variables with common continuous distribution , it is possible to construct a selection procedure that maximizes the expected number of alternating selections. Such expected values can be tightly estimated, and it equals . [6]

As , the optimal number of online alternating selections appropriately centered and scaled converges to a normal distribution. [7]

See also

Related Research Articles

<span class="mw-page-title-main">Binomial coefficient</span> Number of subsets of a given size

In mathematics, the binomial coefficients are the positive integers that occur as coefficients in the binomial theorem. Commonly, a binomial coefficient is indexed by a pair of integers nk ≥ 0 and is written It is the coefficient of the xk term in the polynomial expansion of the binomial power (1 + x)n; this coefficient can be computed by the multiplicative formula

In mathematics, the determinant is a scalar-valued function of the entries of a square matrix. The determinant of a matrix A is commonly denoted det(A), det A, or |A|. Its value characterizes some properties of the matrix and the linear map represented, on a given basis, by the matrix. In particular, the determinant is nonzero if and only if the matrix is invertible and the corresponding linear map is an isomorphism.

<span class="mw-page-title-main">Discrete Fourier transform</span> Type of Fourier transform in discrete mathematics

In mathematics, the discrete Fourier transform (DFT) converts a finite sequence of equally-spaced samples of a function into a same-length sequence of equally-spaced samples of the discrete-time Fourier transform (DTFT), which is a complex-valued function of frequency. The interval at which the DTFT is sampled is the reciprocal of the duration of the input sequence. An inverse DFT (IDFT) is a Fourier series, using the DTFT samples as coefficients of complex sinusoids at the corresponding DTFT frequencies. It has the same sample-values as the original input sequence. The DFT is therefore said to be a frequency domain representation of the original input sequence. If the original sequence spans all the non-zero values of a function, its DTFT is continuous, and the DFT provides discrete samples of one cycle. If the original sequence is one cycle of a periodic function, the DFT provides all the non-zero values of one DTFT cycle.

<span class="mw-page-title-main">Central limit theorem</span> Fundamental theorem in probability theory and statistics

In probability theory, the central limit theorem (CLT) states that, under appropriate conditions, the distribution of a normalized version of the sample mean converges to a standard normal distribution. This holds even if the original variables themselves are not normally distributed. There are several versions of the CLT, each applying in the context of different conditions.

<span class="mw-page-title-main">Simplex</span> Multi-dimensional generalization of triangle

In geometry, a simplex is a generalization of the notion of a triangle or tetrahedron to arbitrary dimensions. The simplex is so-named because it represents the simplest possible polytope in any given dimension. For example,

<span class="mw-page-title-main">Permutation</span> Mathematical version of an order change

In mathematics, a permutation of a set can mean one of two different things:

In mathematics, Fatou's lemma establishes an inequality relating the Lebesgue integral of the limit inferior of a sequence of functions to the limit inferior of integrals of these functions. The lemma is named after Pierre Fatou.

<span class="mw-page-title-main">Dirichlet distribution</span> Probability distribution

In probability and statistics, the Dirichlet distribution, often denoted , is a family of continuous multivariate probability distributions parameterized by a vector of positive reals. It is a multivariate generalization of the beta distribution, hence its alternative name of multivariate beta distribution (MBD). Dirichlet distributions are commonly used as prior distributions in Bayesian statistics, and in fact, the Dirichlet distribution is the conjugate prior of the categorical distribution and multinomial distribution.

<span class="mw-page-title-main">Real coordinate space</span> Space formed by the n-tuples of real numbers

In mathematics, the real coordinate space or real coordinate n-space, of dimension n, denoted Rn or , is the set of all ordered n-tuples of real numbers, that is the set of all sequences of n real numbers, also known as coordinate vectors. Special cases are called the real lineR1, the real coordinate planeR2, and the real coordinate three-dimensional spaceR3. With component-wise addition and scalar multiplication, it is a real vector space.

Inversive congruential generators are a type of nonlinear congruential pseudorandom number generator, which use the modular multiplicative inverse to generate the next number in a sequence. The standard formula for an inversive congruential generator, modulo some prime q is:

In mathematics, especially in combinatorics, Stirling numbers of the first kind arise in the study of permutations. In particular, the unsigned Stirling numbers of the first kind count permutations according to their number of cycles.

In computer science, the longest increasing subsequence problem aims to find a subsequence of a given sequence in which the subsequence's elements are sorted in an ascending order and in which the subsequence is as long as possible. This subsequence is not necessarily contiguous or unique. The longest increasing subsequences are studied in the context of various disciplines related to mathematics, including algorithmics, random matrix theory, representation theory, and physics. The longest increasing subsequence problem is solvable in time where denotes the length of the input sequence.

In statistics, an exchangeable sequence of random variables is a sequence X1X2X3, ... whose joint probability distribution does not change when the positions in the sequence in which finitely many of them appear are altered. In other words, the joint distribution is invariant to finite permutation. Thus, for example the sequences

A repeating decimal or recurring decimal is a decimal representation of a number whose digits are eventually periodic ; if this sequence consists only of zeros, the decimal is said to be terminating, and is not considered as repeating.

In mathematics and in particular in combinatorics, the Lehmer code is a particular way to encode each possible permutation of a sequence of n numbers. It is an instance of a scheme for numbering permutations and is an example of an inversion table.

For certain applications in linear algebra, it is useful to know properties of the probability distribution of the largest eigenvalue of a finite sum of random matrices. Suppose is a finite sequence of random matrices. Analogous to the well-known Chernoff bound for sums of scalars, a bound on the following is sought for a given parameter t:

In statistics, an approximate entropy (ApEn) is a technique used to quantify the amount of regularity and the unpredictability of fluctuations over time-series data. For example, consider two series of data:

In combinatorial mathematics, the hook length formula is a formula for the number of standard Young tableaux whose shape is a given Young diagram. It has applications in diverse areas such as representation theory, probability, and algorithm analysis; for example, the problem of longest increasing subsequences. A related formula gives the number of semi-standard Young tableaux, which is a specialization of a Schur polynomial.

In mathematics, Kingman's subadditive ergodic theorem is one of several ergodic theorems. It can be seen as a generalization of Birkhoff's ergodic theorem. Intuitively, the subadditive ergodic theorem is a kind of random variable version of Fekete's lemma. As a result, it can be rephrased in the language of probability, e.g. using a sequence of random variables and expected values. The theorem is named after John Kingman.

<span class="mw-page-title-main">Affine symmetric group</span> Mathematical structure

The affine symmetric groups are a family of mathematical structures that describe the symmetries of the number line and the regular triangular tiling of the plane, as well as related higher-dimensional objects. In addition to this geometric description, the affine symmetric groups may be defined in other ways: as collections of permutations (rearrangements) of the integers that are periodic in a certain sense, or in purely algebraic terms as a group with certain generators and relations. They are studied in combinatorics and representation theory.

References

  1. Stanley, Richard P. (2011), Enumerative Combinatorics, Volume I, second edition, Cambridge University Press
  2. Romik, Dan (2011), "Local extrema in random permutations and the structure of longest alternating subsequences", 23rd International Conference on Formal Power Series and Algebraic Combinatorics (FPSAC 2011), Discrete Math. Theor. Comput. Sci. Proc., vol. AO, Assoc. Discrete Math. Theor. Comput. Sci., Nancy, pp. 825–834, MR   2820763
  3. Widom, Harold (2006), "On the limiting distribution for the length of the longest alternating sequence in a random permutation", Electron. J. Combin., 13: Research Paper 25, 7, doi: 10.37236/1051
  4. Stanley, Richard P. (2008), "Longest alternating subsequences of permutations", Michigan Math. J., 57: 675–687, arXiv: math/0511419 , doi:10.1307/mmj/1220879431
  5. Houdré, Christian; Restrepo, Ricardo (2010), "A probabilistic approach to the asymptotics of the length of the longest alternating subsequence", Electron. J. Combin., 17: Research Paper 168, 19, arXiv: 1005.1893 , doi:10.37236/440
  6. Arlotto, Alessandro; Chen, Robert W.; Shepp, Lawrence A.; Steele, J. Michael (2011), "Online selection of alternating subsequences from a random sample", J. Appl. Probab., 48 (4): 1114–1132, arXiv: 1105.1558 , doi:10.1239/jap/1324046022
  7. Arlotto, Alessandro; Steele, J. Michael (2014), "Optimal online selection of an alternating subsequence: a central limit theorem", Adv. Appl. Probab., 46 (2): 536–559, doi: 10.1239/aap/1401369706