Longest alternating subsequence

Last updated

In combinatorial mathematics, probability, and computer science, in the longest alternating subsequence problem, one wants to find a subsequence of a given sequence in which the elements are in alternating order, and in which the sequence is as long as possible.

Contents

Formally, if is a sequence of distinct real numbers, then the subsequence is alternating [1] (or zigzag or down-up) if

Similarly, is reverse alternating (or up-down) if

Let denote the length (number of terms) of the longest alternating subsequence of . For example, if we consider some of the permutations of the integers 1,2,3,4,5, we have that

Efficient algorithms

The longest alternating subsequence problem is solvable in time , where is the length of the original sequence.[ citation needed ]

Distributional results

If is a random permutation of the integers and , then it is possible to show [2] [3] [4] that

Moreover, as , the random variable , appropriately centered and scaled, converges to a standard normal distribution.

Online algorithms

The longest alternating subsequence problem has also been studied in the setting of online algorithms, in which the elements of are presented in an online fashion, and a decision maker needs to decide whether to include or exclude each element at the time it is first presented, without any knowledge of the elements that will be presented in the future, and without the possibility of recalling on preceding observations.

Given a sequence of independent random variables with common continuous distribution , it is possible to construct a selection procedure that maximizes the expected number of alternating selections. Such expected values can be tightly estimated, and it equals . [5]

As , the optimal number of online alternating selections appropriately centered and scaled converges to a normal distribution. [6]

See also

Related Research Articles

<span class="mw-page-title-main">Binomial coefficient</span> Number of subsets of a given size

In mathematics, the binomial coefficients are the positive integers that occur as coefficients in the binomial theorem. Commonly, a binomial coefficient is indexed by a pair of integers nk ≥ 0 and is written It is the coefficient of the xk term in the polynomial expansion of the binomial power (1 + x)n; this coefficient can be computed by the multiplicative formula

In mathematics, the determinant is a scalar value that is a function of the entries of a square matrix. The determinant of a matrix A is commonly denoted det(A), det A, or |A|. Its value characterizes some properties of the matrix and the linear map represented by the matrix. In particular, the determinant is nonzero if and only if the matrix is invertible and the linear map represented by the matrix is an isomorphism. The determinant of a product of matrices is the product of their determinants (which follows directly from the above properties).

<span class="mw-page-title-main">Discrete Fourier transform</span> Type of Fourier transform in discrete mathematics

In mathematics, the discrete Fourier transform (DFT) converts a finite sequence of equally-spaced samples of a function into a same-length sequence of equally-spaced samples of the discrete-time Fourier transform (DTFT), which is a complex-valued function of frequency. The interval at which the DTFT is sampled is the reciprocal of the duration of the input sequence. An inverse DFT (IDFT) is a Fourier series, using the DTFT samples as coefficients of complex sinusoids at the corresponding DTFT frequencies. It has the same sample-values as the original input sequence. The DFT is therefore said to be a frequency domain representation of the original input sequence. If the original sequence spans all the non-zero values of a function, its DTFT is continuous, and the DFT provides discrete samples of one cycle. If the original sequence is one cycle of a periodic function, the DFT provides all the non-zero values of one DTFT cycle.

<span class="mw-page-title-main">Independence (probability theory)</span> When the occurrence of one event does not affect the likelihood of another

Independence is a fundamental notion in probability theory, as in statistics and the theory of stochastic processes. Two events are independent, statistically independent, or stochastically independent if, informally speaking, the occurrence of one does not affect the probability of occurrence of the other or, equivalently, does not affect the odds. Similarly, two random variables are independent if the realization of one does not affect the probability distribution of the other.

In probability theory, the central limit theorem (CLT) establishes that, in many situations, for independent and identically distributed random variables, the sampling distribution of the standardized sample mean tends towards the standard normal distribution even if the original variables themselves are not normally distributed.

<span class="mw-page-title-main">Permutation</span> Mathematical version of an order change

In mathematics, a permutation of a set is, loosely speaking, an arrangement of its members into a sequence or linear order, or if the set is already ordered, a rearrangement of its elements. The word "permutation" also refers to the act or process of changing the linear order of an ordered set.

<span class="mw-page-title-main">Covariance matrix</span> Measure of covariance of components of a random vector

In probability theory and statistics, a covariance matrix is a square matrix giving the covariance between each pair of elements of a given random vector.

In mathematics, Fatou's lemma establishes an inequality relating the Lebesgue integral of the limit inferior of a sequence of functions to the limit inferior of integrals of these functions. The lemma is named after Pierre Fatou.

In physics, the S-matrix or scattering matrix relates the initial state and the final state of a physical system undergoing a scattering process. It is used in quantum mechanics, scattering theory and quantum field theory (QFT).

<span class="mw-page-title-main">Dirichlet distribution</span> Probability distribution

In probability and statistics, the Dirichlet distribution (after Peter Gustav Lejeune Dirichlet), often denoted , is a family of continuous multivariate probability distributions parameterized by a vector of positive reals. It is a multivariate generalization of the beta distribution, hence its alternative name of multivariate beta distribution (MBD). Dirichlet distributions are commonly used as prior distributions in Bayesian statistics, and in fact, the Dirichlet distribution is the conjugate prior of the categorical distribution and multinomial distribution.

<span class="mw-page-title-main">Random forest</span> Binary search tree based ensemble machine learning method


Random forests or random decision forests is an ensemble learning method for classification, regression and other tasks that operates by constructing a multitude of decision trees at training time. For classification tasks, the output of the random forest is the class selected by most trees. For regression tasks, the mean or average prediction of the individual trees is returned. Random decision forests correct for decision trees' habit of overfitting to their training set. Random forests generally outperform decision trees, but their accuracy is lower than gradient boosted trees. However, data characteristics can affect their performance.

<span class="mw-page-title-main">Real coordinate space</span> Space formed by the n-tuples of real numbers

In mathematics, the real coordinate space of dimension n, denoted Rn or , is the set of the n-tuples of real numbers, that is the set of all sequences of n real numbers. Special cases are called the real lineR1 and the real coordinate planeR2. With component-wise addition and scalar multiplication, it is a real vector space, and its elements are called coordinate vectors.

In computer science, the longest increasing subsequence problem aims to find a subsequence of a given sequence in which the subsequence's elements are sorted in an ascending order and in which the subsequence is as long as possible. This subsequence is not necessarily contiguous or unique. The longest increasing subsequences are studied in the context of various disciplines related to mathematics, including algorithmics, random matrix theory, representation theory, and physics. The longest increasing subsequence problem is solvable in time where denotes the length of the input sequence.

In statistics, an exchangeable sequence of random variables is a sequence X1X2X3, ... whose joint probability distribution does not change when the positions in the sequence in which finitely many of them appear are altered. Thus, for example the sequences

In probability theory, Bernstein inequalities give bounds on the probability that the sum of random variables deviates from its mean. In the simplest case, let X1, ..., Xn be independent Bernoulli random variables taking values +1 and −1 with probability 1/2, then for every positive ,

In multilinear algebra, the tensor rank decomposition or the decomposition of a tensor is the decomposition of a tensor in terms of a sum of minimum tensors. This is an open problem.

For certain applications in linear algebra, it is useful to know properties of the probability distribution of the largest eigenvalue of a finite sum of random matrices. Suppose is a finite sequence of random matrices. Analogous to the well-known Chernoff bound for sums of scalars, a bound on the following is sought for a given parameter t:

In statistics, an approximate entropy (ApEn) is a technique used to quantify the amount of regularity and the unpredictability of fluctuations over time-series data. For example, consider two series of data:

In combinatorial mathematics, the hook length formula is a formula for the number of standard Young tableaux whose shape is a given Young diagram. It has applications in diverse areas such as representation theory, probability, and algorithm analysis; for example, the problem of longest increasing subsequences. A related formula gives the number of semi-standard Young tableaux, which is a specialization of a Schur polynomial.

<span class="mw-page-title-main">Affine symmetric group</span> Mathematical structure

The affine symmetric groups are a family of mathematical structures that describe the symmetries of the number line and the regular triangular tiling of the plane, as well as related higher-dimensional objects. In addition to this geometric description, the affine symmetric groups may be defined in other ways: as collections of permutations (rearrangements) of the integers that are periodic in a certain sense, or in purely algebraic terms as a group with certain generators and relations. They are studied as part of the fields of combinatorics and representation theory.

References

  1. Stanley, Richard P. (2011), Enumerative Combinatorics, Volume I, second edition, Cambridge University Press
  2. Widom, Harold (2006), "On the limiting distribution for the length of the longest alternating sequence in a random permutation", Electron. J. Combin., 13: Research Paper 25, 7
  3. Stanley, Richard P. (2008), "Longest alternating subsequences of permutations", Michigan Math. J., 57: 675–687, arXiv: math/0511419 , doi:10.1307/mmj/1220879431
  4. Houdré, Christian; Restrepo, Ricardo (2010), "A probabilistic approach to the asymptotics of the length of the longest alternating subsequence", Electron. J. Combin., 17: Research Paper 168, 19
  5. Arlotto, Alessandro; Chen, Robert W.; Shepp, Lawrence A.; Steele, J. Michael (2011), "Online selection of alternating subsequences from a random sample", J. Appl. Probab., 48 (4): 1114–1132, arXiv: 1105.1558 , doi:10.1239/jap/1324046022
  6. Arlotto, Alessandro; Steele, J. Michael (2014), "Optimal online selection of an alternating subsequence: a central limit theorem", Adv. Appl. Probab., 46 (2): 536–559, doi: 10.1239/aap/1401369706