Iterative Viterbi decoding

Last updated

Iterative Viterbi decoding is an algorithm that spots the subsequence S of an observation O = {o1, ..., on} having the highest average probability (i.e., probability scaled by the length of S) of being generated by a given hidden Markov model M with m states. The algorithm uses a modified Viterbi algorithm as an internal step.

Contents

The scaled probability measure was first proposed by John S. Bridle. An early algorithm to solve this problem, sliding window, was proposed by Jay G. Wilpon et al., 1989, with constant cost T = mn2/2.

A faster algorithm consists of an iteration of calls to the Viterbi algorithm, reestimating a filler score until convergence.

The algorithm

A basic (non-optimized) version, finding the sequence s with the smallest normalized distance from some subsequence of t is:

// input is placed in observation s[1..n], template t[1..m], // and [[distance matrix]] d[1..n,1..m] // remaining elements in matrices are solely for internal computations (int, int, int) AverageSubmatchDistance(char s[0..(n+1)], char t[0..(m+1)], int d[1..n,0..(m+1)]) {     // score, subsequence start, subsequence end     declare int e, B, E     t'[0] := t'[m+1] := s'[0] := s'[n+1] := 'e'      e := random()     do         e' := e  for i := 1 to n do d'[i,0] := d'[i,m+1] := e  (e, B, E)  := ViterbiDistance(s', t', d')         e := e/(E-B+1)     until (e == e')      return (e, B, E) } 

The ViterbiDistance() procedure returns the tuple (e, B, E), i.e., the Viterbi score "e" for the match of t and the selected entry (B) and exit (E) points from it. "B" and "E" have to be recorded using a simple modification to Viterbi.

A modification that can be applied to CYK tables, proposed by Antoine Rozenknop, consists in subtracting e from all elements of the initial matrix d.

References

Further reading