Finite-state transducer

External videos
External videos
	Finite State Transducers // Karlsruhe Institute of Technology, YouTube video

Last updated January 19, 2025

A finite-state transducer (FST) is a finite-state machine with two memory tapes, following the terminology for Turing machines: an input tape and an output tape. This contrasts with an ordinary finite-state automaton, which has a single tape. An FST is a type of finite-state automaton (FSA) that maps between two sets of symbols.^[1] An FST is more general than an FSA. An FSA defines a formal language by defining a set of accepted strings, while an FST defines a relation between sets of strings.

Overview

An automaton can be said to recognize a string if we view the content of its tape as input. In other words, the automaton computes a function that maps strings into the set {0,1}. Alternatively, we can say that an automaton generates strings, which means viewing its tape as an output tape. On this view, the automaton generates a formal language, which is a set of strings. The two views of automata are equivalent: the function that the automaton computes is precisely the indicator function of the set of strings it generates. The class of languages generated by finite automata is known as the class of regular languages.

The two tapes of a transducer are typically viewed as an input tape and an output tape. On this view, a transducer is said to transduce (i.e., translate) the contents of its input tape to its output tape, by accepting a string on its input tape and generating another string on its output tape. It may do so nondeterministically and it may produce more than one output for each input string. A transducer may also produce no output for a given input string, in which case it is said to reject the input. In general, a transducer computes a relation between two formal languages.

Each string-to-string finite-state transducer relates the input alphabet Σ to the output alphabet Γ. Relations R on Σ*×Γ* that can be implemented as finite-state transducers are called rational relations. Rational relations that are partial functions, i.e. that relate every input string from Σ* to at most one Γ*, are called rational functions.

Finite-state transducers are often used for phonological and morphological analysis in natural language processing research and applications. Pioneers in this field include Ronald Kaplan, Lauri Karttunen, Martin Kay and Kimmo Koskenniemi.^[2]^{[ non-primary source needed ]} A common way of using transducers is in a so-called "cascade", where transducers for various operations are combined into a single transducer by repeated application of the composition operator (defined below).

Formal construction

Formally, a finite transducer T is a 6-tuple ( $Q, Σ, Γ, I, F, δ$ ) such that:

$Q$ is a finite set, the set of states;
$Σ$ is a finite set, called the input alphabet;
$Γ$ is a finite set, called the output alphabet;
$I$ is a subset of $Q$ , the set of initial states;
$F$ is a subset of $Q$ , the set of final states; and
$\delta \subseteq Q\times (\Sigma \cup \{\epsilon \})\times (\Gamma \cup \{\epsilon \})\times Q$ (where ε is the empty string) is the transition relation.

We can view (Q, δ) as a labeled directed graph, known as the transition graph of T: the set of vertices is Q, and $(q,a,b,r)\in \delta$ means that there is a labeled edge going from vertex q to vertex r. We also say that a is the input label and b the output label of that edge.

NOTE: This definition of finite transducer is also called letter transducer (Roche and Schabes 1997); alternative definitions are possible, but can all be converted into transducers following this one.

Define the extended transition relation $\delta ^{*}$ as the smallest set such that:

$\delta \subseteq \delta ^{*}$ ;
$(q,\epsilon ,\epsilon ,q)\in \delta ^{*}$ for all $q\in Q$ ; and
whenever $(q,x,y,r)\in \delta ^{*}$ and $(r,a,b,s)\in \delta$ then $(q,xa,yb,s)\in \delta ^{*}$ .

The extended transition relation is essentially the reflexive transitive closure of the transition graph that has been augmented to take edge labels into account. The elements of $\delta ^{*}$ are known as paths. The edge labels of a path are obtained by concatenating the edge labels of its constituent transitions in order.

The behavior of the transducer T is the rational relation [T] defined as follows: $x[T]y$ if and only if there exists $i\in I$ and $f\in F$ such that $(i,x,y,f)\in \delta ^{*}$ . This is to say that T transduces a string $x\in \Sigma ^{*}$ into a string $y\in \Gamma ^{*}$ if there exists a path from an initial state to a final state whose input label is x and whose output label is y.

Weighted automata

Finite State Transducers can be weighted, where each transition is labelled with a weight in addition to the input and output labels. A Weighted Finite State Transducer (WFST) over a set K of weights can be defined similarly to an unweighted one as an 8-tuple $T =(Q, Σ, Γ, I, F, E, λ, ρ)$ , where:

$Q, Σ, Γ, I, F$ are defined as above;
$E\subseteq Q\times (\Sigma \cup \{\epsilon \})\times (\Gamma \cup \{\epsilon \})\times Q\times K$ (where ε is the empty string) is the finite set of transitions;
$\lambda :I\rightarrow K$ maps initial states to weights;
$\rho :F\rightarrow K$ maps final states to weights.

In order to make certain operations on WFSTs well-defined, it is convenient to require the set of weights to form a semiring.^[3] Two typical semirings used in practice are the log semiring and tropical semiring: nondeterministic automata may be regarded as having weights in the Boolean semiring.^[4]

Stochastic FST

Stochastic FSTs (also known as probabilistic FSTs or statistical FSTs) are presumably a form of weighted FST.^{[ citation needed ]}

Operations on finite-state transducers

The following operations defined on finite automata also apply to finite transducers:

Union. Given transducers $T$ and $S$ , there exists a transducer $T\cup S$ such that $x[T\cup S]y$ if and only if $x[T]y$ or $x[S]y$ .
Concatenation. Given transducers $T$ and $S$ , there exists a transducer $T\cdot S$ such that $x[T\cdot S]y$ if and only if there exist $x_{1},x_{2},y_{1},y_{2}$ with $x=x_{1}x_{2},y=y_{1}y_{2},x_{1}[T]y_{1}$ and $x_{2}[S]y_{2}.$
Kleene closure. Given a transducer $T$ , there might exist a transducer $T^{*}$ with the following properties:^[5]

\epsilon [T^{*}]\epsilon

;

k1

if

w[T^{*}]y

and

x[T]z

, then

wx[T^{*}]yz

;

k2

and

x[T^{*}]y

does not hold unless mandated by ( k1 ) or ( k2 ).

Composition. Given a transducer $T$ on alphabets Σ and Γ and a transducer $S$ on alphabets Γ and Δ, there exists a transducer $T\circ S$ on Σ and Δ such that $x[T\circ S]z$ if and only if there exists a string $y\in \Gamma ^{*}$ such that $x[T]y$ and $y[S]z$ . This operation extends to the weighted case.^[6]

This definition uses the same notation used in mathematics for relation composition. However, the conventional reading for relation composition is the other way around: given two relations

T

and

S

,

(x,z)\in T\circ S

when there exist some

y

such that

(x,y)\in S

and

(y,z)\in T.

Projection to an automaton. There are two projection functions: $\pi _{1}$ preserves the input tape, and $\pi _{2}$ preserves the output tape. The first projection, $\pi _{1}$ is defined as follows:

Given a transducer

T

, there exists a finite automaton

\pi _{1}T

such that

\pi _{1}T

accepts x if and only if there exists a string y for which

x[T]y.

The second projection,

\pi _{2}

is defined similarly.

Determinization. Given a transducer $T$ , we want to build an equivalent transducer that has a unique initial state and such that no two transitions leaving any state share the same input label. The powerset construction can be extended to transducers, or even weighted transducers, but sometimes fails to halt; indeed, some non-deterministic transducers do not admit equivalent deterministic transducers.^[7] Characterizations of determinizable transducers have been proposed^[8] along with efficient algorithms to test them:^[9] they rely on the semiring used in the weighted case as well as a general property on the structure of the transducer (the twins property).
Weight pushing for the weighted case.^[10]
Minimization for the weighted case.^[11]
Removal of epsilon-transitions.

Additional properties of finite-state transducers

It is decidable whether the relation [T] of a transducer T is empty.
It is decidable whether there exists a string y such that x[T]y for a given string x.
It is undecidable whether two transducers are equivalent.^[12] Equivalence is however decidable in the special case where the relation [T] of a transducer T is a (partial) function.
If one defines the alphabet of labels $L=(\Sigma \cup \{\epsilon \})\times (\Gamma \cup \{\epsilon \})$ , finite-state transducers are isomorphic to NDFA over the alphabet $L$ , and may therefore be determinized (turned into deterministic finite automata over the alphabet $L=[(\Sigma \cup \{\epsilon \})\times \Gamma ]\cup [\Sigma \times (\Gamma \cup \{\epsilon \})]$ ) and subsequently minimized so that they have the minimum number of states.^{[ citation needed ]}

Applications

FSTs are used in the lexical analysis phase of compilers to associate semantic value with the discovered tokens.^[13]

Context-sensitive rewriting rules of the form a → b / c _ d, used in linguistics to model phonological rules and sound change, are computationally equivalent to finite-state transducers, provided that application is nonrecursive, i.e. the rule is not allowed to rewrite the same substring twice.^[14]

Weighted FSTs found applications in natural language processing, including machine translation, and in machine learning.^[15]^[16] An implementation for part-of-speech tagging can be found as one component of the OpenGrm^[17] library.

Notes

↑ Jurafsky, Daniel (2009). Speech and Language Processing. Pearson. ISBN 9789332518414.
↑ Koskenniemi 1983
↑ Berstel, Jean; Reutenauer, Christophe (2011). Noncommutative rational series with applications. Encyclopedia of Mathematics and Its Applications. Vol. 137. Cambridge: Cambridge University Press. p. 16. ISBN 978-0-521-19022-0. Zbl 1250.68007.
↑ Lothaire, M. (2005). Applied combinatorics on words. Encyclopedia of Mathematics and Its Applications. Vol. 105. A collective work by Jean Berstel, Dominique Perrin, Maxime Crochemore, Eric Laporte, Mehryar Mohri, Nadia Pisanti, Marie-France Sagot, Gesine Reinert, Sophie Schbath, Michael Waterman, Philippe Jacquet, Wojciech Szpankowski, Dominique Poulalhon, Gilles Schaeffer, Roman Kolpakov, Gregory Koucherov, Jean-Paul Allouche and Valérie Berthé. Cambridge: Cambridge University Press. p. 211. ISBN 0-521-84802-4. Zbl 1133.68067.
↑ Boigelot, Bernard; Legay, Axel; Wolper, Pierre (2003). "Iterating Transducers in the Large". Computer Aided Verification. Lecture Notes in Computer Science. Vol. 2725. Springer Berlin Heidelberg. pp. 223–235. doi:10.1007/978-3-540-45069-6_24. eISSN 1611-3349. ISBN 978-3-540-40524-5. ISSN 0302-9743.
↑ Mohri 2004 , pp. 3–5
↑ "Determinization of Transducers".
↑ Mohri 2004 , pp. 5–6
↑ Allauzen & Mohri 2003
↑ Mohri 2004 , pp. 7–9
↑ Mohri 2004 , pp. 9–11
↑ Griffiths 1968
↑ Charles N. Fischer; Ron K. Cytron; Richard J. LeBlanc, Jr. (2010). "Scanning - Theory and Practice". Crafting a Compiler. Addison-Wesley. ISBN 978-0-13-606705-4.
↑ "Regular Models of Phonological Rule Systems" (PDF). Archived from the original (PDF) on October 11, 2010. Retrieved August 25, 2012.
↑ Kevin Knight; Jonathan May (2009). "Applications of Weighted Automata in Natural Language Processing". In Manfred Droste; Werner Kuich; Heiko Vogler (eds.). Handbook of Weighted Automata. Springer Science & Business Media. ISBN 978-3-642-01492-5.
↑ "Learning with Weighted Transducers" (PDF). Retrieved April 29, 2017.
↑ OpenGrm

Related Research Articles

A finite-state machine (FSM) or finite-state automaton, finite automaton, or simply a state machine, is a mathematical model of computation. It is an abstract machine that can be in exactly one of a finite number of states at any given time. The FSM can change from one state to another in response to some inputs; the change from one state to another is called a transition. An FSM is defined by a list of its states, its initial state, and the inputs that trigger each transition. Finite-state machines are of two types—deterministic finite-state machines and non-deterministic finite-state machines. For any non-deterministic finite-state machine, an equivalent deterministic one can be constructed.

In the theory of computation, a branch of theoretical computer science, a pushdown automaton (PDA) is a type of automaton that employs a stack.

Automata theory is the study of abstract machines and automata, as well as the computational problems that can be solved using them. It is a theory in theoretical computer science with close connections to mathematical logic. The word automata comes from the Greek word αὐτόματος, which means "self-acting, self-willed, self-moving". An automaton is an abstract self-propelled computing device which follows a predetermined sequence of operations automatically. An automaton with a finite number of states is called a finite automaton (FA) or finite-state machine (FSM). The figure on the right illustrates a finite-state machine, which is a well-known type of automaton. This automaton consists of states and transitions. As the automaton sees a symbol of input, it makes a transition to another state, according to its transition function, which takes the previous state and current input symbol as its arguments.

In computer science and automata theory, a deterministic Büchi automaton is a theoretical machine which either accepts or rejects infinite inputs. Such a machine has a set of states and a transition function, which determines which state the machine should move to from its current state when it reads the next input character. Some states are accepting states and one state is the start state. The machine accepts an input if and only if it will pass through an accepting state infinitely many times as it reads the input.

In the theory of computation, a branch of theoretical computer science, a deterministic finite automaton (DFA)—also known as deterministic finite acceptor (DFA), deterministic finite-state machine (DFSM), or deterministic finite-state automaton (DFSA)—is a finite-state machine that accepts or rejects a given string of symbols, by running through a state sequence uniquely determined by the string. Deterministic refers to the uniqueness of the computation run. In search of the simplest models to capture finite-state machines, Warren McCulloch and Walter Pitts were among the first researchers to introduce a concept similar to finite automata in 1943.

In automata theory, a finite-state machine is called a deterministic finite automaton (DFA), if

In automata theory, an alternating finite automaton (AFA) is a nondeterministic finite automaton whose transitions are divided into existential and universal transitions. For example, let A be an alternating automaton.

In automata theory, a deterministic pushdown automaton is a variation of the pushdown automaton. The class of deterministic pushdown automata accepts the deterministic context-free languages, a proper subset of context-free languages.

In quantum computing, quantum finite automata (QFA) or quantum state machines are a quantum analog of probabilistic automata or a Markov decision process. They provide a mathematical abstraction of real-world quantum computers. Several types of automata may be defined, including measure-once and measure-many automata. Quantum finite automata can also be understood as the quantization of subshifts of finite type, or as a quantization of Markov chains. QFAs are, in turn, special cases of geometric finite automata or topological finite automata.

<span class="mw-page-title-main">Nested stack automaton</span>

In automata theory, a nested stack automaton is a finite automaton that can make use of a stack containing data which can be additional stacks. Like a stack automaton, a nested stack automaton may step up or down in the stack, and read the current symbol; in addition, it may at any place create a new stack, operate on that one, eventually destroy it, and continue operating on the old stack. This way, stacks can be nested recursively to an arbitrary depth; however, the automaton always operates on the innermost stack only.

A queue machine, queue automaton, or pullup automaton (PUA) is a finite-state machine with the ability to store and retrieve data from an infinite-memory queue. Its design is similar to a pushdown automaton but differs by replacing the stack with this queue. A queue machine is a model of computation equivalent to a Turing machine, and therefore it can process the same class of formal languages.

A read-only Turing machine or two-way deterministic finite-state automaton (2DFA) is class of models of computability that behave like a standard Turing machine and can move in both directions across input, except cannot write to its input tape. The machine in its bare form is equivalent to a deterministic finite automaton in computational power, and therefore can only parse a regular language.

An embedded pushdown automaton or EPDA is a computational model for parsing languages generated by tree-adjoining grammars (TAGs). It is similar to the context-free grammar-parsing pushdown automaton, but instead of using a plain stack to store symbols, it has a stack of iterated stacks that store symbols, giving TAGs a generative capacity between context-free and context-sensitive grammars, or a subset of mildly context-sensitive grammars. Embedded pushdown automata should not be confused with nested stack automata which have more computational power.

An abstract family of acceptors (AFA) is a grouping of generalized acceptors. Informally, an acceptor is a device with a finite state control, a finite number of input symbols, and an internal store with a read and write function. Each acceptor has a start state and a set of accepting states. The device reads a sequence of symbols, transitioning from state to state for each input symbol. If the device ends in an accepting state, the device is said to accept the sequence of symbols. A family of acceptors is a set of acceptors with the same type of internal store. The study of AFA is part of AFL (abstract families of languages) theory.

In computer science, more specifically in automata and formal language theory, nested words are a concept proposed by Alur and Madhusudan as a joint generalization of words, as traditionally used for modelling linearly ordered structures, and of ordered unranked trees, as traditionally used for modelling hierarchical structures. Finite-state acceptors for nested words, so-called nested word automata, then give a more expressive generalization of finite automata on words. The linear encodings of languages accepted by finite nested word automata gives the class of visibly pushdown languages. The latter language class lies properly between the regular languages and the deterministic context-free languages. Since their introduction in 2004, these concepts have triggered much research in that area.

In computer science and mathematical logic, an infinite-tree automaton is a state machine that deals with infinite tree structures. It can be seen as an extension of top-down finite-tree automata to infinite trees or as an extension of infinite-word automata to infinite trees.

In theoretical computer science and formal language theory, a weighted automaton or weighted finite-state machine is a generalization of a finite-state machine in which the edges have weights, for example real numbers or integers. Finite-state machines are only capable of answering decision problems; they take as input a string and produce a Boolean output, i.e. either "accept" or "reject". In contrast, weighted automata produce a quantitative output, for example a count of how many answers are possible on a given input string, or a probability of how likely the input string is according to a probability distribution. They are one of the simplest studied models of quantitative automata.

In computer science, a suffix automaton is an efficient data structure for representing the substring index of a given string which allows the storage, processing, and retrieval of compressed information about all its substrings. The suffix automaton of a string $is the smallest directed acyclic graph with a dedicated initial vertex and a set of "final" vertices, such that paths from the initial vertex to final vertices represent the suffixes of the string.$

In automata theory, an unambiguous finite automaton (UFA) is a nondeterministic finite automaton (NFA) such that each word has at most one accepting path. Each deterministic finite automaton (DFA) is an UFA, but not vice versa. DFA, UFA, and NFA recognize exactly the same class of formal languages. On the one hand, an NFA can be exponentially smaller than an equivalent DFA. On the other hand, some problems are easily solved on DFAs and not on UFAs. For example, given an automaton A, an automaton A′ which accepts the complement of A can be computed in linear time when A is a DFA, whereas it is known that this cannot be done in polynomial time for UFAs. Hence UFAs are a mix of the worlds of DFA and of NFA; in some cases, they lead to smaller automata than DFA and quicker algorithms than NFA.

In theoretical computer science and formal language theory, a tree transducer (TT) is an abstract machine taking as input a tree, and generating output – generally other trees, but models producing words or other structures exist. Roughly speaking, tree transducers extend tree automata in the same way that word transducers extend word automata.

References

Allauzen, Cyril; Mohri, Mehryar (2003). "Efficient Algorithms for Testing the Twins Property" (PDF). Journal of Automata, Languages and Combinatorics. 8 (2): 117–144.
Koskenniemi, Kimmo (1983), Two-level morphology: A general computational model of word-form recognition and production (PDF), Department of General Linguistics, University of Helsinki, archived from the original (PDF) on 2018-12-21, retrieved 2010-01-10
Mohri, Mehryar (2004). "Weighted Finite-State Transducer Algorithms. An Overview" (PDF). Formal Languages and Applications. Studies in Fuzziness and Soft Computing. Vol. 148. pp. 551–564. doi:10.1007/978-3-540-39886-8_29. ISBN 978-3-642-53554-3.
Griffiths, T. V. (1968). "The unsolvability of the Equivalence Problem for Λ-Free nondeterministic generalized machines". Journal of the ACM. 15 (3). ACM: 409–413. doi:10.1145/321466.321473.

External links

OpenFst, an open-source library for FST operations.
Finite State Morphology--The Book Archived 2022-03-25 at the Wayback Machine XFST/ LEXC, a description of Xerox's implementation of finite-state transducers intended for linguistic applications.
The Helsinki open source implementation and extension of the Xerox fst
FOMA, an open-source implementation of most of the capabilities of the Xerox XFST/ LEXC implementation.
Stuttgart Finite State Transducer Tools, another open-source FST toolkit
java FST Framework, an open-source java FST Framework capable of handling OpenFst text format.
Vcsn Archived 2020-06-23 at the Wayback Machine , an open-source platform (C++ & IPython) platform for weighted automata and rational expressions.