Diagrammatic Compositional Theory of FSTs

Updated 1 January 2026

Diagrammatic Compositional Theory of FSTs is an algebraic framework that unifies automata constructions with categorical semantics using string-diagram syntax.
It employs foundational generators like tuplefy, determinization-run maps, and feedback operators to synthesize and decompose finite-state transducers.
The theory integrates equational axioms and Krohn–Rhodes decompositions to offer diagrammatic proofs of functional and language equivalence.

The diagrammatic compositional theory of finite-state transducers (FSTs) provides an algebraic and categorical framework for the synthesis, decomposition, and equivalence reasoning of FSTs. This theory unifies classical automata-theoretic constructions—such as determinization, run harvesting, and Krohn–Rhodes decompositions—with string diagram syntax and equational axioms. Finite-state transducers are interpreted both as morphisms in strict symmetric monoidal categories and as uniform relations on words, allowing for compositional constructions, factorization results, and diagrammatic proofs of functional and language equivalence. The framework supports not only regular functions but also the manipulation and comparison of general uniform relations represented by FSTs.

1. Generators and Basic Construction Primitives

Regular functions $f:\Sigma_0^*\times\cdots\times\Sigma_{n-1}^* \to \Sigma_n^*$ are generated by multi-ary composition from a small collection of foundational primitives (Kern, 2016). Core generators include:

Tuple-fy: Synchronizes multiple input streams of possibly different lengths via $tuplefy$ , padding with a special symbol $\#$ . This enables single-automaton processing of $n$ -ary inputs.
Character-wise Maps: For $g:A_0\times\cdots\times A_{k-1}\to B$ , the lift $_g$ applies $g$ position-wise to padded input tuples.
Moore-run Maps: For any regular function $f$ , one constructs a nondeterministic DFA whose run structure (Moore machine) lifts $w\in \Sigma^*$ faithfully to an accepting path and projects output in $\Gamma$ .
Determinization-run Map ( $\delta^*$ ): Captures the sequence of subsets of states (reachable sets, $P(Q)$ ) traversed during input consumption, corresponding to the powerset construction.
Harvester-run Map ( $harvest$ ): Reconstructs an accepting run from a sequence of reachable sets and input letters by backtracking, capturing the "reverse Moore" operation.
Unpadding ( $unpad$ ): Removes trailing padding, thus allowing for shortening and variable-length output.

Every regular (length-preserving) function factors as the composite

$f = {_{(\varepsilon)}} \circ harvest \circ tuplefy(Id_{\Sigma^*}, \delta^*)$

and all regular functions (not just length-preserving) are generated by composing these with $unpad$ , padding, and multi-ary tuplefy operations (Kern, 2016).

2. Diagrammatic Syntax and Monoidal Categories

FSTs are embedded within strict symmetric monoidal categories—specifically, the syntactic category $\mathrm{Trans}$ and the category $\mathcal{R}eg$ of regular functions (Carette et al., 10 Feb 2025). In these frameworks:

Morphisms represent FSTs or regular functions.
Objects are finite alphabets, interpreted via free monoids on the alphabet.
Generators include double-line boxes for arbitrary finite relations, triangle nodes for initial and final state subsets, and a feedback-loop operator ( $\mathsf{fb}_Q$ ).
String diagrams encode morphism composition (vertical stacking), tensor product (horizontal juxtaposition), and state feedback (looping wires).
Functorial semantics map diagrams to uniform word relations by lifting letterwise, mapping automaton states, and applying existential quantification over feedback/state wires via a finite shift $\sigma_Q$ .

The framework subsumes the usual automaton-theoretic constructions by encoding composition, synchronization (tuplefy), and character-wise operations as explicit morphisms, and feedback as existentially quantified looping over state wires.

3. Equational Axioms and Rewriting Rules

The compositional theory employs a layered system of equational axioms (Carette et al., 10 Feb 2025):

Strict Symmetric Monoidal Laws: Governing associativity, unit, and symmetry for sequential/parallel composition and wire crossing.
Compact Closed Structure: Cups and caps realize duality for state wires, enforcing yanking and sliding identities.
Feedback/Trace Axioms: Laws for $\mathsf{fb}_Q(D)$ formalize the passing of state information back into the input, encoding existential quantification ("tracing out" state wires).
Faithful Embedding of FinRel: Any composite of finite relation boxes can be collapsed using relational composition, compatible with the monoidal structure.

A single additional axiom—the simulation principle—encodes the classical forward and backward simulation relations between states, subsuming trace equivalence and language equivalence arguments. These equations ensure that diagrammatic rewriting is sufficient to decide equivalence of FSTs and subsume classical minimization and simulation arguments used in automaton theory.

4. Krohn–Rhodes Factorization and Cascade/Wreath Decomposition

The compositional theory incorporates and clarifies the Krohn–Rhodes theorem using both inductive and algebraic approaches (Kern, 2016):

Cascade-Product (Inductive Proof): Any transparent permutation-reset Moore machine $M$ can be factored into a cascade of simpler machines $M_0, ..., M_{n-1}$ , each acting on a tuplefy-synchronized stream. The composite output is then computed by a final character-wise map.
Algebraic (Wreath-Product Proof): The transition monoid $T$ of any such machine divides a wreath product of finite simple groups and two-element reset monoids, corresponding exactly to cascade products of machines.
Categorical Embedding: Each factor in the Krohn–Rhodes decomposition becomes a generator box in the string diagrammatic language, allowing compositional diagram construction of any regular function or transducer morphism.

5. Interpretation Functor and Completeness

The interpretation functor $\llbracket-\rrbracket$ establishes rigorous semantics by mapping string diagrams to relations on words. For each generator, the assignment is as follows (Carette et al., 10 Feb 2025):

Finite Relation Boxes ( $R$ ): Letterwise lifted to $R^* : X^* \to Y^*$ .
Initial/Final State Nodes ( $I$ , $F$ ): Indicators for allowed initial/final state words.
Feedback ( $\mathsf{fb}_Q$ ): Interpreted as the existential trace (elimination) of state wires via shifting and feedback relations.

Completeness: If two diagrams denote the same uniform relation on words, they can be rewritten to each other diagrammatically by the axioms above and the simulation principle. Any diagram, via minimization and the simulation principle, reduces to a unique normal form coinciding with the minimal DFA/transducer (Carette et al., 10 Feb 2025).

6. Applications, Worked Example, and Extensions

Diagrammatic compositional reasoning enables direct proofs of FST equivalence and minimization. For example, consider the merging of a two-state chain and a one-state loop on a unary alphabet (Carette et al., 10 Feb 2025):

Both transducers accept $a^+\cup\{\epsilon\}$ .
The simulation principle establishes diagrammatic equivalence via an explicit surjective relation between state sets and checks of initial, final, and transition compatibilities.
All classical structure (determinization, minimization, forward/backward simulation) is subsumed in the diagrammatic calculus.

The method generalizes readily to the synthesis and decomposition of multi-ary regular functions, tree automata, and bi-infinite word systems. Any extension (e.g., to $\omega$ -automata or tree automata) corresponds to refining or augmenting the collection of diagrammatic generators or introducing modified trace/feedback operations (Kern, 2016).

In summary, the diagrammatic compositional theory of FSTs unifies automata-theoretic constructions, categorical semantics, and algebraic decompositions using a minimal set of syntactic generators, rewriting rules, and completeness theorems for functional and relational equivalence. All aspects of regular function synthesis, factorization, and comparison can be conducted in the string-diagrammatic language, grounded in strict symmetric monoidal categories and realized via finite-state machinery (Kern, 2016, Carette et al., 10 Feb 2025).