Universal FSA Emulation by Neural Finite-State Machines
- Universal FSA Emulation is a method where finite-depth feedforward ReLU and threshold networks simulate any deterministic finite automaton by encoding state transitions for bounded-length inputs.
- It employs explicit layerwise constructions, including one-hot and binary encodings with two-layer transition modules, to implement regular language recognition.
- The approach demonstrates exponential state compression, latent embeddings of Myhill–Nerode equivalence classes, and a formal expressivity boundary for fixed-depth networks.
Universal finite-state automaton (FSA) emulation refers to the capacity of certain neural network architectures—specifically, finite-depth feedforward ReLU and threshold networks—to exactly simulate any deterministic finite automaton (DFA) on bounded-length inputs, thus acting as "neural finite-state machines" (N-FSMs). This is achieved through explicit layerwise constructions that encode DFA state transitions in the network’s depth, enabling precise realization of regular languages and delineating a formal expressivity boundary for such networks. The central results formalize layer and parameter requirements, provide state compression strategies, establish embeddings of Myhill–Nerode equivalence classes into continuous latent spaces, and rigorously show that fixed-depth networks cannot recognize non-regular languages (Dhayalkar, 16 May 2025).
1. Definition and Theoretical Framework
A deterministic finite automaton is defined as a 5-tuple , where is a finite set of states, a finite input alphabet of size , the deterministic transition function, the initial state, and the accepting states. On input , the DFA recursively applies for and accepts if .
An N-FSM corresponding to is a feedforward network satisfying if and $0$ otherwise. At each network layer, the hidden representation encodes the DFA state and the final layer tests for membership in (Dhayalkar, 16 May 2025).
2. Explicit Construction of Feedforward Emulators
Given a DFA , the emulation proceeds through explicit neural architectures:
2.1. One-Hot Encodings
- Symbols: Each is represented by .
- States: Each is mapped to .
- At step , the hidden state is concatenated with to form the input to the transition module.
2.2. Two-Layer Transition Modules
- Hidden Layer: For each , an “AND-unit” ensures activation if and only if the network is in state and receives symbol .
- Output Layer: The next state is computed as a sum over the activated , with weights iff .
2.3. Readout Layer
Once symbols are processed, the network produces . The indicator (with iff ) determines acceptance.
2.4. Depth and Width Bounds
| Construction | Depth | Width |
|---|---|---|
| One-hot + ReLU | $2T + 1$ | |
| Binary + threshold | $2T + 1$ |
The construction ensures exact simulation for all of length . For inputs restricted to at most symbols, .
3. Exponential State Compression
State encodings can be exponentially compressed by representing DFA states as -bit binary codes:
- Binary State Encoding: Map each to with .
- Transition Realization: Each output bit is a Boolean function specifying the -th bit of .
- Threshold Circuits: Classical results guarantee a depth-2 threshold circuit for any finite Boolean function. Each bit is realized via threshold gates.
This approach achieves hidden-state width . Depth and overall layer width are preserved at $2T+1$ and , respectively (Dhayalkar, 16 May 2025).
4. Myhill–Nerode Equivalence and Latent Embeddings
The Myhill–Nerode relation partitions strings into equivalence classes corresponding to DFA states:
- Embedding Theorem: There exists a feedforward network such that iff . If , then .
- Construction: Run DFA simulation to obtain , then project using mapping each to distinct in .
- Johnson–Lindenstrauss Compression: The set can be further reduced to dimension while preserving linear separability, thus embedding equivalence classes in low-dimensional latent space (Dhayalkar, 16 May 2025).
5. Expressivity Limitations: Boundary for Regular Languages
Feedforward networks of fixed depth and width possess a finite partitioning capacity:
- Linear Region Bound: For depth and width , any such network partitions input space into regions.
- Non-Regular Language Limitation: Languages such as require capacity for infinitely many regions, which exceeds what is possible for fixed and . Thus, such networks cannot recognize non-regular languages.
- Formal Lower Bound: For every such network, there exists such that, for all , correct classification of all strings in is impossible. Only regular languages are exactly recognizable (Dhayalkar, 16 May 2025).
6. Synthesis of Results and Architectural Trade-Offs
| Emulation Mode | Depth | Width | State Width |
|---|---|---|---|
| One-hot encoding + ReLU | $2T+1$ | ||
| Binary encoding + threshold | $2T+1$ | ||
| Myhill–Nerode embedding | $2T+1$ |
- State Compression: Exponential compression from -dim. one-hot to -bit binary without sacrificing expressivity for finite-state computations.
- Latent Embeddings: Faithful, linearly-separable vectorial mapping of equivalence classes, with further dimension reduction via random projection possible.
- Expressivity Boundary: The constructive approach delineates that regular languages are both the upper and lower bounds of what N-FSMs can recognize with fixed architecture.
- Bridging Symbolic and Neural Computation: These results rigorously instantiate a blueprint for realizing symbolic algorithms within neural architectures (Dhayalkar, 16 May 2025).
7. Context and Significance
The established equivalence between neural finite-state machines and DFAs provides a mathematically precise characterization of neural network capacity in symbolic sequence processing, automata simulation, and neural-symbolic integration. The constructive nature of the methods contrasts with prior heuristic or probing-based analyses, supplying explicit network weights, architectures, and representations. This formalization enables principled design of neural models for tasks where regular language structure is fundamental, while also identifying exact limits for problems involving unbounded memory or non-regular languages. As such, universal FSA emulation serves as a foundational result, bridging disciplines and informing further research on the correspondence between discrete automata and continuous neural computation (Dhayalkar, 16 May 2025).