Extend F2 to classically tractable subroutines beyond free-fermions (e.g., tensor networks)

Determine how to extend the F2 offline reinforcement learning paradigm for compiling Trotter-based Hamiltonian simulation circuits that exploit free-fermionic substructures to other classically tractable subroutines, including those efficiently simulatable via tensor networks, while preserving compilation efficiency and accuracy within prescribed error tolerances.

Background

F2 targets aggressive optimization of Trotter-based Hamiltonian simulation by isolating and compiling classically simulatable free-fermionic subcircuits via an offline reinforcement learning environment. This leverages polynomial-size representations and time-reversibility to generate abundant successful trajectories for training, achieving sizable gate-count and depth reductions without sacrificing accuracy.

Expanding this paradigm to other tractable classes—such as those efficiently handled by tensor networks—could broaden applicability across many-body systems where free-fermionic structure is absent or limited. However, it is unclear what environment states, action sets, and learning biases are needed to support such subroutines while maintaining the same levels of performance and fidelity reported for free-fermionic kernels.

References

While this progress is promising, multiple research questions are still unanswered. These questions are as follows. How could this paradigm be extended to other classically tractable subroutines, such as those efficiently tractable by tensor networks?

F2: Offline Reinforcement Learning for Hamiltonian Simulation via Free-Fermionic Subroutine Compilation (2512.08023 - Decker et al., 8 Dec 2025) in Section 7 (Conclusion)