Incorporate continuous parameters into the F2 action space

Investigate whether incorporating continuous rotation parameters into the action space of the F2 offline reinforcement learning framework for compiling free-fermionic subroutines can achieve equal or improved gate-count and depth reductions relative to the current discretized-angle move set, while maintaining target accuracy.

Background

F2 currently operates with a discrete set of parameterized Pauli-string exponentials, where rotation angles are discretized (e.g., Θ includes ±π/2k). This design stabilizes learning in a large hybrid discrete-continuous action space and simplifies offline trajectory generation, but can initially inflate depth until post-pass optimizations merge and commute gates.

The authors explicitly question whether allowing continuous parameters directly in the move space could match or surpass the performance achieved with discretized angles, potentially mitigating depth inflation and offering finer-grained synthesis—while maintaining the low error tolerances demonstrated across benchmarks.

References

While this progress is promising, multiple research questions are still unanswered. These questions are as follows. Can the same performance be achieved or improved upon through the incorporation of continuous parameters into the move space?

F2: Offline Reinforcement Learning for Hamiltonian Simulation via Free-Fermionic Subroutine Compilation (2512.08023 - Decker et al., 8 Dec 2025) in Section 7 (Conclusion)