Stable Hadamard Memory (SHM)
- Stable Hadamard Memory (SHM) is a memory framework that employs the Hadamard product to update and stabilize both quantum snapshots and reinforcement learning memories.
- It leverages a non-destructive guess-and-check protocol with SWAP tests to iteratively refine and capture quantum states without collapsing the information.
- In reinforcement learning, SHM uses dynamically calibrated matrix updates to maintain long-term memory stability and prevent exploding or vanishing gradients.
Stable Hadamard Memory (SHM) encompasses a set of memory models and protocols unified by the deployment of the Hadamard product as their core mechanism for memory update, calibration, or quantum state reconstruction. Two domains articulate distinct but conceptually related instantiations: SHM as a classically stored, machine-learning-guided quantum snapshot protocol specialized for the Hadamard state (Kundu et al., 20 Apr 2025), and SHM as a dynamically calibrated matrix memory for reinforcement learning with robustness and stability guarantees (&&&1&&&). Both leverage the structural and computational properties of the Hadamard (element-wise) product to enable non-destructive, scalable information retention—either of quantum amplitudes or agent histories.
1. SHM for Quantum State Snapshots
In the context of quantum information, SHM denotes a hardware-agnostic protocol for capturing, storing, and reconstructing the single-qubit Hadamard state using a non-destructive “guess-and-check” methodology. The essential procedure departs fundamentally from standard projective measurement, which irreversibly collapses the quantum state, and also from exhaustive quantum state tomography, which is destructive and exponentially scaling.
The SHM protocol proceeds by:
- Iteratively refining a classically parameterized ansatz for the unknown quantum state using SWAP tests to non-destructively estimate fidelity.
- Encoding the final state amplitudes in classical memory after high-fidelity convergence, enabling persistent, non-volatile storage and future re-preparation on quantum hardware.
This mitigates the necessity for persistent physical quantum memories that depend on long coherence times, offloading long-term quantum information retention to ordinary classical RAM (Kundu et al., 20 Apr 2025).
2. Theoretical Foundations and Machine Learning Formulation
The quantum SHM technique is grounded in the SWAP test as a fidelity estimator. Given unknown target state (e.g., ) and a current ansatz generated by either a deep neural network or evolutionary strategy, the SWAP test yields the overlap fidelity
where is the measured probability of the ancilla outcome 0. Gradient-based optimization (Adam, finite-difference on SWAP test output) or population-based gradient-free algorithms (QESwap) update the generator parameters exclusively by maximizing .
The reconstructed amplitudes are stored in the form . High-fidelity reconstructions () are demonstrated both in simulation and on superconducting hardware.
The Hadamard product, although not explicit in quantum evolution, underlies the iterative process by ensuring state updates act locally and preserve the non-destructive character of the protocol (Kundu et al., 20 Apr 2025).
3. SHM as a Reinforcement Learning Memory Architecture
A separate development line conceptualizes SHM as a matrix memory model for reinforcement learning agents experiencing partially observable, long-horizon environments (Le et al., 2024). Here, the memory is a matrix updated via:
where:
- denotes the Hadamard product (element-wise multiplication).
- is a learned, input-dependent calibration matrix controlling dynamic erasure and reinforcement of prior memory elements.
- is a rank-1 update gate.
The SHM update ensures that each element in is adaptively weakened or reinforced based on current input, enabling persistent memory without the numerical instability endemic to conventional recurrent or MANN models. Boundedness of (, mean ) ensures that memory products neither vanish nor explode even over long episodes.
4. Numerical Stability and Memory Capacity Analysis
A central property of RL-oriented SHM is the stabilizing effect of random, input-dependent calibration matrices. For each memory cell , the expected product of calibrations across steps, , maintains stability of both forward activations and backward gradients, even for , provided mild Gaussianity and independence conditions on the inputs.
In contrast, use of a fixed or purely neural calibration (without random pool sampling) leads inevitably to exploding or vanishing gradients due to the cumulative product structure. The matrix formulation ( memory) substantially extends capacity relative to vector-based RNN memories () (Le et al., 2024).
5. Computational Complexity and Implementation
Both quantum and RL SHM emphasize efficient implementation:
- The guess-and-check quantum SHM uses SWAP test circuit calls per step (where for qubits). Resource overhead is linear in .
- RL SHM with naive sequential update is per time step but supports depth for multi-step products and updates via parallel scan.
In RL applications, a typical SHM cell maintains memory dimension –$128$ ( calibration pool size), with shallow MLPs for each embedding network. The approach is agnostic to the choice of RL algorithm (SAC, PPO) and benchmarks (Le et al., 2024).
Quantum SHM is realized on superconducting hardware supporting mid-circuit CSWAP and state reset; snapshot storage and retrieval require no special-purpose quantum memory.
6. Empirical Results and Limitations
Quantum SHM
| Experiment Type | Fidelity (Hadamard, n=1) | Epochs to Convergence |
|---|---|---|
| Noiseless Sim (NN) | $0.9999$ | 5 |
| Noisy Sim (NN) | $0.9927$ | 27 |
| Noiseless Sim (ES) | $0.9955$ | 5 |
| Noisy Sim (ES) | $0.9963$ | 5 |
| Real IBM Hardware | $0.99$–$1.0$ | 3 |
- Under noise (T\approx$272 μs, T$_2188 μs, 2% depolarizing), fidelity remains 0.99 for 100 ms runtime and Bloch-sphere drift under across 50 snapshots (Kundu et al., 20 Apr 2025).
RL SHM
- Meta-RL tasks (“Wind”, “Point Robot”): SHM achieves 90–100% success rate, compared to 40–75% for LSTM/NTM baselines.
- Long-horizon credit assignment (“Visual Match”, “Key-to-Door”): SHM solves 250 and 500-step variants perfectly, versus best alternative at 25–75%.
- POPGym hardest games: SHM’s average return across 12 environments, versus (FFM) and (GRU).
Ablation studies confirm the necessity of random- calibration for stability, and controlled memory size for improved performance.
Limitations include unreliable behavior of quantum SHM for mixed states (SWAP test only measures ), and RL SHM’s Gaussian input assumption as only approximately met in general RL settings. Quantum SHM faces scalability bottlenecks in gradient estimation for high-dimensional () Hadamard states; RL SHM’s update is sequential, with parallel scan a potential optimization path (Kundu et al., 20 Apr 2025, Le et al., 2024).
7. Extensions and Practical Considerations
Both quantum and RL instantiations of SHM are hardware-agnostic and compatible with a range of platforms and algorithmic frameworks. Quantum SHM can be adapted, in principle, to other basis states (Pauli, Fourier), given corresponding generator architectures. RL SHM invites extensions such as multi-head or block-structured calibration, parallel prefix computation, and application to meta-RL and continual learning, where selective erasure and reinforcement confer unique advantages.
Prospective applications comprise domains requiring persistent, robust, and high-capacity memory under resource or hardware constraints: quantum debugging, quantum random access memory (QRAM), dynamic quantum circuit design (Kundu et al., 20 Apr 2025); memory-intensive RL tasks with partial observability, long-term sequential dependencies, and evolving task structure (Le et al., 2024).
For comprehensive technical details and experimental data, refer to "Guess, SWAP, Repeat: Capturing Quantum Snapshots in Classical Memory" (Kundu et al., 20 Apr 2025) and "Stable Hadamard Memory: Revitalizing Memory-Augmented Agents for Reinforcement Learning" (Le et al., 2024).