Retentive Neural Quantum States: Efficient Ansätze for Ab Initio Quantum Chemistry (2411.03900v1)

Published 6 Nov 2024 in cs.LG, cs.CE, and quant-ph

Abstract: Neural-network quantum states (NQS) has emerged as a powerful application of quantum-inspired deep learning for variational Monte Carlo methods, offering a competitive alternative to existing techniques for identifying ground states of quantum problems. A significant advancement toward improving the practical scalability of NQS has been the incorporation of autoregressive models, most recently transformers, as variational ansatze. Transformers learn sequence information with greater expressiveness than recurrent models, but at the cost of increased time complexity with respect to sequence length. We explore the use of the retentive network (RetNet), a recurrent alternative to transformers, as an ansatz for solving electronic ground state problems in $\textit{ab initio}$ quantum chemistry. Unlike transformers, RetNets overcome this time complexity bottleneck by processing data in parallel during training, and recurrently during inference. We give a simple computational cost estimate of the RetNet and directly compare it with similar estimates for transformers, establishing a clear threshold ratio of problem-to-model size past which the RetNet's time complexity outperforms that of the transformer. Though this efficiency can comes at the expense of decreased expressiveness relative to the transformer, we overcome this gap through training strategies that leverage the autoregressive structure of the model -- namely, variational neural annealing. Our findings support the RetNet as a means of improving the time complexity of NQS without sacrificing accuracy. We provide further evidence that the ablative improvements of neural annealing extend beyond the RetNet architecture, suggesting it would serve as an effective general training strategy for autoregressive NQS.

Summary

The paper demonstrates that implementing a RetNet reduces the computational scaling in neural network quantum states for ab initio quantum chemistry.
It employs variational neural annealing alongside RetNet to optimize the wavefunction ansatz with linear inference complexity.
Experimental results show that RetNet achieves energy calculation accuracies comparable to coupled cluster methods while lowering FLOP counts.

Analyzing Retentive Neural Quantum States for Efficient Quantum Chemistry Applications

The paper "Retentive Neural Quantum States: Efficient Ansatze for Ab Initio Quantum Chemistry" presents a significant paper on the use of deep learning architectures for solving quantum chemical problems. Specifically, the paper explores the retentive network (RetNet) as an ansatz for neural network quantum states (NQS), positioning it as a resource-efficient alternative to transformers for calculating electronic ground states in ab initio quantum chemistry.

Overview of Neural Network Quantum States

Neural network quantum states (NQS) offer a versatile framework for solving quantum many-body problems, particularly through variational Monte Carlo (VMC) methodologies. In essence, NQS leverages a neural network ansatz to approximate ground states of quantum systems without the need to explicitly store the exponentially large state vectors or Hamiltonian matrices. This quality has made NQS applicable to a variety of quantum problems, including electronic structure calculations in quantum chemistry.

Autoregressive models have been identified as strong candidates for NQS due to their ability to represent normalized wavefunctions and facilitate exact sampling via the Born probability distribution. However, these models, specifically transformers, carry computational inefficiencies due to their quadratic time complexity scaling with both training and inference operations.

Retentive Networks as a Solution

The paper introduces the RetNet, a recurrent architecture alternative to transformers. RetNet retains the benefits of autoregressive models while addressing the quadratic scaling bottleneck by employing a recurrent form during inference that achieves linear time complexity relative to the number of qubits. This characteristic presents a substantial advantage in terms of computational efficiency, especially for large-scale quantum systems.

The authors propose a framework wherein RetNets are used in combination with a variational neural annealing (VNA) technique, which has shown potential to increase the robustness of NQS performance. VNA regularizes the entropy of the ansatz distribution, promoting exploration during optimization and thereby avoiding suboptimal local minima.

Implications and Experimental Results

The practical implications of this research are multifold. By providing computational estimates demonstrating how RetNets can achieve lower FLOP counts compared to transformers in certain configurations, the paper delineates the potential for RetNet-based NQS to serve as more scalable electronic structure solvers. The paper's experimental results on various small molecule systems corroborate the theoretical findings, demonstrating the RetNet's capability to achieve energy calculation accuracies that match or surpass those of more traditional methods like coupled cluster (CCSD) and previous NQS implementations.

Moreover, the successful integration of VNA to improve training accuracy for different NQS ansatze, including MADE and transformer models, underscores its general applicability and benefit across diverse neural architectures utilized in quantum chemistry simulations.

Theoretical and Practical Significance

From a theoretical perspective, the paper contributes to the understanding of recurrent networks' role in quantum-inspired computations and their advantages in terms of parallelization and scaling. Practically, the RetNet's applicability as a drop-in replacement for transformers signals a shift towards more efficient quantum chemical computations, expanding the harnessable potential of NQS for applications previously hindered by computational constraints.

Future Directions and Conclusion

The research opens avenues for further exploring autoregressive NQS models beyond electronic structure problems. Additionally, the authors underscore the need for a more comprehensive paper into different neural annealing schedules to optimize the application of VNA. Future research may also explore the extension of this work to larger quantum systems and other complex quantum chemical computations.

In conclusion, the paper presents a compelling case for the retentive network's utility in quantum chemistry, particularly in enhancing the scalability and efficiency of NQS frameworks. Through empirical validation and theoretical insights, it delivers a substantial foundation for future advancements in neural network-based quantum chemistry solvers. The combination of RetNet architectures with variational neural annealing strategies demonstrates promising potential for advancing practical applications in computational chemistry.

PDF Markdown

Related Papers

Tweets

https://twitter.com/tweetnakasho/status/1854676815387971906