Scaling laws versus individual network dimensions in the Psi-Solid architecture
Develop a detailed scaling theory relating accuracy and convergence to individual architectural dimensions—including the number of attention heads, number of layers, attention width, and perceptron width—in the Psi-Solid self-attention neural-network wavefunction for variational Monte Carlo of interacting electrons, and determine how these dimensions govern the parameter count required to reach convergence.
References
We leave a more detailed analysis of scaling laws as a function of individual network dimensions for future work.
                — Is attention all you need to solve the correlated electron problem?
                
                (2502.05383 - Geier et al., 7 Feb 2025) in Section 5.1 (Convergence and scaling with system size)