Robustness of quadratic vs. Gaussian kernels to symmetry breaking and bandwidth effects in RFM

Determine whether the Mahalanobis quadratic kernel within the Recursive Feature Machine (RFM) algorithm is more robust than the Mahalanobis Gaussian kernel to symmetry-breaking perturbations of the training data, and ascertain whether the kernel bandwidth parameter in either kernel modulates this robustness.

Background

The paper studies grokking and generalization in Recursive Feature Machines (RFM) on algebraic tasks such as modular arithmetic and Abelian group addition. It shows that specific symmetry-preserving train–test partitions (e.g., withholding fixed points under a reflection) can inhibit generalization, and that breaking this symmetry by moving random training points to the test set recovers generalization.

Empirically, the authors observe that the quadratic kernel requires moving more random points from train to test to achieve the same improvement in test accuracy compared to the Gaussian kernel. Motivated by this, they explicitly pose as future work determining whether the quadratic kernel is inherently more "robust" to symmetry breaking and whether bandwidth choices in Gaussian or quadratic kernels influence this robustness.

References

We leave it for future work to explore whether the quadratic kernel is more "robust" to symmetry breaking, or whether bandwidth choices in the kernel (whether Gaussian or quadratic) can affect the robustness of the model to symmetry breaking.

Breaking Data Symmetry is Needed For Generalization in Feature Learning Kernels  (2604.00316 - Bernal et al., 31 Mar 2026) in Section 3, Partitions that inhibit generalization (Removing points symmetrically)