Cross-domain generalization of symbolic surrogates for LLM MLP layers
Determine how well symbolic surrogate models that replace transformer MLP layers—constructed via SymTorch with PCA-based dimensionality reduction and PySR-fitted analytic expressions—generalize when evaluated on distributions different from their training distribution, and ascertain whether domain-agnostic symbolic approximations are feasible or whether task- and domain-specific surrogates are necessary.
References
Cross-Domain Generalization: The symbolic surrogates for LLM components are trained and evaluated on the same distribution, leaving open the question of how well such surrogates generalize across domains. We hope to determine whether domain-agnostic symbolic approximations are feasible, or whether task- and domain-specific surrogates are necessary.