Learning single-index models via harmonic decomposition (2506.09887v1)

Published 11 Jun 2025 in cs.LG, math.ST, stat.ML, and stat.TH

Abstract: We study the problem of learning single-index models, where the label $y \in \mathbb{R}$ depends on the input $\boldsymbol{x} \in \mathbb{R}^d$ only through an unknown one-dimensional projection $\langle \boldsymbol{w}*,\boldsymbol{x}\rangle$. Prior work has shown that under Gaussian inputs, the statistical and computational complexity of recovering $\boldsymbol{w}*$ is governed by the Hermite expansion of the link function. In this paper, we propose a new perspective: we argue that "spherical harmonics" -- rather than "Hermite polynomials" -- provide the natural basis for this problem, as they capture its intrinsic "rotational symmetry". Building on this insight, we characterize the complexity of learning single-index models under arbitrary spherically symmetric input distributions. We introduce two families of estimators -- based on tensor unfolding and online SGD -- that respectively achieve either optimal sample complexity or optimal runtime, and argue that estimators achieving both may not exist in general. When specialized to Gaussian inputs, our theory not only recovers and clarifies existing results but also reveals new phenomena that had previously been overlooked.

Summary

Learning Single-Index Models via Harmonic Decomposition

This paper investigates the learning of single-index models (SIMs) in high-dimensional spaces, where the relationship between input variables and an output depends on a one-dimensional projection. Specifically, the goal is to recover an unknown projection direction in such models. The inherent complexity arises because the output is determined by a link function applied to the projection, with the function being represented using basis functions. Previous studies primarily utilized Hermite polynomials for these representations, particularly under Gaussian inputs, dictating the statistical and computational complexity.

The authors propose a novel approach using spherical harmonics due to the rotational symmetry inherent in many such modeling problems. This paper posits that spherical harmonics, rather than Hermite polynomials, fundamentally represent the complexities of learning SIMs more naturally, especially when the input distribution is spherically symmetric. The focus expands beyond Gaussian inputs, offering robust theoretical frameworks for learning under arbitrary spherical distributions.

Key Theoretical Contributions

Complexity Characterization: The authors introduce new metrics for the complexity of learning SIMs, centered around spherical harmonics. The complexity is shown to be partitioned across harmonic subspaces, each associated with different degrees of spherical harmonics.
Estimator Families: Two distinct families of estimators are presented—one optimizing sample complexity and another runtime. The first uses tensor unfolding and the second employs online stochastic gradient descent (SGD). This introduces a comprehensive framework that acknowledges potential trade-offs between computational cost and sample efficiency.
Gaussian Special Case: For Gaussian input distributions, the paper verifies that this harmonic approach both recovers previously established results and uncovers more nuanced aspects overlooked in prior analyses. It identifies optimal degrees of harmonics (either 1 or 2, depending on the parity of the generative exponent) enabling maximal learning efficiency.

Discussion of Results

Spherical harmonics bring a symmetry-consistent framework, apt for multimodal distributions not restricted to Gaussian assumptions. The frameworks devised can handle varying levels of symmetry in input distributions, adapting to different practical scenarios in high-dimensional data analysis. The paper further elucidates how conventional SGD approaches can remain suboptimal—highlighting their inefficiencies as they fail to exploit radial distribution components which spherical harmonics naturally accommodate.

Implications and Future Directions

Practically, the results presented may impact representation learning in neural networks, where understanding simpler, feature-driven projections can unravel complexities in model training. Theoretically, this harmonic decomposition offers fertile ground for extending learning methods to other symmetrical data structures or group symmetries.

The paper provides a robust platform for sample and runtime efficiency, but notes that perfect simultaneous optimization in these aspects might be elusive for non-Gaussian distributions, indicating areas for future exploration in joint optimization models and complex symmetries beyond orthogonal spheres. The concept can extend to multi-index models, suggesting exciting possibilities for research in broader multi-dimensional settings where interactions beyond direct projections exist.

In summary, this work presents a significant advancement in understanding and simplifying the computational efficiency of learning single-index models across varied high-dimensional settings. The harmonic decomposition approach proposed promises enhanced adaptability and efficiency, unlocking previously unrealized avenues for model learning and analysis.