Consistent model selection in the spiked Wigner model via AIC-type criteria (2307.12982v2)
Abstract: Consider the spiked Wigner model [ X = \sum_{i = 1}k \lambda_i u_i u_i\top + \sigma G, ] where $G$ is an $N \times N$ GOE random matrix, and the eigenvalues $\lambda_i$ are all spiked, i.e. above the Baik-Ben Arous-P\'ech\'e (BBP) threshold $\sigma$. We consider AIC-type model selection criteria of the form [ -2 \, (\text{maximised log-likelihood}) + \gamma \, (\text{number of parameters}) ] for estimating the number $k$ of spikes. For $\gamma > 2$, the above criterion is strongly consistent provided $\lambda_k > \lambda_{\gamma}$, where $\lambda_{\gamma}$ is a threshold strictly above the BBP threshold, whereas for $\gamma < 2$, it almost surely overestimates $k$. Although AIC (which corresponds to $\gamma = 2$) is not strongly consistent, we show that taking $\gamma = 2 + \delta_N$, where $\delta_N \to 0$ and $\delta_N \gg N{-2/3}$, results in a weakly consistent estimator of $k$. We further show that a soft minimiser of AIC, where one chooses the least complex model whose AIC score is close to the minimum AIC score, is strongly consistent. Based on a spiked (generalised) Wigner representation, we also develop similar model selection criteria for consistently estimating the number of communities in a balanced stochastic block model under some sparsity restrictions.
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.