- The paper introduces a non-asymptotic Gaussian approximation for the last iterate in federated linear stochastic approximation, addressing critical inference challenges.
- It provides precise error and high-order moment bounds that quantify trade-offs among step size decay, local updates, and data heterogeneity in distributed settings.
- The work validates an online multiplier bootstrap procedure, enabling practical confidence interval construction without explicit asymptotic covariance estimation.
Gaussian Approximation and Multiplier Bootstrap for Federated Linear Stochastic Approximation
The paper "Gaussian Approximation and Multiplier Bootstrap for Federated Linear Stochastic Approximation" (2605.19629) addresses fundamental statistical inference questions for federated learning, focusing on the federated linear stochastic approximation (FedLSA) framework. FedLSA generalizes widely used local SGD-type algorithms in a distributed or cross-silo federated setting, where N agents with heterogeneous data sources collectively solve a linear system Aθ=bˉ via local stochastic updates and periodic synchronization.
Despite substantial advances in understanding convergence rates for federated optimization, uncertainty quantification and statistical inference remain insufficiently developed for federated protocols, particularly for the last iterate of linear stochastic approximation with non-constant step sizes and increasing numbers of local updates. Previous works offer only asymptotic (CLT-type) results or bounds for special cases such as Polyak-Ruppert averaging or constant local steps, lacking non-asymptotic distributional guarantees and practical confidence interval construction.
This paper closes these gaps by providing—under general conditions and in a fully non-asymptotic regime—a Berry-Esseen-type Gaussian approximation for the last iterate of FedLSA, tracking the interplay between communication-computation tradeoffs, local update schedules, and system/data heterogeneity. The analysis yields new p-th moment error bounds, non-asymptotic Gaussian approximation rates, and formal finite-time guarantees for an online multiplier bootstrap procedure for inference, making no explicit assumptions about asymptotic covariance estimation.
Moment and Error Bounds for Federated LSA
A central technical contribution is an exact error decomposition for the last iterate in FedLSA under arbitrary non-increasing step size schedules ηt​ and non-decreasing sequences of local updates Ht​. The authors provide, for the first time, high-order (p≥2) moment bounds for the error ∥θt​−θ∗​∥ in this generalized setting. The error is decomposed into: (i) transient terms accounting for initialization, (ii) bias terms arising from heterogeneity, and (iii) fluctuation terms from both observation noise and agent heterogeneity.
Two regimes are analyzed:
- Constant step size and local updates: Explicit MSE bounds generalize and sharpen prior analyses, establishing necessary scaling between η and H to achieve prescribed MSE levels in the presence of non-vanishing variance.
- Polynomially decreasing step size and increasing local updates: The moment bounds quantify conditions under which initialization vanishes and the variance decays at the desired rate. Critical tradeoffs are highlighted between the decay rate of the step size (ηt​∼(1+t)−γη​) and the increase in local updates (Aθ=bˉ0); notably, statistical efficiency and communication reduction can be simultaneously tuned, but only if Aθ=bˉ1.
Key bounds demonstrate that to maintain vanishing error and achieve fast concentration, the step size must decay sufficiently faster than the growth in local updates, and agents’ heterogeneity must be controlled as measured by the dispersion of local solutions and system matrices.
Non-Asymptotic Gaussian Approximation for the Last Iterate
The core statistical result is a non-asymptotic convex distance bound (in the sense of Berry-Esseen theory) for the distribution of the self-normalized last iterate of FedLSA. The analysis proceeds by extracting the dominant linear term (martingale) in the error, separating it from higher-order nonlinear remainders, and applying sharp multivariate central limit theorems for nonlinear statistics.
The main theorem shows that, under standard stability and regularity conditions, the convex distance to a Gaussian law satisfies
Aθ=bˉ2
where Aθ=bˉ3. This rate quantifies, for the first time, how federated system heterogeneity, the number of agents, and the communication/computation schedule affect non-asymptotic normality, rigorously justifying empirical observations and providing a framework for principled finite-sample inference.
A detailed analysis of covariance stabilization reveals an intrinsic limitation: using the true asymptotic covariance Aθ=bˉ4 leads to potentially much slower convergence in distribution due to slow covariance matrix stabilization, a phenomenon quantified via lower bounds in the single-agent case. This underscores the necessity of alternative bootstrap-based uncertainty quantification methods in finite-sample and non-asymptotic regimes.
Multiplier Bootstrap for Federated Inference
To practically construct confidence intervals without explicit asymptotic covariance estimation, the authors propose and analyze an online multiplier bootstrap for the last iterate of FedLSA. In this procedure, agents’ local updates are re-weighted by i.i.d. mean-one, variance-one random weights at each communication round, and the sequence of bootstrapped global iterates forms the reference distribution for uncertainty quantification. The procedure is fully online, requires no explicit covariance estimation, and directly mirrors the stochasticity in the original optimization process.
The main theoretical guarantee is a non-asymptotic bound, uniform over convex sets, on the difference between the law of the normalized bootstrapped error and that of the original normalized error. The bootstrap validity theorem proves that, with high probability: Aθ=bˉ5
matching the Gaussian approximation rate. The validity holds even with growing local update counts and non-vanishing heterogeneity, confirming and formally proving previously conjectured results (see [bonnerjee2025sharp] for related conjectures). Notably, the bound indicates that the rate is not bottlenecked by the stabilization of the limiting covariance, but is instead determined by the core martingale CLT scaling.
Empirical coverage results demonstrate that bootstrap-based confidence intervals achieve better control of coverage in non-asymptotic regimes relative to plug-in Gaussian methods, particularly in early or intermediate training stages and under significant heterogeneity.
Practical and Theoretical Implications
The results have broad implications for federated learning and distributed stochastic approximation:
- They establish, for the first time, theoretically rigorous and non-asymptotic distributional approximations for the last iterate in inhomogeneous, communication-efficient federated settings, accommodating decaying step sizes and growing local computation.
- The moment and normal approximation bounds provide precise quantitative tools for algorithmic design, suggesting how to tune schedules to balance convergence, communication, and inferential accuracy.
- The multiplier bootstrap framework enables fully online, practical uncertainty quantification, obviating the need for consistent asymptotic covariance estimation and allowing for rigorous distributed inference with formal finite-sample guarantees.
- The technical analysis generalizes to other structured distributed SA/SGD paradigms and suggests directions for further research on non-asymptotic statistics in distributed, possibly non-linear or non-convex, learning regimes.
Future Directions
Possible future research avenues include:
- Extension to non-linear stochastic approximation and federated reinforcement learning with function approximation, where similar distributional issues arise.
- Analysis under non-i.i.d. or dependent data distributions, and adaptive or data-dependent step sizes and local schedules.
- Integration with privacy-preserving mechanisms and their effect on statistical efficiency and inferential guarantees.
- Investigation of analogous distributional results for other decentralized consensus protocols or federated optimization variants.
Conclusion
This work provides a comprehensive framework for non-asymptotic normal approximation and fully online inference for federated linear stochastic approximation (2605.19629). By quantifying the interplay of computation, communication, and heterogeneity on distributional convergence, and by establishing the validity of multiplier bootstrap confidence sets, the results substantively advance theoretical understanding and practical methodology for reliable uncertainty quantification in federated and distributed learning.