Origin of weaker crackling statistics in ResNets

Determine whether the comparatively weaker power-law behavior and crackling-noise scaling observed in avalanche statistics of Residual Networks (ResNets) are due to limitations of the avalanche measurement methodology employed (including thresholding and event definition) or instead reflect properties of convolutional network architectures that necessitate designing networks with clearer power-law statistics.

Background

The paper extends crackling-noise analysis from Gaussian-initialized fully connected networks to engineered convolutional architectures (ResNets). Avalanche definitions are adapted to accommodate heterogeneous operations (convolutions, batch normalization, activations) and layer-dependent thresholds.

While avalanche size and duration distributions in ResNets appear power-law distributed and exhibit scaling, these signatures are noticeably weaker than in Gaussian-initialized networks. The authors explicitly note uncertainty about whether this reflects measurement limitations or architectural factors, calling for further investigation.

References

Whether it shows that there is room for improvement in terms of avalanche measurement, or in designing convolutional networks with clearer power-law statistics, is unclear and requires future investigation.

— Toward a Physics of Deep Learning and Brains (2509.22649 - Ghavasieh et al., 26 Sep 2025) in Section “Beyond Gaussian networks,” paragraph following Figure 4 (page 9)

Origin of weaker crackling statistics in ResNets

Background

References

Related Problems