Mean-Field Microcanonical Gradient Descent (2403.08362v2)
Abstract: Microcanonical gradient descent is a sampling procedure for energy-based models allowing for efficient sampling of distributions in high dimension. It works by transporting samples from a high-entropy distribution, such as Gaussian white noise, to a low-energy region using gradient descent. We put this model in the framework of normalizing flows, showing how it can often overfit by losing an unnecessary amount of entropy in the descent. As a remedy, we propose a mean-field microcanonical gradient descent that samples several weakly coupled data points simultaneously, allowing for better control of the entropy loss while paying little in terms of likelihood fit. We study these models in the context of financial time series, illustrating the improvements on both synthetic and real data.
- Kymatio: Scattering transforms in Python. Journal of Machine Learning Research, 21(60):1–6, 2020.
- Joint time–frequency scattering. IEEE Transactions on Signal Processing, 67(14):3704–3718, 2019.
- Separation of dust emission from the cosmic infrared background in Herschel observations with wavelet phase harmonics. Astronomy & Astrophysics, 681:A1, 2023.
- Generalized rectifier wavelet covariance models for texture synthesis. In International Conference on Learning Representations, 2022.
- Invariant scattering convolution networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(8):1872–1886, 2013.
- Multiscale sparse microcanonical models. Mathematical Statistics and Learning, 1(3):257–315, 2019.
- A data-driven market simulator for small data environments. arXiv preprint arXiv:2006.14498, 2020.
- Probabilistic theory of mean field games with applications I-II. Springer, 2018.
- Scattering spectra models for physics. arXiv preprint arXiv:2306.17210, 2023.
- Elements of Information Theory. John Wiley & Sons, Inc., 2nd edition, 2006. ISBN 978-0-471-24195-9.
- A theory of the term structure of interest rates. Econometrica, 53(2):385, 1985. doi: 10.2307/1911242.
- Maximum-entropy distributions having prescribed first and second moments. IEEE Transactions on Information Theory, 19(5):689–693, 1973.
- Implicit generation and generalization in energy-based models. arXiv preprint arXiv:1903.08689, 2019.
- Solid harmonic wavelet scattering: Predicting quantum molecular energy from invariant descriptors of 3d electronic densities. In Advances in Neural Information Processing Systems, volume 30, 2017.
- Large deviation principles and complete equivalence and nonequivalence results for pure and mixed ensembles. Journal of Statistical Physics, 101(5):999–1064, 2000.
- Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6(6):721–741, 1984.
- Jaynes, E. T. Information theory and statistical mechanics. Physical Review, 106(4):620–630, 1957.
- Lanford, O. E. Time evolution of large classical systems. In Moser, J. (ed.), Dynamical Systems, Theory and Applications, pp. 1–111. Springer, Berlin, Heidelberg, 1975.
- Maximum-entropy scattering models for financial time series. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5496–5500, 2019.
- Time–frequency scattering accurately models auditory similarities between instrumental playing techniques. EURASIP Journal on Audio, Speech, and Music Processing, 2021(1):3, 2021.
- Lyons, T. Rough paths, signatures and the modelling of functions on streams. arXiv preprint arXiv:1405.4537, 2014.
- Mallat, S. Group invariant scattering. Communications on Pure and Applied Mathematics, 65(10):1331–1398, 2012.
- Mallat, S. Understanding deep convolutional networks. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 374(2065), 2016.
- Phase harmonic correlations and convolutional neural networks. Information and Inference: A Journal of the IMA, 9(3):721–747, 2019.
- Scale dependencies and self-similar models with wavelet scattering spectra. Available at SSRN 4516767, 2023.
- Sig-Wasserstein GANs for time series generation. arXiv preprint arXiv:2111.01207, 2021.
- Scattering networks for hybrid representation learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(9):2208–2221, 2019.
- Normalizing flows for probabilistic modeling and inference. Journal of Machine Learning Research, 22(1):2617–2680, 2021.
- PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems, pp. 8024–8035, 2019.
- Sheppard, K. bashtage/arch: Release 6.3, 2024.
- Touchette, H. Equivalence and nonequivalence of ensembles: Thermodynamic, macrostate, and measure levels. Journal of Statistical Physics, 159(5):987–1016, 2015.
- Arrhythmia classification of 12-lead electrocardiograms by hybrid scattering-LSTM networks. In 2020 Computing in Cardiology, pp. 1–4, 2020.
- Maximum entropy models from phase harmonic covariances. Applied and Computational Harmonic Analysis, 53:199–230, 2021.