Solving Large-scale Spatial Problems with Convolutional Neural Networks (2306.08191v2)
Abstract: Over the past decade, deep learning research has been accelerated by increasingly powerful hardware, which facilitated rapid growth in the model complexity and the amount of data ingested. This is becoming unsustainable and therefore refocusing on efficiency is necessary. In this paper, we employ transfer learning to improve training efficiency for large-scale spatial problems. We propose that a convolutional neural network (CNN) can be trained on small windows of signals, but evaluated on arbitrarily large signals with little to no performance degradation, and provide a theoretical bound on the resulting generalization error. Our proof leverages shift-equivariance of CNNs, a property that is underexploited in transfer learning. The theoretical results are experimentally supported in the context of mobile infrastructure on demand (MID). The proposed approach is able to tackle MID at large scales with hundreds of agents, which was computationally intractable prior to this work.
- P. P. Shinde and S. Shah, “A Review of Machine Learning and Deep Learning Applications,” in 2018 Fourth Int. Conf. Comput. Commun. Control Autom. ICCUBEA, Aug. 2018, pp. 1–6.
- J. Gu et al., “Recent advances in convolutional neural networks,” Pattern Recognition, vol. 77, pp. 354–377, May 2018.
- T. Lin, Y. Wang, X. Liu, and X. Qiu, “A Survey of Transformers,” Jun. 2021, arXiv:2106.04554.
- M. M. Najafabadi, F. Villanustre, T. M. Khoshgoftaar, N. Seliya, R. Wald, and E. Muharemagic, “Deep learning applications and challenges in big data analytics,” Journal of Big Data, vol. 2, no. 1, p. 1, Feb. 2015.
- S. Du and J. Lee, “On the Power of Over-parametrization in Neural Networks with Quadratic Activation,” in Proc. 35th Int. Conf. Mach. Learn. PMLR, Jul. 2018, pp. 1329–1338.
- M. Soltanolkotabi, A. Javanmard, and J. D. Lee, “Theoretical Insights Into the Optimization Landscape of Over-Parameterized Shallow Neural Networks,” IEEE Trans. Inf. Theory, vol. 65, no. 2, pp. 742–769, Feb. 2019.
- J. Kaplan et al., “Scaling Laws for Neural Language Models,” Jan. 2020, arXiv:2001.08361.
- J. Hoffmann et al., “An empirical analysis of compute-optimal large language model training,” in Adv. Neural Inf. Process. Syst., S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, Eds., vol. 35. Curran Associates, Inc., 2022, pp. 30 016–30 030.
- T. Brown et al., “Language Models are Few-Shot Learners,” in Adv. Neural Inf. Process. Syst., vol. 33. Curran Associates, Inc., 2020, pp. 1877–1901.
- J. W. Rae et al., “Scaling Language Models: Methods, Analysis & Insights from Training Gopher,” Jan. 2022, arXiv:2112.11446.
- S. Smith et al., “Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model,” Feb. 2022, arXiv:2201.11990.
- H. Touvron et al., “LLaMA: Open and Efficient Foundation Language Models,” Feb. 2023, arXiv:2302.13971.
- N. C. Thompson, K. Greenewald, K. Lee, and G. F. Manso, “The Computational Limits of Deep Learning,” Jul. 2022, arXiv:2007.05558.
- D. Patterson et al., “Carbon Emissions and Large Neural Network Training,” Apr. 2021, arXiv:2104.10350.
- F. Zhuang et al., “A Comprehensive Survey on Transfer Learning,” Proc. IEEE, vol. 109, no. 1, pp. 43–76, Jan. 2021.
- C. Tan, F. Sun, T. Kong, W. Zhang, C. Yang, and C. Liu, “A Survey on Deep Transfer Learning,” in Artif. Neural Netw. Mach. Learn. – ICANN 2018, ser. Lecture Notes in Computer Science, V. Kůrková, Y. Manolopoulos, B. Hammer, L. Iliadis, and I. Maglogiannis, Eds. Cham: Springer International Publishing, 2018, pp. 270–279.
- R. Ribani and M. Marengoni, “A Survey of Transfer Learning for Convolutional Neural Networks,” in 2019 32nd SIBGRAPI Conf. Graph. Patterns Images Tutor. SIBGRAPI-T, Oct. 2019, pp. 47–57.
- Z. Zhu, K. Lin, A. K. Jain, and J. Zhou, “Transfer Learning in Deep Reinforcement Learning: A Survey,” May 2022, arXiv:2009.07888.
- H.-C. Shin et al., “Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning,” IEEE Trans. Med. Imaging, vol. 35, no. 5, pp. 1285–1298, May 2016.
- W. Rawat and Z. Wang, “Deep Convolutional Neural Networks for Image Classification: A Comprehensive Review,” Neural Computation, vol. 29, no. 9, pp. 2352–2449, Sep. 2017.
- Z. Li, F. Liu, W. Yang, S. Peng, and J. Zhou, “A Survey of Convolutional Neural Networks: Analysis, Applications, and Prospects,” IEEE Trans. Neural Netw. Learn. Syst., vol. 33, no. 12, pp. 6999–7019, Dec. 2022.
- M. Alipour, D. K. Harris, and G. R. Miller, “Robust Pixel-Level Crack Detection Using Deep Fully Convolutional Neural Networks,” J. Comput. Civ. Eng., vol. 33, no. 6, p. 04019040, Nov. 2019.
- R. Zhang, “Making Convolutional Networks Shift-Invariant Again,” ArXiv, Apr. 2019.
- A. Chaman and I. Dokmanic, “Truly shift-invariant convolutional neural networks,” 2021 IEEECVF Conf. Comput. Vis. Pattern Recognit. CVPR, pp. 3772–3782, Jun. 2021.
- A. Azulay and Y. Weiss, “Why do deep convolutional networks generalize so poorly to small image transformations?” J. Mach. Learn. Res., May 2018.
- O. Semih Kayhan and J. C. van Gemert, “On Translation Invariance in CNNs: Convolutional Layers Can Exploit Absolute Spatial Location,” in 2020 IEEECVF Conf. Comput. Vis. Pattern Recognit. CVPR. Seattle, WA, USA: IEEE, Jun. 2020, pp. 14 262–14 273.
- D. Mox, M. Calvo-Fullana, M. Gerasimenko, J. Fink, V. Kumar, and A. Ribeiro, “Mobile Wireless Network Infrastructure on Demand,” in 2020 IEEE Int. Conf. Robot. Autom. ICRA, May 2020, pp. 7726–7732.
- C. R. Shalizi and A. Kontorovich, “Almost None of the Theory of Stochastic Processes,” Dec. 2007, lecture notes.
- P. Rinn, Y. Stepanov, J. Peinke, T. Guhr, and R. Schäfer, “Dynamics of quasi-stationary systems: Finance as an example,” EPL, vol. 110, no. 6, p. 68003, Jun. 2015.
- P.-A. Michelangeli, R. Vautard, and B. Legras, “Weather Regimes: Recurrence and Quasi Stationarity,” J. Atmospheric Sci., vol. 52, no. 8, pp. 1237–1256, Apr. 1995.
- D. Owerko, C. Kanatsoulis, A. Ribeiro, D. J. Bucci Jr, and J. Bondarchuk, “Multi-Target Tracking with Transferable Convolutional Neural Networks,” Jul. 2023, arXiv:2210.15539.
- D. Mox, V. Kumar, and A. Ribeiro, “Learning Connectivity-Maximizing Network Configurations,” IEEE Robot. Autom. Lett., vol. 7, no. 2, pp. 5552–5559, Apr. 2022.
- J. Fink, “Communication for teams of networked robots,” Ph.D. dissertation, University of Pennsylvania, United States – Pennsylvania, 2011.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.