Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 73 tok/s
Gemini 2.5 Pro 42 tok/s Pro
GPT-5 Medium 26 tok/s Pro
GPT-5 High 34 tok/s Pro
GPT-4o 96 tok/s Pro
Kimi K2 191 tok/s Pro
GPT OSS 120B 454 tok/s Pro
Claude Sonnet 4.5 36 tok/s Pro
2000 character limit reached

Solving Large-scale Spatial Problems with Convolutional Neural Networks (2306.08191v2)

Published 14 Jun 2023 in cs.LG and eess.SP

Abstract: Over the past decade, deep learning research has been accelerated by increasingly powerful hardware, which facilitated rapid growth in the model complexity and the amount of data ingested. This is becoming unsustainable and therefore refocusing on efficiency is necessary. In this paper, we employ transfer learning to improve training efficiency for large-scale spatial problems. We propose that a convolutional neural network (CNN) can be trained on small windows of signals, but evaluated on arbitrarily large signals with little to no performance degradation, and provide a theoretical bound on the resulting generalization error. Our proof leverages shift-equivariance of CNNs, a property that is underexploited in transfer learning. The theoretical results are experimentally supported in the context of mobile infrastructure on demand (MID). The proposed approach is able to tackle MID at large scales with hundreds of agents, which was computationally intractable prior to this work.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (33)
  1. P. P. Shinde and S. Shah, “A Review of Machine Learning and Deep Learning Applications,” in 2018 Fourth Int. Conf. Comput. Commun. Control Autom. ICCUBEA, Aug. 2018, pp. 1–6.
  2. J. Gu et al., “Recent advances in convolutional neural networks,” Pattern Recognition, vol. 77, pp. 354–377, May 2018.
  3. T. Lin, Y. Wang, X. Liu, and X. Qiu, “A Survey of Transformers,” Jun. 2021, arXiv:2106.04554.
  4. M. M. Najafabadi, F. Villanustre, T. M. Khoshgoftaar, N. Seliya, R. Wald, and E. Muharemagic, “Deep learning applications and challenges in big data analytics,” Journal of Big Data, vol. 2, no. 1, p. 1, Feb. 2015.
  5. S. Du and J. Lee, “On the Power of Over-parametrization in Neural Networks with Quadratic Activation,” in Proc. 35th Int. Conf. Mach. Learn.   PMLR, Jul. 2018, pp. 1329–1338.
  6. M. Soltanolkotabi, A. Javanmard, and J. D. Lee, “Theoretical Insights Into the Optimization Landscape of Over-Parameterized Shallow Neural Networks,” IEEE Trans. Inf. Theory, vol. 65, no. 2, pp. 742–769, Feb. 2019.
  7. J. Kaplan et al., “Scaling Laws for Neural Language Models,” Jan. 2020, arXiv:2001.08361.
  8. J. Hoffmann et al., “An empirical analysis of compute-optimal large language model training,” in Adv. Neural Inf. Process. Syst., S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, Eds., vol. 35.   Curran Associates, Inc., 2022, pp. 30 016–30 030.
  9. T. Brown et al., “Language Models are Few-Shot Learners,” in Adv. Neural Inf. Process. Syst., vol. 33.   Curran Associates, Inc., 2020, pp. 1877–1901.
  10. J. W. Rae et al., “Scaling Language Models: Methods, Analysis & Insights from Training Gopher,” Jan. 2022, arXiv:2112.11446.
  11. S. Smith et al., “Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model,” Feb. 2022, arXiv:2201.11990.
  12. H. Touvron et al., “LLaMA: Open and Efficient Foundation Language Models,” Feb. 2023, arXiv:2302.13971.
  13. N. C. Thompson, K. Greenewald, K. Lee, and G. F. Manso, “The Computational Limits of Deep Learning,” Jul. 2022, arXiv:2007.05558.
  14. D. Patterson et al., “Carbon Emissions and Large Neural Network Training,” Apr. 2021, arXiv:2104.10350.
  15. F. Zhuang et al., “A Comprehensive Survey on Transfer Learning,” Proc. IEEE, vol. 109, no. 1, pp. 43–76, Jan. 2021.
  16. C. Tan, F. Sun, T. Kong, W. Zhang, C. Yang, and C. Liu, “A Survey on Deep Transfer Learning,” in Artif. Neural Netw. Mach. Learn. – ICANN 2018, ser. Lecture Notes in Computer Science, V. Kůrková, Y. Manolopoulos, B. Hammer, L. Iliadis, and I. Maglogiannis, Eds.   Cham: Springer International Publishing, 2018, pp. 270–279.
  17. R. Ribani and M. Marengoni, “A Survey of Transfer Learning for Convolutional Neural Networks,” in 2019 32nd SIBGRAPI Conf. Graph. Patterns Images Tutor. SIBGRAPI-T, Oct. 2019, pp. 47–57.
  18. Z. Zhu, K. Lin, A. K. Jain, and J. Zhou, “Transfer Learning in Deep Reinforcement Learning: A Survey,” May 2022, arXiv:2009.07888.
  19. H.-C. Shin et al., “Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning,” IEEE Trans. Med. Imaging, vol. 35, no. 5, pp. 1285–1298, May 2016.
  20. W. Rawat and Z. Wang, “Deep Convolutional Neural Networks for Image Classification: A Comprehensive Review,” Neural Computation, vol. 29, no. 9, pp. 2352–2449, Sep. 2017.
  21. Z. Li, F. Liu, W. Yang, S. Peng, and J. Zhou, “A Survey of Convolutional Neural Networks: Analysis, Applications, and Prospects,” IEEE Trans. Neural Netw. Learn. Syst., vol. 33, no. 12, pp. 6999–7019, Dec. 2022.
  22. M. Alipour, D. K. Harris, and G. R. Miller, “Robust Pixel-Level Crack Detection Using Deep Fully Convolutional Neural Networks,” J. Comput. Civ. Eng., vol. 33, no. 6, p. 04019040, Nov. 2019.
  23. R. Zhang, “Making Convolutional Networks Shift-Invariant Again,” ArXiv, Apr. 2019.
  24. A. Chaman and I. Dokmanic, “Truly shift-invariant convolutional neural networks,” 2021 IEEECVF Conf. Comput. Vis. Pattern Recognit. CVPR, pp. 3772–3782, Jun. 2021.
  25. A. Azulay and Y. Weiss, “Why do deep convolutional networks generalize so poorly to small image transformations?” J. Mach. Learn. Res., May 2018.
  26. O. Semih Kayhan and J. C. van Gemert, “On Translation Invariance in CNNs: Convolutional Layers Can Exploit Absolute Spatial Location,” in 2020 IEEECVF Conf. Comput. Vis. Pattern Recognit. CVPR.   Seattle, WA, USA: IEEE, Jun. 2020, pp. 14 262–14 273.
  27. D. Mox, M. Calvo-Fullana, M. Gerasimenko, J. Fink, V. Kumar, and A. Ribeiro, “Mobile Wireless Network Infrastructure on Demand,” in 2020 IEEE Int. Conf. Robot. Autom. ICRA, May 2020, pp. 7726–7732.
  28. C. R. Shalizi and A. Kontorovich, “Almost None of the Theory of Stochastic Processes,” Dec. 2007, lecture notes.
  29. P. Rinn, Y. Stepanov, J. Peinke, T. Guhr, and R. Schäfer, “Dynamics of quasi-stationary systems: Finance as an example,” EPL, vol. 110, no. 6, p. 68003, Jun. 2015.
  30. P.-A. Michelangeli, R. Vautard, and B. Legras, “Weather Regimes: Recurrence and Quasi Stationarity,” J. Atmospheric Sci., vol. 52, no. 8, pp. 1237–1256, Apr. 1995.
  31. D. Owerko, C. Kanatsoulis, A. Ribeiro, D. J. Bucci Jr, and J. Bondarchuk, “Multi-Target Tracking with Transferable Convolutional Neural Networks,” Jul. 2023, arXiv:2210.15539.
  32. D. Mox, V. Kumar, and A. Ribeiro, “Learning Connectivity-Maximizing Network Configurations,” IEEE Robot. Autom. Lett., vol. 7, no. 2, pp. 5552–5559, Apr. 2022.
  33. J. Fink, “Communication for teams of networked robots,” Ph.D. dissertation, University of Pennsylvania, United States – Pennsylvania, 2011.
Citations (2)

Summary

We haven't generated a summary for this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.