Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
121 tokens/sec
GPT-4o
9 tokens/sec
Gemini 2.5 Pro Pro
47 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Role of Locality and Weight Sharing in Image-Based Tasks: A Sample Complexity Separation between CNNs, LCNs, and FCNs (2403.15707v1)

Published 23 Mar 2024 in cs.LG, cs.AI, stat.ML, and cs.CV

Abstract: Vision tasks are characterized by the properties of locality and translation invariance. The superior performance of convolutional neural networks (CNNs) on these tasks is widely attributed to the inductive bias of locality and weight sharing baked into their architecture. Existing attempts to quantify the statistical benefits of these biases in CNNs over locally connected convolutional neural networks (LCNs) and fully connected neural networks (FCNs) fall into one of the following categories: either they disregard the optimizer and only provide uniform convergence upper bounds with no separating lower bounds, or they consider simplistic tasks that do not truly mirror the locality and translation invariance as found in real-world vision tasks. To address these deficiencies, we introduce the Dynamic Signal Distribution (DSD) classification task that models an image as consisting of $k$ patches, each of dimension $d$, and the label is determined by a $d$-sparse signal vector that can freely appear in any one of the $k$ patches. On this task, for any orthogonally equivariant algorithm like gradient descent, we prove that CNNs require $\tilde{O}(k+d)$ samples, whereas LCNs require $\Omega(kd)$ samples, establishing the statistical advantages of weight sharing in translation invariant tasks. Furthermore, LCNs need $\tilde{O}(k(k+d))$ samples, compared to $\Omega(k2d)$ samples for FCNs, showcasing the benefits of locality in local tasks. Additionally, we develop information theoretic tools for analyzing randomized algorithms, which may be of interest for statistical research.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (17)
  1. Learnability with respect to fixed distributions. Theor. Comput. Sci., 86:377–390, 1991. URL https://api.semanticscholar.org/CorpusID:33054388.
  2. How many samples are needed to estimate a convolutional neural network? In S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett (eds.), Advances in Neural Information Processing Systems, volume 31. Curran Associates, Inc., 2018. URL https://proceedings.neurips.cc/paper_files/paper/2018/file/03c6b06952c750899bb03d998e631860-Paper.pdf.
  3. John Duchi. Lecture notes for statistics 311/electrical engineering 377, 2021. URL https://web.stanford.edu/class/stats311/lecture-notes.pdf. Stanford University.
  4. Eva: Exploring the limits of masked visual representation learning at scale, 2022.
  5. Deep symmetry networks. In NIPS, 2014. URL https://api.semanticscholar.org/CorpusID:267009.
  6. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer series in statistics. Springer, 2009. ISBN 9780387848846. URL https://books.google.com/books?id=eBSgoAEACAAJ.
  7. Local signal adaptivity: Provable feature learning in neural networks beyond kernels. In A. Beygelzimer, Y. Dauphin, P. Liang, and J. Wortman Vaughan (eds.), Advances in Neural Information Processing Systems, 2021. URL https://openreview.net/forum?id=oAjn5-AgSd.
  8. Why are convolutional nets more sample-efficient than fully-connected nets? In International Conference on Learning Representations, 2021. URL https://openreview.net/forum?id=uCY5MuAxcxU.
  9. A convnet for the 2020s. CoRR, abs/2201.03545, 2022. URL https://arxiv.org/abs/2201.03545.
  10. Generalization bounds for deep convolutional neural networks, 2020.
  11. Computational separation between convolutional and fully-connected networks, 2020.
  12. Gary Marcus. Deep learning: A critical appraisal. CoRR, abs/1801.00631, 2018. URL http://arxiv.org/abs/1801.00631.
  13. Concentration inequalities and model selection. 2007. URL https://api.semanticscholar.org/CorpusID:119022238.
  14. The sample complexity of one-hidden-layer neural networks, 2022.
  15. Vim: Out-of-distribution with virtual-logit matching, 2022.
  16. Theoretical analysis of inductive biases in deep convolutional networks, 2023.
  17. Cat head detection - how to effectively exploit shape and texture features. In European Conference on Computer Vision, 2008. URL https://api.semanticscholar.org/CorpusID:2441648.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com