Papers
Topics
Authors
Recent
Search
2000 character limit reached

The Hessian perspective into the Nature of Convolutional Neural Networks

Published 16 May 2023 in cs.LG and stat.ML | (2305.09088v1)

Abstract: While Convolutional Neural Networks (CNNs) have long been investigated and applied, as well as theorized, we aim to provide a slightly different perspective into their nature -- through the perspective of their Hessian maps. The reason is that the loss Hessian captures the pairwise interaction of parameters and therefore forms a natural ground to probe how the architectural aspects of CNN get manifested in its structure and properties. We develop a framework relying on Toeplitz representation of CNNs, and then utilize it to reveal the Hessian structure and, in particular, its rank. We prove tight upper bounds (with linear activations), which closely follow the empirical trend of the Hessian rank and hold in practice in more general settings. Overall, our work generalizes and establishes the key insight that, even in CNNs, the Hessian rank grows as the square root of the number of parameters.

Citations (7)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.