The Outer Product Structure of Neural Network Derivatives (1810.03798v1)

Published 9 Oct 2018 in cs.LG and stat.ML

Abstract: In this paper, we show that feedforward and recurrent neural networks exhibit an outer product derivative structure but that convolutional neural networks do not. This structure makes it possible to use higher-order information without needing approximations or infeasibly large amounts of memory, and it may also provide insights into the geometry of neural network optima. The ability to easily access these derivatives also suggests a new, geometric approach to regularization. We then discuss how this structure could be used to improve training methods, increase network robustness and generalizability, and inform network compression methods.

Citations (3)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

Generate Now

The Outer Product Structure of Neural Network Derivatives (1810.03798v1)

Summary

Follow-up Questions

Related Papers

Authors (3)