Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
153 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Interpreting and Disentangling Feature Components of Various Complexity from DNNs (2006.15920v2)

Published 29 Jun 2020 in cs.LG, cs.AI, cs.CV, and stat.ML

Abstract: This paper aims to define, quantify, and analyze the feature complexity that is learned by a DNN. We propose a generic definition for the feature complexity. Given the feature of a certain layer in the DNN, our method disentangles feature components of different complexity orders from the feature. We further design a set of metrics to evaluate the reliability, the effectiveness, and the significance of over-fitting of these feature components. Furthermore, we successfully discover a close relationship between the feature complexity and the performance of DNNs. As a generic mathematical tool, the feature complexity and the proposed metrics can also be used to analyze the success of network compression and knowledge distillation.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (51)
  1. Emergence of invariance and disentanglement in deep representations. The Journal of Machine Learning Research, 19(1):1947–1980, 2018a.
  2. Information dropout: Learning optimal representations through noisy computation. IEEE transactions on pattern analysis and machine intelligence, 40(12):2897–2905, 2018b.
  3. Explaining deep neural networks with a polynomial time algorithm for shapley values approximation. arXiv preprint arXiv:1903.10992, 2019.
  4. Understanding deep neural networks with rectified linear units. arXiv preprint arXiv:1611.01491, 2016.
  5. On the complexity of neural network classifiers: A comparison between shallow and deep architectures. IEEE transactions on neural networks and learning systems, 25(8):1553–1565, 2014.
  6. Training a 3-node neural network is np-complete. In Advances in neural information processing systems, pages 494–501, 1989.
  7. Complexity of training relu neural network. arXiv preprint arXiv:1809.10787, 2018.
  8. Grad-cam++: Generalized gradient-based visual explanations for deep convolutional networks. In 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), pages 839–847. IEEE, 2018.
  9. Learning to explain: An information-theoretic perspective on model interpretation. In International Conference on Machine Learning, pages 882–891, 2018.
  10. Adanet: Adaptive structural learning of artificial neural networks. In Proceedings of the 34th International Conference on Machine Learning-Volume 70, pages 874–883. JMLR. org, 2017.
  11. Inverting visual representations with convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 4829–4837, 2016.
  12. Interpretable explanations of black boxes by meaningful perturbation. In Proceedings of the IEEE International Conference on Computer Vision, pages 3429–3437, 2017.
  13. Stiffness: A new perspective on generalization in neural networks. arXiv preprint arXiv:1901.09491, 2019.
  14. Estimating information flow in deep neural networks. In International Conference on Machine Learning, pages 2299–2308, 2019.
  15. An axiomatic approach to the concept of interaction among players in cooperative games. International Journal of game theory, 28(4):547–565, 1999.
  16. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149, 2015.
  17. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
  18. beta-vae: Learning basic visual concepts with a constrained variational framework. ICLR, 2(5):6, 2017.
  19. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531, 2015.
  20. Novel dataset for fine-grained image categorization. In First Workshop on Fine-Grained Visual Categorization, IEEE Conference on Computer Vision and Pattern Recognition, Colorado Springs, CO, June 2011.
  21. Learning how to explain neural networks: Patternnet and patternattribution. arXiv preprint arXiv:1705.05598, 2017.
  22. Similarity of neural network representations revisited. arXiv preprint arXiv:1905.00414, 2019.
  23. Learning multiple layers of features from tiny images. Technical report, Citeseer, 2009.
  24. Knowledge consistency between neural networks and beyond. In International Conference on Learning Representations, 2019.
  25. Fisher-rao metric, geometry, and complexity of neural networks. arXiv preprint arXiv:1711.01530, 2017.
  26. On the computational efficiency of training neural networks. In Advances in neural information processing systems, pages 855–863, 2014.
  27. A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems, pages 4765–4774, 2017.
  28. Understanding deep image representations by inverting them. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 5188–5196, 2015.
  29. The computational complexity of training relu (s). arXiv preprint arXiv:1810.04207, 2018.
  30. Sensitivity and generalization in neural networks: an empirical study. arXiv preprint arXiv:1802.08760, 2018.
  31. How to construct deep recurrent neural networks. arXiv preprint arXiv:1312.6026, 2013.
  32. On the expressive power of deep neural networks. In Proceedings of the 34th International Conference on Machine Learning-Volume 70, pages 2847–2854. JMLR. org, 2017.
  33. "why should I trust you?": Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pages 1135–1144. ACM, 2016.
  34. The power of deeper networks for expressing natural functions. arXiv preprint arXiv:1705.05502, 2017.
  35. Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE International Conference on Computer Vision, pages 618–626, 2017.
  36. Lloyd S Shapley. A value for n-person games. Contributions to the Theory of Games, 2(28):307–317, 1953.
  37. Opening the black box of deep neural networks via information. arXiv preprint arXiv:1703.00810, 2017.
  38. Deep inside convolutional networks: visualising image classification models and saliency maps. arXiv preprint arXiv:1312.6034, 2017.
  39. The caltech-ucsd birds-200-2011 dataset. 2011.
  40. Robert J Weber. Probabilistic values for games. The Shapley Value. Essays in Honor of Lloyd S. Shapley, pages 101–119, 1988.
  41. Evaluating the robustness of neural networks: An extreme value theory approach. arXiv preprint arXiv:1801.10578, 2018.
  42. Natalie Wolchover. New theory cracks open the black box of deep learning. In Quanta Magazine, 2017.
  43. Principal component analysis. Chemometrics and intelligent laboratory systems, 2(1-3):37–52, 1987.
  44. Information-theoretic analysis of generalization capability of learning algorithms. In Advances in Neural Information Processing Systems, pages 2524–2533, 2017.
  45. Zhiqin John Xu. Understanding training and generalization in deep learning by fourier analysis. arXiv preprint arXiv:1808.04295, 2018.
  46. Understanding neural networks through deep visualization. arXiv preprint arXiv:1506.06579, 2015.
  47. Visualizing and understanding convolutional networks. In European conference on computer vision, pages 818–833. Springer, 2014.
  48. Architectural complexity measures of recurrent neural networks. In Advances in neural information processing systems, pages 1822–1830, 2016.
  49. Object detectors emerge in deep scene cnns. In ICLR, 2015.
  50. Learning deep features for discriminative localization. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2921–2929, 2016.
  51. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE international conference on computer vision, pages 2223–2232, 2017.
Citations (18)

Summary

We haven't generated a summary for this paper yet.