Information Plane Analysis Visualization in Deep Learning via Transfer Entropy (2404.01364v1)
Abstract: In a feedforward network, Transfer Entropy (TE) can be used to measure the influence that one layer has on another by quantifying the information transfer between them during training. According to the Information Bottleneck principle, a neural model's internal representation should compress the input data as much as possible while still retaining sufficient information about the output. Information Plane analysis is a visualization technique used to understand the trade-off between compression and information preservation in the context of the Information Bottleneck method by plotting the amount of information in the input data against the compressed representation. The claim that there is a causal link between information-theoretic compression and generalization, measured by mutual information, is plausible, but results from different studies are conflicting. In contrast to mutual information, TE can capture temporal relationships between variables. To explore such links, in our novel approach we use TE to quantify information transfer between neural layers and perform Information Plane analysis. We obtained encouraging experimental results, opening the possibility for further investigations.
- Thomas Schreiber. Measuring information transfer. Phys. Rev. Lett., 85:461–464, Jul 2000.
- Granger causality and transfer entropy are equivalent for gaussian variables. Phys. Rev. Lett., 103:238701, Dec 2009.
- Kateřina Hlaváčková-Schindler. Equivalence of granger causality and transfer entropy: A generalization. Appl. Math. Sci., 5(73):3637–3648, 2011.
- Forecasting economic time series. Academic Press New York, 1977.
- Differentiating information transfer and causal effect. The European Physical Journal B, 73:605–615, 2010.
- The information bottleneck method. Proceedings of the 37th Allerton Conference on Communication, Control and Computation, 49, 07 2001.
- Opening the black box of deep neural networks via information. CoRR, abs/1703.00810, 2017.
- Visual knowledge discovery with artificial intelligence: Challenges and future directions. In Boris Kovalerchuk, Kawa Nazemi, Răzvan Andonie, Nuno Datia, and Ebad Banissi, editors, Integrating Artificial Intelligence and Visualization for Visual Knowledge Discovery, pages 1–27. Springer International Publishing, Cham, 2022.
- Integrating Artificial Intelligence and Visualization for Visual Knowledge Discovery. Springer, 2022.
- Utilizing information bottleneck to evaluate the capability of deep neural networks for image classification. Entropy, 21(5), 2019.
- On the information bottleneck theory of deep learning. Journal of Statistical Mechanics: Theory and Experiment, 2019(12):124020, dec 2019.
- Deep learning, stochastic gradient descent and diffusion maps. Journal of Computational Mathematics and Data Science, 4:100054, 2022.
- The anisotropic noise in stochastic gradient descent: Its behavior of escaping from sharp minima and regularization effects. arXiv preprint arXiv:1803.00195, 2018.
- Bernhard C. Geiger. On information plane analyses of neural network classifiers—a review. IEEE Transactions on Neural Networks and Learning Systems, 33(12):7039–7051, 2022.
- A methodology to explain neural network classification. Neural Networks, 15(2):237 – 246, 2002.
- Multivariate information-theoretic measures reveal directed information structure and task relevant changes in fmri connectivity. Journal of Computational Neuroscience, 30(1):85–107, Feb 2011.
- Transfer entropy—a model-free measure of effective connectivity for the neurosciences. Journal of Computational Neuroscience, 30(1):45–67, Feb 2011.
- Functional clusters, hubs, and communities in the cortical microconnectome. Cerebral cortex (New York, N.Y. : 1991), 25(10):3743–3757, Oct 2015. 25336598[pmid].
- Dissecting deep learning networks—visualizing mutual information. Entropy, 20(11), 2018.
- Deep Learning: A Practitioner’s Approach. O’Reilly Media, Inc., 1st edition, 2017.
- Transfer entropy-based feedback improves performance in artificial neural networks. CoRR, abs/1706.04265, 2017.
- Evolving artificial neural networks with feedback. Neural Networks, 123:153 – 162, 2020.
- Improving recurrent neural network performance using transfer entropy. In Proceedings of the 17th International Conference on Neural Information Processing: Models and Applications - Volume Part II, ICONIP’10, pages 193–200, Berlin, Heidelberg, 2010. Springer-Verlag.
- Learning in feedforward neural networks accelerated by transfer entropy. Entropy, 22(1):102, 2020.
- Learning in convolutional neural networks accelerated by transfer entropy. Entropy, 23(9), 2021.
- Claude E Shannon. Coding theorems for a discrete source with a fidelity criterion (international convention record), vol. 7. New York, NY, USA: Institute of Radio Engineers, 1959.
- Neural coding and decoding: communication channels and quantization. Network: Computation in Neural Systems, 12(4):441, aug 2001.
- Successive refinement of information. IEEE Transactions on Information Theory, 37(2):269–275, 1991.
- The information bottleneck method, 2000.
- Deep learning and the information bottleneck principle. In 2015 IEEE Information Theory Workshop (ITW), pages 1–5, 2015.
- Estimating information flow in deep neural networks. In Kamalika Chaudhuri and Ruslan Salakhutdinov, editors, Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, pages 2299–2308. PMLR, 09–15 Jun 2019.
- Information bottleneck in deep learning-a semiotic approach. International Journal of Computers Communications & Control, 17(1), 2022.
- The information bottleneck problem and its applications in machine learning. IEEE Journal on Selected Areas in Information Theory, 1(1):19–38, 2020.
- R. Amjad and B. C. Geiger. Learning representations for neural network-based classification using the information bottleneck principle. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(09):2225–2239, sep 2020.
- Estimating information flow in deep neural networks, 2018.
- On the difference between the information bottleneck and the deep information bottleneck. Entropy, 22(2), 2020.
- Direct validation of the information bottleneck principle for deep nets. In 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), pages 758–762, 2019.
- Deep variational information bottleneck, 2016.
- Caveats for information bottleneck in deterministic scenarios. In International Conference on Learning Representations, 2019.
- Entropy and mutual information in models of deep neural networks*. Journal of Statistical Mechanics: Theory and Experiment, 2019(12):124014, dec 2019.
- UCI machine learning repository, 2017.
- Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems, 2012.
- Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms, 2017.
- An analysis of single-layer networks in unsupervised feature learning. Journal of Machine Learning Research - Proceedings Track, 15:215–223, 01 2011.
- Reading digits in natural images with unsupervised feature learning. NIPS, 01 2011.
- J. J. Hull. A database for handwritten text recognition research. IEEE Transactions on Pattern Analysis and Machine Intelligence, 16(5):550–554, 1994.