Closed-Form Interpretation of Neural Network Classifiers with Symbolic Gradients (2401.04978v2)
Abstract: I introduce a unified framework for finding a closed-form interpretation of any single neuron in an artificial neural network. Using this framework I demonstrate how to interpret neural network classifiers to reveal closed-form expressions of the concepts encoded in their decision boundaries. In contrast to neural network-based regression, for classification, it is in general impossible to express the neural network in the form of a symbolic equation even if the neural network itself bases its classification on a quantity that can be written as a closed-form equation. The interpretation framework is based on embedding trained neural networks into an equivalence class of functions that encode the same concept. I interpret these neural networks by finding an intersection between the equivalence class and human-readable equations defined by a symbolic search space. The approach is not limited to classifiers or full neural networks and can be applied to arbitrary neurons in hidden layers or latent spaces.
- Explainable deep learning in healthcare: A methodological survey from an attribution view. WIREs Mechanisms of Disease, 14(3), January 2022. URL: http://dx.doi.org/10.1002/wsbm.1548, doi:10.1002/wsbm.1548.
- To explain or not to explain?—artificial intelligence explainability in clinical decision support systems. PLOS Digital Health, 1(2):e0000016, February 2022. URL: http://dx.doi.org/10.1371/journal.pdig.0000016, doi:10.1371/journal.pdig.0000016.
- Explainable ai under contract and tort law: legal incentives and technical challenges. Artificial Intelligence and Law, 28(4):415–439, January 2020. URL: http://dx.doi.org/10.1007/s10506-020-09260-6, doi:10.1007/s10506-020-09260-6.
- Legal requirements on explainability in machine learning. Artificial Intelligence and Law, 29(2):149–169, July 2020. URL: http://dx.doi.org/10.1007/s10506-020-09270-4, doi:10.1007/s10506-020-09270-4.
- Interpretable hierarchical symbolic regression for safety-critical systems with an application to highway crash prediction. Engineering Applications of Artificial Intelligence, 117:105534, January 2023. URL: http://dx.doi.org/10.1016/j.engappai.2022.105534, doi:10.1016/j.engappai.2022.105534.
- Interpretable and explainable machine learning for materials science and chemistry. Accounts of Materials Research, 3(6):597–607, June 2022. URL: http://dx.doi.org/10.1021/accountsmr.1c00244, doi:10.1021/accountsmr.1c00244.
- On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PLOS ONE, 10(7):e0130140, July 2015. URL: http://dx.doi.org/10.1371/journal.pone.0130140, doi:10.1371/journal.pone.0130140.
- Axiomatic attribution for deep networks. In International conference on machine learning, pages 3319–3328. PMLR, 2017.
- Learning important features through propagating activation differences. In International conference on machine learning, pages 3145–3153. PMLR, 2017.
- Methods for interpreting and understanding deep neural networks. Digital Signal Processing, 73:1–15, February 2018. URL: http://dx.doi.org/10.1016/j.dsp.2017.10.011, doi:10.1016/j.dsp.2017.10.011.
- Discovering governing equations from data by sparse identification of nonlinear dynamical systems. Proceedings of the National Academy of Sciences, 113(15):3932–3937, March 2016. URL: http://dx.doi.org/10.1073/pnas.1517384113, doi:10.1073/pnas.1517384113.
- Distilling free-form natural laws from experimental data. Science, 324(5923):81–85, April 2009. URL: http://dx.doi.org/10.1126/science.1165893, doi:10.1126/science.1165893.
- Kernel methods for interpretable machine learning of order parameters. Physical Review B, 96(20), November 2017. URL: http://dx.doi.org/10.1103/PhysRevB.96.205146, doi:10.1103/physrevb.96.205146.
- Probing hidden spin order with interpretable machine learning. Physical Review B, 99(6), February 2019. URL: http://dx.doi.org/10.1103/PhysRevB.99.060404, doi:10.1103/physrevb.99.060404.
- Deep learning for universal linear embeddings of nonlinear dynamics. Nature Communications, 9(1), November 2018. URL: http://dx.doi.org/10.1038/s41467-018-07210-0, doi:10.1038/s41467-018-07210-0.
- Discovering symbolic models from deep learning with inductive biases. Advances in Neural Information Processing Systems, 33:17429–17442, 2020.
- Rediscovering orbital mechanics with machine learning. Machine Learning: Science and Technology, 4(4):045002, October 2023. URL: http://dx.doi.org/10.1088/2632-2153/acfa63, doi:10.1088/2632-2153/acfa63.
- Machine learning of explicit order parameters: From the ising model to su(2) lattice gauge theory. Physical Review B, 96(18), November 2017. URL: http://dx.doi.org/10.1103/PhysRevB.96.184410, doi:10.1103/physrevb.96.184410.
- Correlator convolutional neural networks as an interpretable architecture for image-like quantum matter data. Nature Communications, 12(1), June 2021. URL: http://dx.doi.org/10.1038/s41467-021-23952-w, doi:10.1038/s41467-021-23952-w.
- Discovering symmetry invariants and conserved quantities by interpreting siamese neural networks. Physical Review Research, 2(3), September 2020. URL: http://dx.doi.org/10.1103/PhysRevResearch.2.033499, doi:10.1103/physrevresearch.2.033499.
- Discovering invariants via machine learning. Physical Review Research, 3(4), December 2021. URL: http://dx.doi.org/10.1103/PhysRevResearch.3.L042035, doi:10.1103/physrevresearch.3.l042035.
- Machine learning conservation laws from trajectories. Physical Review Letters, 126(18), May 2021. URL: http://dx.doi.org/10.1103/PhysRevLett.126.180604, doi:10.1103/physrevlett.126.180604.
- Ahmed M Alaa and Mihaela van der Schaar. Demystifying black-box models with symbolic metamodels. Advances in Neural Information Processing Systems, 32, 2019.
- Symbolic metamodels for interpreting black-boxes using primitive functions, 2023. URL: https://arxiv.org/abs/2302.04791, doi:10.48550/ARXIV.2302.04791.
- Learning outside the black-box: The pursuit of interpretable models. Advances in neural information processing systems, 33:17838–17849, 2020.
- Operon c++: an efficient genetic programming framework for symbolic regression. In Proceedings of the 2020 Genetic and Evolutionary Computation Conference Companion, GECCO ’20. ACM, July 2020. URL: http://dx.doi.org/10.1145/3377929.3398099, doi:10.1145/3377929.3398099.
- Pysindy: A comprehensive python package for robust sparse system identification. Journal of Open Source Software, 7(69):3994, January 2022. URL: http://dx.doi.org/10.21105/joss.03994, doi:10.21105/joss.03994.
- An approach to symbolic regression using feyn, 2021. URL: https://arxiv.org/abs/2104.05417, doi:10.48550/ARXIV.2104.05417.
- Improving model-based genetic programming for symbolic regression of small expressions. Evolutionary Computation, 29(2):211–237, 2021. URL: http://dx.doi.org/10.1162/evco_a_00278, doi:10.1162/evco_a_00278.
- Trevor Stephens. Gplearn version 0.4.2. https://github.com/trevorstephens/gplearn, 2022.
- Miles Cranmer. Interpretable machine learning for science with pysr and symbolicregression.jl, 2023. URL: https://arxiv.org/abs/2305.01582, doi:10.48550/ARXIV.2305.01582.
- Extrapolation and learning equations, 2016. URL: https://arxiv.org/abs/1610.02995, doi:10.48550/ARXIV.1610.02995.
- Learning equations for extrapolation and control. In International Conference on Machine Learning, pages 4442–4450. PMLR, 2018.
- Deep symbolic regression: Recovering mathematical expressions from data via risk-seeking policy gradients. In International Conference on Learning Representations, 2020.
- End-to-end symbolic regression with transformers. Advances in Neural Information Processing Systems, 35:10269–10281, 2022.
- Ai feynman: A physics-inspired method for symbolic regression. Science Advances, 6(16), April 2020. URL: http://dx.doi.org/10.1126/sciadv.aay2631, doi:10.1126/sciadv.aay2631.
- Interpretable scientific discovery with symbolic regression: A review, 2022. URL: https://arxiv.org/abs/2211.10873, doi:10.48550/ARXIV.2211.10873.
- Let’s have a coffee with the standard model of particle physics! Physics Education, 52(3):034001, March 2017. URL: http://dx.doi.org/10.1088/1361-6552/aa5b25, doi:10.1088/1361-6552/aa5b25.
- Fischer Black. The pricing of commodity contracts. Journal of Financial Economics, 3(1–2):167–179, January 1976. URL: http://dx.doi.org/10.1016/0304-405X(76)90024-6, doi:10.1016/0304-405x(76)90024-6.
- Louis de Broglie. Xxxv. a tentative theory of light quanta. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, 47(278):446–458, February 1924. URL: http://dx.doi.org/10.1080/14786442408634378, doi:10.1080/14786442408634378.
- TensorFlow: Large-scale machine learning on heterogeneous systems, 2015. Software available from tensorflow.org. URL: https://www.tensorflow.org/.