Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash 88 tok/s
Gemini 2.5 Pro 49 tok/s Pro
GPT-5 Medium 15 tok/s
GPT-5 High 16 tok/s Pro
GPT-4o 105 tok/s
GPT OSS 120B 471 tok/s Pro
Kimi K2 202 tok/s Pro
2000 character limit reached

Full Bayesian Significance Testing for Neural Networks (2401.13335v1)

Published 24 Jan 2024 in stat.ML, cs.AI, and cs.LG

Abstract: Significance testing aims to determine whether a proposition about the population distribution is the truth or not given observations. However, traditional significance testing often needs to derive the distribution of the testing statistic, failing to deal with complex nonlinear relationships. In this paper, we propose to conduct Full Bayesian Significance Testing for neural networks, called \textit{n}FBST, to overcome the limitation in relationship characterization of traditional approaches. A Bayesian neural network is utilized to fit the nonlinear and multi-dimensional relationships with small errors and avoid hard theoretical derivation by computing the evidence value. Besides, \textit{n}FBST can test not only global significance but also local and instance-wise significance, which previous testing methods don't focus on. Moreover, \textit{n}FBST is a general framework that can be extended based on the measures selected, such as Grad-\textit{n}FBST, LRP-\textit{n}FBST, DeepLIFT-\textit{n}FBST, LIME-\textit{n}FBST. A range of experiments on both simulated and real data are conducted to show the advantages of our method.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (56)
  1. Layer-wise relevance propagation for neural networks with local renormalization layers. In International Conference on Artificial Neural Networks, 63–71. Springer.
  2. Variational inference: A review for statisticians. Journal of the American statistical Association, 112(518): 859–877.
  3. Weight uncertainty in neural networks. arXiv preprint arXiv:1505.05424.
  4. AlphaPortfolio: Direct construction through deep reinforcement learning and interpretable AI. Available at SSRN 3554486.
  5. Algorithmic transparency via quantitative input influence: Theory and experiments with learning systems. In 2016 IEEE symposium on security and privacy (SP), 598–617. IEEE.
  6. Can a significance test be genuinely Bayesian? Bayesian Analysis, 3(1): 79 – 100.
  7. Evidence and credibility: Full Bayesian significance test for precise hypotheses. Entropy, 1(4): 99–110.
  8. Efron, B. 1979. Bootstrap Methods: Another Look at the Jackknife. The Annals of Statistics, 7(1): 1 – 26.
  9. Consistent model specification tests: omitted variables and semiparametric functional forms. Econometrica: Journal of the econometric society, 865–890.
  10. Fisher, R.Β A. 1922. On the interpretation of Ο‡πœ’\chiitalic_Ο‡ 2 from contingency tables, and the calculation of P. Journal of the royal statistical society, 85(1): 87–94.
  11. Friedman, J.Β H. 2001. Greedy function approximation: a gradient boosting machine. Annals of statistics, 1189–1232.
  12. Gozalo, P.Β L. 1993. A consistent model specification test for nonparametric estimation of regression function models. Econometric Theory, 9(3): 451–477.
  13. A simple and effective model-based variable importance measure. arXiv preprint arXiv:1805.04755.
  14. Learning stable graphs from multiple environments with selection bias. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2194–2202.
  15. Significance tests for neural networks. Journal of Machine Learning Research, 21(227): 1–29.
  16. Multilayer feedforward networks are universal approximators. Neural networks, 2(5): 359–366.
  17. Bayesian parameter estimation via variational methods. Statistics and Computing, 10(1): 25–37.
  18. Jeffreys, H. 1998. The theory of probability. OuP Oxford.
  19. STDEN: Towards physics-guided neural networks for traffic flow prediction. In Proceedings of the AAAI Conference on Artificial Intelligence, volumeΒ 36, 4048–4056.
  20. Interpretable spatiotemporal deep learning model for traffic flow prediction based on potential energy fields. In 2020 IEEE International Conference on Data Mining (ICDM), 1076–1081. IEEE.
  21. Bayes factors. Journal of the american statistical association, 90(430): 773–795.
  22. What uncertainties do we need in bayesian deep learning for computer vision? Advances in neural information processing systems, 30.
  23. Investigating the influence of noise and distractors on the interpretation of neural networks. arXiv preprint arXiv:1611.07270.
  24. Nonparametric significance testing. Econometric Theory, 16(4): 576–601.
  25. Nonparametric selection of regressors: The nonnested case. Econometrica: Journal of the Econometric Society, 207–219.
  26. A unified approach to interpreting model predictions. Advances in neural information processing systems, 30.
  27. Analysis of serial measurements in medical research. British Medical Journal, 300(6719): 230–235.
  28. Illuminating the β€œblack box”: a randomization approach for understanding variable contributions in artificial neural networks. Ecological modelling, 154(1-2): 135–150.
  29. Orlitzky, M. 2012. How can significance tests be deinstitutionalized? Organizational Research Methods, 15(2): 199–228.
  30. Bayesian hypothesis testing: an alternative to null hypothesis significance testing (NHST) in psychology and social sciences. In Bayesian inference. IntechOpen.
  31. Parzen, E. 1962. On estimation of a probability density function and mode. The annals of mathematical statistics, 33(3): 1065–1076.
  32. Racine, J. 1997. Consistent significance testing for nonparametric regression. Journal of Business & Economic Statistics, 15(3): 369–378.
  33. ” Why should i trust you?” Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, 1135–1144.
  34. Effect sizes and statistical testing in the determination of clinical significance in behavioral medicine research. Annals of Behavioral Medicine, 27(2): 138–145.
  35. Scott, D.Β W. 1979. On optimal and data-based histograms. Biometrika, 66(3): 605–610.
  36. Learning important features through propagating activation differences. In International conference on machine learning, 3145–3153. PMLR.
  37. Deep inside convolutional networks: Visualising image classification models and saliency maps. arXiv preprint arXiv:1312.6034.
  38. Striving for simplicity: The all convolutional net. arXiv preprint arXiv:1412.6806.
  39. Student. 1908. The probable error of a mean. Biometrika, 6(1): 1–25.
  40. Axiomatic attribution for deep networks. In International conference on machine learning, 3319–3328. PMLR.
  41. Accurate quantitative estimation of energy performance of residential buildings using statistical machine learning tools. Energy and buildings, 49: 560–567.
  42. Vuong, Q.Β H. 1989. Likelihood ratio tests for model selection and non-nested hypotheses. Econometrica: Journal of the Econometric Society, 307–333.
  43. SVM-based deep stacking networks. In Proceedings of the AAAI conference on artificial intelligence, volumeΒ 33, 5273–5280.
  44. Traffic flow prediction based on spatiotemporal potential energy fields. IEEE Transactions on Knowledge and Data Engineering.
  45. Deep fuzzy cognitive maps for interpretable multivariate time series prediction. IEEE transactions on fuzzy systems, 29(9): 2647–2660.
  46. Multilevel wavelet decomposition network for interpretable time series analysis. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2437–2446.
  47. Deep Trajectory Recovery with Fine-Grained Calibration using Kalman Filter. IEEE Transactions on Knowledge and Data Engineering, 33(3): 921–934.
  48. Personalized route recommendation with neural network enhanced search algorithm. IEEE Transactions on Knowledge and Data Engineering, 34(12): 5910–5924.
  49. Empowering A* search algorithms with neural networks for personalized route recommendation. In Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, 539–547.
  50. Interpretability is a kind of safety: An interpreter-based ensemble for adversary defense. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 15–24.
  51. WHEN: A Wavelet-DTW Hybrid Attention Network for Heterogeneous Time Series Analysis. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2361–2373.
  52. AlphaStock: A Buying-Winners-and-Selling-Losers Investment Strategy Using Interpretable Deep Reinforcement Attention Networks. In Proceedings of the 25th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD ’19, 1900–1908. New York, NY, USA: Association for Computing Machinery. ISBN 9781450362016.
  53. White, H. 1989a. Learning in artificial neural networks: A statistical perspective. Neural computation, 1(4): 425–464.
  54. White, H. 1989b. Some asymptotic results for learning in single hidden-layer feedforward network models. Journal of the American Statistical Association, 84(408): 1003–1013.
  55. Yatchew, A.Β J. 1992. Nonparametric regression tests based on least squares. Econometric Theory, 8(4): 435–451.
  56. Visualizing and understanding convolutional networks. In European conference on computer vision, 818–833. Springer.
Citations (2)
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Ai Generate Text Spark Streamline Icon: https://streamlinehq.com

Paper Prompts

Sign up for free to create and run prompts on this paper using GPT-5.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com