Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
166 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Adversarially-Robust Inference on Trees via Belief Propagation (2404.00768v1)

Published 31 Mar 2024 in cs.DS, math.PR, math.ST, stat.ML, and stat.TH

Abstract: We introduce and study the problem of posterior inference on tree-structured graphical models in the presence of a malicious adversary who can corrupt some observed nodes. In the well-studied broadcasting on trees model, corresponding to the ferromagnetic Ising model on a $d$-regular tree with zero external field, when a natural signal-to-noise ratio exceeds one (the celebrated Kesten-Stigum threshold), the posterior distribution of the root given the leaves is bounded away from $\mathrm{Ber}(1/2)$, and carries nontrivial information about the sign of the root. This posterior distribution can be computed exactly via dynamic programming, also known as belief propagation. We first confirm a folklore belief that a malicious adversary who can corrupt an inverse-polynomial fraction of the leaves of their choosing makes this inference impossible. Our main result is that accurate posterior inference about the root vertex given the leaves is possible when the adversary is constrained to make corruptions at a $\rho$-fraction of randomly-chosen leaf vertices, so long as the signal-to-noise ratio exceeds $O(\log d)$ and $\rho \leq c \varepsilon$ for some universal $c > 0$. Since inference becomes information-theoretically impossible when $\rho \gg \varepsilon$, this amounts to an information-theoretically optimal fraction of corruptions, up to a constant multiplicative factor. Furthermore, we show that the canonical belief propagation algorithm performs this inference.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (24)
  1. Emmanuel Abbe. Community detection and stochastic block models: recent developments. Journal of Machine Learning Research, 18(177):1–86, 2018.
  2. Robust bayes and empirical bayes analysis with ε𝜀\varepsilonitalic_ε-contaminated priors. The Annals of Statistics, pages 461–486, 1986.
  3. Glauber dynamics on trees and hyperbolic graphs. Probability Theory and Related Fields, 131(3):311–340, 2005.
  4. Local statistics, semidefinite programming, and community detection. In Proceedings of the 2021 ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1298–1316. SIAM, 2021.
  5. Kalman filtering with adversarial corruptions. In Proceedings of the 54th Annual ACM SIGACT Symposium on Theory of Computing, pages 832–845, 2022.
  6. Robust recovery for stochastic block models. In 2021 IEEE 62nd Annual Symposium on Foundations of Computer Science (FOCS), pages 387–394. IEEE, 2022.
  7. Distribution-independent pac learning of halfspaces with massart noise. Advances in Neural Information Processing Systems, 32, 2019.
  8. Algorithmic High-dimensional Robust Statistics. Cambridge University Press, 2023.
  9. Learning halfspaces with massart noise under structured distributions. In Conference on Learning Theory, pages 1486–1513. PMLR, 2020.
  10. Optimal phylogenetic reconstruction. In Proceedings of the thirty-eighth annual ACM symposium on Theory of computing, pages 159–168, 2006.
  11. Broadcasting on trees and the ising model. Annals of Applied Probability, pages 410–433, 2000.
  12. Isidore Jacob Good. Probability and the Weighing of Evidence. Charles Griffin & Co., Ltd., London; Hafner Publishing Co., New York, 1950.
  13. Semidefinite programs simulate approximate message passing robustly. arXiv preprint arXiv:2311.09017, 2023.
  14. Robust reconstruction on trees is determined by the second eigenvalue. Ann. Probab., 32(3B):2630–2649, 2004.
  15. Michael J Kearns. The computational complexity of machine learning. MIT press, 1990.
  16. Minimax rates for robust community detection. In 2022 IEEE 63rd Annual Symposium on Foundations of Computer Science (FOCS), pages 823–831. IEEE, 2022.
  17. Risk bounds for statistical learning. Ann. Statist., 34(5):2326–2366, 2006.
  18. Reconstruction and estimation in the planted partition model. Probability Theory and Related Fields, 162:431–461, 2015.
  19. Belief propagation, robust reconstruction and optimal recovery of block models. Ann. Appl. Probab., 26(4):2211–2256, 2016.
  20. Elchanan Mossel. Survey: Information flow on trees. arXiv preprint math/0406446, 2004.
  21. How robust are reconstruction thresholds for community detection? In STOC’16—Proceedings of the 48th Annual ACM SIGACT Symposium on Theory of Computing, pages 828–841. ACM, New York, 2016.
  22. Glauber dynamics on trees: boundary conditions and mixing time. Communications in Mathematical Physics, 250:301–334, 2004.
  23. Yury Polyanskiy. Personal communication, December 2023. Personal communication with Yury Polyanskiy.
  24. Ising model on locally tree-like graphs: Uniqueness of solutions to cavity equations. arXiv preprint arXiv:2211.15242, 2022.

Summary

We haven't generated a summary for this paper yet.