Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 63 tok/s
Gemini 2.5 Pro 48 tok/s Pro
GPT-5 Medium 27 tok/s Pro
GPT-5 High 27 tok/s Pro
GPT-4o 49 tok/s Pro
Kimi K2 182 tok/s Pro
GPT OSS 120B 433 tok/s Pro
Claude Sonnet 4.5 35 tok/s Pro
2000 character limit reached

Prospects for inconsistency detection using large language models and sheaves (2401.16713v1)

Published 30 Jan 2024 in cs.CY, cs.CL, and math.AT

Abstract: We demonstrate that LLMs can produce reasonable numerical ratings of the logical consistency of claims. We also outline a mathematical approach based on sheaf theory for lifting such ratings to hypertexts such as laws, jurisprudence, and social media and evaluating their consistency globally. This approach is a promising avenue to increasing consistency in and of government, as well as to combating mis- and disinformation and related ills.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (103)
  1. Srinivas M Aji and Robert J McEliece. 2000. The generalized distributive law. IEEE Transactions on Information Theory 46, 2 (2000), 325–343.
  2. Tilman Altenburg. 2005. The private sector and development agencies: how to form successful alliances. Critical issues and lessons learned from leading donor programs. In International Business Forum.
  3. Compositional algorithms on compositional data: deciding sheaves on presheaves. arXiv:2302.05575 (2023).
  4. Topological differential testing. arXiv:2003.00976 (2020).
  5. Theory of Deep Learning. (Draft manuscript).
  6. Sanjeev Arora and Anirudh Goyal. 2023. A theory for emergence of complex skills in language models. arXiv:2307.15936 (2023).
  7. Amos Azaria and Tom Mitchell. 2023. The internal state of an LLM knows when it’s lying. arXiv:2304.13734 (2023).
  8. Maximum satisfiability. In Handbook of Satisfiability, Armin Biere, Marijn Heule, and Hans van Maaren (Eds.). IOS Press.
  9. E. Bach. 1999. Sheaf cohomology is #P-hard. Journal of Symbolic Computation 27, 4 (1999), 429–433.
  10. The current digital arena and its risks to serving military personnel. NATO STRATCOM COE (2019), 7–18.
  11. Carl T Bergstrom and C Brandon Ogbunu. 2023. ChatGPT isn’t ‘hallucinating.’ It’s bullshitting. Undark 6 April (2023).
  12. Carl T Bergstrom and Jevin D West. 2021. Calling Bullshit: The Art of Skepticism in a Data-Driven World. Random House.
  13. Philippe Besnard and John Grant. 2020. Relative inconsistency measures. Artificial Intelligence 280 (2020), 103231.
  14. Fred L Block and Matthew R Keller. 2015. State of Innovation: The US Government’s Role in Technology Development. Routledge.
  15. Adam Brandenburger and H Jerome Keisler. 2006. An impossibility theorem on beliefs in games. Studia Logica 84 (2006), 211–240.
  16. Discovering latent knowledge in language models without supervision. arXiv:2212.03827 (2022).
  17. Approximate model counting. In Handbook of Satisfiability, Armin Biere, Marijn Heule, and Hans van Maaren (Eds.). IOS Press.
  18. Canyu Chen and Kai Shu. 2023. Combating misinformation in the age of LLMs: opportunities and challenges. arXiv: 2311.05656 (2023).
  19. Government responses to online disinformation unpacked. Internet Policy Review 12, 4 (2023), 1–19.
  20. Justin Michael Curry. 2014. Sheaves, Cosheaves and Applications. Ph. D. Dissertation. University of Pennsylvania.
  21. Wojciech Cyrul. 2013. Consistency and coherence in the “hypertext” of law: a textological approach. In Coherence: Insights from Philosophy, Jurisprudence and Artificial Intelligence, Michał Araszkiewicz and Jaromír Šavelka (Eds.). Springer.
  22. Marian David. 2022. The correspondence theory of truth. In The Stanford Encyclopedia of Philosophy (Summer 2022 ed.), Edward N. Zalta (Ed.).
  23. Expander graph propagation. In Learning on Graphs (Proceedings of Machine Learning Research, Vol. 198), Bastian Rieck and Razvan Pascanu (Eds.).
  24. Syllogistic reasoning for legal judgment analysis. In Conference on Empirical Methods in Natural Language Processing.
  25. Chain-of-verification reduces hallucination in large language models. arXiv:2309.11495 (2023).
  26. Clifford H Dowker. 1952. Homology groups of relations. Annals of Mathematics (1952), 84–95.
  27. Alexandra Draxler. 2008. New Partnerships for EFA: Building on Experience. UNESCO.
  28. Crossing the valley of death: Five underlying innovation processes. Technovation 109 (2022), 102162.
  29. Challenges with unsupervised LLM knowledge discovery. arXiv:2312.10029 (2023).
  30. Harry G Frankfurt. 2005. On Bullshit. Princeton.
  31. Harry G Frankfurt. 2010. On Truth. Random House.
  32. Boris Galitsky. 2021. Artificial Intelligence for Customer Relationship Management. Springer.
  33. Retrieval-augmented generation for large language models: a survey. arXiv:2312.10997 (2023).
  34. Sourcebook for Evaluating Global and Regional Partnership Programs: Indicative Principles and Standards. World Bank.
  35. Robert Ghrist. 2014. Elementary Applied Topology. Createspace.
  36. Robert Ghrist. 2022. Network sheaf models for social information dynamics. In IEEE International Conference on Collaboration and Internet Computing (CIC). IEEE.
  37. Robert Goldblatt. 2014. Topoi: The Categorial Analysis of Logic. Elsevier.
  38. Akshat Gupta. 2023. Probing quantifier comprehension in large language models. arXiv:2306.07384 (2023).
  39. LESA: linguistic encapsulation and semantic amalgamation based generalised claim detection from online content. In Conference of the European Chapter of the Association for Computational Linguistics: Main Volume.
  40. Jakob Hansen. 2021. A gentle introduction to sheaves on graphs. https://www.jakobhansen.org/publications/gentleintroduction.pdf.
  41. Jakob Hansen and Thomas Gebhart. 2020. Sheaf neural networks. arXiv:2012.06333 (2020).
  42. Jakob Hansen and Robert Ghrist. 2019. Toward a spectral theory of cellular sheaves. Journal of Applied and Computational Topology 3 (2019), 315–358.
  43. Jakob Hansen and Robert Ghrist. 2021. Opinion dynamics on discourse sheaves. SIAM J. Appl. Math. 81, 5 (2021), 2033–2060.
  44. Ludmilla Huntsman. 2013. Private Sector Partnership Guide. US Department of State Bureau of Educational and Cultural Affairs.
  45. Topology. In Mathematics in Cyber Research, Paul L Goethals, Natalie M Scala, and Daniel T Bennett (Eds.). CRC.
  46. Disinformation detection: an evolving challenge in the age of LLMs. arXiv:2309.15847 (2023).
  47. Cong Jiang and Xiaolei Yang. 2023. Legal syllogism prompting: teaching large language models for legal judgment prediction. In International Conference on Artificial Intelligence and Law.
  48. A sheaf theoretical approach to uncertainty quantification of heterogeneous geolocation information. Sensors 20, 12 (2020), 3418.
  49. Noise: A Flaw in Human Judgment. Little, Brown Spark.
  50. Can ChatGPT perform reasoning using the IRAC method in analyzing legal scenarios like a lawyer?. In Findings of the Association for Computational Linguistics.
  51. GPT-4 passes the bar exam. Available at SSRN 4389233 (2023).
  52. Donald E Knuth. 2015. The Art of Computer Programming, Volume 4, Fascicle 6: Satisfiability. Addison-Wesley.
  53. Sheaves as a framework for understanding and interpreting model fit. In IEEE/CVF International Conference on Computer Vision.
  54. Retrieval-augmented generation for knowledge-intensive NLP tasks. In Advances in Neural Information Processing Systems.
  55. ContraDoc: understanding self-contradictions in documents with large language models. arXiv:2311.09182 (2023).
  56. Shane Littrell and Jonathan A Fugelsang. 2023. Bullshit blind spots: the roles of miscalibration and information processing in bullshit detection. Thinking & Reasoning (2023), 1–30.
  57. Evaluating the logical reasoning ability of ChatGPT and GPT-4. arXiv:2304.03439 (2023).
  58. Cognitive dissonance: why do language model outputs disagree with internal representations of truthfulness?. In Conference on Empirical Methods in Natural Language Processing.
  59. Survey on graph neural network acceleration: an algorithmic perspective. In International Joint Conference on Artificial Intelligence.
  60. Saunders MacLane and Ieke Moerdijk. 2012. Sheaves in Geometry and Logic: A First Introduction to Topos Theory. Springer.
  61. Samuel Marks and Max Tegmark. 2023. The geometry of truth: emergent linear structure in large language model representations of true/false datasets. arXiv:2310.06824 (2023).
  62. Alex Mesoudi and Andrew Whiten. 2008. The multiple roles of cultural transmission experiments in understanding human cultural evolution. Philosophical Transactions of the Royal Society B: Biological Sciences 363, 1509 (2008), 3489–3501.
  63. Self-contradictory hallucinations of large language models: evaluation, detection and mitigation. arXiv:2305.15852 (2023).
  64. Christina Nemr and William Gangware. 2019. Weapons of Mass Distraction: Foreign State-Sponsored Disinformation in the Digital Age. Park Advisors.
  65. US Department of State Global Engagement Center. 2024. The framework to counter foreign state information manipulation. https://www.state.gov/the-framework-to-counter-foreign-state-information-manipulation/. Accessed 28 January 2024.
  66. Eric Pacuit. 2007. Understanding the Brandenburger-Keisler paradox. Studia Logica 86 (2007), 435–454.
  67. AI deception: a survey of examples, risks, and potential solutions. arXiv:2308.14752 (2023).
  68. Gordon Pennycook and David G Rand. 2020. Who falls for fake news? The roles of bullshit receptivity, overclaiming, familiarity, and analytic thinking. Journal of Personality 88, 2 (2020), 185–200.
  69. Roger Penrose. 1992. On the cohomology of impossible figures. Leonardo (1992), 245–247.
  70. Oren Perez. 2005. The institutionalization of inconsistency: from fluid concepts to random walk. In Paradoxes and Inconsistencies in the Law. Bloomsbury.
  71. Marija B Petrović and Iris Žeželj. 2022. Thinking inconsistently: development and validation of an instrument for assessing proneness to doublethink. European Journal of Psychological Assessment 38, 6 (2022), 463.
  72. Brenda Praggastis. 2016. Maximal sections of sheaves of data over an abstract simplicial complex. arXiv:1612.00397 (2016).
  73. Linking Łukasiewicz logic and Boolean maximum satisfiability. In IEEE International Symposium on Multiple-Valued Logic (ISMVL).
  74. Hans Riess and Robert Ghrist. 2022. Diffusion of information on networked lattices by gossip. In IEEE Conference on Decision and Control (CDC). IEEE.
  75. Michael Robinson. 2017. Sheaves are the canonical data structure for sensor integration. Information Fusion 36 (2017), 208–224.
  76. Michael Robinson. 2019. Hunting for foxes with sheaves. Notices of the American Mathematical Society 66, 5 (2019), 661–676.
  77. Michael Robinson. 2020. Assignments to sheaves of pseudometric spaces. Compositionality 2 (2020), 2.
  78. Sheaf-theoretic framework for optimal network control. arXiv:2012.00120 (2020).
  79. Daniel Rosiak. 2022. Sheaf Theory Through Examples. MIT.
  80. Uwe Schöning and Jacobo Torán. 2013. The Satisfiability Problem: Algorithms and Analyses. Lehmanns Media.
  81. Yaroslav Shramko and Heinrich Wansing. 2021. Truth values. In The Stanford Encyclopedia of Philosophy (Winter 2021 ed.), Edward N. Zalta (Ed.).
  82. Joan Ramon Soler and Felip Manya. 2016. A bit-vector approach to satisfiability testing in finitely-valued logics. In IEEE International Symposium on Multiple-Valued Logic (ISMVL).
  83. Yellamraju V Srinivas. 1993a. Contract N00014-92-C-0124. Technical Report. Office of Naval Research.
  84. Yellamraju V Srinivas. 1993b. A sheaf-theoretic approach to pattern matching and related problems. Theoretical Computer Science 112, 1 (1993), 53–97.
  85. Evaluating the factual consistency of large language models through news summarization. In Findings of the Association for Computational Linguistics.
  86. Galactica: a large language model for science. arXiv:2211.09085 (2022).
  87. Gemini: a family of highly capable multimodal models. arXiv:2312.11805 (2023).
  88. Richard H Thaler and Cass R Sunstein. 2021. Nudge: The Final Edition. Yale.
  89. Fine-tuning language models for factuality. arXiv:2311.08401 (2023).
  90. Richard E Turner. 2023. An introduction to transformers. arXiv:2304.10557 (2023).
  91. US Department of State. 2019. Foreign Affairs Manual 2 FAM 970.
  92. Attention is all you need. In Advances in Neural Information Processing Systems.
  93. Vijay V Vazirani. 2001. Approximation Algorithms. Springer.
  94. Rand Waltzman. 2017. The weaponization of information: the need for cognitive security. Testimony presented before the Senate Armed Services Committee, Subcommittee on Cybersecurity.
  95. Survey on factuality in large language models: knowledge, retrieval and domain-specificity. arXiv:2310.07521 (2023).
  96. Self-consistency improves chain of thought reasoning in language models. arXiv:2203.11171 (2022).
  97. Emergent abilities of large language models. arXiv:2206.07682 (2022).
  98. World Economic Forum. 2005. Building on the Monterrey consensus: the growing role of public-private partnerships in mobilizing resources for development.
  99. The Global Risks Report.
  100. Topological analysis of contradictions in text. In International ACM SIGIR Conference on Research and Development in Information Retrieval.
  101. James O. Young. 2018. The coherence theory of truth. In The Stanford Encyclopedia of Philosophy (Fall 2018 ed.), Edward N. Zalta (Ed.).
  102. Exploring the effectiveness of prompt engineering for legal reasoning tasks. In Findings of the Association for Computational Linguistics.
  103. Wlodek Zadrozny and Luciana Garbayo. 2018. A sheaf model of contradictions and disagreements. Preliminary report and discussion. arXiv:1801.09036 (2018).
Citations (3)

Summary

We haven't generated a summary for this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube