Prospects for inconsistency detection using large language models and sheaves (2401.16713v1)
Abstract: We demonstrate that LLMs can produce reasonable numerical ratings of the logical consistency of claims. We also outline a mathematical approach based on sheaf theory for lifting such ratings to hypertexts such as laws, jurisprudence, and social media and evaluating their consistency globally. This approach is a promising avenue to increasing consistency in and of government, as well as to combating mis- and disinformation and related ills.
- Srinivas M Aji and Robert J McEliece. 2000. The generalized distributive law. IEEE Transactions on Information Theory 46, 2 (2000), 325–343.
- Tilman Altenburg. 2005. The private sector and development agencies: how to form successful alliances. Critical issues and lessons learned from leading donor programs. In International Business Forum.
- Compositional algorithms on compositional data: deciding sheaves on presheaves. arXiv:2302.05575 (2023).
- Topological differential testing. arXiv:2003.00976 (2020).
- Theory of Deep Learning. (Draft manuscript).
- Sanjeev Arora and Anirudh Goyal. 2023. A theory for emergence of complex skills in language models. arXiv:2307.15936 (2023).
- Amos Azaria and Tom Mitchell. 2023. The internal state of an LLM knows when it’s lying. arXiv:2304.13734 (2023).
- Maximum satisfiability. In Handbook of Satisfiability, Armin Biere, Marijn Heule, and Hans van Maaren (Eds.). IOS Press.
- E. Bach. 1999. Sheaf cohomology is #P-hard. Journal of Symbolic Computation 27, 4 (1999), 429–433.
- The current digital arena and its risks to serving military personnel. NATO STRATCOM COE (2019), 7–18.
- Carl T Bergstrom and C Brandon Ogbunu. 2023. ChatGPT isn’t ‘hallucinating.’ It’s bullshitting. Undark 6 April (2023).
- Carl T Bergstrom and Jevin D West. 2021. Calling Bullshit: The Art of Skepticism in a Data-Driven World. Random House.
- Philippe Besnard and John Grant. 2020. Relative inconsistency measures. Artificial Intelligence 280 (2020), 103231.
- Fred L Block and Matthew R Keller. 2015. State of Innovation: The US Government’s Role in Technology Development. Routledge.
- Adam Brandenburger and H Jerome Keisler. 2006. An impossibility theorem on beliefs in games. Studia Logica 84 (2006), 211–240.
- Discovering latent knowledge in language models without supervision. arXiv:2212.03827 (2022).
- Approximate model counting. In Handbook of Satisfiability, Armin Biere, Marijn Heule, and Hans van Maaren (Eds.). IOS Press.
- Canyu Chen and Kai Shu. 2023. Combating misinformation in the age of LLMs: opportunities and challenges. arXiv: 2311.05656 (2023).
- Government responses to online disinformation unpacked. Internet Policy Review 12, 4 (2023), 1–19.
- Justin Michael Curry. 2014. Sheaves, Cosheaves and Applications. Ph. D. Dissertation. University of Pennsylvania.
- Wojciech Cyrul. 2013. Consistency and coherence in the “hypertext” of law: a textological approach. In Coherence: Insights from Philosophy, Jurisprudence and Artificial Intelligence, Michał Araszkiewicz and Jaromír Šavelka (Eds.). Springer.
- Marian David. 2022. The correspondence theory of truth. In The Stanford Encyclopedia of Philosophy (Summer 2022 ed.), Edward N. Zalta (Ed.).
- Expander graph propagation. In Learning on Graphs (Proceedings of Machine Learning Research, Vol. 198), Bastian Rieck and Razvan Pascanu (Eds.).
- Syllogistic reasoning for legal judgment analysis. In Conference on Empirical Methods in Natural Language Processing.
- Chain-of-verification reduces hallucination in large language models. arXiv:2309.11495 (2023).
- Clifford H Dowker. 1952. Homology groups of relations. Annals of Mathematics (1952), 84–95.
- Alexandra Draxler. 2008. New Partnerships for EFA: Building on Experience. UNESCO.
- Crossing the valley of death: Five underlying innovation processes. Technovation 109 (2022), 102162.
- Challenges with unsupervised LLM knowledge discovery. arXiv:2312.10029 (2023).
- Harry G Frankfurt. 2005. On Bullshit. Princeton.
- Harry G Frankfurt. 2010. On Truth. Random House.
- Boris Galitsky. 2021. Artificial Intelligence for Customer Relationship Management. Springer.
- Retrieval-augmented generation for large language models: a survey. arXiv:2312.10997 (2023).
- Sourcebook for Evaluating Global and Regional Partnership Programs: Indicative Principles and Standards. World Bank.
- Robert Ghrist. 2014. Elementary Applied Topology. Createspace.
- Robert Ghrist. 2022. Network sheaf models for social information dynamics. In IEEE International Conference on Collaboration and Internet Computing (CIC). IEEE.
- Robert Goldblatt. 2014. Topoi: The Categorial Analysis of Logic. Elsevier.
- Akshat Gupta. 2023. Probing quantifier comprehension in large language models. arXiv:2306.07384 (2023).
- LESA: linguistic encapsulation and semantic amalgamation based generalised claim detection from online content. In Conference of the European Chapter of the Association for Computational Linguistics: Main Volume.
- Jakob Hansen. 2021. A gentle introduction to sheaves on graphs. https://www.jakobhansen.org/publications/gentleintroduction.pdf.
- Jakob Hansen and Thomas Gebhart. 2020. Sheaf neural networks. arXiv:2012.06333 (2020).
- Jakob Hansen and Robert Ghrist. 2019. Toward a spectral theory of cellular sheaves. Journal of Applied and Computational Topology 3 (2019), 315–358.
- Jakob Hansen and Robert Ghrist. 2021. Opinion dynamics on discourse sheaves. SIAM J. Appl. Math. 81, 5 (2021), 2033–2060.
- Ludmilla Huntsman. 2013. Private Sector Partnership Guide. US Department of State Bureau of Educational and Cultural Affairs.
- Topology. In Mathematics in Cyber Research, Paul L Goethals, Natalie M Scala, and Daniel T Bennett (Eds.). CRC.
- Disinformation detection: an evolving challenge in the age of LLMs. arXiv:2309.15847 (2023).
- Cong Jiang and Xiaolei Yang. 2023. Legal syllogism prompting: teaching large language models for legal judgment prediction. In International Conference on Artificial Intelligence and Law.
- A sheaf theoretical approach to uncertainty quantification of heterogeneous geolocation information. Sensors 20, 12 (2020), 3418.
- Noise: A Flaw in Human Judgment. Little, Brown Spark.
- Can ChatGPT perform reasoning using the IRAC method in analyzing legal scenarios like a lawyer?. In Findings of the Association for Computational Linguistics.
- GPT-4 passes the bar exam. Available at SSRN 4389233 (2023).
- Donald E Knuth. 2015. The Art of Computer Programming, Volume 4, Fascicle 6: Satisfiability. Addison-Wesley.
- Sheaves as a framework for understanding and interpreting model fit. In IEEE/CVF International Conference on Computer Vision.
- Retrieval-augmented generation for knowledge-intensive NLP tasks. In Advances in Neural Information Processing Systems.
- ContraDoc: understanding self-contradictions in documents with large language models. arXiv:2311.09182 (2023).
- Shane Littrell and Jonathan A Fugelsang. 2023. Bullshit blind spots: the roles of miscalibration and information processing in bullshit detection. Thinking & Reasoning (2023), 1–30.
- Evaluating the logical reasoning ability of ChatGPT and GPT-4. arXiv:2304.03439 (2023).
- Cognitive dissonance: why do language model outputs disagree with internal representations of truthfulness?. In Conference on Empirical Methods in Natural Language Processing.
- Survey on graph neural network acceleration: an algorithmic perspective. In International Joint Conference on Artificial Intelligence.
- Saunders MacLane and Ieke Moerdijk. 2012. Sheaves in Geometry and Logic: A First Introduction to Topos Theory. Springer.
- Samuel Marks and Max Tegmark. 2023. The geometry of truth: emergent linear structure in large language model representations of true/false datasets. arXiv:2310.06824 (2023).
- Alex Mesoudi and Andrew Whiten. 2008. The multiple roles of cultural transmission experiments in understanding human cultural evolution. Philosophical Transactions of the Royal Society B: Biological Sciences 363, 1509 (2008), 3489–3501.
- Self-contradictory hallucinations of large language models: evaluation, detection and mitigation. arXiv:2305.15852 (2023).
- Christina Nemr and William Gangware. 2019. Weapons of Mass Distraction: Foreign State-Sponsored Disinformation in the Digital Age. Park Advisors.
- US Department of State Global Engagement Center. 2024. The framework to counter foreign state information manipulation. https://www.state.gov/the-framework-to-counter-foreign-state-information-manipulation/. Accessed 28 January 2024.
- Eric Pacuit. 2007. Understanding the Brandenburger-Keisler paradox. Studia Logica 86 (2007), 435–454.
- AI deception: a survey of examples, risks, and potential solutions. arXiv:2308.14752 (2023).
- Gordon Pennycook and David G Rand. 2020. Who falls for fake news? The roles of bullshit receptivity, overclaiming, familiarity, and analytic thinking. Journal of Personality 88, 2 (2020), 185–200.
- Roger Penrose. 1992. On the cohomology of impossible figures. Leonardo (1992), 245–247.
- Oren Perez. 2005. The institutionalization of inconsistency: from fluid concepts to random walk. In Paradoxes and Inconsistencies in the Law. Bloomsbury.
- Marija B Petrović and Iris Žeželj. 2022. Thinking inconsistently: development and validation of an instrument for assessing proneness to doublethink. European Journal of Psychological Assessment 38, 6 (2022), 463.
- Brenda Praggastis. 2016. Maximal sections of sheaves of data over an abstract simplicial complex. arXiv:1612.00397 (2016).
- Linking Łukasiewicz logic and Boolean maximum satisfiability. In IEEE International Symposium on Multiple-Valued Logic (ISMVL).
- Hans Riess and Robert Ghrist. 2022. Diffusion of information on networked lattices by gossip. In IEEE Conference on Decision and Control (CDC). IEEE.
- Michael Robinson. 2017. Sheaves are the canonical data structure for sensor integration. Information Fusion 36 (2017), 208–224.
- Michael Robinson. 2019. Hunting for foxes with sheaves. Notices of the American Mathematical Society 66, 5 (2019), 661–676.
- Michael Robinson. 2020. Assignments to sheaves of pseudometric spaces. Compositionality 2 (2020), 2.
- Sheaf-theoretic framework for optimal network control. arXiv:2012.00120 (2020).
- Daniel Rosiak. 2022. Sheaf Theory Through Examples. MIT.
- Uwe Schöning and Jacobo Torán. 2013. The Satisfiability Problem: Algorithms and Analyses. Lehmanns Media.
- Yaroslav Shramko and Heinrich Wansing. 2021. Truth values. In The Stanford Encyclopedia of Philosophy (Winter 2021 ed.), Edward N. Zalta (Ed.).
- Joan Ramon Soler and Felip Manya. 2016. A bit-vector approach to satisfiability testing in finitely-valued logics. In IEEE International Symposium on Multiple-Valued Logic (ISMVL).
- Yellamraju V Srinivas. 1993a. Contract N00014-92-C-0124. Technical Report. Office of Naval Research.
- Yellamraju V Srinivas. 1993b. A sheaf-theoretic approach to pattern matching and related problems. Theoretical Computer Science 112, 1 (1993), 53–97.
- Evaluating the factual consistency of large language models through news summarization. In Findings of the Association for Computational Linguistics.
- Galactica: a large language model for science. arXiv:2211.09085 (2022).
- Gemini: a family of highly capable multimodal models. arXiv:2312.11805 (2023).
- Richard H Thaler and Cass R Sunstein. 2021. Nudge: The Final Edition. Yale.
- Fine-tuning language models for factuality. arXiv:2311.08401 (2023).
- Richard E Turner. 2023. An introduction to transformers. arXiv:2304.10557 (2023).
- US Department of State. 2019. Foreign Affairs Manual 2 FAM 970.
- Attention is all you need. In Advances in Neural Information Processing Systems.
- Vijay V Vazirani. 2001. Approximation Algorithms. Springer.
- Rand Waltzman. 2017. The weaponization of information: the need for cognitive security. Testimony presented before the Senate Armed Services Committee, Subcommittee on Cybersecurity.
- Survey on factuality in large language models: knowledge, retrieval and domain-specificity. arXiv:2310.07521 (2023).
- Self-consistency improves chain of thought reasoning in language models. arXiv:2203.11171 (2022).
- Emergent abilities of large language models. arXiv:2206.07682 (2022).
- World Economic Forum. 2005. Building on the Monterrey consensus: the growing role of public-private partnerships in mobilizing resources for development.
- The Global Risks Report.
- Topological analysis of contradictions in text. In International ACM SIGIR Conference on Research and Development in Information Retrieval.
- James O. Young. 2018. The coherence theory of truth. In The Stanford Encyclopedia of Philosophy (Fall 2018 ed.), Edward N. Zalta (Ed.).
- Exploring the effectiveness of prompt engineering for legal reasoning tasks. In Findings of the Association for Computational Linguistics.
- Wlodek Zadrozny and Luciana Garbayo. 2018. A sheaf model of contradictions and disagreements. Preliminary report and discussion. arXiv:1801.09036 (2018).
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.