LLMCheckup: Conversational Examination of Large Language Models via Interpretability Tools and Self-Explanations (2401.12576v2)
Abstract: Interpretability tools that offer explanations in the form of a dialogue have demonstrated their efficacy in enhancing users' understanding (Slack et al., 2023; Shen et al., 2023), as one-off explanations may fall short in providing sufficient information to the user. Current solutions for dialogue-based explanations, however, often require external tools and modules and are not easily transferable to tasks they were not designed for. With LLMCheckup, we present an easily accessible tool that allows users to chat with any state-of-the-art LLM about its behavior. We enable LLMs to generate explanations and perform user intent recognition without fine-tuning, by connecting them with a broad spectrum of Explainable AI (XAI) methods, including white-box explainability tools such as feature attributions, and self-explanations (e.g., for rationale generation). LLM-based (self-)explanations are presented as an interactive dialogue that supports follow-up questions and generates suggestions. LLMCheckupprovides tutorials for operations available in the system, catering to individuals with varying levels of expertise in XAI and supporting multiple input modalities. We introduce a new parsing strategy that substantially enhances the user intent recognition accuracy of the LLM. Finally, we showcase LLMCheckup for the tasks of fact checking and commonsense question answering.
- Explanations for CommonsenseQA: New Dataset and Models. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 3050–3065, Online. Association for Computational Linguistics.
- CrossCheck: Rapid, reproducible, and interpretable model evaluation. In Proceedings of the Second Workshop on Data Science with Human in the Loop: Language Advances, pages 79–85, Online. Association for Computational Linguistics.
- A tale of pronouns: Interpretability informs gender bias mitigation for fairer instruction-tuned machine translation. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 3996–4014, Singapore. Association for Computational Linguistics.
- PromptSource: An integrated development environment and repository for natural language prompts. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pages 93–104, Dublin, Ireland. Association for Computational Linguistics.
- Neural machine translation by jointly learning to align and translate. In 3rd International Conference on Learning Representations (ICLR 2015).
- Pythia: A suite for analyzing large language models across training and scaling. In Proceedings of the 40th International Conference on Machine Learning, ICML’23. JMLR.org.
- Petals: Collaborative inference and fine-tuning of large models. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations), pages 558–568, Toronto, Canada. Association for Computational Linguistics.
- Follow the successful herd: Towards explanations for improved use and mental models of natural language systems. In Proceedings of the 28th International Conference on Intelligent User Interfaces, IUI ’23, page 220–239, New York, NY, USA. Association for Computing Machinery.
- Language models are few-shot learners. In Advances in Neural Information Processing Systems, volume 33, pages 1877–1901. Curran Associates, Inc.
- e-SNLI: Natural language inference with natural language explanations. In Advances in Neural Information Processing Systems, volume 31. Curran Associates, Inc.
- A conversational interface for interacting with machine learning models. In XAILA @ ICAIL.
- SemEval-2017 task 1: Semantic textual similarity multilingual and crosslingual focused evaluation. In Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), pages 1–14, Vancouver, Canada. Association for Computational Linguistics.
- DISCO: Distilling counterfactuals with large language models. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 5514–5528, Toronto, Canada. Association for Computational Linguistics.
- AugGPT: Leveraging chatGPT for text data augmentation. arXiv, abs/2302.13007.
- 8-bit optimizers via block-wise quantization. In International Conference on Learning Representations.
- Joseph Enguehard. 2023. Sequential integrated gradients: a simple but effective method for explaining language models. In Findings of the Association for Computational Linguistics: ACL 2023, pages 7555–7565, Toronto, Canada. Association for Computational Linguistics.
- InterroLang: Exploring NLP models and datasets through dialogue-based explanations. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 5399–5421, Singapore. Association for Computational Linguistics.
- Juliana J. Ferreira and Mateus S. Monteiro. 2020. What are people doing about XAI user experience? a survey on AI explainability research and practice. In Design, User Experience, and Usability. Design for Contemporary Interactive Environments, pages 56–73, Cham. Springer International Publishing.
- OPTQ: Accurate quantization for generative pre-trained transformers. In The Eleventh International Conference on Learning Representations.
- A survey on automated fact-checking. Transactions of the Association for Computational Linguistics, 10:178–206.
- Diagnosing AI explanation methods with folk concepts of behavior. In Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency, FAccT ’23, page 247, New York, NY, USA. Association for Computing Machinery.
- Mistral 7B. arXiv, abs/2310.06825.
- Large language models are zero-shot reasoners. In Advances in Neural Information Processing Systems, volume 35, pages 22199–22213. Curran Associates, Inc.
- Rethinking explainability as a dialogue: A practitioner’s perspective. HCAI @ NeurIPS 2022.
- XMD: An end-to-end framework for interactive explanation-based debugging of NLP models. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations), pages 264–273, Toronto, Canada. Association for Computational Linguistics.
- Retrieval-augmented generation for knowledge-intensive NLP tasks. In Proceedings of the 34th International Conference on Neural Information Processing Systems, NIPS’20, Red Hook, NY, USA. Curran Associates Inc.
- DAIL: Data augmentation for in-context learning via self-paraphrase. arXiv, abs/2311.03319.
- In-context learning with many demonstration examples. arXiv, abs/2302.04931.
- MultiViz: Towards visualizing and understanding multimodal models. In The Eleventh International Conference on Learning Representations.
- Post-hoc interpretability for neural NLP: A survey. ACM Comput. Surv., 55(8).
- ConvXAI: a system for multimodal interaction with any black-box explainer. Cognitive Computation, 15(2):613–644.
- Few-shot self-rationalization with natural language prompts. In Findings of the Association for Computational Linguistics: NAACL 2022, pages 410–424, Seattle, United States. Association for Computational Linguistics.
- A survey on deep learning and explainability for automatic report generation from medical images. ACM Comput. Surv., 54(10s).
- George A. Miller. 1995. Wordnet: A lexical database for english. Commun. ACM, 38(11):39–41.
- IFAN: An explainability-focused interaction framework for humans and NLP models. arXiv, abs/2303.03124.
- Orca: Progressive learning from complex explanation traces of GPT-4. arXiv, abs/2306.02707.
- SemEval-2018 task 11: Machine comprehension using commonsense knowledge. In Proceedings of the 12th International Workshop on Semantic Evaluation, pages 747–757, New Orleans, Louisiana. Association for Computational Linguistics.
- The RefinedWeb dataset for Falcon LLM: Outperforming curated corpora with web data only. In Thirty-seventh Conference on Neural Information Processing Systems Datasets and Benchmarks Track.
- Nils Reimers and Iryna Gurevych. 2019. Sentence-BERT: Sentence embeddings using Siamese BERT-networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3982–3992, Hong Kong, China. Association for Computational Linguistics.
- "why should i trust you?": Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’16, page 1135–1144, New York, NY, USA. Association for Computing Machinery.
- Explaining NLP models via minimal contrastive editing (MiCE). In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 3840–3852, Online. Association for Computational Linguistics.
- Tailor: Generating and perturbing text with semantic controls. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 3194–3213, Dublin, Ireland. Association for Computational Linguistics.
- COVID-fact: Fact extraction and verification of real-world claims on COVID-19 pandemic. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 2116–2129, Online. Association for Computational Linguistics.
- Inseq: An interpretability toolkit for sequence generation models. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations), pages 421–435, Toronto, Canada. Association for Computational Linguistics.
- ConvXAI: Delivering heterogeneous AI explanations via conversations to support human-AI scientific writing. In Computer Supported Cooperative Work and Social Computing, CSCW ’23 Companion, page 384–387, New York, NY, USA. Association for Computing Machinery.
- Constrained language models yield few-shot semantic parsers. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 7699–7715, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Deep inside convolutional networks: Visualising image classification models and saliency maps. In Workshop at International Conference on Learning Representations.
- Explaining machine learning models with interactive natural language conversations using TalkToModel. Nature Machine Intelligence.
- Axiomatic attribution for deep networks. In Proceedings of the 34th International Conference on Machine Learning, volume 70 of Proceedings of Machine Learning Research, pages 3319–3328. PMLR.
- Evaluating semantic parsing against a simple web-based question answering model. In Proceedings of the 6th Joint Conference on Lexical and Computational Semantics (*SEM 2017), pages 161–167, Vancouver, Canada. Association for Computational Linguistics.
- The language interpretability tool: Extensible, interactive visualizations and analysis for NLP models. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 107–118, Online. Association for Computational Linguistics.
- FEVER: a large-scale dataset for fact extraction and VERification. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 809–819, New Orleans, Louisiana. Association for Computational Linguistics.
- Llama 2: Open foundation and fine-tuned chat models. arXiv, abs/2307.09288.
- Fairseq S2T: Fast speech-to-text modeling with fairseq. In Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing: System Demonstrations, pages 33–39, Suzhou, China. Association for Computational Linguistics.
- Plan-and-solve prompting: Improving zero-shot chain-of-thought reasoning by large language models. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2609–2634, Toronto, Canada. Association for Computational Linguistics.
- Self-consistency improves chain of thought reasoning in language models. In The Eleventh International Conference on Learning Representations.
- Chain of thought prompting elicits reasoning in large language models. In Advances in Neural Information Processing Systems.
- Neural text generation with unlikelihood training. In International Conference on Learning Representations.
- Reframing human-AI collaboration for generating free-text explanations. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 632–658, Seattle, United States. Association for Computational Linguistics.
- iSee: Intelligent sharing of explanation experience by users for users. In Companion Proceedings of the 28th International Conference on Intelligent User Interfaces, IUI ’23 Companion, page 79–82, New York, NY, USA. Association for Computing Machinery.
- Transformers: State-of-the-art natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 38–45, Online. Association for Computational Linguistics.
- Polyjuice: Generating counterfactuals for explaining, evaluating, and improving models. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 6707–6723, Online. Association for Computational Linguistics.
- Large language models as optimizers. arXiv, abs/2309.03409.
- Spider: A large-scale human-labeled dataset for complex and cross-domain semantic parsing and text-to-SQL task. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 3911–3921, Brussels, Belgium. Association for Computational Linguistics.
- Matthew D. Zeiler and Rob Fergus. 2014. Visualizing and understanding convolutional networks. In Computer Vision – ECCV 2014, pages 818–833, Cham. Springer International Publishing.
- May i ask a follow-up question? understanding the benefits of conversations in neural network explainability. arXiv, abs/2309.13965.
- Qianli Wang (11 papers)
- Tatiana Anikina (9 papers)
- Nils Feldhus (18 papers)
- Josef van Genabith (43 papers)
- Leonhard Hennig (25 papers)
- Sebastian Möller (77 papers)