Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 119 tok/s
Gemini 2.5 Pro 51 tok/s Pro
GPT-5 Medium 27 tok/s Pro
GPT-5 High 17 tok/s Pro
GPT-4o 60 tok/s Pro
Kimi K2 196 tok/s Pro
GPT OSS 120B 423 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

Interactive Prompt Debugging with Sequence Salience (2404.07498v1)

Published 11 Apr 2024 in cs.CL, cs.AI, cs.HC, and cs.LG

Abstract: We present Sequence Salience, a visual tool for interactive prompt debugging with input salience methods. Sequence Salience builds on widely used salience methods for text classification and single-token prediction, and extends this to a system tailored for debugging complex LLM prompts. Our system is well-suited for long texts, and expands on previous work by 1) providing controllable aggregation of token-level salience to the word, sentence, or paragraph level, making salience over long inputs tractable; and 2) supporting rapid iteration where practitioners can act on salience results, refine prompts, and run salience on the new output. We include case studies showing how Sequence Salience can help practitioners work with several complex prompting strategies, including few-shot, chain-of-thought, and constitutional principles. Sequence Salience is built on the Learning Interpretability Tool, an open-source platform for ML model visualizations, and code, notebooks, and tutorials are available at http://goo.gle/sequence-salience.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (47)
  1. GPT-4 technical report. arXiv preprint arXiv:2303.08774.
  2. J Alammar. 2021. Ecco: An open source library for the explainability of transformer language models. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics (ACL): System Demonstrations, pages 249–257.
  3. From discovery to adoption: Understanding the ml practitioners’ interpretability journey. In Proceedings of the 2023 ACM Designing Interactive Systems Conference, pages 2304–2325.
  4. Constitutional AI: Harmlessness from AI feedback. arXiv preprint arXiv:2212.08073.
  5. “will you find these shortcuts?” a protocol for evaluating the faithfulness of input salience methods for text classification. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 976–991, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
  6. The struggles of feature-based explanations: Shapley values vs. minimal sufficient subsets. arXiv preprint arXiv:2009.11023.
  7. Training verifiers to solve math word problems. arXiv preprint arXiv:2110.14168.
  8. Extraction of salient sentences from labelled documents. arXiv preprint arXiv:1412.6815.
  9. Longrope: Extending llm context window beyond 2 million tokens. arXiv preprint arXiv:2402.13753.
  10. Visgets: Coordinated visualizations for web-based information exploration and discovery. IEEE Transactions on Visualization and Computer Graphics, 14(6):1205–1212.
  11. Finale Doshi-Velez and Been Kim. 2017. Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608.
  12. HotFlip: White-box adversarial examples for text classification. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 31–36, Melbourne, Australia. Association for Computational Linguistics.
  13. Garçon.
  14. Shi Feng and Jordan Boyd-Graber. 2019. What can AI do for me? evaluating machine learning interpretations in cooperative play. In Proceedings of the 24th International Conference on Intelligent User Interfaces, IUI ’19, page 229–239. Association for Computing Machinery.
  15. Gemini: a family of highly capable multimodal models. arXiv preprint arXiv:2312.11805.
  16. Gemma: Open models based on gemini research and technology. arXiv preprint arXiv:2403.08295.
  17. Contrastive explanations for model interpretability. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 1597–1611, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
  18. Mistral 7b. arXiv preprint arXiv:2310.06825.
  19. Promptmaker: Prompt-based prototyping with large language models. In CHI Conference on Human Factors in Computing Systems Extended Abstracts, pages 1–8.
  20. Interpreting interpretability: Understanding data scientists’ use of interpretability tools for machine learning. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, CHI ’20, page 1–14. ACM.
  21. Ayush Kaushal and Kyle Mahowald. 2022. What do tokens know about their characters and how do they know it? In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 2487–2507, Seattle, United States. Association for Computational Linguistics.
  22. Captum: A unified and generic model interpretability library for pytorch. arXiv preprint arXiv:2009.07896.
  23. Taku Kudo and John Richardson. 2018. SentencePiece: A simple and language independent subword tokenizer and detokenizer for neural text processing. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 66–71, Brussels, Belgium. Association for Computational Linguistics.
  24. Scott M Lundberg and Su-In Lee. 2017. A unified approach to interpreting model predictions. Advances in neural information processing systems, 30.
  25. Towards Faithful Model Explanation in NLP: A Survey. Computational Linguistics, pages 1–70.
  26. Faithfulness measurable masked language models. arXiv preprint arXiv:2310.07819.
  27. Using captum to explain generative language models. In Proceedings of the 3rd Workshop for Natural Language Processing Open Source Software (NLP-OSS 2023), pages 165–173, Singapore. Association for Computational Linguistics.
  28. Tim Miller. 2019. Explanation in artificial intelligence: Insights from the social sciences. Artificial intelligence, 267:1–38.
  29. Explanation in human-ai systems: A literature meta-review, synopsis of key ideas and publications, and bibliography for explainable ai. arXiv preprint arXiv:1902.01876.
  30. ConstitutionMaker: Interactively critiquing large language models by converting feedback into principles. In ACM Conference on Intelligent User Interfaces (IUI).
  31. Evaluating neural network explanation methods using hybrid documents and morphosyntactic agreement. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL), pages 340–350, Melbourne, Australia.
  32. Language models are unsupervised multitask learners. OpenAI blog, 1(8):9.
  33. “Why should I trust you?”: Explaining the predictions of any classifier. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations, pages 97–101, San Diego, California. Association for Computational Linguistics.
  34. The cost structure of sensemaking. In Proceedings of the INTERACT’93 and CHI’93 conference on Human factors in computing systems, pages 269–276.
  35. Inseq: An interpretability toolkit for sequence generation models. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations), pages 421–435.
  36. Toolformer: Language models can teach themselves to use tools. Advances in Neural Information Processing Systems, 36.
  37. Seq2seq-vis: A visual debugging tool for sequence-to-sequence models. IEEE Transactions on Visualization and Computer Graphics, 25(1):353–363.
  38. LMdiff: A visual diff tool to compare language models. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 96–105, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
  39. Axiomatic attribution for deep networks. In Proceedings of the 34th International Conference on Machine Learning - Volume 70, ICML’17, page 3319–3328. JMLR.org.
  40. The language interpretability tool: Extensible, interactive visualizations and analysis for NLP models. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP): System Demonstrations.
  41. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288.
  42. Universal adversarial triggers for attacking and analyzing NLP. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 2153–2162, Hong Kong, China. Association for Computational Linguistics.
  43. LLMCheckup: Conversational examination of large language models via interpretability tools. arXiv preprint arXiv:2401.12576.
  44. Chain-of-thought prompting elicits reasoning in large language models. Advances in neural information processing systems, 35:24824–24837.
  45. Jeremy M Wolfe. 2020. Visual search: How do we find what we are looking for? Annual review of vision science, 6:539–562.
  46. AI Chains: Transparent and controllable human-ai interaction by chaining large language model prompts. In Proceedings of the 2022 CHI conference on human factors in computing systems, pages 1–22.
  47. Kayo Yin and Graham Neubig. 2022. Interpreting language models with contrastive explanations. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 184–198, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 2 tweets and received 8 likes.

Upgrade to Pro to view all of the tweets about this paper: