Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
153 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

KnowledgeVIS: Interpreting Language Models by Comparing Fill-in-the-Blank Prompts (2403.04758v1)

Published 7 Mar 2024 in cs.HC, cs.AI, cs.CY, and cs.LG

Abstract: Recent growth in the popularity of LLMs has led to their increased usage for summarizing, predicting, and generating text, making it vital to help researchers and engineers understand how and why they work. We present KnowledgeVis, a human-in-the-loop visual analytics system for interpreting LLMs using fill-in-the-blank sentences as prompts. By comparing predictions between sentences, KnowledgeVis reveals learned associations that intuitively connect what LLMs learn during training to natural language tasks downstream, helping users create and test multiple prompt variations, analyze predicted words using a novel semantic clustering technique, and discover insights using interactive visualizations. Collectively, these visualizations help users identify the likelihood and uniqueness of individual predictions, compare sets of predictions between prompts, and summarize patterns and relationships between predictions across all prompts. We demonstrate the capabilities of KnowledgeVis with feedback from six NLP experts as well as three different use cases: (1) probing biomedical knowledge in two domain-adapted models; and (2) evaluating harmful identity stereotypes and (3) discovering facts and relationships between three general-purpose models.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (55)
  1. J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of deep bidirectional transformers for language understanding,” in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers).   Minneapolis, Minnesota: Association for Computational Linguistics, Jun. 2019, pp. 4171–4186.
  2. T. Brown et al., “Language models are few-shot learners,” in Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M. Balcan, and H. Lin, Eds., vol. 33.   Curran Associates, Inc., 2020, pp. 1877–1901.
  3. A. Srivastava et al., “Beyond the imitation game: Quantifying and extrapolating the capabilities of language models,” Transactions on Machine Learning Research, 2023.
  4. A. Rogers, O. Kovaleva, and A. Rumshisky, “A primer in BERTology: What we know about how BERT works,” Transactions of the Association for Computational Linguistics, vol. 8, pp. 842–866, 2020.
  5. A. Roberts, C. Raffel, and N. Shazeer, “How much knowledge can you pack into the parameters of a language model?” in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP).   Online: Association for Computational Linguistics, Nov. 2020, pp. 5418–5426.
  6. A. Abid, M. Farooqi, and J. Zou, “Persistent anti-muslim bias in large language models,” in Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society, ser. AIES ’21.   New York, NY, USA: Association for Computing Machinery, 2021, p. 298–306.
  7. P. Liu, W. Yuan, J. Fu, Z. Jiang, H. Hayashi, and G. Neubig, “Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing,” ACM Comput. Surv., vol. 55, no. 9, jan 2023.
  8. Z. Jiang, F. F. Xu, J. Araki, and G. Neubig, “How Can We Know What Language Models Know?” Transactions of the Association for Computational Linguistics, vol. 8, pp. 423–438, 07 2020.
  9. T. Shin, Y. Razeghi, R. L. Logan IV, E. Wallace, and S. Singh, “AutoPrompt: Eliciting Knowledge from Language Models with Automatically Generated Prompts,” in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP).   Online: Association for Computational Linguistics, Nov. 2020, pp. 4222–4235.
  10. G. Qin and J. Eisner, “Learning how to ask: Querying LMs with mixtures of soft prompts,” in Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.   Online: Association for Computational Linguistics, Jun. 2021, pp. 5203–5212.
  11. Z. Zhong, D. Friedman, and D. Chen, “Factual probing is [MASK]: Learning vs. learning to recall,” in Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.   Online: Association for Computational Linguistics, Jun. 2021, pp. 5017–5033.
  12. E. M. Bender, T. Gebru, A. McMillan-Major, and S. Shmitchell, “On the dangers of stochastic parrots: Can language models be too big?” in Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, ser. FAccT ’21.   New York, NY, USA: Association for Computing Machinery, 2021, p. 610–623.
  13. Z. J. Wang, D. Choi, S. Xu, and D. Yang, “Putting humans in the natural language processing loop: A survey,” in Proceedings of the First Workshop on Bridging Human–Computer Interaction and Natural Language Processing.   Online: Association for Computational Linguistics, Apr. 2021, pp. 47–52.
  14. A. Warstadt et al., “Investigating BERT’s knowledge of language: Five analysis methods with NPIs,” in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP).   Hong Kong, China: Association for Computational Linguistics, Nov. 2019, pp. 2877–2887.
  15. Y. Elazar, N. Kassner, S. Ravfogel, A. Ravichander, E. Hovy, H. Schütze, and Y. Goldberg, “Measuring and Improving Consistency in Pretrained Language Models,” Transactions of the Association for Computational Linguistics, vol. 9, pp. 1012–1031, 12 2021.
  16. C. Collins, F. B. Viegas, and M. Wattenberg, “Parallel tag clouds to explore and analyze faceted text corpora,” in 2009 IEEE Symposium on Visual Analytics Science and Technology, 2009, pp. 91–98.
  17. J. S. Yi, R. Melton, J. Stasko, and J. A. Jacko, “Dust & magnet: Multivariate information visualization using a magnet metaphor,” Information Visualization, vol. 4, no. 4, pp. 239–256, 2005.
  18. I. Beltagy, K. Lo, and A. Cohan, “Scibert: A pretrained language model for scientific text,” arXiv preprint arXiv:1903.10676, 2019.
  19. Y. Gu, R. Tinn, H. Cheng, M. Lucas, N. Usuyama, X. Liu, T. Naumann, J. Gao, and H. Poon, “Domain-specific language model pretraining for biomedical natural language processing,” ACM Trans. Comput. Healthcare, vol. 3, no. 1, oct 2021.
  20. Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, and V. Stoyanov, “Roberta: A robustly optimized bert pretraining approach,” arXiv preprint arXiv:1907.11692, 2019.
  21. V. Sanh, L. Debut, J. Chaumond, and T. Wolf, “Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter,” arXiv preprint arXiv:1910.01108, 2019.
  22. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” in Advances in Neural Information Processing Systems, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, Eds., vol. 30.   Curran Associates, Inc., 2017.
  23. F. Petroni, T. Rocktäschel, S. Riedel, P. Lewis, A. Bakhtin, Y. Wu, and A. Miller, “Language models as knowledge bases?” in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP).   Hong Kong, China: Association for Computational Linguistics, Nov. 2019, pp. 2463–2473.
  24. F. Hohman, M. Kahng, R. Pienta, and D. H. Chau, “Visual analytics in deep learning: An interrogative survey for the next frontiers,” IEEE Transactions on Visualization and Computer Graphics, vol. 25, no. 8, pp. 2674–2693, 2019.
  25. Z. J. Wang, R. Turko, O. Shaikh, H. Park, N. Das, F. Hohman, M. Kahng, and D. H. Polo Chau, “Cnn explainer: Learning convolutional neural networks with interactive visualization,” IEEE Transactions on Visualization and Computer Graphics, vol. 27, no. 2, pp. 1396–1406, 2021.
  26. S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural computation, vol. 9, no. 8, pp. 1735–1780, 1997.
  27. D. Bahdanau, K. Cho, and Y. Bengio, “Neural machine translation by jointly learning to align and translate,” arXiv preprint arXiv:1409.0473, 2014.
  28. H. Strobelt, S. Gehrmann, M. Behrisch, A. Perer, H. Pfister, and A. M. Rush, “Seq2seq-vis: A visual debugging tool for sequence-to-sequence models,” IEEE Transactions on Visualization and Computer Graphics, vol. 25, no. 1, pp. 353–363, 2019.
  29. C. Wang, A. Jain, D. Chen, and J. Gu, “VizSeq: a visual analysis toolkit for text generation tasks,” in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP): System Demonstrations.   Hong Kong, China: Association for Computational Linguistics, Nov. 2019, pp. 253–258.
  30. I. Tenney, J. Wexler, J. Bastings, T. Bolukbasi, A. Coenen, S. Gehrmann, E. Jiang, M. Pushkarna, C. Radebaugh, E. Reif, and A. Yuan, “The language interpretability tool: Extensible, interactive visualizations and analysis for NLP models,” in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations.   Online: Association for Computational Linguistics, Oct. 2020, pp. 107–118.
  31. J. Vig, “A multiscale visualization of attention in the transformer model,” in Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: System Demonstrations.   Florence, Italy: Association for Computational Linguistics, Jul. 2019, pp. 37–42.
  32. B. Hoover, H. Strobelt, and S. Gehrmann, “exBERT: A Visual Analysis Tool to Explore Learned Representations in Transformer Models,” in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations.   Online: Association for Computational Linguistics, Jul. 2020, pp. 187–196.
  33. J. F. DeRose, J. Wang, and M. Berger, “Attention flows: Analyzing and comparing attention mechanisms in language models,” IEEE Transactions on Visualization and Computer Graphics, vol. 27, no. 2, pp. 1160–1170, 2021.
  34. Z. J. Wang, R. Turko, and D. H. Chau, “Dodrio: Exploring Transformer Models with Interactive Visualization,” in Proceedings of the Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: System Demonstrations.   Online: Association for Computational Linguistics, 2021, pp. 132–141.
  35. K. Clark, U. Khandelwal, O. Levy, and C. D. Manning, “What does BERT look at? an analysis of BERT’s attention,” in Proceedings of the 2019 ACL Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP.   Florence, Italy: Association for Computational Linguistics, Aug. 2019, pp. 276–286.
  36. S. Jain and B. C. Wallace, “Attention is not Explanation,” in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers).   Minneapolis, Minnesota: Association for Computational Linguistics, Jun. 2019, pp. 3543–3556.
  37. P. Atanasova, J. G. Simonsen, C. Lioma, and I. Augenstein, “A diagnostic study of explainability techniques for text classification,” in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP).   Online: Association for Computational Linguistics, Nov. 2020, pp. 3256–3274.
  38. H. Strobelt, A. Webson, V. Sanh, B. Hoover, J. Beyer, H. Pfister, and A. M. Rush, “Interactive and visual prompt engineering for ad-hoc task adaptation with large language models,” IEEE Transactions on Visualization and Computer Graphics, vol. 29, no. 1, pp. 1146–1156, 2023.
  39. H. Strobelt, B. Hoover, A. Satyanaryan, and S. Gehrmann, “LMdiff: A visual diff tool to compare language models,” in Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing: System Demonstrations.   Online and Punta Cana, Dominican Republic: Association for Computational Linguistics, Nov. 2021, pp. 96–105.
  40. T. Wolf et al., “Transformers: State-of-the-art natural language processing,” in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations.   Online: Association for Computational Linguistics, Oct. 2020, pp. 38–45.
  41. J. H. Lau, K. Grieser, D. Newman, and T. Baldwin, “Automatic labelling of topic models,” in Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1, ser. HLT ’11.   USA: Association for Computational Linguistics, 2011, p. 1536–1545.
  42. Z. Wu and M. Palmer, “Verbs semantics and lexical selection,” in Proceedings of the 32nd Annual Meeting on Association for Computational Linguistics, ser. ACL ’94.   USA: Association for Computational Linguistics, 1994, p. 133–138.
  43. T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of word representations in vector space,” arXiv preprint arXiv:1301.3781, 2013.
  44. D. Müllner, “Modern hierarchical, agglomerative clustering algorithms,” arXiv preprint arXiv:1109.2378, 2011.
  45. J. H. W. Jr., “Hierarchical grouping to optimize an objective function,” Journal of the American Statistical Association, vol. 58, no. 301, pp. 236–244, 1963.
  46. B. J. Frey and D. Dueck, “Clustering by passing messages between data points,” Science, vol. 315, no. 5814, pp. 972–976, 2007.
  47. P. J. Rousseeuw, “Silhouettes: A graphical aid to the interpretation and validation of cluster analysis,” Journal of Computational and Applied Mathematics, vol. 20, pp. 53–65, 1987.
  48. G. A. Miller, “Wordnet: A lexical database for english,” Commun. ACM, vol. 38, no. 11, p. 39–41, nov 1995.
  49. B. B. Bederson, “Fisheye menus,” in Proceedings of the 13th annual ACM symposium on User interface software and technology, 2000, pp. 217–225.
  50. K. A. Olsen, R. R. Korfhage, K. M. Sochats, M. B. Spring, and J. G. Williams, “Visualization of a document collection: The vibe system,” Information Processing & Management, vol. 29, no. 1, pp. 69–81, 1993.
  51. Q. Jin, B. Dhingra, Z. Liu, W. Cohen, and X. Lu, “PubMedQA: A dataset for biomedical research question answering,” in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP).   Hong Kong, China: Association for Computational Linguistics, Nov. 2019, pp. 2567–2577.
  52. D. Nozza, F. Bianchi, and D. Hovy, “”HONEST: Measuring hurtful sentence completion in language models”,” in Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.   Online: Association for Computational Linguistics, Jun. 2021, pp. 2398–2406.
  53. D. Nozza, F. Bianchi, A. Lauscher, and D. Hovy, “Measuring harmful sentence completion in language models for LGBTQIA+ individuals,” in Proceedings of the Second Workshop on Language Technology for Equality, Diversity and Inclusion.   Dublin, Ireland: Association for Computational Linguistics, May 2022, pp. 26–34.
  54. J. Dhamala, T. Sun, V. Kumar, S. Krishna, Y. Pruksachatkun, K.-W. Chang, and R. Gupta, “Bold: Dataset and metrics for measuring biases in open-ended language generation,” in Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, ser. FAccT ’21.   New York, NY, USA: Association for Computing Machinery, 2021, p. 862–872.
  55. K. Crenshaw, “Demarginalizing the intersection of race and sex: A black feminist critique of antidiscrimination doctrine, feminist theory and antiracist politics,” The University of Chicago Legal Forum, vol. 1989, no. 1, pp. 139–167, 1989.
Citations (6)

Summary

  • The paper introduces KnowledgeVIS, a visual analytics system that interprets BERT-based models using dynamic fill-in-the-blank prompts.
  • It employs semantic clustering and coordinated visualizations like heat maps and scatter plots to reveal underlying model associations.
  • Evaluations in biomedical and bias contexts demonstrate how KnowledgeVIS improves understanding of model reasoning and ethical deployments.

Interpreting BERT-Based LLMs with KnowledgeVIS

The paper "KnowledgeVIS: Interpreting LLMs by Comparing Fill-in-the-Blank Prompts" presents a human-in-the-loop visual analytics system designed to scrutinize LLMs, specifically focusing on BERT-based models. This system, KnowledgeVIS, aims to bridge the gap in understanding what these models have learned and how they apply this knowledge to downstream tasks. Utilizing a fill-in-the-blank approach, KnowledgeVIS provides researchers and engineers with interactive visualizations that illuminate the associations these models have ingrained during their training processes.

Overview of KnowledgeVIS

KnowledgeVIS integrates multi-dimensional interaction techniques with visual analytics to enhance the interpretability of LLMs like BERT. Key features include:

  1. Prompt Interface: Users can create and modify fill-in-the-blank prompts. The system allows for flexible subject inputs within templates to generate diverse prompt variations, which is crucial for probing different types of relationships and associations within the model.
  2. Prediction Clustering: A novel semantic clustering technique groups predicted words based on taxonomic similarity, providing a clear structure for users to analyze relationships between predictions.
  3. Visualization Tools: The system employs multiple coordinated views:
    • A Heat Map for broad visibility of prediction probabilities.
    • A Set View that aids in comparing set memberships and rank through parallel tag clouds.
    • A Scatter Plot using a dust-and-magnet metaphor to examine the relationships across multiple prompts.

Evaluative Use Cases

KnowledgeVIS's utility is demonstrated through several use cases with different models and tasks:

  • Biomedical Knowledge: By evaluating domain-specific models like SciBERT and PubMedBERT, the system identifies how grammar and phrasing impact the models' understanding and association capabilities, particularly in sensitive domains like healthcare.
  • Identity Stereotypes: It uncovers biases in general-purpose models such as BERT and RoBERTa, revealing contextual gender, racial, and political stereotypes that may not be apparent through standard evaluation benchmarks.
  • Knowledge Probing: The system compares large and small models (BERT vs. DistilBERT) for grasping complex reasoning, revealing differences in handling verb-based versus noun-based associations.

Implications and Future Work

KnowledgeVIS underscores the potential for human-in-the-loop systems to complement existing quantitative benchmarks, providing nuanced, qualitative insights into LLM performance. By surfacing and visualizing predictive associations, it allows practitioners to evaluate and iteratively improve model outcomes, especially for applications demanding high reliability and ethical considerations.

Future research could expand this approach to include other transformer-based models, explore automated prompt generation, and tackle multi-modal inputs for comprehensive interpretability. Additionally, embedding the system within model training processes could further elucidate model learning dynamics, potentially guiding the development of less biased and more robust LLMs.

In conclusion, KnowledgeVIS represents a significant step towards demystifying the logic and learning within BERT-based LLMs. Its focus on interpretability could inform better engineering practices, enhancing model transparency and trustworthiness, especially in complex, real-world applications where understanding emergent model behaviors is crucial.

X Twitter Logo Streamline Icon: https://streamlinehq.com
Youtube Logo Streamline Icon: https://streamlinehq.com