Revealing the Unwritten: Visual Investigation of Beam Search Trees to Address Language Model Prompting Challenges (2310.11252v1)
Abstract: The growing popularity of generative LLMs has amplified interest in interactive methods to guide model outputs. Prompt refinement is considered one of the most effective means to influence output among these methods. We identify several challenges associated with prompting LLMs, categorized into data- and model-specific, linguistic, and socio-linguistic challenges. A comprehensive examination of model outputs, including runner-up candidates and their corresponding probabilities, is needed to address these issues. The beam search tree, the prevalent algorithm to sample model outputs, can inherently supply this information. Consequently, we introduce an interactive visual method for investigating the beam search tree, facilitating analysis of the decisions made by the model during generation. We quantitatively show the value of exposing the beam search tree and present five detailed analysis scenarios addressing the identified challenges. Our methodology validates existing results and offers additional insights.
- Davey Alba. 2022. OpenAI chatbot spits out biased musings, despite guardrails. Bloomberg. [Online; accessed 30. Mar. 2023].
- Using natural sentence prompts for understanding biases in language models. In Proc. of the Conf. of the North American Chapter of the Assoc. for Comp. Ling.: Human Language Technologies, pages 2824–2830, Seattle, United States. ACL.
- Neural machine translation by jointly learning to align and translate.
- A neural probabilistic language model. Advances in Neural Information Processing Systems, 13.
- Language (technology) is power: A critical survey of “bias” in NLP. In Proc. of the Assoc. for Comp. Ling., pages 5454–5476, Online. ACL.
- Together Computer. 2023. Redpajama: An open source recipe to reproduce llama training dataset.
- A survey of the state of explainable AI for natural language processing. In Proc. of the 1st Conf. of the Asia-Pacific Chapter of the Assoc. for Comp. Ling. and the 10th Int. Joint Conf. on Natural Language Processing, pages 447–459, Suzhou, China. ACL.
- BERT: Pre-training of deep bidirectional transformers for language understanding. In Proc. of the Conf. of the North American Chapter of the Assoc. for Comp. Ling.: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, Minneapolis, Minnesota. Association for Computational Linguistics.
- Semantic color mapping: A pipeline for assigning meaningful colors to text. 4th IEEE Workshop on Visualization Guidelines in Research, Design, and Education.
- Akshat Gupta. 2023. Probing quantifier comprehension in large language models.
- Surface form competition: Why the highest probability answer isn’t always right. In Proc. of the Conf. on Empirical Methods in Natural Language Processing. ACL.
- Silke Husse and Andreas Spitz. 2022. Mind your bias: A critical review of bias detection methods for contextual language models. In Findings of the Assoc. for Comp. Ling.: EMNLP.
- Survey of hallucination in natural language generation. ACM Computing Surveys, 55(12):1–38.
- Negation, coordination, and quantifiers in contextualized language models. In Proc. of the 29th Int. Conf. on Comp. Ling., pages 3074–3085, Gyeongju, Republic of Korea. International Committee on Computational Linguistics.
- Nora Kassner and Hinrich Schütze. 2020. Negated and misprimed probes for pretrained language models: Birds can talk, but cannot fly. In Proc. of the 58th Annual Meeting of the Assoc. for Comp. Ling., pages 7811–7818, Online. ACL.
- Vivian Lai and Chenhao Tan. 2019. On human predictions with explanations and predictions of machine learning models: A case study on deception detection. In Proc. of the Conf. on Fairness, Accountability, and Transparency, page 29–38. ACM.
- Interactive visualization and manipulation of attention-based neural machine translation. In Proc. of the Conf. on Empirical Methods in Natural Language Processing: System Demonstrations, pages 121–126, Copenhagen, Denmark. ACL.
- Pretrained language model for text generation: A survey. In Proc. of the Thirtieth Int. Joint Conf. on Artificial Intelligence. International Joint Conferences on Artificial Intelligence Organization.
- Towards understanding and mitigating social biases in language models. In Int. Conf. on Machine Learning, pages 6565–6576. PMLR.
- On interpretation of network embedding via taxonomy induction. In Proc. of the 24th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining. ACM.
- A survey on bias and fairness in machine learning. ACM Computing Surveys, 54(6).
- Explainable prediction of medical codes from clinical text. In Proc. of the Conf. of the North American Chapter of the Assoc. for Comp. Ling.: Human Language Technologies, Volume 1 (Long Papers). ACL.
- Visualization-based improvement of neural machine translation. Comput. Graph., 103:45–60.
- Language models are unsupervised multitask learners. OpenAI Blog, 1(8):9.
- State of the art of visual analytics for eXplainable deep learning. Computer Graphics Forum, 42(1):319–355.
- Learning representations by back-propagating errors. Cahiers De La Revue De Theologie Et De Philosophie, 323(6088):533–536.
- Bloom: A 176b-parameter open-access multilingual language model.
- Rita Sevastjanova and Mennatallah El-Assady. 2022. Beware the rationalization trap! when language model explainability diverges from our mental models of language.
- Seq2seq-vis: A visual debugging tool for sequence-to-sequence models. IEEE Trans. on Vis. and Computer Graphics, 25(1):353–363.
- GenNI: Human-AI collaboration for data-backed text generation. IEEE Trans. on Vis. and Computer Graphics, 28(1):1106–1116.
- What can a generative language model answer about a passage? In Proc. of the 3rd Workshop on Machine Reading for Question Answering, pages 73–81, Punta Cana, Dominican Republic. ACL.
- Language models are not naysayers: An analysis of language models on negation benchmarks.
- Attention is all you need. In Proc. of the 31st Int. Conf. on Neural Information Processing Systems, NIPS’17, page 6000–6010, Red Hook, NY, USA. Curran Associates Inc.
- Investigating BERT’s knowledge of language: Five analysis methods with NPIs. In Proc. of the Conf. on Empirical Methods in Natural Language Processing and the 9th Int. Joint Conf. on Natural Language Processing (EMNLP-IJCNLP), pages 2877–2887, Hong Kong, China. ACL.
- Albert Webson and Ellie Pavlick. 2022. Do prompt-based models really understand the meaning of their prompts? In Proc. of the Conf. of the North American Chapter of the Assoc. for Comp. Ling.: Human Language Technologies, pages 2300–2344, Seattle, United States. ACL.
- Transformers: State-of-the-art natural language processing. In Proc. of the Conf. on Empirical Methods in Natural Language Processing: System Demonstrations, pages 38–45, Online. ACL.
- A survey of visual analytics techniques for machine learning. Comp. Visual Media, 7(1):3–36.
- Why johnny can’t prompt: How non-AI experts try (and fail) to design LLM prompts. In Proc. of the CHI Conf. on Human Factors in Computing Systems. ACM.
Sponsor
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.