Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
158 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Which questions should I answer? Salience Prediction of Inquisitive Questions (2404.10917v2)

Published 16 Apr 2024 in cs.CL

Abstract: Inquisitive questions -- open-ended, curiosity-driven questions people ask as they read -- are an integral part of discourse processing (Kehler and Rohde, 2017; Onea, 2016) and comprehension (Prince, 2004). Recent work in NLP has taken advantage of question generation capabilities of LLMs to enhance a wide range of applications. But the space of inquisitive questions is vast: many questions can be evoked from a given context. So which of those should be prioritized to find answers? Linguistic theories, unfortunately, have not yet provided an answer to this question. This paper presents QSALIENCE, a salience predictor of inquisitive questions. QSALIENCE is instruction-tuned over our dataset of linguist-annotated salience scores of 1,766 (context, question) pairs. A question scores high on salience if answering it would greatly enhance the understanding of the text (Van Rooy, 2003). We show that highly salient questions are empirically more likely to be answered in the same article, bridging potential questions (Onea, 2016) with Questions Under Discussion (Roberts, 2012). We further validate our findings by showing that answering salient questions is an indicator of summarization quality in news.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (44)
  1. Gerry Altmann and Mark Steedman. 1988. Interaction with context during human sentence processing. Cognition, 30(3):191–238.
  2. Ron Artstein and Massimo Poesio. 2008. Survey article: Inter-coder agreement for computational linguistics. Computational Linguistics, 34(4):555–596.
  3. Anton Benz and Katja Jasinskaja. 2017. Questions under discussion: From sentence to discourse. Discourse Processes, 54(3):177–186.
  4. Children’s questions: A mechanism for cognitive development. Monographs of the society for research in child development, pages i–129.
  5. Scaling instruction-finetuned language models. Journal of Machine Learning Research, 25(70):1–53.
  6. DiffQG: Generating questions to summarize factual changes. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, pages 3088–3101, Dubrovnik, Croatia. Association for Computational Linguistics.
  7. Dan Cristea and Bonnie Webber. 1997. Expectations in incremental discourse processing. In 35th Annual Meeting of the Association for Computational Linguistics and 8th Conference of the European Chapter of the Association for Computational Linguistics, pages 88–95, Madrid, Spain. Association for Computational Linguistics.
  8. Beth Davey and Susan McBride. 1986. Effects of question-generation training on reading comprehension. Journal of Educational Psychology, 78(4):256.
  9. QLoRA: Efficient finetuning of quantized LLMs. Advances in Neural Information Processing Systems, 36.
  10. Qlarify: Bridging scholarly abstracts and papers with recursively expandable summaries. arXiv preprint arXiv:2310.07581.
  11. “what makes a question inquisitive?” a study on type-controlled inquisitive question generation. In Proceedings of the 11th Joint Conference on Lexical and Computational Semantics, pages 240–257, Seattle, Washington. Association for Computational Linguistics.
  12. News summarization and evaluation in the era of gpt-3. arXiv preprint arXiv:2209.12356.
  13. Embrace divergence for richer insights: A multi-document summarization benchmark and a case study on summarizing diverse information from news articles. arXiv preprint arXiv:2309.09369.
  14. Mistral 7b. arXiv preprint arXiv:2310.06825.
  15. Andrew Kehler and Hannah Rohde. 2017. Evaluating an expectation-driven question-under-discussion model of discourse interpretation. Discourse Processes, 54(3):219–238.
  16. Inquisitive question generation for high level text comprehension. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 6544–6555, Online. Association for Computational Linguistics.
  17. Discourse comprehension: A question answering framework to represent sentence connections. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 11752–11764, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
  18. Klaus Krippendorff. 2011. Computing krippendorff’s alpha-reliability.
  19. Discord questions: A computational approach to diversity analysis in news coverage. In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 5180–5194, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
  20. What makes good in-context examples for gpt-3333? Preprint, arXiv:2101.06804.
  21. Lost in the middle: How language models use long contexts. Preprint, arXiv:2307.03172.
  22. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692.
  23. Ilya Loshchilov and Frank Hutter. 2018. Decoupled weight decay regularization. In International Conference on Learning Representations.
  24. FollowupQG: Towards information-seeking follow-up question generation. In Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics (Volume 1: Long Papers), pages 252–271, Nusa Dua, Bali. Association for Computational Linguistics.
  25. Conditional generation with a question-answering blueprint. Transactions of the Association for Computational Linguistics, 11:974–996.
  26. A question answering framework for decontextualizing user-facing snippets from scientific documents. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 3194–3212, Singapore. Association for Computational Linguistics.
  27. Edgar Onea. 2016. Potential questions at the semantics-pragmatics interface. In Potential Questions at the Semantics-Pragmatics Interface. Brill.
  28. Background summarization of event timelines. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 8111–8136, Singapore. Association for Computational Linguistics.
  29. Michael Prince. 2004. Does active learning work? a review of the research. Journal of engineering education, 93(3):223–231.
  30. Sudha Rao and Hal Daumé III. 2018. Learning to ask good questions: Ranking clarification questions using neural expected value of perfect information. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2737–2746, Melbourne, Australia. Association for Computational Linguistics.
  31. Annotation guidelines for questions under discussion and information structure. Information structure in lesser-described languages. Studies in prosody and syntax, pages 403–443.
  32. Craige Roberts. 2012. Information structure: Towards an integrated formal theory of pragmatics. Semantics and pragmatics, 5:6–1.
  33. Learning to retrieve prompts for in-context learning. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 2655–2671, Seattle, United States. Association for Computational Linguistics.
  34. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288.
  35. Infolossqa: Characterizing and recovering information loss in text simplification. arXiv preprint arXiv:2401.16475.
  36. Jan Van Kuppevelt. 1995. Discourse structure, topicality and questioning. Journal of linguistics, 31(1):109–147.
  37. Robert Van Rooy. 2003. Questioning to resolve decision problems. Linguistics and Philosophy, 26:727–763.
  38. Alex Warstadt. 2020. " just" don’t ask: Exclusives and potential questions. In Proceedings of Sinn und Bedeutung, volume 24, pages 373–390.
  39. Chain-of-thought prompting elicits reasoning in large language models. Preprint, arXiv:2201.11903.
  40. TED-Q: TED talks and the questions they evoke. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 1118–1127, Marseille, France. European Language Resources Association.
  41. QUDeval: The evaluation of questions under discussion discourse parsing. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 5344–5363, Singapore. Association for Computational Linguistics.
  42. Elaborative simplification as implicit questions under discussion. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 5525–5537, Singapore. Association for Computational Linguistics.
  43. Tinyllama: An open-source small language model. Preprint, arXiv:2401.02385.
  44. Benchmarking large language models for news summarization. Preprint, arXiv:2301.13848.

Summary

  • The paper introduces a novel salience prediction model that identifies inquisitive questions enhancing text comprehension.
  • It leverages a dataset of 1,766 annotated questions from news and TED talks, utilizing GPT-4 and Flan-T5 with instruction-tuning to outperform benchmarks.
  • Empirical results demonstrate a strong correlation between question salience and in-text responses, promising improved summarization and discourse processing.

Enhancing Context Understanding through Salient Inquisitive Question Prediction

Introduction to Inquisitive Question Prediction

In this detailed exploration, the focus concentrates on the characterization and prediction of salient inquisitive questions that enhance text comprehension. Previous approaches in NLP have generated inquisitive questions to fulfill various analytical needs, but often without prioritizing question relevance or utility. This paper introduces a novel approach of developing a salience prediction model tailored to identify inquisitive questions whose answers substantially enhance understanding of the text.

Theoretical Background and Prior Work

Inquisitive questions arise naturally as readers seek to satisfy their curiosity about text content. The broad variability in these questions raises the challenge of distinguishing the most relevant or "salient" questions. Linguistically, this ties in with concepts of potential questions and Questions Under Discussion (QUDs), which reflect naturally in discourse progression. The authors provide a comprehensive overview of previous models that, while efficacious in generating numerous valid questions, lack mechanisms to assess their relative importance or salience.

Data and Methodology

The core of this research revolves around creating and utilizing a dataset of 1,766 inquisitive questions sourced from English news articles and TED talks, annotated for salience based on their contextual utility. The model predicts the salience of questions using a novel scoring system that factors in the extent to which a question's answers enhance text comprehension. The paper presents an innovative approach to model training, leveraging linguist-annotated salience scores alongside instruction-tuning techniques applied on known LLMs like GPT-4 and Flan-T5, which significantly outperform traditional zero-shot or few-shot approaches.

Empirical Findings and Model Evaluation

The empirical analysis demonstrates a strong correlation between the predicted salience of questions and their likelihood of being answered within the same article, suggesting alignment between question salience and discourse progression. The model's effectiveness is further illustrated through detailed performance metrics, specifically focusing on its capability to predict salience more accurately than established LLM benchmarks.

Practical Applications and Future Implications

The practical implications of this work are evident in its application to enhancing summary quality in journalistic contexts. By focusing on salient questions, summarization becomes more targeted and informative, potentially improving reader engagement and satisfaction. Looking forward, the proposed salience prediction model holds promise for broad applicability across various domains where information extraction and contextual interpretation are critical.

Conclusion

This paper marks a significant step toward understanding and automating the identification of salient inquisitive questions in textual data. By bridging linguistic theory with practical NLP applications, it opens new avenues for research and development in discourse processing, question generation, and beyond.

In summary, this research advances the field by presenting a robust, validated approach to predicting question salience, grounded in linguistic theory and enhanced by modern AI techniques. Its contributions are poised to influence future explorations and applications in automatic question generation and contextual information retrieval.

X Twitter Logo Streamline Icon: https://streamlinehq.com