Improving Reading Comprehension Question Generation with Data Augmentation and Overgenerate-and-rank (2306.08847v1)
Abstract: Reading comprehension is a crucial skill in many aspects of education, including language learning, cognitive development, and fostering early literacy skills in children. Automated answer-aware reading comprehension question generation has significant potential to scale up learner support in educational activities. One key technical challenge in this setting is that there can be multiple questions, sometimes very different from each other, with the same answer; a trained question generation method may not necessarily know which question human educators would prefer. To address this challenge, we propose 1) a data augmentation method that enriches the training dataset with diverse questions given the same context and answer and 2) an overgenerate-and-rank method to select the best question from a pool of candidates. We evaluate our method on the FairytaleQA dataset, showing a 5% absolute improvement in ROUGE-L over the best existing method. We also demonstrate the effectiveness of our method in generating harder, "implicit" questions, where the answers are not contained in the context as text spans.
- Ralph Allan Bradley and Milton E Terry. 1952. Rank analysis of incomplete block designs: I. the method of paired comparisons. Biometrika, 39(3/4):324–345.
- Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901.
- Evaluating large language models trained on code. arXiv preprint arXiv:2107.03374.
- Scaling instruction-finetuned language models. arXiv preprint arXiv:2210.11416.
- A feasibility study of answer-unaware question generation for education. In Findings of the Association for Computational Linguistics: ACL 2022, pages 1919–1926.
- Enhancing parent-child relationship through dialogic reading. Educational Studies, 43(1):51–66.
- Language matters: Denying the existence of the 30-million-word gap has serious consequences. Child development, 90(3):985–992.
- The curious case of neural text degeneration. In International Conference on Learning Representations.
- Convbert: Improving bert with span-based dynamic convolution. Advances in Neural Information Processing Systems, 33:12837–12848.
- James M Joyce. 2011. Kullback-leibler divergence. In International encyclopedia of statistical science, pages 720–722. Springer.
- The narrativeqa reading comprehension challenge. Transactions of the Association for Computational Linguistics, 6:317–328.
- A systematic review of automatic question generation for educational purposes. International Journal of Artificial Intelligence in Education, 30:121–204.
- TellMeWhy: A dataset for answering why-questions in narratives. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 596–610, Online. Association for Computational Linguistics.
- BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 7871–7880, Online. Association for Computational Linguistics.
- Chin-Yew Lin. 2004. Rouge: A package for automatic evaluation of summaries. In Text summarization branches out, pages 74–81.
- Ilya Loshchilov and Frank Hutter. 2017. Decoupled weight decay regularization. In International Conference on Learning Representations.
- The development of narrative comprehension and its relation to other early reading skills. Reading Psychology, 29(4):327–365.
- Measuring and narrowing the compositionality gap in language models. arXiv preprint arXiv:2210.03350.
- Language models are unsupervised multitask learners. OpenAI blog, 1(8):9.
- Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research, 21(1):5485–5551.
- Educational multi-question generation for reading comprehension. In Proceedings of the 17th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2022), pages 216–223.
- Replug: Retrieval-augmented black-box language models. arXiv preprint arXiv:2301.12652.
- Susan Sim and Donna Berthelsen. 2014. Shared book reading by parents with young children: Evidence-based practice. Australasian Journal of Early Childhood, 39(1):50–55.
- Automatically generating cause-and-effect questions from passages. In Proceedings of the 16th Workshop on Innovative Use of NLP for Building Educational Applications, pages 158–170.
- A contrastive framework for neural text generation. In Advances in Neural Information Processing Systems.
- Math word problem generation with mathematical consistency and problem context constraints. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 5986–5999, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Qg-net: a data-driven question generation model for educational content. In Proceedings of the fifth annual ACM conference on learning at scale, pages 1–10.
- Chain of thought prompting elicits reasoning in large language models. arXiv preprint arXiv:2201.11903.
- Transformers: State-of-the-art natural language processing. In EMNLP: System Demonstrations, pages 38–45.
- Nece: Narrative event chain extraction toolkit. arXiv preprint arXiv:2208.08063.
- Same benefits, different communication patterns: Comparing children’s reading with a conversational agent vs. a human partner. Computers & Education, 161:104059.
- Fantastic questions and where to find them: FairytaleQA – an authentic dataset for narrative comprehension. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 447–460, Dublin, Ireland. Association for Computational Linguistics.
- It is AI’s turn to ask humans a question: Question-answer pair generation for children’s story books. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 731–744, Dublin, Ireland. Association for Computational Linguistics.
- Selecting better samples from pre-trained llms: A case study on question generation. arXiv preprint arXiv:2209.11000.
- Andrea A Zevenbergen and Grover J Whitehurst. 2003. Dialogic reading: A shared picture book reading intervention for preschoolers. On reading books to children: Parents and teachers, 177:200.
- Storybuddy: A human-ai collaborative chatbot for parent-child interactive storytelling with flexible parental involvement. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems, pages 1–21.
- Educational question generation of children storybooks via question type distribution learning and event-centric summarization. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 5073–5085, Dublin, Ireland. Association for Computational Linguistics.
- Fine-tuning language models from human preferences. arXiv preprint arXiv:1909.08593.
- Automatic true/false question generation for educational purpose. In Proceedings of the 17th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2022), pages 61–70.
- Exploring automated question answering methods for teaching assistance. In Artificial Intelligence in Education: 21st International Conference, AIED 2020, Ifrane, Morocco, July 6–10, 2020, Proceedings, Part I 21, pages 610–622. Springer.
- Nischal Ashok Kumar (7 papers)
- Nigel Fernandez (10 papers)
- Zichao Wang (34 papers)
- Andrew Lan (48 papers)