Papers
Topics
Authors
Recent
2000 character limit reached

Exploring Answer Information Methods for Question Generation with Transformers (2312.03483v1)

Published 6 Dec 2023 in cs.CL and cs.LG

Abstract: There has been a lot of work in question generation where different methods to provide target answers as input, have been employed. This experimentation has been mostly carried out for RNN based models. We use three different methods and their combinations for incorporating answer information and explore their effect on several automatic evaluation metrics. The methods that are used are answer prompting, using a custom product method using answer embeddings and encoder outputs, choosing sentences from the input paragraph that have answer related information, and using a separate cross-attention attention block in the decoder which attends to the answer. We observe that answer prompting without any additional modes obtains the best scores across rouge, meteor scores. Additionally, we use a custom metric to calculate how many of the generated questions have the same answer, as the answer which is used to generate them.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (28)
  1. Benjamin Börschinger and Mark Johnson. 2011. A particle filter algorithm for Bayesian wordsegmentation. In Proceedings of the Australasian Language Technology Association Workshop 2011, pages 10–18, Canberra, Australia.
  2. Mixture content selection for diverse sequence generation. arXiv preprint arXiv:1909.01953.
  3. Learning phrase representations using rnn encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078.
  4. Xinya Du and Claire Cardie. 2017. Identifying where to focus in reading comprehension for neural question generation. In Proceedings of the 2017 conference on empirical methods in natural language processing, pages 2067–2073.
  5. Learning to ask: Neural question generation for reading comprehension. arXiv preprint arXiv:1705.00106.
  6. Question generation for question answering. In Proceedings of the 2017 conference on empirical methods in natural language processing, pages 866–874.
  7. Question generation for reading comprehension assessment by modeling how and what to ask. arXiv preprint arXiv:2204.02908.
  8. Noise reduction and targeted exploration in imitation learning for Abstract Meaning Representation parsing. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1–11, Berlin, Germany. Association for Computational Linguistics.
  9. Improving neural question generation using world knowledge. arXiv preprint arXiv:1909.03716.
  10. Mary Harper. 2014. Learning from 26 languages: Program management and science in the babel program. In Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers, page 1, Dublin, Ireland. Dublin City University and Association for Computational Linguistics.
  11. Vrindavan Harrison and Marilyn Walker. 2018. Neural generation of diverse questions using answer focus, contextual and linguistic features. arXiv preprint arXiv:1809.02637.
  12. Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation, 9(8):1735–1780.
  13. Improving neural question generation using answer separation. In Proceedings of the AAAI conference on artificial intelligence, volume 33, pages 6602–6609.
  14. Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv preprint arXiv:1910.13461.
  15. Learning to generate questions by learningwhat not to generate. In The world wide web conference, pages 1106–1118.
  16. Transformer-based end-to-end question generation. arXiv preprint arXiv:2005.01107, 4.
  17. Improving question generation with sentence-level semantic matching and answer position inferring. In Proceedings of the AAAI conference on artificial intelligence, volume 34, pages 8464–8471.
  18. Warren S McCulloch and Walter Pitts. 1943. A logical calculus of the ideas immanent in nervous activity. The bulletin of mathematical biophysics, 5:115–133.
  19. Squad: 100,000+ questions for machine comprehension of text. arXiv preprint arXiv:1606.05250.
  20. Automatic question generation for intelligent tutoring systems. In 2017 2Nd international conference on communication systems, computing and IT applications (CSCITA), pages 127–132. IEEE.
  21. Neural models for key phrase detection and question generation. arXiv preprint arXiv:1706.04560.
  22. Answer-focused and position-aware neural question generation. In Proceedings of the 2018 conference on empirical methods in natural language processing, pages 3930–3939.
  23. Question answering and question generation as dual tasks. arXiv preprint arXiv:1706.02027.
  24. Weak supervision enhanced generative network for question generation. arXiv preprint arXiv:1907.00607.
  25. mt5: A massively multilingual pre-trained text-to-text transformer. arXiv preprint arXiv:2010.11934.
  26. Machine comprehension by text-to-text neural question generation. arXiv preprint arXiv:1705.02012.
  27. Pegasus: Pre-training with extracted gap-sentences for abstractive summarization. In International Conference on Machine Learning, pages 11328–11339. PMLR.
  28. Sequential copying networks. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 32.

Summary

  • The paper demonstrates that direct answer prompting outperforms other integration methods for question generation.
  • It evaluates transformer-based models using BART on the SQuAD dataset to compare answer prompting, CP, and answer-aware attention techniques.
  • Findings indicate that simpler answer prompting offers effective question quality, guiding future improvements in QG models.

Introduction

Question generation (QG) constitutes a critical task in numerous domains, such as educational assessment, information retrieval, and conversational AI. It involves the creation of questions based on various input types, including passages of text, images, or structured data. Though the recent focus has been on text-based QG, the challenge lies in understanding the input's context to formulate coherent and relevant questions. In this effort, the use of transformer-based models, particularly for generating questions from text, has shown promising results.

Background and Problem Definition

Previous works have primarily utilized recurrent neural network (RNN) models, including LSTM and GRU, as well as transformer models like BART and T5 for text-based QG. Innovations include leveraging linguistic features and external knowledge bases to refine the generated questions. However, the way answer information is integrated into the question generation process varies. The paper examines the impact of different answer integration methods on the quality of questions produced by a specific transformer model, BART. Furthermore, it explores question generation in two contexts: one that is based on provided answers (answer-aware) and one without explicit answer cues (answer-agnostic).

Methodology and Experimental Design

The paper capitalizes on the SQUAD dataset, a collection of question-answer pairs used to train and benchmark QG models. The authors experiment with three main techniques to embed answer information into the question generation process:

  1. Answer Prompting (AP), where the answer is directly provided to the model as part of the input sequence.
  2. Answer Embeddings and Encoder Output Products (CP), where an encoder's output is modulated by the answer through a product operation and subsequently used to inform the decoder.
  3. Answer-Aware Attention Mechanisms (AA), where a separate decoder attention block is dedicated to the answer embeddings.

Additionally, these strategies are tested in combination with each other, and a 'related sentences' approach (RS), where only sentences containing the answer are fed to the model. The performance of each approach is measured with automatic evaluation metrics, namely ROUGE-L and METEOR, complemented with an accuracy check using a question answering model to validate if the generated questions indeed correspond to the original answers.

Results and Analysis

The findings reveal that the Answer Prompting method outperformed other strategies regarding the chosen metrics. While combining methods yielded mixed results, in some cases, it slightly improved over single methods. Notably, the combination of AP and CP, with or without RS, showed a minor decrease in performance as compared to AP alone. These results highlight the importance of answer representation and positioning in generating high-quality questions, especially in transformer-based architectures, which are sensitive to the input structure.

In conclusion, this paper offers valuable insights into the optimal utilization of answer information within question generation models. The indication that straightforward answer prompting provides the best results simplifies the process and allows future research to build upon a more refined baseline. Future work could extend these findings to diverse transformer models and investigate the relationship between model architecture and the efficacy of answer information techniques.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.