Leaf: Multiple-Choice Question Generation (2201.09012v1)

Published 22 Jan 2022 in cs.CL and cs.AI

Abstract: Testing with quiz questions has proven to be an effective way to assess and improve the educational process. However, manually creating quizzes is tedious and time-consuming. To address this challenge, we present Leaf, a system for generating multiple-choice questions from factual text. In addition to being very well suited for the classroom, Leaf could also be used in an industrial setting, e.g., to facilitate onboarding and knowledge sharing, or as a component of chatbots, question answering systems, or Massive Open Online Courses (MOOCs). The code and the demo are available on https://github.com/KristiyanVachev/Leaf-Question-Generation.

Authors (6)

Kristiyan Vachev (2 papers)
Momchil Hardalov (23 papers)
Georgi Karadzhov (20 papers)
Georgi Georgiev (28 papers)
Ivan Koychev (33 papers)
Preslav Nakov (253 papers)

Citations (20)

View on Semantic Scholar

Summary

The paper introduces an automated system that generates MCQs from educational text, significantly reducing the manual workload for educators.
It employs a multi-module design, including a REST API and a dedicated MCQ generator, to produce question-answer pairs with contextually relevant distractors.
Validated using datasets like SQuAD1.1 and RACE, the approach achieves promising performance metrics, highlighting its scalability for educational and industrial applications.

Leaf: Multiple-Choice Question Generation

The paper "Leaf: Multiple-Choice Question Generation" introduces an automated system that facilitates the generation of multiple-choice questions (MCQs) from educational text, addressing the substantial effort required to manually create quizzes for students. Educators often dedicate a significant proportion of their time—sometimes up to 50%—to crafting assessment questions, especially in university settings where large question banks are necessary to prevent memorization and the dissemination of answers. Leaf aims to mitigate this burden by providing an efficient and reliable tool for question creation, thereby enhancing the educational process in settings such as Massive Open Online Courses (MOOCs) and traditional classrooms. Moreover, this system has potential applications in industrial contexts, such as employee onboarding and knowledge sharing.

System Components and Functionality

Leaf encompasses three main system modules: a client interface, a REST API, and a Multiple-Choice Question (MCQ) Generator Module. These modules work in tandem to take educational text as input and subsequently produce question-answer pairs alongside distractors. The system builds upon previous methodologies in question generation by adopting a neural network-based approach, specifically leveraging the T5 Transformer model, which has been fine-tuned on datasets such as SQuAD1.1 and RACE to achieve proficiency in generating both questions and distractors.

The question and answer generation process combines the two tasks into a multi-task model. Through fine-tuning the T5 Transformer model, the researchers enhanced its performance for question creation, training it over five epochs and achieving a validation cross-entropy loss of 1.17. For generating distractors, the model was similarly trained, yielding BLEU1 scores of 46.37, 32.19, and 34.47 for the first, second, and third distractors, respectively. Additionally, the system utilizes the sense2vec approach to propose semantically similar distractors, further enriching the variety and relevance of the options.

Implications and Future Directions

Leaf's ability to automatically generate MCQs has significant implications for educational contexts, where it can be used to assess student learning outcomes, facilitate self-assessment, and identify knowledge gaps. The capacity to produce high-quality educative assessments without substantial educator input aligns well with the scalability requirements of MOOCs. Furthermore, the open-source nature of Leaf provides researchers and educators with opportunities to adapt and enhance the tool for their specific data and pedagogical aims.

Looking forward, potential enhancements could include experimenting with larger pre-trained Transformer models to improve question generation quality. Additionally, acquiring more diverse training data, particularly datasets that reflect the complexity and multilingual nature of MOOCs content, could further validate and augment the system's capabilities. Considering the lack of specialized datasets for question generation, the creation and curation of new datasets through real-world educational applications represent a crucial area for future research and development.

The research presents a robust framework for automating quiz question design, underscoring the benefits of machine learning in educational technology. By facilitating more efficient and adaptive assessment methods, Leaf exemplifies a strategic advancement in educational automation rather than merely a technical augmentation of existing processes.

PDF Markdown

Related Papers

GitHub

GitHub - KristiyanVachev/Leaf-Question-Generation: Easy to use and understand multiple-choice question generation algorithm using T5 Transformers. (119 stars)