Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

CDGP: Automatic Cloze Distractor Generation based on Pre-trained Language Model (2403.10326v1)

Published 15 Mar 2024 in cs.CL, cs.AI, and cs.LG

Abstract: Manually designing cloze test consumes enormous time and efforts. The major challenge lies in wrong option (distractor) selection. Having carefully-design distractors improves the effectiveness of learner ability assessment. As a result, the idea of automatically generating cloze distractor is motivated. In this paper, we investigate cloze distractor generation by exploring the employment of pre-trained LLMs (PLMs) as an alternative for candidate distractor generation. Experiments show that the PLM-enhanced model brings a substantial performance improvement. Our best performing model advances the state-of-the-art result from 14.94 to 34.17 (NDCG@10 score). Our code and dataset is available at https://github.com/AndyChiangSH/CDGP.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (15)
  1. Scibert: A pretrained language model for scientific text. In EMNLP. Association for Computational Linguistics.
  2. Enriching word vectors with subword information. arXiv preprint arXiv:1607.04606.
  3. Automatic generation of cloze question distractors. In Second language studies: acquisition, learning, education and technology.
  4. BERT: pre-training of deep bidirectional transformers for language understanding. CoRR, abs/1810.04805.
  5. Revup: Automatic gap-fill question generation from educational texts. In Proceedings of the Tenth Workshop on Innovative Use of NLP for Building Educational Applications, pages 154–161.
  6. John Lee and Stephanie Seneff. 2007. Automatic generation of cloze items for prepositions. In Eighth Annual Conference of the International Speech Communication Association. Citeseer.
  7. BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. CoRR, abs/1910.13461.
  8. Distractor generation for multiple choice questions using learning to rank. In Proceedings of the thirteenth workshop on innovative use of NLP for building educational applications, pages 284–290.
  9. Roberta: A robustly optimized BERT pretraining approach. CoRR, abs/1907.11692.
  10. George A Miller. 1995. Wordnet: a lexical database for english. Communications of the ACM, 38(11):39–41.
  11. Automatic cloze-questions generation. In Proceedings of the International Conference Recent Advances in Natural Language Processing RANLP 2013, pages 511–515.
  12. Siyu Ren and Kenny Q. Zhu. 2021. Knowledge-driven distractor generation for cloze-style multiple choice questions. Proceedings of the AAAI Conference on Artificial Intelligence, 35(5):4339–4347.
  13. Crowdsourcing multiple choice science questions. arXiv preprint arXiv:1707.06209.
  14. Probase: A probabilistic taxonomy for text understanding. In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data, pages 481–492.
  15. Large-scale cloze test dataset created by teachers. arXiv preprint arXiv:1711.03225.
Citations (9)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets