Papers

Topics

Authors

Recent

View all

Assistant

AI Research Assistant

Well-researched responses based on relevant abstracts and paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses.

Gemini 2.5 Flash

Gemini 2.5 Flash 73 tok/s

Gemini 2.5 Pro 41 tok/s Pro

GPT-5 Medium 32 tok/s Pro

GPT-5 High 35 tok/s Pro

GPT-4o 84 tok/s Pro

Kimi K2 185 tok/s Pro

GPT OSS 120B 441 tok/s Pro

Claude Sonnet 4.5 36 tok/s Pro

2000 character limit reached

Enhancing the Learning Experience: Using Vision-Language Models to Generate Questions for Educational Videos (2505.01790v1)

Published 3 May 2025 in cs.CV, cs.CL, and cs.MM

Abstract: Web-based educational videos offer flexible learning opportunities and are becoming increasingly popular. However, improving user engagement and knowledge retention remains a challenge. Automatically generated questions can activate learners and support their knowledge acquisition. Further, they can help teachers and learners assess their understanding. While large language and vision-LLMs have been employed in various tasks, their application to question generation for educational videos remains underexplored. In this paper, we investigate the capabilities of current vision-LLMs for generating learning-oriented questions for educational video content. We assess (1) out-of-the-box models' performance; (2) fine-tuning effects on content-specific question generation; (3) the impact of different video modalities on question quality; and (4) in a qualitative study, question relevance, answerability, and difficulty levels of generated questions. Our findings delineate the capabilities of current vision-LLMs, highlighting the need for fine-tuning and addressing challenges in question diversity and relevance. We identify requirements for future multimodal datasets and outline promising research directions.