Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

QACP: An Annotated Question Answering Dataset for Assisting Chinese Python Programming Learners (2402.07913v2)

Published 30 Jan 2024 in cs.CL, cs.AI, and cs.HC

Abstract: In online learning platforms, particularly in rapidly growing computer programming courses, addressing the thousands of students' learning queries requires considerable human cost. The creation of intelligent assistant LLMs tailored for programming education necessitates distinct data support. However, in real application scenarios, the data resources for training such LLMs are relatively scarce. Therefore, to address the data scarcity in intelligent educational systems for programming, this paper proposes a new Chinese question-and-answer dataset for Python learners. To ensure the authenticity and reliability of the sources of the questions, we collected questions from actual student questions and categorized them according to various dimensions such as the type of questions and the type of learners. This annotation principle is designed to enhance the effectiveness and quality of online programming education, providing a solid data foundation for developing the programming teaching assists (TA). Furthermore, we conducted comprehensive evaluations of various LLMs proficient in processing and generating Chinese content, highlighting the potential limitations of general LLMs as intelligent teaching assistants in computer programming courses.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (30)
  1. Can chatgpt play the role of a teaching assistant in an introductory programming course?, 2023.
  2. Unmasking the giant: A comprehensive evaluation of chatgpt’s proficiency in coding algorithms and data structures, 2023.
  3. Qwen technical report, 2023.
  4. A comparative study of ai-generated (gpt-4) and human-crafted mcqs in programming education. In Proceedings of the 26th Australasian Computing Education Conference, ACE 2024. ACM, Jan. 2024. doi: 10.1145/3636243.3636256. URL http://dx.doi.org/10.1145/3636243.3636256.
  5. Ai-ta: Towards an intelligent question-answer teaching assistant using open-source llms, 2023.
  6. Y. Katz and S. Romi. Affective education: The nature and characteristics of teachers and students attitudes toward school. Educational Practice and Theory, 25:35–47, 01 2003. doi: 10.7459/ept/25.1.04.
  7. Studying the effect of ai code generators on supporting novice learners in introductory programming. Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, 2023. URL https://api.semanticscholar.org/CorpusID:256868626.
  8. N. Kiesler and D. Schiffner. Large language models in introductory programming education: Chatgpt’s performance and implications for assessments, 2023.
  9. Exploring the potential of large language models to generate formative programming feedback, 2023.
  10. Learning from teaching assistants to program with subgoals: Exploring the potential for ai teaching assistants, 2023.
  11. J. Leppink. Cognitive load theory: Practical implications and an important challenge. Journal of Taibah University Medical Sciences, 12(5):385–391, 2017. ISSN 1658-3612. doi: https://doi.org/10.1016/j.jtumed.2017.05.003. URL https://www.sciencedirect.com/science/article/pii/S1658361217300835.
  12. Retrieval-augmented generation for knowledge-intensive nlp tasks, 2021.
  13. C. Li. External knowledge augmented polyphone disambiguation using large language model, 2023.
  14. C.-Y. Lin. Rouge: A package for automatic evaluation of summaries. In Text summarization branches out, pages 74–81, 2004.
  15. Retrieval augmented generation and representative vector summarization for large unstructured textual data in medical education, 2023.
  16. Augmented language models: a survey, 2023.
  17. M. Mizbani and A. Chalak. Analyzing reading and writing activities of iranian efl textbook prospect 3 based on bloom’s revised taxonomy. 4:13–27, 01 2017.
  18. OpenAI. Chatgpt: Optimizing language models for dialogue, 2023. URL https://openai.com/chatgpt. Software available from OpenAI.
  19. Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting of the Association for Computational Linguistics, pages 311–318, 2002.
  20. Direct preference optimization: Your language model is secretly a reward model, 2023.
  21. Thrilled by your progress! large language models (gpt-4) no longer struggle to pass assessments in higher education programming courses. In Proceedings of the 2023 ACM Conference on International Computing Education Research V.1, ICER 2023. ACM, Aug. 2023a. doi: 10.1145/3568813.3600142. URL http://dx.doi.org/10.1145/3568813.3600142.
  22. Large language models (gpt) struggle to answer multiple-choice questions about code, 2023b.
  23. Sotana: The open-source software development assistant, 2023.
  24. Llama 2: Open foundation and fine-tuned chat models, 2023.
  25. Cider: Consensus-based image description evaluation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4566–4575, 2015.
  26. Clinicalgpt: Large language models finetuned with diverse medical data and comprehensive evaluation, 2023a.
  27. Data management for large language models: A survey, 2023b.
  28. Baichuan 2: Open large-scale language models, 2023.
  29. Glm-130b: An open bilingual pre-trained model, 2023.
  30. L. Zhong and Z. Wang. Can chatgpt replace stackoverflow? a study on robustness and reliability of large language model code generation, 2023.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Rui Xiao (18 papers)
  2. Lu Han (38 papers)
  3. Xiaoying Zhou (9 papers)
  4. Jiong Wang (18 papers)
  5. Na Zong (2 papers)
  6. Pengyu Zhang (26 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com