Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
126 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Using Generative Text Models to Create Qualitative Codebooks for Student Evaluations of Teaching (2403.11984v1)

Published 18 Mar 2024 in cs.CL, cs.AI, and cs.HC

Abstract: Feedback is a critical aspect of improvement. Unfortunately, when there is a lot of feedback from multiple sources, it can be difficult to distill the information into actionable insights. Consider student evaluations of teaching (SETs), which are important sources of feedback for educators. They can give instructors insights into what worked during a semester. A collection of SETs can also be useful to administrators as signals for courses or entire programs. However, on a large scale as in high-enroLLMent courses or administrative records over several years, the volume of SETs can render them difficult to analyze. In this paper, we discuss a novel method for analyzing SETs using NLP and LLMs. We demonstrate the method by applying it to a corpus of 5,000 SETs from a large public university. We show that the method can be used to extract, embed, cluster, and summarize the SETs to identify the themes they express. More generally, this work illustrates how to use the combination of NLP techniques and LLMs to generate a codebook for SETs. We conclude by discussing the implications of this method for analyzing SETs and other types of student writing in teaching and research settings.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (80)
  1. Thousand Oaks, Califorinia: SAGE Publications, Inc, third edition ed., 2014.
  2. C. Pope, S. Ziebland, and N. Mays, “Qualitative research in health care: Analysing qualitative data,” BMJ: British Medical Journal, vol. 320, no. 7227, p. 114, 2000.
  3. G. Terry, N. Hayfield, V. Clarke, and V. Braun, “The SAGE Handbook of Qualitative Research in Psychology,” in The SAGE Handbook of Qualitative Research in Psychology, pp. 17–36, 55 City Road: SAGE Publications Ltd, 2017.
  4. S. Rädiker and U. Kuckartz, Focused Analysis of Qualitative Interviews with MAXQDA. DE: MAXQDA Press, 1 ed., 2020.
  5. M. C. Gizzi and S. Rädiker, The practice of qualitative data analysis: Research examples using MAXQDA. BoD–Books on Demand, 2021.
  6. A. Edwards-Jones, “Qualitative data analysis with nvivo,” 2014.
  7. T. Basit, “Manual or electronic? the role of coding in qualitative data analysis,” Educational research, vol. 45, no. 2, pp. 143–154, 2003.
  8. R. L. Patibandla, S. S. Kurra, A. Prasad, and N. Veeranjaneyulu, “Unstructured data: Qualitative analysis,” J. of Computation In Biosciences And Engineering, vol. 2, no. 3, pp. 1–4, 2015.
  9. K. Crowston, X. Liu, and E. E. Allen, “Machine learning and rule-based automated coding of qualitative data: Machine Learning and Rule-Based Automated Coding of Qualitative Data,” Proceedings of the American Society for Information Science and Technology, vol. 47, pp. 1–2, Nov. 2010.
  10. A. Katz, U. Shakir, and B. Chambers, “The Utility of Large Language Models and Generative AI for Education Research,” May 2023. arXiv:2305.18125 [cs].
  11. A. N. Garman, T. S. Erwin, T. R. Garman, and D. H. Kim, “Developing competency frameworks using natural language processing: An exploratory study,” The Journal of Competency-Based Education, vol. 6, p. e01256, Sept. 2021.
  12. S. Bhaduri, “NLP in Engineering Education - Demonstrating the use of Natural Language Processing Techniques for Use in Engineering Education Classrooms and Research,” dissertation, Virginia Tech, Blacksburg, VA, Dec. 2017. Publisher: Virginia Tech.
  13. Z. Kastrati, F. Dalipi, A. S. Imran, K. Pireva Nuci, and M. A. Wani, “Sentiment Analysis of Students’ Feedback with NLP and Deep Learning: A Systematic Mapping Study,” Applied Sciences, vol. 11, p. 3986, Apr. 2021.
  14. R. Somers, S. Cunningham-Nelson, and W. Boles, “Applying natural language processing to automatically assess student conceptual understanding from textual responses,” Australasian Journal of Educational Technology, vol. 37, pp. 98–115, Dec. 2021.
  15. T. Shaik, X. Tao, Y. Li, C. Dann, J. McDonald, P. Redmond, and L. Galligan, “A Review of the Trends and Challenges in Adopting Natural Language Processing Methods for Education Feedback Analysis,” IEEE Access, vol. 10, pp. 56720–56739, 2022.
  16. A. S. Sunar and M. S. Khalid, “Natural Language Processing of Student’s Feedback to Instructors: A Systematic Review,” IEEE Transactions on Learning Technologies, vol. 17, pp. 741–753, 2024.
  17. A. Ganesh, H. Scribner, J. Singh, K. Goodman, J. Hertzberg, and K. Kann, “Response Construct Tagging: NLP-Aided Assessment for Engineering Education,” in Proceedings of the 17th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2022), (Seattle, Washington), pp. 250–261, Association for Computational Linguistics, 2022.
  18. A. N. Mathew, R. V., and J. Paulose, “NLP-based personal learning assistant for school education,” International Journal of Electrical and Computer Engineering (IJECE), vol. 11, p. 4522, Oct. 2021.
  19. M. Chiu, S. Lim, and A. Silva, “Visualizing design project team and individual progress using NLP: a comparison between latent semantic analysis and Word2Vector algorithms,” Artificial Intelligence for Engineering Design, Analysis and Manufacturing, vol. 37, p. e18, 2023.
  20. I. Persing and V. Ng, “Modeling Stance in Student Essays,” in Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), (Berlin, Germany), pp. 2174–2184, Association for Computational Linguistics, 2016.
  21. M. Soledad, J. Grohs, S. Bhaduri, J. Doggett, J. Williams, and S. Culver, “Leveraging Institutional Data to Understand Student Perceptions of Teaching in Large Engineering Classes,” (Indianapolis, IN), IEEE Frontiers in Education, Oct. 2017.
  22. M. M. Soledad, “Understanding the Teaching and Learning Experience in Fundamental Engineering Courses,” dissertation, Virginia Tech, Blacksburg, VA, June 2019. Publisher: Virginia Tech.
  23. R. H. Tai, L. R. Bentley, X. Xia, J. M. Sitt, S. C. Fankhauser, A. M. Chicas-Mosier, and B. G. Monteith, “An Examination of the Use of Large Language Models to Aid Analysis of Textual Data,” International Journal of Qualitative Methods, vol. 23, p. 16094069241231168, Jan. 2024. Publisher: SAGE Publications Inc.
  24. A. Katz, S. Wei, G. Nanda, C. Brinton, and M. Ohland, “Exploring the Efficacy of ChatGPT in Analyzing Student Teamwork Feedback with an Existing Taxonomy,” May 2023. arXiv:2305.11882 [cs].
  25. “Privacy policy,” Nov. 2023.
  26. “Gemini Apps Privacy Hub - Gemini Apps Help.”
  27. E. Alpaydin, Machine learning. MIT press, 2021.
  28. I. El Naqa and M. J. Murphy, “What Is Machine Learning?,” in Machine Learning in Radiation Oncology: Theory and Applications (I. El Naqa, R. Li, and M. J. Murphy, eds.), pp. 3–11, Cham: Springer International Publishing, 2015.
  29. K. R. Chowdhary, “Natural Language Processing,” in Fundamentals of Artificial Intelligence (K. Chowdhary, ed.), pp. 603–649, New Delhi: Springer India, 2020.
  30. A. Gillioz, J. Casas, E. Mugellini, and O. A. Khaled, “Overview of the Transformer-based Models for NLP Tasks,” in 2020 15th Conference on Computer Science and Information Systems (FedCSIS), pp. 179–183, Sept. 2020.
  31. J. Vig, S. Gehrmann, Y. Belinkov, S. Qian, D. Nevo, Y. Singer, and S. Shieber, “Investigating Gender Bias in Language Models Using Causal Mediation Analysis,” in Advances in Neural Information Processing Systems (H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, and H. Lin, eds.), vol. 33, pp. 12388–12401, Curran Associates, Inc., 2020.
  32. N. Arthurs and A. J. Alvero, “Whose Truth Is the ”Ground Truth”? College Admissions Essays and Bias in Word Vector Evaluation Methods,” tech. rep., International Educational Data Mining Society, July 2020. Publication Title: International Educational Data Mining Society ERIC Number: ED608067.
  33. D. R. E. Cotton, P. A. Cotton, and J. R. Shipway, “Chatting and cheating: Ensuring academic integrity in the era of ChatGPT,” Innovations in Education and Teaching International, vol. 0, no. 0, pp. 1–12, 2023. Publisher: Routledge _eprint: https://doi.org/10.1080/14703297.2023.2190148.
  34. J. Qadir, “Engineering Education in the Era of ChatGPT: Promise and Pitfalls of Generative AI for Education,” Dec. 2022.
  35. B. Min, H. Ross, E. Sulem, A. P. B. Veyseh, T. H. Nguyen, O. Sainz, E. Agirre, I. Heintz, and D. Roth, “Recent Advances in Natural Language Processing via Large Pre-trained Language Models: A Survey,” ACM Computing Surveys, vol. 56, pp. 30:1–30:40, Sept. 2023.
  36. A. Ansari, S. Ahmad, and S. Bhutta, “Mapping the global evidence around the use of ChatGPT in higher education: A systematic scoping review,” Education and Information Technologies, 2023.
  37. G. Cooper, “Examining Science Education in ChatGPT: An Exploratory Study of Generative Artificial Intelligence,” Journal of Science Education and Technology, vol. 32, pp. 444–452, June 2023.
  38. C. G. Berdanier, E. Baker, W. Wang, and C. McComb, “Opportunities for Natural Language Processing in Qualitative Engineering Education Research: Two Examples,” in 2018 IEEE Frontiers in Education Conference (FIE), (San Jose, CA, USA), pp. 1–6, IEEE, Oct. 2018.
  39. M. Verleger, “Using natural language processing tools to classify student responses to open-ended engineering problems in large classes,” ASEE Annual Conference and Exposition, Conference Proceedings, Jan. 2014.
  40. V. González-Calatayud, P. Prendes-Espinosa, and R. Roig-Vila, “Artificial Intelligence for Student Assessment: A Systematic Review,” Applied Sciences, vol. 11, p. 5467, Jan. 2021. Number: 12 Publisher: Multidisciplinary Digital Publishing Institute.
  41. S. Nikolic, S. Daniel, R. Haque, M. Belkina, G. M. Hassan, S. Grundy, S. Lyden, P. Neal, and C. Sandison, “ChatGPT versus engineering education assessment: a multidisciplinary and multi-institutional benchmarking and analysis of this generative artificial intelligence tool to investigate assessment integrity,” European Journal of Engineering Education, vol. 48, pp. 559–614, July 2023. Publisher: Taylor & Francis _eprint: https://doi.org/10.1080/03043797.2023.2213169.
  42. J. Savelka, A. Agarwal, C. Bogart, Y. Song, and M. Sakr, “Can Generative Pre-trained Transformers (GPT) Pass Assessments in Higher Education Programming Courses?,” in Proceedings of the 2023 Conference on Innovation and Technology in Computer Science Education V. 1, ITiCSE 2023, (New York, NY, USA), pp. 117–123, Association for Computing Machinery, June 2023.
  43. S. Crossley, J. Ocumpaugh, M. Labrum, F. Bradfield, M. Dascalu, and R. S. Baker, “Modeling Math Identity and Math Success through Sentiment Analysis and Linguistic Features,” tech. rep., International Educational Data Mining Society, July 2018. Publication Title: International Educational Data Mining Society ERIC Number: ED593117.
  44. C. Troussas, C. Papakostas, A. Krouska, P. Mylonas, and C. Sgouropoulou, “Personalized Feedback Enhanced by Natural Language Processing in Intelligent Tutoring Systems,” in Augmented Intelligence and Intelligent Tutoring Systems (C. Frasson, P. Mylonas, and C. Troussas, eds.), Lecture Notes in Computer Science, (Cham), pp. 667–677, Springer Nature Switzerland, 2023.
  45. N. Madnani, J. Burstein, N. Elliot, B. Beigman Klebanov, D. Napolitano, S. Andreyev, and M. Schwartz, “Writing mentor: Self-regulated writing feedback for struggling writers,” in Proceedings of the 27th International Conference on Computational Linguistics: System Demonstrations (D. Zhao, ed.), (Santa Fe, New Mexico), pp. 113–117, Association for Computational Linguistics, Aug. 2018.
  46. T. Phillips, A. Saleh, and G. Ozogul, “An AI toolkit to support teacher reflection,” International Journal of Artificial Intelligence in Education, vol. 33, Aug. 2022.
  47. T. Atapattu, K. Falkner, and N. Falkner, “A comprehensive text analysis of lecture slides to generate concept maps,” Computers & Education, vol. 115, pp. 96–113, Dec. 2017.
  48. K. Jayakodi, M. Bandara, and I. Perera, “An automatic classifier for exam questions in Engineering: A process for Bloom’s taxonomy,” in 2015 IEEE International Conference on Teaching, Assessment, and Learning for Engineering (TALE), pp. 195–202, Dec. 2015.
  49. Z. Xiao, X. Yuan, Q. V. Liao, R. Abdelghani, and P.-Y. Oudeyer, “Supporting Qualitative Analysis with Large Language Models: Combining Codebook with GPT-3 for Deductive Coding,” in Companion Proceedings of the 28th International Conference on Intelligent User Interfaces, IUI ’23 Companion, (New York, NY, USA), pp. 75–78, Association for Computing Machinery, Mar. 2023.
  50. A. Katz, M. Norris, A. M. Alsharif, M. D. Klopfer, D. B. Knight, and J. R. Grohs, “Using natural language processing to facilitate student feedback analysis,” in 2021 ASEE Virtual Annual Conference Content Access, 2021.
  51. H. K. Wachtel, “Student Evaluation of College Teaching Effectiveness: a brief review,” Assessment & Evaluation in Higher Education, vol. 23, pp. 191–212, Jan. 1998.
  52. H. W. Marsh, “Students’ evaluations of University teaching: Research findings, methodological issues, and directions for future research,” International Journal of Educational Research, vol. 11, pp. 253–388, Jan. 1987.
  53. R. Sproule, “Student Evaluation of Teaching: Methodological Critique,” Education Policy Analysis Archives, vol. 8, pp. 50–50, Nov. 2000.
  54. H. A. Hornstein, “Student evaluations of teaching are an inadequate assessment tool for evaluating faculty performance,” Cogent Education, vol. 4, p. 1304016, Jan. 2017. Publisher: Cogent OA _eprint: https://www.tandfonline.com/doi/pdf/10.1080/2331186X.2017.1304016.
  55. P. C. Abrami, S. d’Apollonia, and S. Rosenfield, “The Dimensionality of Student Ratings of Instruction: What We Know and What We Do Not*,” in The Scholarship of Teaching and Learning in Higher Education: An Evidence-Based Perspective (R. P. Perry and J. C. Smart, eds.), pp. 385–456, Dordrecht: Springer Netherlands, 2007.
  56. A. Hoel and T. I. Dahl, “Why bother? Student motivation to participate in student evaluations of teaching,” Assessment & Evaluation in Higher Education, vol. 44, pp. 361–378, Apr. 2019.
  57. V. Reyes, E. Bogumil, and L. E. Welch, “The living codebook: Documenting the process of qualitative data analysis,” Sociological Methods & Research, p. 0049124120986185, 2021.
  58. A. Q. Jiang, A. Sablayrolles, A. Mensch, C. Bamford, D. S. Chaplot, D. d. l. Casas, F. Bressand, G. Lengyel, G. Lample, L. Saulnier, et al., “Mistral 7b,” arXiv preprint arXiv:2310.06825, 2023.
  59. T. Wolf, L. Debut, V. Sanh, J. Chaumond, C. Delangue, A. Moi, P. Cistac, T. Rault, R. Louf, M. Funtowicz, J. Davison, S. Shleifer, P. von Platen, C. Ma, Y. Jernite, J. Plu, C. Xu, T. Le Scao, S. Gugger, M. Drame, Q. Lhoest, and A. Rush, “Transformers: State-of-the-art natural language processing,” in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations (Q. Liu and D. Schlangen, eds.), (Online), pp. 38–45, Association for Computational Linguistics, Oct. 2020.
  60. X. Li and J. Li, “Angle-optimized text embeddings,” arXiv preprint arXiv:2309.12871, 2023.
  61. K. Erk, “What do you know about an alligator when you know the company it keeps?,” Semantics and Pragmatics, vol. 9, pp. 17–1, 2016.
  62. G. Boleda, “Distributional semantics and linguistic theory,” Annual Review of Linguistics, vol. 6, pp. 213–234, 2020.
  63. T. Kenter and M. De Rijke, “Short text similarity with word embeddings,” in Proceedings of the 24th ACM international on conference on information and knowledge management, pp. 1411–1420, 2015.
  64. D. Khurana, A. Koli, K. Khatter, and S. Singh, “Natural language processing: State of the art, current trends and challenges,” Multimedia tools and applications, vol. 82, no. 3, pp. 3713–3744, 2023.
  65. L. McInnes, J. Healy, and S. Astels, “hdbscan: Hierarchical density based clustering.,” J. Open Source Softw., vol. 2, no. 11, p. 205, 2017.
  66. Los Angeles, Calif.: SAGE, 4. ed ed., 2014.
  67. L. R. Lattuca and J. S. Stark, Shaping the college curriculum: academic plans in context. San Francisco, CA: Jossey-Bass, 2nd ed ed., 2009. OCLC: ocn303075446.
  68. D. B. Knight, I. T. Cameron, R. G. Hadgraft, and C. Reidsema, “The influence of external forces, institutional forces, and academics’ characteristics on the adoption of positive teaching practices across Australian undergraduate engineering,” International Journal of Engineering Education, vol. 32, pp. 695–711, Jan. 2016. Publisher: Dublin Institute of Technology * Tempus Publications.
  69. D. Knight, L. R. Lattuca, A. Yin, G. Kremer, T. York, and H. K. Ro, “AN EXPLORATION OF GENDER DIVERSITY IN ENGINEERING PROGRAMS: A CURRICULUM AND INSTRUCTION-BASED PERSPECTIVE,” Journal of Women and Minorities in Science and Engineering, vol. 18, no. 1, pp. 55–78, 2012.
  70. D. M. Blei, A. Y. Ng, and M. I. Jordan, “Latent dirichlet allocation,” Journal of machine Learning research, vol. 3, no. Jan, pp. 993–1022, 2003.
  71. H. M. Wallach, I. Murray, R. Salakhutdinov, and D. Mimno, “Evaluation methods for topic models,” in Proceedings of the 26th annual international conference on machine learning, pp. 1105–1112, 2009.
  72. M. Grootendorst, “Bertopic: Neural topic modeling with a class-based tf-idf procedure,” arXiv preprint arXiv:2203.05794, 2022.
  73. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in neural information processing systems, vol. 30, 2017.
  74. Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, and V. Stoyanov, “RoBERTa: A Robustly Optimized BERT Pretraining Approach,” July 2019.
  75. T. Rahman, J. Nwokeji, R. Matovu, S. Frezza, H. Sugnanam, and A. Pisolkar, “Analyzing Competences in Software Testing: Combining Thematic Analysis with Natural Language Processing (NLP),” in 2021 IEEE Frontiers in Education Conference (FIE), pp. 1–9, Oct. 2021. ISSN: 2377-634X.
  76. Rose E. Wang, Pawan Wirawarn, Noah D. Goodman, and Dorottya Demszky, “SIGHT: A Large Annotated Dataset on Student Insights Gathered from Higher Education Transcripts,” Workshop on Innovative Use of NLP for Building Educational Applications, June 2023. ARXIV_ID: 2306.09343 MAG ID: 4380994587 S2ID: fdfce2076a0158df0388fcd6762164bb337b05da.
  77. Y. Gamieldien, R. McCord, and A. Katz, “Utilizing Natural Language Processing to Examine Self-Reflections in Self-Regulated Learning,” June 2023.
  78. R. Mao, Q. Liu, K. He, W. Li, and E. Cambria, “The biases of pre-trained language models: An empirical study on prompt-based sentiment analysis and emotion detection,” IEEE Transactions on Affective Computing, vol. 14, no. 3, pp. 1743–1753, 2023.
  79. E. Sheng, K.-W. Chang, P. Natarajan, and N. Peng, “The woman worked as a babysitter: On biases in language generation,” in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) (K. Inui, J. Jiang, V. Ng, and X. Wan, eds.), (Hong Kong, China), pp. 3407–3412, Association for Computational Linguistics, Nov. 2019.
  80. K. W. Church, Z. Chen, and Y. Ma, “Emerging trends: A gentle introduction to fine-tuning,” Natural Language Engineering, vol. 27, no. 6, p. 763–778, 2021.
Citations (2)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com