Multimodal Classification of Teaching Activities from University Lecture Recordings (2312.17262v1)
Abstract: The way of understanding online higher education has greatly changed due to the worldwide pandemic situation. Teaching is undertaken remotely, and the faculty incorporate lecture audio recordings as part of the teaching material. This new online teaching-learning setting has largely impacted university classes. While online teaching technology that enriches virtual classrooms has been abundant over the past two years, the same has not occurred in supporting students during online learning. {To overcome this limitation, our aim is to work toward enabling students to easily access the piece of the lesson recording in which the teacher explains a theoretical concept, solves an exercise, or comments on organizational issues of the course. To that end, we present a multimodal classification algorithm that identifies the type of activity that is being carried out at any time of the lesson by using a transformer-based LLM that exploits features from the audio file and from the automated lecture transcription. The experimental results will show that some academic activities are more easily identifiable with the audio signal while resorting to the text transcription is needed to identify others. All in all, our contribution aims to recognize the academic activities of a teacher during a lesson.
- Challenges in the online component of blended learning: A systematic review. Comput. Educ. 2020, 144, 103701.
- Students’ online learning challenges during the pandemic and how they cope with them: The case of the Philippines. Educ. Inf. Technol. 2021, 26, 7321–7338.
- Evaluating the use and impact of lecture recording in undergraduates: Evidence for distinct approaches by different groups of students. Comput. Educ. 2013, 61, 185–192.
- The use of recorded lectures in education and the impact on lecture attendance and exam performance. Br. J. Educ. Technol. 2016, 47, 906–917.
- Determining the impact of lecture videos on student outcomes. Learn. Teach. 2020, 13, 25–40. https://doi.org/10.3167/latiss.2020.130203.
- What can we learn from learning analytics? A case study based on an analysis of student use of video recordings. Res. Learn. Technol. 2018, 26, 2087. https://doi.org/10.25304/rlt.v26.2087.
- Turn up, tune in, don’t drop out: The relationship between lecture attendance, use of lecture recordings, and achievement at different levels of study. High. Educ. 2019, 77, 1065–1084. https://doi.org/10.1007/s10734-018-0320-8.
- Contributions of Machine Learning Models towards Student Academic Performance Prediction: A Systematic Review. Appl. Sci. 2021, 11, 10007. https://doi.org/10.3390/app112110007.
- A Novel Method for Performance Measurement of Public Educational Institutions Using Machine Learning Models. Appl. Sci. 2021, 11, 9296. https://doi.org/10.3390/app11199296.
- Automatic classification of activities in classroom discourse. Comput. Educ. 2014, 78, 115–123.
- . LENA Research Foundation. 2014..
- A thorough evaluation of the Language Environment Analysis (LENA) system. Behav. Res. Methods 2021, 53, 467–486.
- Language ENvironment analysis (LENA) system investigation of day long recordings in children: A literature review. J. Commun. Disord. 2018, 72, 77–85.
- Classroom sound can be used to classify teaching practices in college science courses. Proc. Natl. Acad. Sci. USA 2017, 114, 3085–3090. https://doi.org/10.1073/pnas.1618693114.
- Unsupervised Methods for Audio Classification from Lecture Discussion Recordings. In Proceedings of the ISCA Interspeech 2019, Graz, Austria, 15–19 September 2019; pp. 3347–3351.
- Fortanet-Gómez, I. Honoris Causa speeches: An approach to structure. Discourse Stud. 2005, 7, 31–51.
- Spoken academic discourse: An approach to research on lectures. Revista Española de Lingüística Aplicada 2005, 1, 161–178.
- Young, L. University lectures—Macro-structure and micro-features. In Academic Listening: Research Perspectives; Cambridge Applied Linguistics; Cambridge University Press: Cambridge, UK, 1995; pp. 159–176.
- Crystal, D. The Cambridge Encyclopedia of the English Language; Cambridge University Press: Cambridge, UK, 1995.
- Csomay, E. Academic lectures: An interface of an oral/literate continuum. NovELTy 2000, 7, 30–48.
- Biber, D. University Language: A Corpus-Based Study of Spoken and Written Registers; John Benjamins: Amsterdam, The Netherlands, 2006.
- Malavska, V. Genre of an Academic Lecture. Int. J. Lang. Lit. Cult. Educ. 2016, 3, 56–84.
- Text analysis in education: A review of selected software packages with an application for analysing students’ conceptual understanding. Australas. J. Eng. Educ. 2018, 23, 25–39.
- Text mining in education. WIREs Data Min. Knowl. Discov. 2019, 9, e1332.
- Topic modeling for evaluating students’ reflective writing: A case study of pre-service teachers’ journals. In Proceedings of the International Conference on Learning Analytics & Knowledge, Edinburgh, UK, 25–29 April 2016; pp. 1–5.
- Topic-Specific Recommendation for Open Education Resources. In Proceedings of the Advances in Web-Based Learning—ICWL 2015, Guangzhou, China, 5–8 November 2015; Springer International Publishing: Cham, Switzerland, 2015; pp. 71–81.
- Recent Trends in Deep Learning Based Natural Language Processing. IEEE Comput. Intell. Mag. 2018, 13, 55–75.
- A New Golden Age in Computer Architecture: Empowering the Machine-Learning Revolution. IEEE Micro 2018, 38, 21–29.
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv 2018, arXiv:cs.CL/1810.04805.
- Language Models are Unsupervised Multitask Learners. OpenAI Blog, 2019; Volume 8.
- Miao, W. A Study on the Teaching Design of a Hybrid Civics Course Based on the Improved Attention Mechanism. Appl. Sci. 2022, 12, 1243. https://doi.org/10.3390/app12031243.
- Hybrid Approach for Emotion Classification of Audio Conversation Based on Text and Speech Mining. Procedia Comput. Sci. 2015, 46, 635–643. https://doi.org/https://doi.org/10.1016/j.procs.2015.02.112.
- Multimodal Speech Emotion Recognition Using Audio and Text. In Proceedings of the 2018 IEEE Spoken Language Technology Workshop (SLT), Athens, Greece, 18–21 December 2018. https://doi.org/10.1109/SLT.2018.8639583.
- Speech Intention Classification with Multimodal Deep Learning. In Proceedings of the Advances in Artificial Intelligence: Canadian Society for Computational Studies of Intelligence, Quebec City, QC, Canada, 7–9 June 2017; pp. 260–271.
- Multimodal Deep Learning for Music Genre Classification. Trans. Int. Soc. Music. Inf. Retr. 2018, 1, 4–21.
- Reliability of the Language ENvironment Analysis system (LENA) in European French. Behav. Res. Methods 2016, 48, 1109–1124.
- Transcriptional Analyses of the LENA Natural Language Corpus; Technical Report LTR-06-2; LENA Foundation: Boulder, CO, USA, 2008.
- Recognition of Teaching Activities from University Lecture Transcriptions. In Advances in Artificial Intelligence—Conference of the Spanish Association for Artificial Intelligence (CAEPIA); Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2021; Volume 12882, pp. 226–236.
- Language model adaptation for video lectures transcription. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2013), Vancouver, BC, Canada, 26–31 May 2013; pp. 8450–8454.
- Efficiency and usability study of innovative computer-aided transcription strategies for video lecture repositories. Speech Commun. 2015, 74, 65–75.
- Team, T.A. Audacity. Available online: https://www.audacityteam.org/ (accessed on 31 Janauary 2022).
- Unsupervised Cross-lingual Representation Learning at Scale. arXiv 2020, arXiv:cs.CL/1911.02116.
- wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations. arXiv 2020, arXiv:abs/2006.11477.
- Attention is All you Need. In Advances in Neural Information Processing Systems 30; Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R., Eds.; Curran Associates, Inc.: Red Hook, NY, USA, 2017; pp. 5998–6008.
- Defense + Commercial Sensing, 2019. arXiv 2019, arXiv:1708.07120.