Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Enhancing Higher Education with Generative AI: A Multimodal Approach for Personalised Learning (2502.07401v1)

Published 11 Feb 2025 in cs.HC and cs.AI
Enhancing Higher Education with Generative AI: A Multimodal Approach for Personalised Learning

Abstract: This research explores the opportunities of Generative AI (GenAI) in the realm of higher education through the design and development of a multimodal chatbot for an undergraduate course. Leveraging the ChatGPT API for nuanced text-based interactions and Google Bard for advanced image analysis and diagram-to-code conversions, we showcase the potential of GenAI in addressing a broad spectrum of educational queries. Additionally, the chatbot presents a file-based analyser designed for educators, offering deep insights into student feedback via sentiment and emotion analysis, and summarising course evaluations with key metrics. These combinations highlight the crucial role of multimodal conversational AI in enhancing teaching and learning processes, promising significant advancements in educational adaptability, engagement, and feedback analysis. By demonstrating a practical web application, this research underlines the imperative for integrating GenAI technologies to foster more dynamic and responsive educational environments, ultimately contributing to improved educational outcomes and pedagogical strategies.

Enhancing Higher Education with Generative AI: A Multimodal Approach for Personalised Learning

This paper presents a focused exploration of the application of Generative AI (GenAI) within higher education through the development of a multimodal chatbot. The central aim of this research is to enrich personalized learning experiences by leveraging the combination of text, image, and file inputs. The multimodal chatbot is specifically designed to address a wide spectrum of educational queries, thus helping to bridge existing gaps in conventional educational technologies.

The authors implemented various state-of-the-art GenAI technologies, utilizing the capabilities of the ChatGPT API for text-based interactions and Google Bard for image analysis and diagram-to-code conversions. The integration of multimodal input capabilities represents the primary contribution of this research, enabling the chatbot to effectively process and respond to complex educational queries. This innovation is particularly relevant in disciplines that require significant interaction with visual information, such as STEM fields.

The paper further introduces a file-based analyzer to enhance the teaching process. This component of the system supports uploading coursework-related documents and provides nuanced sentiment and emotion analysis. Such functionality equips educators with the ability to gain comprehensive insights into student feedback and course evaluations. Key metrics such as sentiment scores and keyword summaries offer educators a substantive tool for pedagogical assessment and improvement.

The methodology outlined in the paper involves the meticulous design of three primary modules: text-based, image-based, and file-based components. The text-based module leverages fine-tuning principles to adapt the ChatGPT API for specific educational contexts. The image-based module employs Google Bard's robust capabilities in interpreting and converting diagrammatic content into executable code, a notable advancement given the existing challenges inherent in such conversions. The file-based analyzer module provides powerful analytical capabilities, drawing on NLP methodologies and Plutchik's emotion wheel to generate detailed analyses of feedback data.

In demonstrating the proof-of-concept using Gradio, an open-source library, the authors effectively showcase each module's functionality. This demonstration underscores the feasibility of employing an integrated GenAI system to address complex educational requirements within a user-friendly and scalable web application framework.

The implications of this research are substantial, heralding a shift toward more dynamic and responsive educational environments facilitated by GenAI technologies. The combination of multimodal input capabilities and scalable, granular analysis marks significant progress toward more personalized, adaptable, and efficient educational processes. For future research, the integration of additional modalities such as voice and haptics could be explored to further augment interactive capabilities.

In conclusion, while the exploration of multimodal conversational AI in education remains in its formative stages, the developments presented in this paper indicate a promising trajectory. This research provides a foundational paper into the practical applications of GenAI within educational settings, with the potential for significant contributions to both theoretical advancements and practical applications in adaptive learning technologies.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Johnny Chan (2 papers)
  2. Yuming Li (14 papers)