Conversations as a Source for Teaching Scientific Concepts at Different Education Levels (2404.10475v1)
Abstract: Open conversations are one of the most engaging forms of teaching. However, creating those conversations in educational software is a complex endeavor, especially if we want to address the needs of different audiences. While LLMs hold great promise for educational applications, there are substantial challenges in training them to engage in meaningful and effective conversational teaching, especially when considering the diverse needs of various audiences. No official data sets exist for this task to facilitate the training of LLMs for conversational teaching, considering the diverse needs of various audiences. This paper presents a novel source for facilitating conversational teaching of scientific concepts at various difficulty levels (from preschooler to expert), namely dialogues taken from video transcripts. We analyse this data source in various ways to show that it offers a diverse array of examples that can be used to generate contextually appropriate and natural responses to scientific topics for specific target audiences. It is a freely available valuable resource for training and evaluating conversation models, encompassing organically occurring dialogues. While the raw data is available online, we provide additional metadata for conversational analysis of dialogues at each level in all available videos.
- Asset: A dataset for tuning and evaluation of sentence simplification models with multiple rewriting transformations. arXiv preprint arXiv:2005.00481.
- Leonid Berov and Kai Standvoss. 2018. Discourse embellishment using a deep encoder-decoder network. arXiv preprint arXiv:1810.08076.
- Rahul Bhagat and Eduard Hovy. 2013. What is a paraphrase? Computational Linguistics, 39(3):463–472.
- Learning to paraphrase sentences to different complexity levels. arXiv preprint arXiv:2308.02226.
- National Research Council et al. 2015. Transforming the workforce for children birth through age 8: A unifying foundation.
- Carlos Islam and Chris Mares. 2003. Adapting classroom materials. Developing materials for language teaching, 2:86–103.
- Chatgpt for good? on opportunities and challenges of large language models for education. Learning and individual differences, 103:102274.
- Derivation of new readability formulas (automated readability index, fog count and flesch reading ease formula) for navy enlisted personnel.
- Charity C Okeke and Gert van der Westhuizen. 2020. Learning from professional conversation: A conversation analysis study. South African journal of education, 40(1):1–10.
- Advaith Siddharthan. 2002. An architecture for a text simplification system. In Language Engineering Conference, 2002. Proceedings, pages 64–71. IEEE.
- A monolingual tree-based translation model for sentence simplification. In Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010), pages 1353–1361.
- Donya Rooein (8 papers)
- Dirk Hovy (57 papers)