Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 177 tok/s
Gemini 2.5 Pro 50 tok/s Pro
GPT-5 Medium 31 tok/s Pro
GPT-5 High 32 tok/s Pro
GPT-4o 93 tok/s Pro
Kimi K2 183 tok/s Pro
GPT OSS 120B 447 tok/s Pro
Claude Sonnet 4.5 36 tok/s Pro
2000 character limit reached

My Science Tutor (MyST) -- A Large Corpus of Children's Conversational Speech (2309.13347v1)

Published 23 Sep 2023 in cs.CL, cs.SD, and eess.AS

Abstract: This article describes the MyST corpus developed as part of the My Science Tutor project -- one of the largest collections of children's conversational speech comprising approximately 400 hours, spanning some 230K utterances across about 10.5K virtual tutor sessions by around 1.3K third, fourth and fifth grade students. 100K of all utterances have been transcribed thus far. The corpus is freely available (https://myst.cemantix.org) for non-commercial use using a creative commons license. It is also available for commercial use (https://boulderlearning.com/resources/myst-corpus/). To date, ten organizations have licensed the corpus for commercial use, and approximately 40 university and other not-for-profit research groups have downloaded the corpus. It is our hope that the corpus can be used to improve automatic speech recognition algorithms, build and evaluate conversational AI agents for education, and together help accelerate development of multimodal applications to improve children's excitement and learning about science, and help them learn remotely.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (17)
  1. “Self-explanations: How students study and use examples in learning to solve problems,” Cognitive Science, vol. 13, no. 2, 1989.
  2. “Eliciting self-explanations improves understanding,” Cognitive Science, vol. 18, no. 3, pp. 439–477, 1994.
  3. “Learning from human tutoring,” Cognitive Science, vol. 25, no. 4, pp. 471–533, 2001.
  4. R. G. M. Hausmann and K. VanLehn, “Explaining self-explaining: A contrast between content and generation,” Artificial Intelligence in Education, pp. 417–424, 2007.
  5. R. G. M. Hausmann and K. VanLehn, “Self-explaining in the classroom: Learning curve evidence,” in 29th Annual Conference of the Cognitive Science Society, Mahwah, NJ., 2007.
  6. “My science tutor: A conversational multimedia virtual tutor for elementary school science,” ACM Trans. Speech Lang. Process., vol. 7, no. 4, 2011.
  7. “My science tutor: A conversational multimedia virtual tutor,” Journal of Educational Psychology, vol. 105, no. 4, pp. 1115–1125, 2013.
  8. MyST Children’s Conversational Speech, Linguistic Data Consortium, 2021, Catalog LDC2021S05.
  9. “Speechbrain: A general-purpose speech toolkit,” arXiv preprint arXiv:2106.04624, 2021.
  10. “End-to-end neural systems for automatic children speech recognition: An empirical study,” Computer Speech & Language, vol. 72, pp. 101289, 2022.
  11. “Acoustics of children’s speech: Developmental changes of temporal and spectral parameters,” The Journal of the Acoustical Society of America, vol. 105, no. 3, pp. 1455–1468, 1999.
  12. Maxine S Eskenazi, Kids: a database of children’s speech, Ph.D. thesis, Acoustical Society of America, 1996.
  13. “University of colorado prompted and read children’s speech corpus,” Tech. Rep., 2006.
  14. R. Cole and B. Pellom, “University of colorado read and summarized stories corpus,” Tech. Rep., 2006.
  15. “The ogi kids’ speech corpus and recognizers,” in Proc. of ICSLP. Citeseer, 2000, pp. 564–567.
  16. “A comparison of human and computer recognition accuracy for children’s speech,” in Proc. Interspeech 2005, 2005, pp. 2197–2200.
  17. “Large vocabulary automatic speech recognition for children,” 2015.
Citations (8)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.