Mathemyths: Leveraging Large Language Models to Teach Mathematical Language through Child-AI Co-Creative Storytelling (2402.01927v2)
Abstract: Mathematical language is a cornerstone of a child's mathematical development, and children can effectively acquire this language through storytelling with a knowledgeable and engaging partner. In this study, we leverage the recent advances in LLMs to conduct free-form, creative conversations with children. Consequently, we developed Mathemyths, a joint storytelling agent that takes turns co-creating stories with children while integrating mathematical terms into the evolving narrative. This paper details our development process, illustrating how prompt-engineering can optimize LLMs for educational contexts. Through a user study involving 35 children aged 4-8 years, our results suggest that when children interacted with Mathemyths, their learning of mathematical language was comparable to those who co-created stories with a human partner. However, we observed differences in how children engaged with co-creation partners of different natures. Overall, we believe that LLM applications, like Mathemyths, offer children a unique conversational experience pertaining to focused learning objectives.
- GPT-3-Driven Pedagogical Agents for Training Children’s Curious Question-Asking Skills. https://doi.org/10.1007/s40593-023-00340-7 arXiv:2211.14228 [cs]
- Olaiya Aina. 1999. The Importance of Oral Storytelling in Literacy Development. Ohio Reading Teacher 33, 1 (1999), 15–18.
- Murat Akkus. 2016. The Common Core State Standards for Mathematics. International Journal of Research in Education and Science 2, 1 (2016), 49–54.
- Automatic Story Generation: Challenges and Attempts. https://doi.org/10.48550/arXiv.2102.12634 arXiv:2102.12634 [cs]
- Can Children Learn Creativity from a Social Robot?. In Proceedings of the 2019 on Creativity and Cognition (C&C ’19). Association for Computing Machinery, New York, NY, USA, 359–368. https://doi.org/10.1145/3325480.3325499
- Can Children Emulate a Robotic Non-Player Character’s Figural Creativity?. In Proceedings of the Annual Symposium on Computer-Human Interaction in Play (CHI PLAY ’20). Association for Computing Machinery, New York, NY, USA, 499–509. https://doi.org/10.1145/3410404.3414251
- Minhui Bao. 2019. Can Home Use of Speech-Enabled Artificial Intelligence Mitigate Foreign Language Anxiety–Investigation of a Concept. Arab World English Journal (AWEJ) Special Issue on CALL 1, 5 (2019), 28–40.
- Parenting with Alexa: Exploring the Introduction of Smart Speakers on Family Dynamics. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery, New York, NY, USA, 1–13.
- Understanding the Long-Term Use of Smart Speaker Assistants. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 2, 3 (2018), 1–24.
- Marina Umaschi Bers and Justine Cassell. 1998a. Interactive Storytelling Systems for Children: Using Technology to Explore Language and Identity. Journal of Interactive Learning Research 9 (1998), 183–215.
- Marina Umaschi Bers and Justine Cassell. 1998b. Interactive storytelling systems for children: Using technology to explore language and identity. Journal of Interactive Learning Research 9 (1998), 183–215.
- Language Models Are Few-Shot Learners. https://doi.org/10.48550/arXiv.2005.14165 arXiv:2005.14165 [cs]
- Joint Book Reading Makes for Success in Learning to Read: A Meta-Analysis on Intergenerational Transmission of Literacy. Review of Educational Research 65, 1 (March 1995), 1–21. https://doi.org/10.3102/00346543065001001
- John Bynner and Samantha Parsons. 1997. Does Numeracy Matter? Evidence from the National Child Development Study on the Impact of Poor Numeracy on Adult Life. Technical Report. Basic Skills Agency, Commonwealth House, 1-19 New Oxford Street, London WC1A 1NU, England, United Kingdom (6.
- PaLM: Scaling Language Modeling with Pathways. https://doi.org/10.48550/arXiv.2204.02311 arXiv:2204.02311 [cs]
- TaleBrush: Sketching Stories with Generative Pretrained Language Models. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems (CHI ’22). Association for Computing Machinery, New York, NY, USA, 1–19. https://doi.org/10.1145/3491102.3501819
- Socioeconomic variations in the frequency of parent number talk: A meta-analysis. Education Sciences 12, 5 (2022), 312.
- StoryCoder: Teaching Computational Thinking Concepts Through Storytelling in a Voice-Guided App for Children. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems (CHI ’21). Association for Computing Machinery, New York, NY, USA, 1–15. https://doi.org/10.1145/3411764.3445039
- Reading Comprehension Quiz Generation Using Generative Pre-Trained Transformers. In The 23th International Conference on Artificial Intelligence in Education (AIED. Springer, New York, NY, USA, 1–14.
- School Readiness and Later Achievement. Developmental Psychology 43, 6 (Nov. 2007), 1428–1446. https://doi.org/10.1037/0012-1649.43.6.1428
- Alessandro Duranti and Charles Goodwin. 1992. Rethinking Context: Language as an Interactive Phenomenon. Number 11 in 1. Cambridge University Press, Cambridge, England.
- VisualMath: An Automated Visualization System for Understanding Math Word-Problems. In Proceedings of the 22nd International Conference on Intelligent User Interfaces Companion (IUI ’17 Companion). Association for Computing Machinery, New York, NY, USA, 105–108. https://doi.org/10.1145/3030024.3040989
- Neural Path Hunter: Reducing Hallucination in Dialogue Systems via Path Grounding. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Marie-Francine Moens, Xuanjing Huang, Lucia Specia, and Scott Wen-tau Yih (Eds.). Association for Computational Linguistics, Online and Punta Cana, Dominican Republic, 2197–2214. https://doi.org/10.18653/v1/2021.emnlp-main.168
- Reham El Shazly. 2021. Effects of Artificial Intelligence on English Speaking Anxiety and Speaking Performance: A Case Study. Expert Systems 38, 3 (2021), e12667.
- Leanne Elliott and Heather J Bachman. 2018. SES disparities in early math abilities: The contributions of parents’ math cognitions, practices to support math, and math talk. Developmental Review 49 (2018), 1–15.
- Understanding sources of individual variability in parents’ number talk with young children. Journal of Experimental Child Psychology 159 (2017), 1–15.
- Amaechi Uneke Enyi. 2015. Language and Interactional Discourse: Deconstrusting the Talk-Generating Machinery in Natural Convresation. Advances in Language and Literary Studies 6, 4 (2015), 171–178.
- Erin Elizabeth Flynn. 2018. Ideas in dialogue: Leveraging the power of child-led storytelling in the multicultural preschool classroom. Language in Society 47, 4 (2018), 601–633.
- Quiz Maker: Automatic Quiz Generation from Text Using NLP. In Futuristic Trends in Networks and Computing Technologies: Select Proceedings of Fourth International Conference on FTNCT 2021. Springer, New York, NY, USA, 523–533.
- Approximating Interactive Human Evaluation with Self-Play for Open-Domain Dialog Systems. https://doi.org/10.48550/arXiv.1906.09308 arXiv:1906.09308 [cs, stat]
- Lauren N Girouard-Hallam and Judith H Danovitch. 2022. Children’s trust in and learning from voice assistants. Developmental Psychology 58, 4 (2022), 646.
- Narrative-Based Learning: Possible Benefits and Problems. The European Journal of Communication Research 34, 4 (Dec. 2009), 429–447. https://doi.org/10.1515/COMM.2009.026
- Intelligent Tutoring Systems. APA educational psychology handbook 3 (2012), 451–473.
- AutoTutor. In Applied Natural Language Processing: Identification, Investigation and Resolution. IGI Global, Hershey, PA, USA, 169–187.
- A Knowledge-Enhanced Pretraining Model for Commonsense Story Generation. Transactions of the Association for Computational Linguistics 8 (Jan. 2020), 93–108. https://doi.org/10.1162/tacl_a_00302
- Reading Stories to Learn Math. The Elementary School Journal 116, 2 (Dec. 2015), 242–264. https://doi.org/10.1086/683986
- Reading Stories to Learn Math: Mathematics Vocabulary Instruction for Children with Early Numeracy Difficulties. The Elementary School Journal 116, 2 (2015), 242–264.
- Solving Math Word Problems by Combining Language Models With Symbolic Solvers. https://doi.org/10.48550/arXiv.2304.09102 arXiv:2304.09102 [cs]
- Screen Time Tantrums: How Families Manage Screen Media Experiences for Toddlers and Preschoolers. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (CHI ’16). Association for Computing Machinery, New York, NY, USA, 648–660. https://doi.org/10.1145/2858036.2858278
- Designing Parent-Child-Robot Interactions to Facilitate In-Home Parental Math Talk with Young Children. In Proceedings of the 22nd Annual ACM Interaction Design and Children Conference (IDC ’23). Association for Computing Machinery, New York, NY, USA, 355–366. https://doi.org/10.1145/3585088.3589358
- Embedding Mathematical Dialogue in Parent–Child Shared Book Reading: A Preliminary Investigation. Early Education and Development 25, 4 (May 2014), 469–492. https://doi.org/10.1080/10409289.2013.810481
- Child-Robot Interaction to Integrate Reflective Storytelling Into Creative Play. In Proceedings of the 13th Conference on Creativity and Cognition (C&C ’21). Association for Computing Machinery, New York, NY, USA, 1–8. https://doi.org/10.1145/3450741.3465254
- Child-Robot Interaction to Integrate Reflective Storytelling Into Creative Play. In Creativity and Cognition (C&C ’21). Association for Computing Machinery, New York, NY, USA, 1–8. https://doi.org/10.1145/3450741.3465254
- Charlotte S. Huck and Barbara Zulandt Kiefer. 2003. Children’s Literature in the Elementary School (8th edition ed.). Mcgraw-Hill College, Boston.
- Common Core State Standards Initiative. 2010. Mathematics Standards – Common Core State Standards Initiative.
- Increasing Interest and Achievement in Mathematics through Children’s Literature. Early Childhood Research Quarterly 7, 2 (June 1992), 263–276. https://doi.org/10.1016/0885-2006(92)90008-M
- A Systematic Review of Conversational AI in Language Education: Focusing on the Collaboration with Human Teachers. Journal of Research on Technology in Education 55, 1 (2022), 48–63.
- Survey of Hallucination in Natural Language Generation. ACM Comput. Surv. 55, 12 (March 2023), 248:1–248:38. https://doi.org/10.1145/3571730
- ALL-IN-ONE: Multi-Task Learning BERT Models for Evaluating Peer Assessments. https://doi.org/10.48550/arXiv.2110.03895 arXiv:2110.03895 [cs]
- ChatGPT for Good? On Opportunities and Challenges of Large Language Models for Education. https://doi.org/10.35542/osf.io/5er8f
- When Screen Time Isn’t Screen Time: Tensions and Needs Between Tweens and Their Parents During Nature-Based Exploration. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems (CHI ’21). Association for Computing Machinery, New York, NY, USA, 1–14. https://doi.org/10.1145/3411764.3445142
- Anne Lamott. 1995. Bird by Bird: Some Instructions on Writing and Life (1st edition ed.). Anchor, New York.
- Towards Reliable and Fluent Large Language Models: Incorporating Feedback Learning Loops in QA Systems. https://doi.org/10.48550/arXiv.2309.06384 arXiv:2309.06384 [cs]
- Interactive Children’s Story Rewriting Through Parent-Children Interaction. In Proceedings of the First Workshop on Intelligent and Interactive Writing Assistants (In2Writing 2022). Association for Computational Linguistics, Dublin, Ireland, 62–71. https://doi.org/10.18653/v1/2022.in2writing-1.9
- DAPIE: Interactive Step-by-Step Explanatory Dialogues to Answer Children’s Why and How Questions. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (CHI ’23). Association for Computing Machinery, New York, NY, USA, 1–22. https://doi.org/10.1145/3544548.3581369
- HaluEval: A Large-Scale Hallucination Evaluation Benchmark for Large Language Models. https://doi.org/10.48550/arXiv.2305.11747 arXiv:2305.11747 [cs]
- ACUTE-EVAL: Improved Dialogue Evaluation with Optimized Questions and Multi-Turn Comparisons. https://doi.org/10.48550/arXiv.1909.03087 arXiv:1909.03087 [cs]
- Generating with Confidence: Uncertainty Quantification for Black-Box Large Language Models. https://doi.org/10.48550/arXiv.2305.19187 arXiv:2305.19187 [cs, stat]
- When do preschoolers learn specific mathematics skills? Mapping the development of early numeracy knowledge. Journal of Experimental Child Psychology 195 (2020), 104846.
- Lost in the Middle: How Language Models Use Long Contexts. https://doi.org/10.48550/arXiv.2307.03172 arXiv:2307.03172 [cs]
- Silvia B. Lovato and Anne Marie Piper. 2019. Young Children and Voice Search: What We Know from Human-Computer Interaction Research. Frontiers in psychology 10 (2019), 8.
- Hey Google, Do Unicorns Exist? Conversational Agents as a Path to Answers to Children’s Questions. In Proceedings of the 18th ACM International Conference on Interaction Design and Children. Association for Computing Machinery, New York, NY, USA, 301–313.
- Kurt Messick. 2023. ChatGPT as Author - 100 Words or Less: 100 Short Stories Generated by AI. Independently published, Bloomington, Indiana, USA.
- Lesley Mandel Morrow. 1985. Retelling Stories: A Strategy for Improving Young Children’s Comprehension, Concept of Story Structure, and Oral Language Complexity. The Elementary School Journal 85, 5 (May 1985), 647–661. https://doi.org/10.1086/461427
- Olga Nelson. 1989. Storytelling: Language experience for meaning making. The Reading Teacher 42, 6 (1989), 386–390.
- OpenAI. 2022. Introducing ChatGPT. https://openai.com/blog/chatgpt.
- OpenAI. 2023. GPT-4. https://openai.com/gpt-4.
- BunCho: AI Supported Story Co-Creation via Unsupervised Multitask Learning to Increase Writers’ Creativity in Japanese. In Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems (CHI EA ’21). Association for Computing Machinery, New York, NY, USA, 1–10. https://doi.org/10.1145/3411763.3450391
- Alison H Paris and Scott G Paris. 2003. Assessing narrative comprehension in young children. Reading Research Quarterly 38, 1 (2003), 36–76.
- Samantha Parsons and John M. Bynner. 2005. Does Numeracy Matter More? National Research and Development Centre for Adult Literacy and Numeracy, London, UK.
- Voice Interfaces in Everyday Life. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems (CHI ’18). Association for Computing Machinery, New York, NY, USA, 1–12. https://doi.org/10.1145/3173574.3174214
- Sarah R. Powell and Melissa K. Driver. 2015. The Influence of Mathematics Vocabulary Instruction Embedded Within Addition Tutoring for First-Grade Students With Mathematics Difficulty. Learning Disability Quarterly 38, 4 (Nov. 2015), 221–233. https://doi.org/10.1177/0731948714564574
- David J. Purpura and Jessica A. R. Logan. 2015. The Nonlinear Relations of the Approximate Number System and Mathematical Language to Early Mathematics Development. Developmental Psychology 51, 12 (2015), 1717–1724. https://doi.org/10.1037/dev0000055
- Development of Mathematical Language in Preschool and Its Role in Learning Numeracy Skills. In Cognitive Foundations for Improving Mathematical Learning. Elsevier Academic Press, San Diego, CA, US, 175–193. https://doi.org/10.1016/B978-0-12-815952-1.00007-4
- Causal Connections Between Mathematical Language and Mathematical Knowledge: A Dialogic Reading Intervention. Journal of Research on Educational Effectiveness 10, 1 (Jan. 2017), 116–137. https://doi.org/10.1080/19345747.2016.1204639
- Engaging Caregivers and Children in Picture Books: A Family-Implemented Mathematical Language Intervention. Journal of Educational Psychology 113 (2021), 1338–1353. https://doi.org/10.1037/edu0000662
- Investigating the Factual Knowledge Boundary of Large Language Models with Retrieval Augmentation. https://doi.org/10.48550/arXiv.2307.11019 arXiv:2307.11019 [cs]
- Supporting Children’s Math Learning with Feedback-Augmented Narrative Technology. In Proceedings of the Interaction Design and Children Conference (IDC ’20). Association for Computing Machinery, New York, NY, USA, 567–580. https://doi.org/10.1145/3392063.3394400
- Virtual Peers as Partners in Storytelling and Literacy Learning. Journal of computer assisted learning 19, 2 (2003), 195–208.
- Adaptive Feedback from Artificial Neural Networks Facilitates Pre-Service Teachers’ Diagnostic Reasoning in Simulation-Based Learning. Learning and Instruction 83 (2023), 101620.
- Julie Sarama and Douglas H. Clements. 2009. Early Childhood Mathematics Education Research: Learning Trajectories for Young Children. Routledge, New York, NY, USA.
- Michael Schiro. 2004. Oral Storytelling and Teaching Mathematics: Pedagogical and Multicultural Perspectives. SAGE, Thousand Oaks, CA, USA.
- SAGA: Collaborative Storytelling with GPT-3. In Companion Publication of the 2021 Conference on Computer Supported Cooperative Work and Social Computing (CSCW ’21). Association for Computing Machinery, New York, NY, USA, 163–166. https://doi.org/10.1145/3462204.3481771
- Large Language Models Can Be Easily Distracted by Irrelevant Context. https://doi.org/10.48550/arXiv.2302.00093 arXiv:2302.00093 [cs]
- Enhancing Young Children’s Mathematical Knowledge through a Pre-Kindergarten Mathematics Intervention. Early Childhood Research Quarterly 19, 1 (2004), 99–120.
- Collaborative Storytelling between Robot and Child: A Feasibility Study. In Proceedings of the 2017 Conference on Interaction Design and Children (IDC ’17). Association for Computing Machinery, New York, NY, USA, 205–214. https://doi.org/10.1145/3078072.3079714
- Tzu-Yu Tai and Howard Hao-Jan Chen. 2020. The Impact of Google Assistant on Adolescent EFL Learners’ Willingness to Communicate. Interactive Learning Environments 31, 3 (2020), 1–18.
- Parents’ use of number talk with young children: Comparing methods, family factors, activity contexts, and relations to math skills. Early Childhood Research Quarterly 53 (2020), 249–259.
- Eylül Turan and Bert De Smedt. 2022. Mathematical Language and Mathematical Abilities in Preschool: A Systematic Literature Review. Educational Research Review 36 (June 2022), 100457. https://doi.org/10.1016/j.edurev.2022.100457
- “Alexa, Can I Program You?”: Student Perceptions of Conversational Artificial Intelligence Before and After Programming Alexa. In Proceedings of the 20th Annual ACM Interaction Design and Children Conference (IDC ’21). Association for Computing Machinery, New York, NY, USA, 305–313. https://doi.org/10.1145/3459990.3460730
- The Mind in the Machine: Anthropomorphism Increases Trust in an Autonomous Vehicle. Journal of Experimental Social Psychology 52 (May 2014), 113–117. https://doi.org/10.1016/j.jesp.2014.01.005
- Emergent Abilities of Large Language Models. https://doi.org/10.48550/arXiv.2206.07682 arXiv:2206.07682 [cs]
- Jacqueline Kory Westlund and Cynthia Breazeal. 2015. The Interplay of Robot Language Level with Children’s Language Learning during Storytelling. In Proceedings of the Tenth Annual ACM/IEEE International Conference on Human-Robot Interaction Extended Abstracts. IEEE, New York, NY, USA, 65–66.
- Designing Emotionally Expressive Social Commentary to Facilitate Child-Robot Interaction. In Interaction Design and Children (IDC ’21). Association for Computing Machinery, New York, NY, USA, 314–325. https://doi.org/10.1145/3459990.3460714
- Miranda Kit-Yi Wong and Wing Chee So. 2016. Spoken Narrative Assessment: A Supplementary Measure of Children’s Creativity. Creativity Research Journal 28, 4 (Oct. 2016), 471–477. https://doi.org/10.1080/10400419.2016.1229989
- Improving Vocabulary Acquisition by Designing a Storytelling Robot. In 2008 Eighth IEEE International Conference on Advanced Learning Technologies. IEEE, New York, NY, USA, 498–500.
- MathKingdom: Teaching Children Mathematical Language Through Speaking at Home via a Voice-Guided Game. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (CHI ’23). Association for Computing Machinery, New York, NY, USA, 1–14. https://doi.org/10.1145/3544548.3581043
- Ying Xu. 2023. Talking with machines: Can conversational technologies serve as children’s social partners? Child Development Perspectives 17, 1 (2023), 53–58.
- Dialogue with a Conversational Agent Promotes Children’s Story Comprehension via Enhancing Engagement. Child Development 93, 2 (2022), e149–e167. https://doi.org/10.1111/cdev.13708
- Are Current Voice Interfaces Designed to Support Children’s Language Development?. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. ACM, Yokohama Japan, 1–12. https://doi.org/10.1145/3411764.3445271
- “Elinor’s Talking to Me!”:Integrating Conversational AI into Children’s Narrative Science Programming. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems (CHI ’22). Association for Computing Machinery, New York, NY, USA, 1–16. https://doi.org/10.1145/3491102.3502050
- Ying Xu and Mark Warschauer. 2020a. Exploring Young Children’s Engagement in Joint Reading with a Conversational Agent. In Proceedings of the Interaction Design and Children Conference (IDC ’20). Association for Computing Machinery, New York, NY, USA, 216–228. https://doi.org/10.1145/3392063.3394417
- Ying Xu and Mark Warschauer. 2020b. What Are You Talking To?: Understanding Children’s Perceptions of Conversational Agents. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (CHI ’20). Association for Computing Machinery, New York, NY, USA, 1–13. https://doi.org/10.1145/3313831.3376416
- Wordcraft: Story Writing With Large Language Models. In 27th International Conference on Intelligent User Interfaces (IUI ’22). Association for Computing Machinery, New York, NY, USA, 841–852. https://doi.org/10.1145/3490099.3511105
- Why Johnny Can’t Prompt: How Non-AI Experts Try (and Fail) to Design LLM Prompts. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (CHI ’23). Association for Computing Machinery, New York, NY, USA, 1–21. https://doi.org/10.1145/3544548.3581388
- Bowen Zhang and Harold Soh. 2023. Large Language Models as Zero-Shot Human Models for Human-Robot Interaction. https://doi.org/10.48550/arXiv.2303.03548 arXiv:2303.03548 [cs]
- StoryDrawer: A Co-Creative Agent Supporting Children’s Storytelling through Collaborative Drawing. In Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems (CHI EA ’21). Association for Computing Machinery, New York, NY, USA, 1–6.
- StoryDrawer: A Child–AI Collaborative Drawing System to Support Children’s Creative Visual Storytelling. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems (CHI ’22). Association for Computing Machinery, New York, NY, USA, 1–15. https://doi.org/10.1145/3491102.3501914
- Observe It, Draw It: Scaffolding Children’s Observations of Plant Biodiversity with an Interactive Drawing Tool. In Proceedings of the 22nd Annual ACM Interaction Design and Children Conference (IDC ’23). Association for Computing Machinery, New York, NY, USA, 253–266. https://doi.org/10.1145/3585088.3589380
- StoryBuddy: A Human-AI Collaborative Chatbot for Parent-Child Interactive Storytelling with Flexible Parental Involvement. In CHI Conference on Human Factors in Computing Systems (CHI ’22). Association for Computing Machinery, New York, NY, USA, 1–21. https://doi.org/10.1145/3491102.3517479