Interactions with Prompt Problems: A New Way to Teach Programming with Large Language Models (2401.10759v1)
Abstract: LLMs have upended decades of pedagogy in computing education. Students previously learned to code through \textit{writing} many small problems with less emphasis on code reading and comprehension. Recent research has shown that free code generation tools powered by LLMs can solve introductory programming problems presented in natural language with ease. In this paper, we propose a new way to teach programming with Prompt Problems. Students receive a problem visually, indicating how input should be transformed to output, and must translate that to a prompt for an LLM to decipher. The problem is considered correct when the code that is generated by the student prompt can pass all test cases. In this paper we present the design of this tool, discuss student interactions with it as they learn, and provide insights into this new class of programming problems as well as the design tools that integrate LLMs.
- Many Small Programs in CS1: Usage Analysis from Multiple Universities. In 2019 ASEE Annual Conference & Exposition ”. ASEE Conferences, Tampa, Florida, 1–13. https://peer.asee.org/33084.
- Grounded Copilot: How Programmers Interact with Code-Generating Models. Proc. ACM Program. Lang. 7, OOPSLA1, Article 78 (apr 2023), 27 pages. https://doi.org/10.1145/3586030
- Programming Is Hard - Or at Least It Used to Be: Educational Opportunities and Challenges of AI Code Generation. In Proceedings of the 54th ACM Technical Symposium on Computer Science Education V. 1 (Toronto ON, Canada) (SIGCSE 2023). Association for Computing Machinery, New York, NY, USA, 500–506. https://doi.org/10.1145/3545945.3569759
- Compiler Error Messages Considered Unhelpful: The Landscape of Text-Based Programming Error Message Research. In Proceedings of the Working Group Reports on Innovation and Technology in Computer Science Education (Aberdeen, Scotland) (ITiCSE-WGR ’19). Association for Computing Machinery, New York, NY, USA, 177–210. https://doi.org/10.1145/3344429.3372508
- Fix the First, Ignore the Rest: Dealing with Multiple Compiler Error Messages. In Proceedings of the 49th ACM Technical Symposium on Computer Science Education (Baltimore, Maryland, USA) (SIGCSE ’18). Association for Computing Machinery, New York, NY, USA, 634–639. https://doi.org/10.1145/3159450.3159453
- DaVinci goes to Bebras: a study on the problem solving ability of GPT-3. In Proceedings of the 15th International Conference on Computer Supported Education. 2: CSEDU. SciTePress, 59–69.
- Taking Flight with Copilot. Commun. ACM 66, 6 (may 2023), 56–62. https://doi.org/10.1145/3589996
- On the Opportunities and Risks of Foundation Models. arXiv:2108.07258 [cs.LG]
- Virginia Braun and Victoria Clarke. 2006. Using Thematic Analysis in Psychology. Qualitative Research in Psychology 3, 2 (2006), 77–101. https://doi.org/10.1191/1478088706qp063oa
- Virginia Braun and Victoria Clarke. 2019. Reflecting on Reflexive Thematic Analysis. Qualitative Research in Sport, Exercise and Health 11, 4 (2019), 589–597. https://doi.org/10.1080/2159676X.2019.1628806
- Virginia Braun and Victoria Clarke. 2022. Conceptual and Design Thinking for Thematic Analysis. Qualitative Psychology 9, 1 (2022), 3–26. https://doi.org/10.1037/qup0000196
- Language Models are Few-shot Learners. Advances in Neural Information Processing Systems 33 (2020), 1877–1901.
- Evaluating Large Language Models Trained on Code. arXiv:2107.03374 [cs.LG]
- ’Explain in Plain English’ Questions Revisited: Data Structures Problems. In Proceedings of the 45th ACM Technical Symposium on Computer Science Education (Atlanta, Georgia, USA) (SIGCSE ’14). Association for Computing Machinery, New York, NY, USA, 591–596. https://doi.org/10.1145/2538862.2538911
- Chat Overflow: Artificially Intelligent Models for Computing Education - RenAIssance or ApocAIypse?. In Proceedings of the 2023 Conference on Innovation and Technology in Computer Science Education V. 1 (Turku, Finland) (ITiCSE 2023). ACM, NY, USA, 3–4. https://doi.org/10.1145/3587102.3588773
- Conversing with Copilot: Exploring Prompt Engineering for Solving CS1 Problems Using Natural Language. In Proceedings of the 54th ACM Technical Symposium on Computer Science Education V. 1 (Toronto ON, Canada) (SIGCSE 2023). ACM, NY, USA, 1136–1142. https://doi.org/10.1145/3545945.3569823
- CodeWrite: Supporting Student-Driven Practice of Java. In Proceedings of the 42nd ACM Technical Symposium on Computer Science Education (Dallas, TX, USA) (SIGCSE ’11). ACM, NY, USA, 471–476. https://doi.org/10.1145/1953163.1953299
- A Closer Look at Metacognitive Scaffolding: Solving Test Cases Before Programming. In Proceedings of the 19th Koli Calling International Conference on Computing Education Research (Koli, Finland) (Koli Calling ’19). Association for Computing Machinery, New York, NY, USA, Article 11, 10 pages. https://doi.org/10.1145/3364510.3366170
- Computing Education in the Era of Generative AI. arXiv:2306.02608 [cs.CY]
- A Review of Research on Parsons Problems. In Proceedings of the Twenty-Second Australasian Computing Education Conference (Melbourne, VIC, Australia) (ACE’20). ACM, NY, USA, 195–202. https://doi.org/10.1145/3373165.3373187
- Parsons Problems and Beyond: Systematic Literature Review and Empirical Study Designs. In Proceedings of the 2022 Working Group Reports on Innovation and Technology in Computer Science Education (Dublin, Ireland) (ITiCSE-WGR ’22). ACM, NY, USA, 191–234. https://doi.org/10.1145/3571785.3574127
- The Robots Are Coming: Exploring the Implications of OpenAI Codex on Introductory Programming. In Australasian Computing Education Conference (Virtual Event, Australia) (ACE ’22). Association for Computing Machinery, New York, NY, USA, 10–19. https://doi.org/10.1145/3511861.3511863
- My AI Wants to Know If This Will Be on the Exam: Testing OpenAI’s Codex on CS2 Programming Exercises. In Proceedings of the 25th Australasian Computing Education Conference (Melbourne, VIC, Australia) (ACE ’23). Association for Computing Machinery, New York, NY, USA, 97–104. https://doi.org/10.1145/3576123.3576134
- Music Creation by Example. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (Honolulu, HI, USA) (CHI ’20). Association for Computing Machinery, New York, NY, USA, 1–13. https://doi.org/10.1145/3313831.3376514
- Example-Based Programming: A Pertinent Visual Approach for Learning to Program. In Proceedings of the Working Conference on Advanced Visual Interfaces (Gallipoli, Italy) (AVI ’04). Association for Computing Machinery, New York, NY, USA, 358–361. https://doi.org/10.1145/989863.989924
- Carl C. Haynes and Barbara J. Ericson. 2021. Problem-Solving Efficiency and Cognitive Load for Adaptive Parsons Problems vs. Writing the Equivalent Code. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems (Yokohama, Japan) (CHI ’21). Association for Computing Machinery, New York, NY, USA, Article 60, 15 pages. https://doi.org/10.1145/3411764.3445292
- Exploring the Responses of Large Language Models to Beginner Programmers’ Help Requests. In Proceedings of the 2023 ACM Conference on International Computing Education Research - Volume 1 (Chicago, IL, USA) (ICER ’23). Association for Computing Machinery, New York, NY, USA, 93–105. https://doi.org/10.1145/3568813.3600139
- Cruz Izu and Claudio Mirolo. 2023. Exploring CS1 Student’s Notions of Code Quality. In Proceedings of the 2023 Conference on Innovation and Technology in Computer Science Education V. 1 (Turku, Finland) (ITiCSE 2023). Association for Computing Machinery, New York, NY, USA, 12–18. https://doi.org/10.1145/3587102.3588808
- David S. Janzen and Hossein Saiedian. 2006. Test-Driven Learning: Intrinsic Integration of Testing into the CS/SE Curriculum. SIGCSE Bull. 38, 1 (mar 2006), 254–258. https://doi.org/10.1145/1124706.1121419
- Exploring the Learnability of Program Synthesizers by Novice Programmers. In Proceedings of the 35th Annual ACM Symposium on User Interface Software and Technology (Bend, OR, USA) (UIST ’22). Association for Computing Machinery, New York, NY, USA, Article 64, 15 pages. https://doi.org/10.1145/3526113.3545659
- Studying the Effect of AI Code Generators on Supporting Novice Learners in Introductory Programming. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (Hamburg, Germany) (CHI ’23). Association for Computing Machinery, New York, NY, USA, Article 455, 23 pages. https://doi.org/10.1145/3544548.3580919
- Caitlin Kelleher and Wint Hnin. 2019. Predicting Cognitive Load in Future Code Puzzles. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (Glasgow, Scotland Uk) (CHI ’19). Association for Computing Machinery, New York, NY, USA, 1–12. https://doi.org/10.1145/3290605.3300487
- Exploring the Potential of Large Language Models to Generate Formative Programming Feedback. arXiv preprint arXiv:2309.00029 (2023).
- Sam Lau and Philip J Guo. 2023. From” Ban It Till We Understand It” to” Resistance is Futile”: How University Programming Instructors Plan to Adapt as More Students Use AI Code Generation and Explanation Tools such as ChatGPT and GitHub Copilot. The 19th ACM Conference on International Computing Education Research (ICER) (2023).
- Comparing Code Explanations Created by Students and Large Language Models. arXiv:2304.03938 [cs.CY]
- Using Large Language Models to Enhance Programming Error Messages. In Proceedings of the 54th ACM Technical Symposium on Computer Science Education V. 1 (Toronto ON, Canada) (SIGCSE 2023). Association for Computing Machinery, New York, NY, USA, 563–569. https://doi.org/10.1145/3545945.3569770
- StarCoder: may the source be with you! arXiv:2305.06161 [cs.CL]
- Am I Wrong, or Is the Autograder Wrong? Effects of AI Grading Mistakes on Learning. (2023).
- Further Evidence of a Relationship between Explaining, Tracing and Writing Skills in Introductory Programming. In Proceedings of the 14th Annual ACM SIGCSE Conference on Innovation and Technology in Computer Science Education (Paris, France) (ITiCSE ’09). Association for Computing Machinery, New York, NY, USA, 161–165. https://doi.org/10.1145/1562877.1562930
- Richard Lobb and Jenny Harlow. 2016. Coderunner: A Tool for Assessing Computer Programming Skills. ACM Inroads 7, 1 (feb 2016), 47–51. https://doi.org/10.1145/2810041
- Machineers: Playfully Introducing Programming to Children. In CHI ’13 Extended Abstracts on Human Factors in Computing Systems (Paris, France) (CHI EA ’13). Association for Computing Machinery, New York, NY, USA, 2639–2642. https://doi.org/10.1145/2468356.2479483
- Programming, Problem Solving, and Self-Awareness: Effects of Explicit Guidance. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (San Jose, California, USA) (CHI ’16). Association for Computing Machinery, New York, NY, USA, 1449–1461. https://doi.org/10.1145/2858036.2858252
- Metacognition and Self-Regulation in Programming Education: Theories and Exemplars of Use. ACM Trans. Comput. Educ. 22, 4, Article 39 (sep 2022), 31 pages. https://doi.org/10.1145/3487050
- Relationships between Reading, Tracing and Writing Skills in Introductory Programming. In Proceedings of the Fourth International Workshop on Computing Education Research (Sydney, Australia) (ICER ’08). Association for Computing Machinery, New York, NY, USA, 101–112. https://doi.org/10.1145/1404520.1404531
- Experiences from Using Code Explanations Generated by Large Language Models in a Web Software Development E-Book. In Proceedings of the 54th ACM Technical Symposium on Computer Science Education V. 1 (Toronto ON, Canada) (SIGCSE 2023). Association for Computing Machinery, New York, NY, USA, 931–937. https://doi.org/10.1145/3545945.3569785
- Programming by Choice: Urban Youth Learning Programming with Scratch. SIGCSE Bull. 40, 1 (mar 2008), 367–371. https://doi.org/10.1145/1352322.1352260
- Reliability and Inter-Rater Reliability in Qualitative Research: Norms and Guidelines for CSCW and HCI Practice. Proc. ACM Hum.-Comput. Interact. 3, CSCW, Article 72 (nov 2019), 23 pages. https://doi.org/10.1145/3359174
- Edward F. Melcer and Katherine Isbister. 2018. Bots & (Main)Frames: Exploring the Impact of Tangible Blocks and Collaborative Play in an Educational Programming Game. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems (Montreal QC, Canada) (CHI ’18). Association for Computing Machinery, New York, NY, USA, 1–14. https://doi.org/10.1145/3173574.3173840
- Ability to ’explain in Plain English’ Linked to Proficiency in Computer-Based Programming. In Proceedings of the Ninth Annual International Conference on International Computing Education Research (Auckland, New Zealand) (ICER ’12). Association for Computing Machinery, New York, NY, USA, 111–118. https://doi.org/10.1145/2361276.2361299
- CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis. arXiv:2203.13474 [cs.LG]
- Satoshi Nishimura. 2014. Takt: A read-eval-play-loop interpreter for a structural/procedural score language.. In ICMC.
- Maciej Pankiewicz and Ryan S Baker. 2023. Large Language Models (GPT) for automating feedback on programming assignments. arXiv preprint arXiv:2307.00150 (2023).
- Metacodenition: Scaffolding the Problem-Solving Process for Novice Programmers. In Proceedings of the 25th Australasian Computing Education Conference (Melbourne, VIC, Australia) (ACE ’23). Association for Computing Machinery, New York, NY, USA, 59–68. https://doi.org/10.1145/3576123.3576130
- What Do We Think We Think We Are Doing? Metacognition and Self-Regulation in Programming. In Proceedings of the 2020 ACM Conference on International Computing Education Research (Virtual Event, New Zealand) (ICER ’20). Association for Computing Machinery, New York, NY, USA, 2–13. https://doi.org/10.1145/3372782.3406263
- Transformed by Transformers: Navigating the AI Coding Revolution for Computing Education: An ITiCSE Working Group Conducted by Humans. In Proceedings of the 2023 Conference on Innovation and Technology in Computer Science Education V. 2 (Turku, Finland) (ITiCSE 2023). Association for Computing Machinery, New York, NY, USA, 561–562. https://doi.org/10.1145/3587103.3594206
- First Things First: Providing Metacognitive Scaffolding for Interpreting Problem Prompts. In Proceedings of the 50th ACM Technical Symposium on Computer Science Education (Minneapolis, MN, USA) (SIGCSE ’19). Association for Computing Machinery, New York, NY, USA, 531–537. https://doi.org/10.1145/3287324.3287374
- Metacognitive Difficulties Faced by Novice Programmers in Automated Assessment Tools. In Proceedings of the 2018 ACM Conference on International Computing Education Research (Espoo, Finland) (ICER ’18). Association for Computing Machinery, New York, NY, USA, 41–50. https://doi.org/10.1145/3230977.3230981
- “It’s Weird That It Knows What I Want”: Usability and Interactions with Copilot for Novice Programmers. ACM Trans. Comput.-Hum. Interact. (aug 2023). https://doi.org/10.1145/3617367 Just Accepted.
- Evaluating the Performance of Code Generation Models for Solving Parsons Problems With Small Prompt Variations. In Proceedings of the 2023 Conference on Innovation and Technology in Computer Science Education V. 1 (Turku, Finland) (ITiCSE 2023). Association for Computing Machinery, New York, NY, USA, 299–305. https://doi.org/10.1145/3587102.3588805
- Code Llama: Open Foundation Models for Code. arXiv:2308.12950 [cs.CL]
- Automatic Generation of Programming Exercises and Code Explanations Using Large Language Models. In Proceedings of the 2022 ACM Conference on International Computing Education Research - Volume 1 (Lugano and Virtual Event, Switzerland) (ICER ’22). Association for Computing Machinery, New York, NY, USA, 27–43. https://doi.org/10.1145/3501385.3543957
- Thrilled by Your Progress! Large Language Models (GPT-4) No Longer Struggle to Pass Assessments in Higher Education Programming Courses. The 19th ACM Conference on International Computing Education Research (ICER) (2023).
- Large language models (gpt) struggle to answer multiple-choice questions about code. arXiv preprint arXiv:2303.08033 (2023).
- Can Generative Pre-trained Transformers (GPT) Pass Assessments in Higher Education Programming Courses? arXiv:2303.09325 [cs.AI]
- Harnessing llms in curricular design: Using gpt-4 to support authoring of learning objectives. arXiv preprint arXiv:2306.17459 (2023).
- Expectation vs. Experience: Evaluating the Usability of Code Generation Tools Powered by Large Language Models. In CHI Conference on Human Factors in Computing Systems Extended Abstracts. Association for Computing Machinery, New York, NY, USA, 1–7.
- Attention is all you need. Advances in neural information processing systems 30 (2017).
- A Closer Look at Tracing, Explaining and Code Writing Skills in the Novice Programmer. In Proceedings of the Fifth International Workshop on Computing Education Research Workshop (Berkeley, CA, USA) (ICER ’09). Association for Computing Machinery, New York, NY, USA, 117–128. https://doi.org/10.1145/1584322.1584336
- Bridging the Syntax-Semantics Gap of Programming. In Proceedings of the 2022 ACM SIGPLAN International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software (Auckland, New Zealand) (Onward! 2022). Association for Computing Machinery, New York, NY, USA, 80–94. https://doi.org/10.1145/3563835.3567668
- Improving Instruction of Programming Patterns with Faded Parsons Problems. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems (Yokohama, Japan) (CHI ’21). Association for Computing Machinery, New York, NY, USA, Article 53, 4 pages. https://doi.org/10.1145/3411764.3445228
- Matt Welsh. 2022. The End of Programming. Commun. ACM 66, 1 (dec 2022), 34–35. https://doi.org/10.1145/3570220
- An Australasian study of reading and comprehension skills in novice programmers, using the Bloom and SOLO taxonomies. (2006).
- Generative AI in Computing Education: Perspectives of Students and Instructors. arXiv preprint arXiv:2308.04309 (2023).
- James Prather (21 papers)
- Paul Denny (67 papers)
- Juho Leinonen (41 papers)
- David H. Smith IV (29 papers)
- Brent N. Reeves (9 papers)
- Stephen MacNeil (37 papers)
- Brett A. Becker (14 papers)
- Andrew Luxton-Reilly (16 papers)
- Thezyrie Amarouche (4 papers)
- Bailey Kimmel (4 papers)