Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Conversational Challenges in AI-Powered Data Science: Obstacles, Needs, and Design Opportunities (2310.16164v1)

Published 24 Oct 2023 in cs.HC

Abstract: LLMs are being increasingly employed in data science for tasks like data preprocessing and analytics. However, data scientists encounter substantial obstacles when conversing with LLM-powered chatbots and acting on their suggestions and answers. We conducted a mixed-methods study, including contextual observations, semi-structured interviews (n=14), and a survey (n=114), to identify these challenges. Our findings highlight key issues faced by data scientists, including contextual data retrieval, formulating prompts for complex tasks, adapting generated code to local environments, and refining prompts iteratively. Based on these insights, we propose actionable design recommendations, such as data brushing to support context selection, and inquisitive feedback loops to improve communications with AI-based assistants in data-science tools.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (44)
  1. Grounded Copilot: How Programmers Interact with Code-Generating Models. Proc. ACM Program. Lang. 7, OOPSLA1, Article 78 (apr 2023), 27 pages. https://doi.org/10.1145/3586030
  2. AutoPandas: Neural-Backed Generators for Program Synthesis. Proc. ACM Program. Lang. 3, OOPSLA, Article 168 (oct 2019), 27 pages. https://doi.org/10.1145/3360594
  3. Richard A. Becker and William S. Cleveland. 1987. Brushing Scatterplots. Technometrics 29, 2 (1987), 127–142. https://doi.org/10.1080/00401706.1987.10488204 arXiv:https://www.tandfonline.com/doi/pdf/10.1080/00401706.1987.10488204
  4. Mary Beth Kery and Brad A. Myers. 2017. Exploring exploratory programming. In 2017 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC). IEEE, New York, NY, USA, 25–29. https://doi.org/10.1109/VLHCC.2017.8103446
  5. What Did My AI Learn? How Data Scientists Make Sense of Model Behavior. ACM Trans. Comput.-Hum. Interact. 30, 1, Article 1 (mar 2023), 27 pages. https://doi.org/10.1145/3542921
  6. What’s Wrong with Computational Notebooks? Pain Points, Needs, and Design Opportunities. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (Honolulu, HI, USA) (CHI ’20). Association for Computing Machinery, New York, NY, USA, 1–12. https://doi.org/10.1145/3313831.3376729
  7. Evaluating Large Language Models Trained on Code. arXiv:2107.03374 [cs.LG]
  8. Wenhu Chen. 2023. Large Language Models are few(1)-shot Table Reasoners. arXiv:2210.06710 [cs.CL]
  9. CoWrangler: Recommender System for Data-Wrangling Scripts. In Companion of the 2023 International Conference on Management of Data (Seattle, WA, USA) (SIGMOD ’23). Association for Computing Machinery, New York, NY, USA, 147–150. https://doi.org/10.1145/3555041.3589722
  10. ColDeco: An End User Spreadsheet Inspection Tool for AI-Generated Code. In IEEE Symposium on Visual Languages and Human-Centric Computing. IEEE, IEEE, New York, NY, USA. https://www.microsoft.com/en-us/research/publication/coldeco-an-end-user-spreadsheet-inspection-tool-for-ai-generated-code/
  11. Interactions with Big Data Analytics. Interactions 19, 3 (may 2012), 50–59. https://doi.org/10.1145/2168931.2168943
  12. Martin Fowler. 2005. Bliki: Fluentinterface. https://www.martinfowler.com/bliki/FluentInterface.html
  13. Xi Ge and Emerson Murphy-Hill. 2014. Manual Refactoring Changes with Automated Refactoring Validation. In Proceedings of the 36th International Conference on Software Engineering (Hyderabad, India) (ICSE 2014). Association for Computing Machinery, New York, NY, USA, 1095–1105. https://doi.org/10.1145/2568225.2568280
  14. Paul Grice. 1991. Studies in the Way of Words. Harvard University Press, Cambridge, Mass. [u.a.].
  15. Proactive Wrangling: Mixed-Initiative End-User Programming of Data Transformation Scripts. In Proceedings of the 24th Annual ACM Symposium on User Interface Software and Technology (Santa Barbara, California, USA) (UIST ’11). Association for Computing Machinery, New York, NY, USA, 65–74. https://doi.org/10.1145/2047196.2047205
  16. An Inquisitive Code Editor for Addressing Novice Programmers’ Misconceptions of Program Behavior. In Proceedings of the 43rd International Conference on Software Engineering: Joint Track on Software Engineering Education and Training (Virtual Event, Spain) (ICSE-JSEET ’21). IEEE Press, New York, NY, USA, 165–170. https://doi.org/10.1109/ICSE-SEET52601.2021.00026
  17. LLMs for Semi-Automated Data Science: Introducing CAAFE for Context-Aware Automated Feature Engineering. arXiv:2305.03403 [cs.AI]
  18. Exploring the Learnability of Program Synthesizers by Novice Programmers. In Proceedings of the 35th Annual ACM Symposium on User Interface Software and Technology (Bend, OR, USA) (UIST ’22). Association for Computing Machinery, New York, NY, USA, Article 64, 15 pages. https://doi.org/10.1145/3526113.3545659
  19. Challenges and Applications of Large Language Models. arXiv:2307.10169 [cs.CL]
  20. Wrangler: Interactive Visual Specification of Data Transformation Scripts. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Vancouver, BC, Canada) (CHI ’11). Association for Computing Machinery, New York, NY, USA, 3363–3372. https://doi.org/10.1145/1978942.1979444
  21. Enterprise data analysis and visualization: An interview study. IEEE Transactions on Visualization and Computer Graphics 18, 12 (2012), 2917–2926.
  22. The Story in the Notebook: Exploratory Data Science Using a Literate Programming Tool. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems (Montreal QC, Canada) (CHI ’18). Association for Computing Machinery, New York, NY, USA, 1–11. https://doi.org/10.1145/3173574.3173748
  23. Data Scientists in Software Teams: State of the Art and Challenges. IEEE Transactions on Software Engineering 44, 11 (2018), 1024–1038. https://doi.org/10.1109/TSE.2017.2754374
  24. Sean Kross and Philip J. Guo. 2019. Practitioners Teaching Data Science in Industry and Academia: Expectations, Workflows, and Challenges. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (Glasgow, Scotland Uk) (CHI ’19). Association for Computing Machinery, New York, NY, USA, 1–14. https://doi.org/10.1145/3290605.3300493
  25. J. Richard Landis and Gary G. Koch. 1977. The measurement of observer agreement for categorical data. Biometrics 33, 1 (1977), 159. https://doi.org/10.2307/2529310
  26. Understanding the Usability of AI Programming Assistants. arXiv:2303.17125 [cs.SE]
  27. “What It Wants Me To Say”: Bridging the Abstraction Gap Between End-User Programmers and Code-Generating Large Language Models. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (Hamburg, Germany) (CHI ’23). Association for Computing Machinery, New York, NY, USA, Article 598, 31 pages. https://doi.org/10.1145/3544548.3580817
  28. On the Design of AI-Powered Code Assistants for Notebooks. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (Hamburg, Germany) (CHI ’23). Association for Computing Machinery, New York, NY, USA, Article 434, 16 pages. https://doi.org/10.1145/3544548.3580940
  29. How Data Science Workers Work with Data: Discovery, Capture, Curation, Design, Creation. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (Glasgow, Scotland Uk) (CHI ’19). Association for Computing Machinery, New York, NY, USA, 1–15. https://doi.org/10.1145/3290605.3300356
  30. Jakob Nielsen. 2006. Progressive disclosure. nngroup.com (2006).
  31. David Noever and Forrest McKee. 2023. Numeracy from Literacy: Data Science as an Emergent Skill from Large Language Models. arXiv:2301.13382 [cs.CL]
  32. OpenAI. 2023. GPT-4 Technical Report. arXiv:2303.08774 [cs.CL]
  33. Exploration and Explanation in Computational Notebooks. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems (Montreal QC, Canada) (CHI ’18). Association for Computing Machinery, New York, NY, USA, 1–12. https://doi.org/10.1145/3173574.3173606
  34. Johnny Saldaña. 2009. The Coding Manual for Qualitative Researchers. http://ci.nii.ac.jp/ncid/BB20067005
  35. What is it like to program with artificial intelligence?. In Proceedings of the 33rd Annual Conference of the Psychology of Programming Interest Group (PPIG 2022).
  36. Remote, but Connected: How #TidyTuesday Provides an Online Community of Practice for Data Scientists. Proc. ACM Hum.-Comput. Interact. 5, CSCW1, Article 52 (apr 2021), 31 pages. https://doi.org/10.1145/3449126
  37. The User Experience of ChatGPT: Findings from a Questionnaire Study of Early Users. In Proceedings of the 5th International Conference on Conversational User Interfaces (Eindhoven, Netherlands) (CUI ’23). Association for Computing Machinery, New York, NY, USA, Article 2, 10 pages. https://doi.org/10.1145/3571884.3597144
  38. GridBook: Natural Language Formulas for the Spreadsheet Grid. In 27th International Conference on Intelligent User Interfaces (Helsinki, Finland) (IUI ’22). Association for Computing Machinery, New York, NY, USA, 345–368. https://doi.org/10.1145/3490099.3511161
  39. Data Diff: Interpretable, Executable Summaries of Changes in Distributions for Data Wrangling. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (London, United Kingdom) (KDD ’18). Association for Computing Machinery, New York, NY, USA, 2279–2288. https://doi.org/10.1145/3219819.3220057
  40. Towards More Effective AI-Assisted Programming: A Systematic Design Exploration to Improve Visual Studio IntelliCode’s User Experience. (2023), 185–195. https://doi.org/10.1109/ICSE-SEIP58684.2023.00022
  41. Expectation vs. Experience: Evaluating the Usability of Code Generation Tools Powered by Large Language Models. In Extended Abstracts of the 2022 CHI Conference on Human Factors in Computing Systems (New Orleans, LA, USA) (CHI EA ’22). Association for Computing Machinery, New York, NY, USA, Article 332, 7 pages. https://doi.org/10.1145/3491101.3519665
  42. How Social Q&A Sites Are Changing Knowledge Sharing in Open Source Software Communities. In Proceedings of the 17th ACM Conference on Computer Supported Cooperative Work & Social Computing (Baltimore, Maryland, USA) (CSCW ’14). Association for Computing Machinery, New York, NY, USA, 342–354. https://doi.org/10.1145/2531602.2531659
  43. Cong Yan and Yeye He. 2020. Auto-Suggest: Learning-to-Recommend Data Preparation Steps Using Data Science Notebooks. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data (Portland, OR, USA) (SIGMOD ’20). Association for Computing Machinery, New York, NY, USA, 1539–1554. https://doi.org/10.1145/3318464.3389738
  44. Natural Language to Code Generation in Interactive Data Science Notebooks. arXiv:2212.09248 [cs.CL]
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Bhavya Chopra (5 papers)
  2. Ananya Singha (6 papers)
  3. Anna Fariha (12 papers)
  4. Sumit Gulwani (55 papers)
  5. Chris Parnin (19 papers)
  6. Ashish Tiwari (44 papers)
  7. Austin Z. Henley (12 papers)
Citations (7)
X Twitter Logo Streamline Icon: https://streamlinehq.com