Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

How far are AI-powered programming assistants from meeting developers' needs? (2404.12000v2)

Published 18 Apr 2024 in cs.SE

Abstract: Recent In-IDE AI coding assistant tools (ACATs) like GitHub Copilot have significantly impacted developers' coding habits. While some studies have examined their effectiveness, there lacks in-depth investigation into the actual assistance process. To bridge this gap, we simulate real development scenarios encompassing three typical types of software development tasks and recruit 27 computer science students to investigate their behavior with three popular ACATs. Our goal is to comprehensively assess ACATs' effectiveness, explore characteristics of recommended code, identify reasons for modifications, and understand users' challenges and expectations. To facilitate the study, we develop an experimental platform that includes a data collection plugin for VSCode IDE and provides functions for screen recording, code evaluation, and automatic generation of personalized interview and survey questions. Through analysis of the collected data, we find that ACATs generally enhance task completion rates, reduce time, improve code quality, and increase self-perceived productivity. However, the improvement is influenced by both the nature of coding tasks and users' experience level. Notably, for experienced participants, the use of ACATs may even increase completion time. We observe that "edited line completion" is the most frequently recommended way, while "comments completion" and "string completion" have the lowest acceptance rates. The primary reasons for modifying recommended code are disparities between output formats and requirements, flawed logic, and inconsistent code styles. In terms of challenges and expectations, optimization of service access and help documentation is also concerned by participants except for functionality and performance. Our study provides valuable insights into the effectiveness and usability of ACATs, informing further improvements in their design and implementation.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (59)
  1. F. Fui-Hoon Nah, R. Zheng, J. Cai, K. Siau, and L. Chen, “Generative ai and chatgpt: Applications, challenges, and ai-human collaboration,” pp. 277–304, 2023.
  2. C. S. Babu and P. Akshara, “Revolutionizing conversational ai: Unleashing the power of chatgpt-based applications in generative ai and natural language processing,” in Advanced Applications of Generative AI and Natural Language Processing Models.   IGI Global, 2024, pp. 228–248.
  3. H. Tian, W. Lu, T. O. Li, X. Tang, S.-C. Cheung, J. Klein, and T. F. Bissyandé, “Is chatgpt the ultimate programming assistant–how far is it?” arXiv preprint arXiv:2304.11938, 2023.
  4. S. Biswas, “Role of chatgpt in computer programming.: Chatgpt in computer programming.” Mesopotamian Journal of Computer Science, vol. 2023, pp. 8–16, 2023.
  5. E. Chen, R. Huang, H.-S. Chen, Y.-H. Tseng, and L.-Y. Li, “Gptutor: a chatgpt-powered programming tool for code explanation,” in International Conference on Artificial Intelligence in Education.   Springer, 2023, pp. 321–327.
  6. S. Baltes and S. Diehl, “Towards a theory of software development expertise,” in Proceedings of the 2018 26th acm joint meeting on european software engineering conference and symposium on the foundations of software engineering, 2018, pp. 187–200.
  7. C. Bird, D. Ford, T. Zimmermann, N. Forsgren, E. Kalliamvakou, T. Lowdermilk, and I. Gazit, “Taking flight with copilot: Early insights and opportunities of ai-powered pair-programming tools,” Queue, vol. 20, no. 6, pp. 35–57, 2022.
  8. F. V. Pantelimon and B. S. Posedaru, “Improving programming activities using chatgpt: A practical approach,” in International Conference on Informatics in Economy.   Springer, 2023, pp. 307–316.
  9. T. Ray, “Microsoft has over a million paying github copilot users: Ceo nadella,” 2023. [Online]. Available: https://www.zdnet.com/article/microsoft-has-over-a-million-paying-github-copilot-users-ceo-nadella/
  10. A. M. McNutt, C. Wang, R. A. Deline, and S. M. Drucker, “On the design of ai-powered code assistants for notebooks,” in Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, 2023, pp. 1–16.
  11. D. Zan, B. Chen, F. Zhang, D. Lu, B. Wu, B. Guan, W. Yongji, and J.-G. Lou, “Large language models meet nl2code: A survey,” in Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023, pp. 7443–7464.
  12. B. Yetistiren, I. Ozsoy, and E. Tuzun, “Assessing the quality of github copilot’s code generation,” in Proceedings of the 18th international conference on predictive models and data analytics in software engineering, 2022, pp. 62–71.
  13. S. I. Ross, F. Martinez, S. Houde, M. Muller, and J. D. Weisz, “The programmer’s assistant: Conversational interaction with a large language model for software development,” in Proceedings of the 28th International Conference on Intelligent User Interfaces, 2023, pp. 491–514.
  14. L. Ponzanelli, G. Bavota, M. Di Penta, R. Oliveto, and M. Lanza, “Prompter: Turning the ide into a self-confident programming assistant,” Empirical Software Engineering, vol. 21, pp. 2190–2231, 2016.
  15. G. Copilot, “Github copilot website,” https://github.com/features/copilot, accessed: Mar. 22, 2024.
  16. Tabnine, “Tabnine website,” https://www.tabnine.com/, accessed: Mar. 22, 2024.
  17. Q. Zheng, X. Xia, X. Zou, Y. Dong, S. Wang, Y. Xue, Z. Wang, L. Shen, A. Wang, Y. Li et al., “Codegeex: A pre-trained model for code generation with multilingual evaluations on humaneval-x,” arXiv preprint arXiv:2303.17568, 2023.
  18. M. Wermelinger, “Using github copilot to solve simple programming problems,” in Proceedings of the 54th ACM Technical Symposium on Computer Science Education V. 1, 2023, pp. 172–178.
  19. V. Corso, L. Mariani, D. Micucci, and O. Riganelli, “Assessing ai-based code assistants in method generation tasks,” arXiv preprint arXiv:2402.09022, 2024.
  20. Y. Li, Y. Huo, Z. Jiang, R. Zhong, P. He, Y. Su, and M. R. Lyu, “Exploring the effectiveness of llms in automated logging generation: An empirical study,” arXiv preprint arXiv:2307.05950, 2023.
  21. E. Kalliamvakou, “quantifying github copilot’s impact on developer productivity and happiness,” 2022. [Online]. Available: https://github.blog/2022-09-07-research-quantifying-github-copilots-impact-on-developer-productivity-and-happiness/
  22. F. F. Xu, B. Vasilescu, and G. Neubig, “In-ide code generation from natural language: Promise and challenges,” ACM Trans. Softw. Eng. Methodol., vol. 31, no. 2, mar 2022. [Online]. Available: https://doi.org/10.1145/3487569
  23. N. Perry, M. Srivastava, D. Kumar, and D. Boneh, “Do users write more insecure code with ai assistants?” in Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security, 2023, pp. 2785–2799.
  24. K. E. Golden, “Differences in the use of ai assistants: how human values influence ai assistant use or disuse,” Ph.D. dissertation, The University of Texas at Austin, 2018.
  25. B. Yetistiren, I. Ozsoy, and E. Tuzun, “Assessing the quality of github copilot’s code generation,” in Proceedings of the 18th International Conference on Predictive Models and Data Analytics in Software Engineering, ser. PROMISE 2022.   New York, NY, USA: Association for Computing Machinery, 2022, p. 62–71. [Online]. Available: https://doi.org/10.1145/3558489.3559072
  26. N. Al Madi, “How readable is model-generated code? examining readability and visual inspection of github copilot,” in Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering, 2022, pp. 1–5.
  27. S. Haque, Z. Eberhart, A. Bansal, and C. McMillan, “Semantic similarity metrics for evaluating source code summarization,” in Proceedings of the 30th IEEE/ACM International Conference on Program Comprehension, 2022, pp. 36–47.
  28. S. Ren, D. Guo, S. Lu, L. Zhou, S. Liu, D. Tang, N. Sundaresan, M. Zhou, A. Blanco, and S. Ma, “Codebleu: a method for automatic evaluation of code synthesis,” arXiv preprint arXiv:2009.10297, 2020.
  29. M. Chen, J. Tworek, H. Jun, Q. Yuan, H. P. d. O. Pinto, J. Kaplan, H. Edwards, Y. Burda, N. Joseph, G. Brockman et al., “Evaluating large language models trained on code,” arXiv preprint arXiv:2107.03374, 2021.
  30. R. Takaichi, Y. Higo, S. Matsumoto, S. Kusumoto, T. Kurabayashi, H. Kirinuki, and H. Tanno, “Are nlp metrics suitable for evaluating generated code?” in International Conference on Product-Focused Software Process Improvement.   Springer, 2022, pp. 531–537.
  31. B. Chopra, Y. Bajpai, P. Biyani, G. Soares, A. Radhakrishna, C. Parnin, and S. Gulwani, “Exploring interaction patterns for debugging: Enhancing conversational capabilities of ai-assistants,” arXiv preprint arXiv:2402.06229, 2024.
  32. B. Sheese, M. Liffiton, J. Savelka, and P. Denny, “Patterns of student help-seeking when using a large language model-powered programming assistant,” in Proceedings of the 26th Australasian Computing Education Conference, 2024, pp. 49–57.
  33. J. T. Liang, C. Yang, and B. A. Myers, “A large-scale survey on the usability of ai programming assistants: Successes and challenges,” in 2024 IEEE/ACM 46th International Conference on Software Engineering (ICSE).   IEEE Computer Society, 2023, pp. 605–617.
  34. M. Jaworski and D. Piotrkowski, “Study of software developers’ experience using the github copilot tool in the software development process,” arXiv preprint arXiv:2301.04991, 2023.
  35. A. Ziegler, E. Kalliamvakou, X. A. Li, A. Rice, D. Rifkin, S. Simister, G. Sittampalam, and E. Aftandilian, “Productivity assessment of neural code completion,” in Proceedings of the 6th ACM SIGPLAN International Symposium on Machine Programming, 2022, pp. 21–29.
  36. C. Wang, J. Hu, C. Gao, Y. Jin, T. Xie, H. Huang, Z. Lei, and Y. Deng, “How practitioners expect code completion?” in Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2023, pp. 1294–1306.
  37. D. Sobania, M. Briesch, and F. Rothlauf, “Choose your programming copilot: a comparison of the program synthesis performance of github copilot and genetic programming,” in Proceedings of the genetic and evolutionary computation conference, 2022, pp. 1019–1027.
  38. P. Vaithilingam, T. Zhang, and E. L. Glassman, “Expectation vs. experience: Evaluating the usability of code generation tools powered by large language models,” in Extended Abstracts of the 2022 CHI Conference on Human Factors in Computing Systems, ser. CHI EA ’22.   New York, NY, USA: Association for Computing Machinery, 2022. [Online]. Available: https://doi.org/10.1145/3491101.3519665
  39. E. Jiang, E. Toh, A. Molina, K. Olson, C. Kayacik, A. Donsbach, C. J. Cai, and M. Terry, “Discovering the syntax and strategies of natural language programming with generative language models,” in Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems, ser. CHI ’22.   New York, NY, USA: Association for Computing Machinery, 2022. [Online]. Available: https://doi.org/10.1145/3491102.3501870
  40. S. Imai, “Is github copilot a substitute for human pair-programming? an empirical study,” in Proceedings of the ACM/IEEE 44th International Conference on Software Engineering: Companion Proceedings, 2022, pp. 319–321.
  41. N. Nascimento, P. Alencar, and D. Cowan, “Comparing software developers with chatgpt: An empirical investigation,” arXiv preprint arXiv:2305.11837, 2023.
  42. A. M. Dakhel, V. Majdinasab, A. Nikanjam, F. Khomh, M. C. Desmarais, and Z. M. J. Jiang, “Github copilot ai pair programmer: Asset or liability?” Journal of Systems and Software, vol. 203, p. 111734, 2023.
  43. P. Vaithilingam, T. Zhang, and E. L. Glassman, “Expectation vs. experience: Evaluating the usability of code generation tools powered by large language models,” in Chi conference on human factors in computing systems extended abstracts, 2022, pp. 1–7.
  44. M. Ciniselli, L. Pascarella, E. Aghajani, S. Scalabrino, R. Oliveto, and G. Bavota, “Source code recommender systems: The practitioners’ perspective,” in 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE).   IEEE, 2023, pp. 2161–2172.
  45. codeforces, “codeforces,” https://codeforces.com/, accessed: Mar. 22, 2024.
  46. yeoman, “yeoman,” https://yeoman.io/, accessed: Mar. 22, 2024.
  47. generator code, “generator-code,” https://www.npmjs.com/package/generator-code, accessed: Mar. 22, 2024.
  48. anonymity, “ccdc-plugin,” https://marketplace.visualstudio.com/items?itemName=longlong-ccdc-plugin.ccdc-plugin, accessed: Mar. 22, 2024.
  49. S. Barke, M. B. James, and N. Polikarpova, “Grounded copilot: How programmers interact with code-generating models,” Proceedings of the ACM on Programming Languages, vol. 7, no. OOPSLA1, pp. 85–111, 2023.
  50. R. Cheng, R. Wang, T. Zimmermann, and D. Ford, “” it would work for me too”: How online communities shape software developers’ trust in ai-powered code generation tools,” arXiv preprint arXiv:2212.03491, 2022.
  51. P. Denny, V. Kumar, and N. Giacaman, “Conversing with copilot: Exploring prompt engineering for solving cs1 problems using natural language,” in Proceedings of the 54th ACM Technical Symposium on Computer Science Education V. 1, 2023, pp. 1136–1142.
  52. E. Jiang, E. Toh, A. Molina, K. Olson, C. Kayacik, A. Donsbach, C. J. Cai, and M. Terry, “Discovering the syntax and strategies of natural language programming with generative language models,” in Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems, 2022, pp. 1–19.
  53. B. Puryear and G. Sprint, “Github copilot in the classroom: learning to code with ai assistance,” Journal of Computing Sciences in Colleges, vol. 38, no. 1, pp. 37–47, 2022.
  54. A. Bajpai, I. M. Calus, and J. Fairley, “Descriptive statistical techniques,” in Methods of environmental data analysis.   Springer, 1992, pp. 1–35.
  55. D. Wicks, “The coding manual for qualitative researchers,” Qualitative research in organizations and management: an international journal, vol. 12, no. 2, pp. 169–170, 2017.
  56. J. A. Holton, “The coding process and its challenges,” The Sage handbook of grounded theory, vol. 3, pp. 265–289, 2007.
  57. gitHub-copilot chat, “github-copilot-chat,” https://docs.github.com/en/copilot/github-copilot-chat, accessed: Mar. 22, 2024.
  58. J. Rosenberg, “Some misconceptions about lines of code,” in Proceedings fourth international software metrics symposium.   IEEE, 1997, pp. 137–142.
  59. L. S. Nowell, J. M. Norris, D. E. White, and N. J. Moules, “Thematic analysis: Striving to meet the trustworthiness criteria,” International journal of qualitative methods, vol. 16, no. 1, p. 1609406917733847, 2017.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Xin Tan (63 papers)
  2. Xiao Long (8 papers)
  3. Xianjun Ni (1 paper)
  4. Yinghao Zhu (45 papers)
  5. Jing Jiang (192 papers)
  6. Li Zhang (693 papers)
Citations (2)

Summary

We haven't generated a summary for this paper yet.