Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
184 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Mining for Cost Awareness in the Infrastructure as Code Artifacts of Cloud-based Applications: an Exploratory Study (2304.07531v3)

Published 15 Apr 2023 in cs.SE

Abstract: Context: The popularity of cloud computing as the primary platform for developing, deploying, and delivering software is largely driven by the promise of cost savings. Therefore, it is surprising that no empirical evidence has been collected to determine whether cost awareness permeates the development process and how it manifests in practice. Objective: This study aims to provide empirical evidence of cost awareness by mining open source repositories of cloud-based applications. The focus is on Infrastructure as Code artifacts that automate software (re)deployment on the cloud. Methods: A systematic search through 152,735 repositories resulted in the selection of 2,010 relevant ones. We then analyzed 538 relevant commits and 208 relevant issues using a combination of inductive and deductive coding. Results: The findings indicate that developers are not only concerned with the cost of their application deployments but also take actions to reduce these costs beyond selecting cheaper cloud services. We also identify research areas for future consideration. Conclusion: Although we focus on a particular Infrastructure as Code technology (Terraform), the findings can be applicable to cloud-based application development in general. The provided empirical grounding can serve developers seeking to reduce costs through service selection, resource allocation, deployment optimization, and other techniques.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (51)
  1. Challenges in chatbot development: A study of stack overflow posts, in: Proceedings of the 17th International Conference on Mining Software Repositories (MSR), ACM, Seoul, Republic of Korea. pp. 174–185. doi:10.1145/3379597.3387472.
  2. What do concurrency developers ask about? a large-scale study using stack overflow, in: Proceedings of the 12th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, ACM, Oulu, Finland. doi:10.1145/3239235.3239524.
  3. An empirical study of developer discussions on low-code software development challenges, in: 2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR), pp. 46–57. doi:10.1109/MSR52588.2021.00018.
  4. Multiobjective optimization for brokering of multicloud service composition. ACM Transactions on Internet Technology (TOIT) 16, 1–20.
  5. How to adapt applications for the cloud environment. Computing 95, 493–535. URL: https://doi.org/10.1007/s00607-012-0248-2, doi:10.1007/s00607-012-0248-2.
  6. A view of cloud computing. Communications of the ACM 53, 50–58.
  7. Task scheduling techniques in cloud computing: A literature survey. Future Generation Computer Systems 91, 407–415.
  8. How android app developers manage power consumption?, in: Proceedings of the 13th International Conference on Mining Software Repositories, ACM. URL: https://doi.org/10.1145/2901739.2901748, doi:10.1145/2901739.2901748.
  9. Probabilistic topic models. IEEE Signal Processing Magazine 27, 55–65.
  10. Latent dirichlet allocation: Extracting topics from software engineering data, in: The Art and Science of Analyzing Software Data. Elsevier, pp. 139–159. doi:10.1016/b978-0-12-411519-4.00006-9.
  11. Performance comparison of terraform and cloudify as multicloud orchestrators, in: 2020 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID), IEEE. URL: https://doi.org/10.1109/ccgrid49817.2020.00-55, doi:10.1109/ccgrid49817.2020.00-55.
  12. Architecting cloud-enabled systems: a systematic survey of challenges and solutions. Software: Practice and Experience 47, 599–644.
  13. Explaining software defects using topic models, in: Proceedings of the 9th IEEE Working Conference on Mining Software Repositories (MSR), pp. 189–198. doi:10.1109/MSR.2012.6224280.
  14. Characterization and prediction of issue-related risks in software projects, in: Proceedings of the 12th Working Conference on Mining Software Repositories, IEEE. pp. 280–291.
  15. Predicting delivery capability in iterative software development. IEEE Transactions on Software Engineering 44, 551–573. URL: https://doi.org/10.1109/tse.2017.2693989, doi:10.1109/tse.2017.2693989.
  16. A survey of profit optimization techniques for cloud providers. ACM Computing Surveys (CSUR) 53, 1–35.
  17. Basics of Qualitative Research: Techniques and Procedures for Developing Grounded Theory. SAGE Publications.
  18. The impact of rapid release cycles on the integration delay of fixed issues. Empirical Software Engineering 23, 835–904. URL: https://doi.org/10.1007/s10664-017-9548-7, doi:10.1007/s10664-017-9548-7.
  19. A quantitative and qualitative investigation of performance-related commits in android apps, in: 2016 IEEE International Conference on Software Maintenance and Evolution (ICSME), IEEE. pp. 443–447.
  20. Perceval: Software project data at your will, in: Proceedings of the 40th International Conference on Software Engineering: Companion Proceedings (ICSE-C), ACM, Gothenburg, Sweden. pp. 1–4. URL: https://github.com/chaoss/grimoirelab-perceval, doi:10.1145/3183440.3183475.
  21. Dataset and analysis of stack overflow terraform discussions about cost concerns. Available online at https://github.com/feitosa-daniel/cloud-cost-awareness-so/.
  22. Dataset and population information regarding this study. Available online at https://github.com/feitosa-daniel/cloud-cost-awareness/.
  23. What do programmers discuss about deep learning frameworks. Empirical Software Engineering 25, 2694–2747. doi:10.1007/s10664-020-09819-6.
  24. The economics of the cloud. Microsoft whitepaper, Microsoft Corporation.
  25. HashiCorp, . Terraform | files and directories — configuration language. URL: https://www.terraform.io/language/files.
  26. Green mining: a methodology of relating software change and configuration to power consumption. Empirical Software Engineering 20, 374–409. doi:10.1007/s10664-013-9276-6.
  27. Automated topic naming to support cross-project analysis of software maintenance activities, in: Proceedings of the 8th Working Conference on Mining Software Repositories (MSR), ACM, Waikiki, Honolulu, HI, USA. pp. 163–172. doi:10.1145/1985441.1985466.
  28. Online learning for latent dirichlet allocation, in: Lafferty, J., Williams, C., Shawe-Taylor, J., Zemel, R., Culotta, A. (Eds.), Advances in Neural Information Processing Systems, Curran Associates, Inc.
  29. Service selection using multi-criteria decision making: a comprehensive overview. Journal of Network and Systems Management 28, 1639–1693.
  30. Automatically mining software-based, semantically-similar words from comment-code mappings, in: Proceedings of the 10th Working Conference on Mining Software Repositories, IEEE. pp. 377–386.
  31. Cloud migration research: a systematic review. IEEE transactions on cloud computing 1, 142–157.
  32. Occopus: a multi-cloud orchestrator to deploy and manage complex scientific infrastructures. Journal of Grid Computing 16, 19–37. URL: https://doi.org/10.1007/s10723-017-9421-3, doi:10.1007/s10723-017-9421-3.
  33. Adapting neural text classification for improved software categorization, in: 2018 IEEE International Conference on Software Maintenance and Evolution (ICSME), IEEE. URL: https://doi.org/10.1109/icsme.2018.00056, doi:10.1109/icsme.2018.00056.
  34. UpSet: Visualization of intersecting sets. IEEE Transactions on Visualization and Computer Graphics 20, 1983–1992. doi:10.1109/tvcg.2014.2346248.
  35. Opinion mining for software development: A systematic literature review. ACM Transactions on Software Engineering Methodology 31. doi:10.1145/3490388.
  36. The NIST definition of cloud computing. NIST Special Publication 800-145.
  37. Mining energy-aware commits, in: 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories, IEEE. URL: https://doi.org/10.1109/msr.2015.13, doi:10.1109/msr.2015.13.
  38. GreenHub: a large-scale collaborative dataset to battery consumption analysis of android devices. Empirical Software Engineering 26. URL: https://doi.org/10.1007/s10664-020-09925-5, doi:10.1007/s10664-020-09925-5.
  39. Mining questions about software energy consumption, in: Proceedings of the 11th Working Conference on Mining Software Repositories, pp. 22–31.
  40. Stanza: A Python natural language processing toolkit for many human languages, in: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations.
  41. An empirical study on the usage of the swift programming language, in: Proceedings of the IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER), IEEE. doi:10.1109/saner.2016.66.
  42. Software framework for topic modelling with large corpora, in: Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks, ELRA, Valletta, Malta. pp. 45–50.
  43. Antipatterns in software classification taxonomies. Journal of Systems and Software 190, 111343. URL: https://doi.org/10.1016/j.jss.2022.111343, doi:10.1016/j.jss.2022.111343.
  44. Why adopting cloud is still a challenge?—a review on issues and challenges for cloud migration in organizations. Ambient Communications and Computer Systems , 387–399.
  45. PyDriller: Python framework for mining software repositories, in: Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering - ESEC/FSE 2018, ACM, New York, New York, USA. pp. 908–911. URL: https://github.com/ishepard/pydriller, doi:10.1145/3236024.3264598.
  46. Cloud resource orchestration in the multi-cloud landscape: a systematic review of existing frameworks. Journal of Cloud Computing 9, 1–24.
  47. Predicting good configurations for GitHub and stack overflow topic models, in: Proceedings of the IEEE/ACM 16th International Conference on Mining Software Repositories (MSR), IEEE. doi:10.1109/msr.2019.00022.
  48. Optimal selection techniques for cloud service providers. IEEE Access 8, 203591–203618.
  49. Comprehensive and systematic review of the service composition mechanisms in the cloud environments. Journal of Network and Computer Applications 81, 24–36.
  50. Experimentation in Software Engineering. Computer Science, Springer Berlin Heidelberg.
  51. Mining the usage of reactive programming apis: A study on github and stack overflow, in: Proceedings of the 19th International Conference on Mining Software Repositories (MSR), ACM, Pittsburgh, Pennsylvania. pp. 203–214. doi:10.1145/3524842.3527966.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com