Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

An Exploratory Study of V-Model in Building ML-Enabled Software: A Systems Engineering Perspective (2308.05381v4)

Published 10 Aug 2023 in cs.SE

Abstract: Machine learning (ML) components are being added to more and more critical and impactful software systems, but the software development process of real-world production systems from prototyped ML models remains challenging with additional complexity and interdisciplinary collaboration challenges. This poses difficulties in using traditional software lifecycle models such as waterfall, spiral, or agile models when building ML-enabled systems. In this research, we apply a Systems Engineering lens to investigate the use of V-Model in addressing the interdisciplinary collaboration challenges when building ML-enabled systems. By interviewing practitioners from software companies, we established a set of 8 propositions for using V-Model to manage interdisciplinary collaborations when building products with ML components. Based on the propositions, we found that despite requiring additional efforts, the characteristics of V-Model align effectively with several collaboration challenges encountered by practitioners when building ML-enabled systems. We recommend future research to investigate new process models, frameworks and tools that leverage the characteristics of V-Model such as the system decomposition, clear system boundary, and consistency of Validation & Verification (V&V) for building ML-enabled systems.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (77)
  1. Characterizing machine learning processes: A maturity framework. In Business Process Management: 18th International Conference, BPM 2020, Seville, Spain, September 13–18, 2020, Proceedings 18. Springer, 17–31.
  2. Software engineering for machine learning: A case study. In 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP). IEEE, 291–300.
  3. Software engineering challenges of deep learning. In 2018 44th euromicro conference on software engineering and advanced applications (SEAA). IEEE, 50–59.
  4. Sundramoorthy Balaji and M Sundararajan Murugaiyan. 2012. Waterfall vs. V-Model vs. Agile: A comparative study on SDLC. International Journal of Information Technology and Business Management 2, 1 (2012), 26–30.
  5. David Beale and Joseph Bonometti. 2006. Systems engineering (SE)-the systems design process. The Lunar Engineering Handbook, Auburg University, Auburn (2006).
  6. A framework for application of system engineering process models to sustainable design of high performance buildings. Journal of Green Building 7, 3 (2012), 171–192.
  7. Alex Bitektine. 2008. Prospective case study design: Qualitative method for deductive theory testing. Organizational research methods 11, 1 (2008), 160–180.
  8. Benjamin S. Blanchard and Wolter J. Fabrycky. 2011. Systems Engineering and Analysis. Prentice Hall.
  9. DA Bodner and WB Rouse. 2009. Handbook of systems engineering and management. Wiley. chapter Organizational Simulation (2009).
  10. Safely entering the deep: A review of verification and validation for machine learning and a challenge elicitation in the automotive industry. Journal of Automotive Software Engineering 1, 1 (2019), 1–19.
  11. Engineering ai systems: A research agenda. Artificial Intelligence Paradigms for Smart Cyber-Physical Systems (2021), 1–19.
  12. Houssem Ben Braiek and Foutse Khomh. 2020. On testing machine learning programs. Journal of Systems and Software 164 (2020), 110542.
  13. Eric J Braude and Michael E Bernstein. 2016. Software engineering: modern approaches. Waveland Press.
  14. Adolf-Peter Bröhl. 1993. Das V-Modell: Der Standard für die Softwareentwicklung mit Praxisleitfaden. Oldenbourg.
  15. Systems Engineering Guidebook for Intelligent Transportation Systems. California Division of the United States Department of Transportation Federal Highway Administration and the California Department of Transportation (2009).
  16. Early Validation and Verification of System Behaviour in Model-Based Systems Engineering: A Systematic Literature Review. ACM Transactions on Software Engineering and Methodology (2023).
  17. Jiyoo Chang and Christine Custis. 2022. Understanding Implementation Challenges in Machine Learning Documentation. In Equity and Access in Algorithms, Mechanisms, and Optimization. 1–8.
  18. UX design innovation: Challenges for working with machine learning as a design material. In Proceedings of the 2017 chi conference on human factors in computing systems. 278–288.
  19. Team data science process documentation. Retrieved April 11 (2017), 2019.
  20. James Fanson. 2010. Lessons learned from the Kepler Mission and space telescope management. In An Optical Believe It or Not: Key Lessons Learned II, Vol. 7796. SPIE, 25–30.
  21. Using multi criteria decision making in analysis of alternatives for selection of enabling technology. Systems Engineering 16, 3 (2013), 287–303.
  22. Görkem Giray. 2021. A software engineering perspective on engineering machine learning systems: State of the art and challenges. Journal of Systems and Software 180 (2021), 111031.
  23. What is software quality for AI engineers? Towards a thinning of the fog. In Proceedings of the 1st International Conference on AI Engineering: Software Engineering for AI. 1–9.
  24. Iris Graessler and Julian Hentze. 2020. The new V-Model of VDI 2206 and its validation. at-Automatisierungstechnik 68, 5 (2020), 312–324.
  25. Data quality considerations for big data and machine learning: Going beyond data cleaning and transformations. International Journal on Advances in Software 10, 1 (2017), 1–20.
  26. AI lifecycle models need to be revised: An exploratory study in Fintech. Empirical Software Engineering 26 (2021), 1–29.
  27. Systems engineering: fundamentals and applications. Springer.
  28. Adversarial machine learning. In Proceedings of the 4th ACM Workshop on Security and Artificial Intelligence. 43–58.
  29. A survey of safety and trustworthiness of deep neural networks: Verification, testing, adversarial attack and defence, and interpretability. Computer Science Review 37 (2020), 100270.
  30. Ai deployment architecture: Multi-case study for key factor identification. In 2020 27th Asia-Pacific Software Engineering Conference (APSEC). IEEE, 395–404.
  31. Data scientists in software teams: State of the art and challenges. IEEE Transactions on Software Engineering 44, 11 (2017), 1024–1038.
  32. Systems Engineering: Principles and Practice. John Wiley & Sons.
  33. Integrating machine learning with software development lifecycles: Insights from experts. (2022).
  34. Software architecture challenges for ml systems. In 2021 IEEE International Conference on Software Maintenance and Evolution (ICSME). IEEE, 634–638.
  35. Testing machine learning systems in industry: an empirical study. In Proceedings of the 44th International Conference on Software Engineering: Software Engineering in Practice. 263–272.
  36. Emerging and changing tasks in the development process for machine learning systems. In Proceedings of the international conference on software and system processes. 125–134.
  37. A survey on security threats and defensive techniques of machine learning: A data driven view. IEEE access 6 (2018), 12103–12117.
  38. Mark W. Maier and Eberhardt Rechtin. 2009. The Art of Systems Architecting. CRC Press.
  39. Who needs MLOps: What data scientists seek to accomplish and how can MLOps help?. In 2021 IEEE/ACM 1st Workshop on AI Engineering-Software Engineering for AI (WAIN). IEEE, 109–112.
  40. CRISP-DM twenty years later: From data mining processes to data science trajectories. IEEE Transactions on Knowledge and Data Engineering 33, 8 (2019), 3048–3061.
  41. The systems engineering DevOps lemniscate and model-based system operations. IEEE Systems Journal 15, 3 (2020), 3980–3991.
  42. An architectural risk analysis of machine learning systems: Toward more secure machine learning. Technical Report. Berryville Institute of Machine Learning, v 1.0.
  43. Who does the work of data? Interactions 27, 3 (2020), 52–55.
  44. A Meta-Summary of Challenges in Building Products with ML Components–Collecting Experiences from 4758+ Practitioners. arXiv preprint arXiv:2304.00078 (2023).
  45. Collaboration challenges in building ml-enabled systems: Communication, documentation, engineering, and process. In Proceedings of the 44th International Conference on Software Engineering. 413–425.
  46. Beyond effective use: Integrating wise reasoning in machine learning development. International Journal of Information Management 69 (2023), 102566.
  47. Paul R Niven and Ben Lamorte. 2016. Objectives and key results: Driving focus, alignment, and engagement with OKRs. John Wiley & Sons.
  48. Katie O’Leary and Makoto Uchida. 2020. Common problems with creating machine learning pipelines from existing code. (2020).
  49. Ipek Ozkaya. 2020. What is really different in engineering AI-enabled systems? IEEE software 37, 4 (2020), 3–6.
  50. Towards a Data Engineering Process in Data-Driven Systems Engineering. In 2022 IEEE International Symposium on Systems Engineering (ISSE). IEEE, 1–8.
  51. Romesh Ranawana and Asoka S Karunananda. 2021. An agile software development life cycle model for machine learning application development. In 2021 5th SLAAI International Conference on Artificial Intelligence (SLAAI-ICAI). IEEE, 1–6.
  52. Testing machine learning based systems: a systematic mapping. Empirical Software Engineering (2020), 1–62.
  53. Jane Ritchie and Liz Spencer. 2002. Qualitative data analysis for applied policy research. In Analyzing qualitative data. Routledge, 173–194.
  54. Andrew P Sage and William B Rouse. 2014. Handbook of systems engineering and management. John Wiley & Sons.
  55. Rick Salay and Krzysztof Czarnecki. 2018. Using machine learning safely in automotive software: An assessment and adaption of software process requirements in ISO 26262. arXiv preprint arXiv:1808.01614 (2018).
  56. An analysis of ISO 26262: Using machine learning safely in automotive software. arXiv preprint arXiv:1709.02435 (2017).
  57. “Everyone wants to do the model work, not the data work”: Data Cascades in High-Stakes AI. In Proceedings of the Conference on Human Factors in Computing Systems. 1–15.
  58. Nithya Sambasivan and Rajesh Veeraraghavan. 2022. The Deskilling of Domain Expertise in AI Development. In Proceedings of the Conference on Human Factors in Computing Systems. 1–14.
  59. Iqbal H Sarker. 2021. Machine learning: Algorithms, real-world applications and research directions. SN computer science 2, 3 (2021), 160.
  60. Automating Large-Scale Data Quality Verification. Proceedings of the VLDB Endowment 11, 12 (2018), 1781–1794.
  61. Hidden technical debt in machine learning systems. Advances in neural information processing systems 28 (2015).
  62. Adoption and effects of software engineering best practices in machine learning. In Proceedings of the 14th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM). 1–12.
  63. Alex Serban and Joost Visser. 2022. Adapting software architectures to machine learning challenges. In 2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE, 152–163.
  64. Mary Shaw and Liming Zhu. 2022. Can software engineering harness the benefits of advanced AI? IEEE Software 39, 6 (2022), 99–104.
  65. Md Saeed Siddik and Cor-Paul Bezemer. 2023. Do Code Quality and Style Issues Differ Across (Non-) Machine Learning Notebooks? Yes!. In 2023 IEEE 23rd International Working Conference on Source Code Analysis and Manipulation (SCAM). IEEE, 72–83.
  66. The machine learning bazaar: Harnessing the ml ecosystem for effective system development. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. 785–800.
  67. Towards CRISP-ML (Q): a machine learning process model with quality assurance methodology. Machine learning and knowledge extraction 3, 2 (2021), 392–413.
  68. Andreas Vogelsang and Markus Borg. 2019. Requirements engineering for machine learning: Perspectives from data scientists. In 2019 IEEE 27th International Requirements Engineering Conference Workshops (REW). IEEE, 245–251.
  69. David D Walden et al. 2015. Systems engineering handbook: A guide for system life cycle processes and activities. (2015).
  70. How does machine learning change software development practices? IEEE Transactions on Software Engineering 47, 9 (2019), 1857–1871.
  71. Charles S. Wasson. 2006. Systems Engineering: Coping with Complexity. John Wiley & Sons.
  72. Steven Euijong Whang and Jae-Gil Lee. 2020. Data collection and quality challenges for deep learning. Proceedings of the VLDB Endowment 13, 12 (2020), 3429–3432.
  73. Wikipedia. 2023. Systems engineering — Wikipedia, The Free Encyclopedia. http://en.wikipedia.org/w/index.php?title=Systems%20engineering&oldid=1163252030. [Online; accessed 08-August-2023].
  74. Carl Wilhjelm and Awad A. Younis. 2020. A threat analysis methodology for security requirements elicitation in machine learning based systems. In 2020 IEEE 20th International Conference on Software Quality, Reliability and Security Companion (QRS-C). IEEE, 426–433.
  75. Comparison of multi-criteria decision-making methods for online controlled experiments in a launch decision-making framework. Information and Software Technology 155 (2023), 107115.
  76. CV-HAZOP: Introducing test data validation for computer vision. In Proceedings of the IEEE International Conference on Computer Vision. 2066–2074.
  77. Machine learning testing: Survey, landscapes and horizons. IEEE Transactions on Software Engineering (2020).
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (1)
  1. Jie JW Wu (8 papers)