Towards Safe Automated Refactoring of Imperative Deep Learning Programs to Graph Execution (2308.11785v2)
Abstract: Efficiency is essential to support responsiveness w.r.t. ever-growing datasets, especially for Deep Learning (DL) systems. DL frameworks have traditionally embraced deferred execution-style DL code -- supporting symbolic, graph-based Deep Neural Network (DNN) computation. While scalable, such development tends to produce code that is error-prone, non-intuitive, and difficult to debug. Consequently, more natural, less error-prone imperative DL frameworks encouraging eager execution have emerged at the expense of run-time performance. Though hybrid approaches aim for the "best of both worlds," using them effectively requires subtle considerations to make code amenable to safe, accurate, and efficient graph execution -- avoiding performance bottlenecks and semantically inequivalent results. We present our ongoing work on an automated refactoring approach that assists developers in specifying whether and how their otherwise eagerly-executed imperative DL code could be reliably and efficiently executed as graphs at run-time in a semantics-preserving fashion. The approach, based on a novel tensor analysis specifically for imperative DL code, consists of refactoring preconditions for automatically determining when it is safe and potentially advantageous to migrate imperative DL code to graph execution and modifying decorator parameters or eagerly executing code already running as graphs. The approach is being implemented as a PyDev Eclipse IDE plug-in and uses the WALA Ariadne analysis framework. We discuss our ongoing work towards optimizing imperative DL code to its full potential.
- “HARP: Holistic Analysis for Refactoring Python-Based Analytics Programs” In International Conference on Software Engineering, 2020, pp. 506–517 DOI: 10.1145/3377811.3380434
- Google LLC “Migrate your TensorFlow 1 code to TensorFlow 2”, 2021 URL: https://tensorflow.org/guide/migrate#automatic_conversion_script
- “MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems” In Workshop on Machine Learning Systems at NIPS, 2015 arXiv:1512.01274 [cs.DC]
- “An Empirical Study on TensorFlow Program Bugs” In International Symposium on Software Testing and Analysis, 2018 DOI: 10.1145/3213846.3213866
- “A comprehensive study on Deep Learning bug characteristics” In Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2019 DOI: 10.1145/3338906.3338955
- “What Do Developers Ask About ML Libraries? A Large-scale Study Using Stack Overflow”, 2019 arXiv:1906.11940 [cs.SE]
- “An Empirical Study of Common Challenges in Developing Deep Learning Applications” In International Symposium on Software Reliability Engineering, 2019 DOI: 10.1109/ISSRE.2019.00020
- “TensorFlow Eager: A Multi-Stage, Python-Embedded DSL for Machine Learning”, 2019 arXiv:1903.01855 [cs.PL]
- “PyTorch: An Imperative Style, High-Performance Deep Learning Library”, 2019 arXiv:1912.01703 [cs.LG]
- François Chollet “Deep Learning with Python” Manning, 2020
- “AutoGraph: Imperative-style Coding with Graph-based Performance”, 2019 arXiv:1810.08061 [cs.PL]
- Facebook Inc. “PyTorch Documentation”, 2019 URL: https://pytorch.org/docs/stable/jit.html
- “Speculative Symbolic Graph Execution of Imperative Deep Learning Programs” In SIGOPS Oper. Syst. Rev. 53.1, 2019, pp. 26–33 DOI: 10.1145/3352020.3352025
- Google LLC “Introduction to graphs and tf.function”, 2022 URL: https://tensorflow.org/guide/intro_to_graphs
- Apache “Hybridize”, 2021 URL: https://mxnet.apache.org/versions/1.8.0/api/python/docs/tutorials/packages/gluon/blocks/hybridize.html
- “TensorFlow: A System for Large-Scale Machine Learning” In Symposium on Operating Systems Design and Implementation, 2016
- Google LLC “Better performance with tf.function”, 2021 URL: https://tensorflow.org/guide/function
- “Challenges in Migrating Imperative Deep Learning Programs to Graph Execution: An Empirical Study” In International Conference on Mining Software Repositories, MSR ’22 ACM, 2022 ACM/IEEE DOI: 10.1145/3524842.3528455
- “Characterizing Performance Bugs in Deep Learning Systems”, 2021 arXiv:2112.01771 [cs.SE]
- Danny Dig, John Marrero and Michael D. Ernst “Refactoring sequential Java code for concurrency via concurrent libraries” In International Conference on Software Engineering, 2009, pp. 397–407 IEEE DOI: 10.1109/ICSE.2009.5070539
- OpenAI, Inc. “ChatGPT”, 2023 URL: https://chat.openai.com
- “Discovering Repetitive Code Changes in Python ML Systems” To appear. In International Conference on Software Engineering, ICSE ’22, 2022
- “A Comparative Study of Manual and Automated Refactorings” In European Conference on Object-Oriented Programming Berlin, Heidelberg: Springer Berlin Heidelberg, 2013, pp. 552–576
- Miryung Kim, Thomas Zimmermann and Nachiappan Nagappan “A Field Study of Refactoring Challenges and Benefits” In Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, FSE ’12 Cary, North Carolina: ACM, 2012 DOI: 10.1145/2393596.2393655
- Fabio Zadrozny “PyDev”, 2023 URL: https://www.pydev.org
- “Ariadne: Analysis for Machine Learning Programs” In International Workshop on Machine Learning and Programming Languages, MAPL 2018 Philadelphia, PA, USA: Association for Computing Machinery, 2018, pp. 1–10 ACM SIGPLAN DOI: 10.1145/3211346.3211349
- Raffi Khatchadourian “graph_execution_time_comparison.ipynb”, 2021 URL: https://bit.ly/3bwrhVt
- Google LLC “Better performance with tf.function”, 2021 TensorFlow URL: https://www.tensorflow.org/guide/function#controlling_retracing
- Malinda Dilhara, Ameya Ketkar and Danny Dig “Understanding Software-2.0: A Study of Machine Learning Library Usage and Evolution” In ACM Transactions on Software Engineering and Methodology New York, NY, USA: Association for Computing Machinery, 2021 DOI: 10.1145/3453478
- “Repairing Deep Neural Networks: Fix Patterns and Challenges” In International Conference on Software Engineering, 2020 DOI: 10.1145/3377811.3380378
- “Unveiling the Mystery of API Evolution in Deep Learning Frameworks: A Case Study of TensorFlow 2” In International Conference on Software Engineering, ICSE-SEIP, 2021 DOI: 10.1109/ICSE-SEIP52600.2021.00033
- “Safe Automated Refactoring for Intelligent Parallelization of Java 8 Streams” In International Conference on Software Engineering, ICSE ’19 IEEE Press, 2019, pp. 619–630 DOI: 10.1109/ICSE.2019.00072