Chameleon: Adaptive Code Optimization for Expedited Deep Neural Network Compilation (2001.08743v1)

Published 23 Jan 2020 in cs.LG and stat.ML

Abstract: Achieving faster execution with shorter compilation time can foster further diversity and innovation in neural networks. However, the current paradigm of executing neural networks either relies on hand-optimized libraries, traditional compilation heuristics, or very recently genetic algorithms and other stochastic methods. These methods suffer from frequent costly hardware measurements rendering them not only too time consuming but also suboptimal. As such, we devise a solution that can learn to quickly adapt to a previously unseen design space for code optimization, both accelerating the search and improving the output performance. This solution dubbed Chameleon leverages reinforcement learning whose solution takes fewer steps to converge, and develops an adaptive sampling algorithm that not only focuses on the costly samples (real hardware measurements) on representative points but also uses a domain-knowledge inspired logic to improve the samples itself. Experimentation with real hardware shows that Chameleon provides 4.45x speed up in optimization time over AutoTVM, while also improving inference time of the modern deep networks by 5.6%.

Citations (74)

View on Semantic Scholar

Summary

The paper presents Chameleon, an adaptive framework that employs reinforcement learning to significantly expedite deep neural network code optimization and improve execution performance.
Chameleon uses adaptive exploration and sampling strategies to efficiently navigate the optimization design space, reducing costly hardware measurements and improving performance over traditional methods like AutoTVM.
The framework enables faster deployment cycles for deep neural networks across various hardware platforms by automating and accelerating the optimization process.

Chameleon: Adaptive Code Optimization for Expedited Deep Neural Network Compilation

The computational demands of deep neural networks (DNNs) necessitate efficient compilation strategies to optimize performance and reduce compilation time. Traditional approaches often rely on hand-optimized libraries or heuristic-based compilation techniques, but these methods can be time-consuming and suboptimal. This paper presents Chameleon, a novel framework designed to enhance code optimization for DNNs by employing reinforcement learning (RL) to improve code compilation processes.

Overview

Chameleon introduces an adaptive strategy to expedite the compilation process for DNNs, substantially reducing both search and optimization times while enhancing execution performance. By leveraging RL, Chameleon rapidly adapts to new design spaces for code optimization, making the search process more efficient and potentially leading to superior performance outcomes. The framework is designed to minimize the dependency on traditional hand-optimized methods, thus increasing the diversity of tensor operations that can be automated and optimized.

Key Contributions

Adaptive Exploration: Chameleon employs an RL-based model for adaptive exploration, which is capable of quickly navigating the design space of new neural network configurations. This reduces the number of steps needed to converge to a solution with a high performance.
Adaptive Sampling: An adaptive sampling algorithm is introduced that strategically reduces the number of costly hardware measurements. By employing domain-knowledge-inspired logic, the algorithm identifies representative configurations, ensuring efficient exploration of the design space.

The experimentation validates that Chameleon provides significant improvements over existing solutions like AutoTVM. It reduces optimization time by a factor of $X$ and improves inference performance by $Y\%$ on modern deep networks such as AlexNet, VGG-16, and ResNet-18 using a high-end GPU.

Experimental Results

Efficiency Gains: Compared to AutoTVM, Chameleon achieves an average speed-up in optimization time while improving inference time for DNNs. It addresses inherent inefficiencies of traditional genetic and random-search algorithms by intelligently exploring promising regions of the design space.
Practical Applications: The framework has immediate applications in improving the deployment cycle of neural networks across various hardware platforms. With automatic optimization passes operating more swiftly, it creates opportunities for increased innovation in DNN design.

Implications and Future Work

Chameleon's approach presents both theoretical and practical implications. Theoretically, it showcases the potential of reinforcement learning in optimizing neural network code, which could inspire further exploration of machine learning methods in compiler technology. Practically, the reduction in compilation time aids faster deployment of DNNs, contributing to the responsiveness and scalability of AI applications.

Future research could explore integrating other machine learning techniques and expanding Chameleon to support a wider variety of hardware architectures. Additionally, evaluating the impact of evolving neural network topologies and operations on Chameleon's framework would be beneficial.

Overall, Chameleon demonstrates a significant advancement in compiler technology for neural networks, positioning itself as a valuable tool for both academic research and industrial applications in AI.

Related Papers

YouTube

Show All Videos