Papers

Topics

Authors

Recent

View all

Assistant

AI Research Assistant

Well-researched responses based on relevant abstracts and paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses.

Gemini 2.5 Flash

Gemini 2.5 Flash 61 tok/s

Gemini 2.5 Pro 49 tok/s Pro

GPT-5 Medium 28 tok/s Pro

GPT-5 High 26 tok/s Pro

GPT-4o 95 tok/s Pro

Kimi K2 193 tok/s Pro

GPT OSS 120B 447 tok/s Pro

Claude Sonnet 4.5 32 tok/s Pro

2000 character limit reached

Neural Architecture Search as Program Transformation Exploration (2102.06599v1)

Published 12 Feb 2021 in cs.LG and cs.PL

Abstract: Improving the performance of deep neural networks (DNNs) is important to both the compiler and neural architecture search (NAS) communities. Compilers apply program transformations in order to exploit hardware parallelism and memory hierarchy. However, legality concerns mean they fail to exploit the natural robustness of neural networks. In contrast, NAS techniques mutate networks by operations such as the grouping or bottlenecking of convolutions, exploiting the resilience of DNNs. In this work, we express such neural architecture operations as program transformations whose legality depends on a notion of representational capacity. This allows them to be combined with existing transformations into a unified optimization framework. This unification allows us to express existing NAS operations as combinations of simpler transformations. Crucially, it allows us to generate and explore new tensor convolutions. We prototyped the combined framework in TVM and were able to find optimizations across different DNNs, that significantly reduce inference time - over 3$\times$ in the majority of cases. Furthermore, our scheme dramatically reduces NAS search time. Code is available at~\href{https://github.com/jack-willturner/nas-as-program-transformation-exploration}{this https url}.

Citations (12)

View on Semantic Scholar

Summary

The paper presents a unified framework that integrates neural architecture search with program transformations, achieving over 3x speedup in DNN inference.
It employs Fisher Potential as a compile-time metric to ensure the legality of transformations without extensive retraining.
Implementation in TVM demonstrates reduced search times and notable performance gains across various DNN models and hardware platforms.

Neural Architecture Search as Program Transformation Exploration

The paper "Neural Architecture Search as Program Transformation Exploration" by Jack Turner, Elliot J. Crowley, and Michael F.P. O'Boyle investigates a novel approach to enhancing the efficiency of deep neural networks (DNNs) by integrating concepts from both the compiler community and the domain of neural architecture search (NAS). The authors propose a unified framework that views neural architecture operations as program transformations, thus allowing the merging of traditional compiler optimizations with architecture-based transformations for improving DNN performance.

Key Contributions

Unified Framework: The paper introduces a methodology to integrate NAS techniques with compilation optimizations through the expression of neural architecture operations as program transformations. Such unification has the potential to streamline the deployment process by interleaving program and neural architecture transformations, thereby unlocking optimization opportunities not previously available.
Transformation Legality via Fisher Potential: A significant challenge addressed by the authors is the legality of neural transformations. Unlike traditional program transformations bound by data dependences, the legality of NAS transformations is judged by their representational capacity. Fisher Potential is employed as a metric to gauge this capacity, offering a compile-time, cost-effective way to filter out damaging network changes without the need for extensive retraining.
Implementation in TVM: The proposed framework has been prototyped in the TVM optimizing compiler, where various transformations, including grouping and bottlenecking, were systematically explored. The authors demonstrated that, by applying these combined transformations, significant reductions in inference time can be achieved across various DNN architectures and platforms.
Empirical Results: The research reports empirical results where the proposed methodology achieved more than 3x speedup in inference times on major DNN models like ResNet, ResNext, and DenseNet across four hardware platforms. Additionally, the unified framework notably reduced the search time in neural architecture search by discarding unfit configurations early in the process.

Background and Methodology

The authors highlight two distinct yet related domains involved in optimizing neural network performance: the community focusing on neural architecture search that concerns itself with finding efficient network models, and compiler developers who optimize fixed networks to exploit hardware capabilities. By viewing neural architecture operations through the lens of program transformations, the paper proposes a systematic exploration of these transformations as part of a broader optimization strategy.

The polyhedral model, renowned for its use in optimizing loop nests and memory access patterns in compilers, is extended to encompass NAS transformations. Bottlenecking, grouping, and depthwise convolutions are expressed as affine transformations within this model, bridging compiler optimizations with neural network structural changes. The authors leverage Fisher Potential from the domain of learning theory as a practical measure to assess the fidelity of transformations without necessitating network retraining.

Implications and Future Research

This approach has substantial implications for both the practical deployment and theoretical development of AI technologies. The unified framework provides a pathway for developing more agile and efficient neural models that optimize resources while preserving accuracy. Future work might extend this framework to broader classes of neural architectures, beyond convolutional networks, and explore alternative proxies for representational capacity that could further enhance the exploration space's efficacy.

Moreover, this research invites further refinement of Fisher Potential or similar metrics to become more predictive of post-training accuracies in varied neural architectures. The integration of such measures could make NAS substantially more efficient, contributing to the growing trend towards automating network design and optimization processes in deep learning frameworks.

In summary, this paper advances the idea that systematic transformation exploration unifies two traditionally siloed areas, providing a comprehensive strategy for the optimization of DNNs, pushing the boundaries for future AI development and application.