Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

TF-NAS: Rethinking Three Search Freedoms of Latency-Constrained Differentiable Neural Architecture Search (2008.05314v1)

Published 12 Aug 2020 in cs.CV

Abstract: With the flourish of differentiable neural architecture search (NAS), automatically searching latency-constrained architectures gives a new perspective to reduce human labor and expertise. However, the searched architectures are usually suboptimal in accuracy and may have large jitters around the target latency. In this paper, we rethink three freedoms of differentiable NAS, i.e. operation-level, depth-level and width-level, and propose a novel method, named Three-Freedom NAS (TF-NAS), to achieve both good classification accuracy and precise latency constraint. For the operation-level, we present a bi-sampling search algorithm to moderate the operation collapse. For the depth-level, we introduce a sink-connecting search space to ensure the mutual exclusion between skip and other candidate operations, as well as eliminate the architecture redundancy. For the width-level, we propose an elasticity-scaling strategy that achieves precise latency constraint in a progressively fine-grained manner. Experiments on ImageNet demonstrate the effectiveness of TF-NAS. Particularly, our searched TF-NAS-A obtains 76.9% top-1 accuracy, achieving state-of-the-art results with less latency. The total search time is only 1.8 days on 1 Titan RTX GPU. Code is available at https://github.com/AberHu/TF-NAS.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Yibo Hu (34 papers)
  2. Xiang Wu (37 papers)
  3. Ran He (172 papers)
Citations (41)

Summary

Overview of TF-NAS: Rethinking Three Search Freedoms of Latency-Constrained Differentiable Neural Architecture Search

The paper presents a novel approach to differentiable neural architecture search (NAS), specifically targeting latency-constrained scenarios through the Three-Freedom NAS (TF-NAS) framework. This work addresses a critical challenge in NAS, which is the trade-off between achieving high classification accuracy and meeting specific latency constraints on target devices, such as GPUs or CPUs. The proposed method re-evaluates three key search freedoms within the differentiable NAS paradigm: operation-level, depth-level, and width-level freedoms.

Core Contributions and Methodological Advances

  1. Bi-Sampling Algorithm for Operation-Level Search: The authors identified a phenomenon known as operation collapse, wherein certain operations are disproportionately favored, leading to suboptimal architecture discovery. To mitigate this, the paper introduces a bi-sampling search algorithm. This technique involves sampling two independent paths during each forward pass, enhancing the competitiveness of operations not selected within the standard differentiable NAS framework. Random sampling for the second path was found to be most effective, improving search stability and training accuracy.
  2. Sink-Connecting Search Space for Depth-Level Search: Traditional approaches include skip operations directly in the operation candidates, which can destabilize search processes. TF-NAS advocates for mutual exclusion between skip operations and other candidates, implemented through a sink-connecting search space. This design ensures the elimination of architecture redundancy, enhancing architectural search efficiency and consistency.
  3. Elasticity-Scaling Strategy for Width-Level Search: Recognizing the limitations of coarse-grained search space in achieving precise latency constraints, the paper introduces an elasticity-scaling strategy. This involves progressively refining architectural width by adaptively shrinking and expanding channels, thus enabling more granular control over latency without increasing memory overheads.

Experimental Results and Implications

TF-NAS has been validated on the ImageNet dataset, demonstrating state-of-the-art accuracy and latency performance across various target latency levels (18ms, 15ms, 12ms, and 10ms). Specifically, the TF-NAS-A architecture achieved 76.9% top-1 accuracy at 18.03ms latency, outstripping competitive models such as NASNet-A, MixNet-S, and EfficientNet-B0 both in accuracy and latency. Furthermore, the approach demonstrated its robustness across multiple benchmarks and transfer learning tasks, including CIFAR10 and CIFAR100, maintaining competitive performance.

Future Research Directions

The research opens several avenues for future exploration. There is potential to extend the TF-NAS framework to other architectural paradigms or more complex tasks beyond image classification, while maintaining latency constraints. Additionally, integrating semantic search spaces could further reduce the search times and increase generalization capabilities. Enhancing the interpretability of NAS search processes could aid practitioners in understanding the underlying architecture choices impacted by latency constraints.

In summary, this paper offers a substantial contribution to the domain of NAS, particularly for resource-constrained environments, by proposing innovative methodological enhancements across operation-level, depth-level, and width-level search freedoms. The TF-NAS framework paves the way for more effective, efficient, and precise neural architecture search processes in latency-critical applications.