Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation (1901.02985v2)

Published 10 Jan 2019 in cs.CV and cs.LG

Abstract: Recently, Neural Architecture Search (NAS) has successfully identified neural network architectures that exceed human designed ones on large-scale image classification. In this paper, we study NAS for semantic image segmentation. Existing works often focus on searching the repeatable cell structure, while hand-designing the outer network structure that controls the spatial resolution changes. This choice simplifies the search space, but becomes increasingly problematic for dense image prediction which exhibits a lot more network level architectural variations. Therefore, we propose to search the network level structure in addition to the cell level structure, which forms a hierarchical architecture search space. We present a network level search space that includes many popular designs, and develop a formulation that allows efficient gradient-based architecture search (3 P100 GPU days on Cityscapes images). We demonstrate the effectiveness of the proposed method on the challenging Cityscapes, PASCAL VOC 2012, and ADE20K datasets. Auto-DeepLab, our architecture searched specifically for semantic image segmentation, attains state-of-the-art performance without any ImageNet pretraining.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Chenxi Liu (84 papers)
  2. Liang-Chieh Chen (66 papers)
  3. Florian Schroff (21 papers)
  4. Hartwig Adam (49 papers)
  5. Wei Hua (35 papers)
  6. Alan Yuille (294 papers)
  7. Li Fei-Fei (199 papers)
Citations (947)

Summary

Hierarchical Neural Architecture Search for Semantic Image Segmentation

The paper "Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation" by Chenxi Liu et al. extends the domain of Neural Architecture Search (NAS) from image classification to semantic image segmentation. The proposed method, Auto-DeepLab, introduces a hierarchical architecture search space encompassing both the cell level and network level structures, diverging from existing methods focused primarily on cell level search spaces. By implementing a continuous relaxation for the architecture search with differentiable processes, this work aims at both capturing the architectural variations required by high-resolution tasks and reducing computational costs.

Key Contributions

The paper makes several distinct contributions:

  1. Extension of NAS Beyond Image Classification: This work is among the first to apply NAS to the domain of dense image prediction, specifically semantic image segmentation.
  2. Hierarchical Architecture Search Space: The integration of a trellis-like network level search space adds to the more commonly used cell level search space, forming a comprehensive hierarchical search space.
  3. Differentiable Formulation for Efficient Search: Employing a gradient-based approach significantly accelerates the search process, enabling it to be completed in just 3 days on a single P100 GPU.
  4. State-of-the-Art Performance Without Pretraining: Auto-DeepLab achieves state-of-the-art performance on multiple datasets without the need for ImageNet pretraining, showcasing its efficacy and efficiency.

Methodology

Hierarchical Search Space

The hierarchical search space addresses two components:

  • Cell Level: Each cell is a directed acyclic graph consisting of several blocks defined by a two-branch structure. The set of possible operations includes depthwise-separable convolutions and atrous convolutions, among others, facilitating the capturing of richer contextual information.
  • Network Level: The network level is represented as a trellis where transitions control the spatial resolution changes. The hierarchical search space thus allows for different architectural variations, accommodating both high-resolution and low-resolution convolutions.

Continuous Relaxation and Optimization

To handle the large search space, the authors employ a continuous relaxation of the discrete architectures. Specifically, they optimize the cell structure's connections and operations and the network structure's layer transitions using stochastic gradient descent. This process bypasses the high computational costs associated with reinforcement learning or evolutionary algorithms typically used in NAS.

Experimental Validation

The proposed Auto-DeepLab is subjected to rigorous testing on several benchmark datasets:

  • Cityscapes: The model achieves an 8.6% improvement over the previous state-of-the-art, indicating its superior performance. Additionally, its architecture search finishes in 3 GPU days compared to 2600 GPU days required by some other models (e.g., DPC).
  • PASCAL VOC 2012 and ADE20K: Without ImageNet pretraining, Auto-DeepLab outperforms several state-of-the-art models, demonstrating its ability to generalize well across tasks and datasets.

Implications

Practical Implications:

The ability to search for optimal architectures efficiently opens up new avenues for deploying high-performance vision models in resource-constrained environments. The success of Auto-DeepLab suggests that NAS can be effectively extended to more complex vision tasks beyond classification.

Theoretical Implications:

The hierarchical search space and continuous relaxation formulations contribute to a broader understanding of how neural architectures can be systematically optimized. This approach could redefine the architectural design paradigms for dense prediction models.

Future Directions

Several promising future directions are suggested by the authors, including the extension of the current framework to related tasks like object detection and exploring more generalized network level search spaces that can incorporate structures like U-nets. Such advancements could further validate the effectiveness and versatility of hierarchical NAS.

Conclusively, Auto-DeepLab marks a significant step in the application of NAS to semantic segmentation, achieving efficiency and performance improvements while guiding future research in automated architecture design.

Youtube Logo Streamline Icon: https://streamlinehq.com