Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

MnasNet: Platform-Aware Neural Architecture Search for Mobile (1807.11626v3)

Published 31 Jul 2018 in cs.CV and cs.LG

Abstract: Designing convolutional neural networks (CNN) for mobile devices is challenging because mobile models need to be small and fast, yet still accurate. Although significant efforts have been dedicated to design and improve mobile CNNs on all dimensions, it is very difficult to manually balance these trade-offs when there are so many architectural possibilities to consider. In this paper, we propose an automated mobile neural architecture search (MNAS) approach, which explicitly incorporate model latency into the main objective so that the search can identify a model that achieves a good trade-off between accuracy and latency. Unlike previous work, where latency is considered via another, often inaccurate proxy (e.g., FLOPS), our approach directly measures real-world inference latency by executing the model on mobile phones. To further strike the right balance between flexibility and search space size, we propose a novel factorized hierarchical search space that encourages layer diversity throughout the network. Experimental results show that our approach consistently outperforms state-of-the-art mobile CNN models across multiple vision tasks. On the ImageNet classification task, our MnasNet achieves 75.2% top-1 accuracy with 78ms latency on a Pixel phone, which is 1.8x faster than MobileNetV2 [29] with 0.5% higher accuracy and 2.3x faster than NASNet [36] with 1.2% higher accuracy. Our MnasNet also achieves better mAP quality than MobileNets for COCO object detection. Code is at https://github.com/tensorflow/tpu/tree/master/models/official/mnasnet

MnasNet: Platform-Aware Neural Architecture Search for Mobile

The paper "MnasNet: Platform-Aware Neural Architecture Search for Mobile" introduces an innovative approach to neural architecture search (NAS) aimed at optimizing convolutional neural networks (CNNs) explicitly for mobile platforms. The key contribution of this work is the consideration of real-world mobile device constraints, specifically focusing on balancing accuracy and latency, something previous NAS methods superficially approximated using indirect metrics like FLOPS.

Methodology and Contributions

The authors propose a twofold methodology: formulating the NAS as a multi-objective optimization problem and introducing a novel factorized hierarchical search space. These methods allow for a direct measurement of inference latency on actual mobile devices, rather than relying on computational estimates.

  1. Multi-Objective Optimization:
    • The authors design the search problem to maximize accuracy while minimizing real-world latency.
    • They define a reward function incorporating both accuracy and latency, emphasizing the need for Pareto-optimal solutions that provide a balance of both metrics.
  2. Factorized Hierarchical Search Space:
    • Instead of repeating the same cell structure throughout the network, as done in previous NAS approaches, this paper allows for varied layer architectures tailored to different stages of the network.
    • This partitioning into blocks helps manage the size of the search space while facilitating architectural diversity, crucial for computational efficiency.

Experimental Results

The proposed MnasNet models were evaluated against state-of-the-art mobile CNNs on various benchmarks, including ImageNet classification and COCO object detection.

  • ImageNet Classification:
    • MnasNet exhibited a compelling performance, with the MnasNet-A1 achieving 75.2% top-1 accuracy and 92.5% top-5 accuracy at 78ms latency on a Pixel phone.
    • Compared to MobileNetV2, MnasNet-A1 is 1.8 times faster with a 0.5% increase in accuracy.
    • When compared to NASNet, MnasNet-A1 is 2.3 times faster with a 1.2% increase in accuracy. This represents a clear advantage in terms of inference efficiency.
  • COCO Object Detection:
    • When integrated with the SSDLite framework, MnasNet-A1 outperformed MobileNetV2-based models, achieving a mean Average Precision (mAP) of 23.0 with a significant reduction in multiply-add operations.
    • This model achieved competitive mAP with conventional SSD300 at a fraction of the computational cost, reinforcing the efficiency and applicability of MnasNet for real-world tasks.

Theoretical and Practical Implications

The primary theoretical contribution of this work lies in demonstrating the importance of real-world constraints in NAS, as opposed to relying on approximations like FLOPS. Practically, the results indicate that it is feasible to achieve high accuracy while significantly improving latency, making it viable to deploy sophisticated CNN models on mobile devices without substantial performance trade-offs.

The factorized hierarchical search space also presents a new direction for NAS research, highlighting the effectiveness of architectural diversity within CNNs for resource-constrained environments.

Future Directions

There are several avenues for future research and development inspired by this work:

  1. Extended Search Space Exploration:
    • Further refining and expanding the factorized hierarchical search space could uncover even more efficient architectures, potentially yielding models that are even faster and more accurate.
  2. Hybrid Search Algorithms:
    • Combining reinforcement learning with other optimization algorithms, such as evolutionary strategies or gradient-based methods, could accelerate the NAS process.
  3. Domain-Specific Adaptations:
    • Tailoring NAS to other domains beyond image classification and object detection, for instance, video processing or natural language processing on mobile devices, could broaden the applicability of these techniques.

Conclusion

The MnasNet approach introduced in this paper marks a significant step forward in the domain of neural architecture search for mobile platforms. By directly incorporating real-world latency measurements and employing a hierarchical, factorized search space, the authors demonstrate a method that balances accuracy and efficiency. This approach sets a new benchmark for mobile CNNs, offering a pathway to more sophisticated yet resource-efficient models that are suitable for deployment on a variety of edge devices.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Mingxing Tan (45 papers)
  2. Bo Chen (309 papers)
  3. Ruoming Pang (59 papers)
  4. Vijay Vasudevan (24 papers)
  5. Mark Sandler (66 papers)
  6. Andrew Howard (59 papers)
  7. Quoc V. Le (128 papers)
Citations (2,841)
Github Logo Streamline Icon: https://streamlinehq.com
Youtube Logo Streamline Icon: https://streamlinehq.com