Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

HW-NAS-Bench:Hardware-Aware Neural Architecture Search Benchmark (2103.10584v1)

Published 19 Mar 2021 in cs.LG

Abstract: HardWare-aware Neural Architecture Search (HW-NAS) has recently gained tremendous attention by automating the design of DNNs deployed in more resource-constrained daily life devices. Despite its promising performance, developing optimal HW-NAS solutions can be prohibitively challenging as it requires cross-disciplinary knowledge in the algorithm, micro-architecture, and device-specific compilation. First, to determine the hardware-cost to be incorporated into the NAS process, existing works mostly adopt either pre-collected hardware-cost look-up tables or device-specific hardware-cost models. Both of them limit the development of HW-NAS innovations and impose a barrier-to-entry to non-hardware experts. Second, similar to generic NAS, it can be notoriously difficult to benchmark HW-NAS algorithms due to their significant required computational resources and the differences in adopted search spaces, hyperparameters, and hardware devices. To this end, we develop HW-NAS-Bench, the first public dataset for HW-NAS research which aims to democratize HW-NAS research to non-hardware experts and make HW-NAS research more reproducible and accessible. To design HW-NAS-Bench, we carefully collected the measured/estimated hardware performance of all the networks in the search spaces of both NAS-Bench-201 and FBNet, on six hardware devices that fall into three categories (i.e., commercial edge devices, FPGA, and ASIC). Furthermore, we provide a comprehensive analysis of the collected measurements in HW-NAS-Bench to provide insights for HW-NAS research. Finally, we demonstrate exemplary user cases to (1) show that HW-NAS-Bench allows non-hardware experts to perform HW-NAS by simply querying it and (2) verify that dedicated device-specific HW-NAS can indeed lead to optimal accuracy-cost trade-offs. The codes and all collected data are available at https://github.com/RICE-EIC/HW-NAS-Bench.

HW-NAS-Bench: A Comprehensive Dataset for Democratizing Hardware-Aware Neural Architecture Search

The development of Hardware-aware Neural Architecture Search (HW-NAS) has sought to optimize neural network architectures for deployment on resource-constrained devices, such as commercial edge devices, FPGAs, and ASICs. However, advancing HW-NAS poses significant challenges due to the interdisciplinary expertise required in algorithms, micro-architecture, and device-specific compilation. To address these difficulties and democratize the field of HW-NAS, the authors of the paper introduce HW-NAS-Bench, a publicly available dataset designed to facilitate HW-NAS research by providing measured and estimated hardware performance data.

Contributions of HW-NAS-Bench

The primary contributions of HW-NAS-Bench focus on removing barriers to entry for non-hardware experts and creating a benchmarking standard within the HW-NAS community. Specifically, the dataset:

  • Incorporates measured/estimated metrics such as energy cost and latency across six devices classified into three categories: commercial edge devices (Edge GPU, Raspberry Pi 4, Edge TPU, and Pixel 3), FPGA, and ASIC.
  • Covers two state-of-the-art (SOTA) NAS search spaces: NAS-Bench-201 and FBNet, with the latter known for its hardware compatibility.

These contributions collectively aim to make HW-NAS more reproducible and accessible, enabling developers to achieve an optimal balance between network accuracy and hardware efficiency without requiring in-depth hardware-specific knowledge.

Analysis and Insights

Through HW-NAS-Bench, the paper offers comprehensive analysis and insights into the correlation between theoretical hardware-cost metrics such as FLOPs and actual hardware measurements. Across various devices and datasets, the paper highlights that traditional theoretical metrics do not align well with real-world hardware costs, revealing a fundamental gap in current HW-NAS methodologies. Furthermore, the paper demonstrates significant variations in hardware-cost data when architectures run on different devices, underscoring the necessity for device-specific optimization during HW-NAS processes.

Implications and Future Directions

HW-NAS-Bench not only serves as a tool for performance evaluation but also as a foundation for future research directions in HW-NAS. By supporting diverse hardware platforms, the dataset provides a unique opportunity to develop unified methodologies that are adaptable across different devices, thus reducing the current fragmentation in HW-NAS techniques. Additionally, it invites further exploration into co-optimizing neural network architectures along with their deployment hardware, potentially leading to new hybrid application-specific integrated circuit (ASIC) and FPGA design workflows that are informed by cutting-edge architectures.

Conclusion

HW-NAS-Bench represents a significant step forward in HW-NAS research, aiming to democratize access to this field for a broader audience and laying the groundwork for a more standardized benchmarking process. By addressing critical challenges in measuring and modeling hardware costs across multiple platforms, HW-NAS-Bench provides a valuable resource for both NAS researchers and hardware designers, facilitating more efficient and effective design of neural networks tailored to specific hardware configurations. As such, future developments in AI are likely to benefit from the meaningful insights offered by this dataset, driving innovations that more closely integrate computational efficiencies with state-of-the-art neural architectures.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Chaojian Li (34 papers)
  2. Zhongzhi Yu (25 papers)
  3. Yonggan Fu (49 papers)
  4. Yongan Zhang (24 papers)
  5. Yang Zhao (382 papers)
  6. Haoran You (33 papers)
  7. Qixuan Yu (8 papers)
  8. Yue Wang (675 papers)
  9. Yingyan Lin (67 papers)
Citations (100)
Github Logo Streamline Icon: https://streamlinehq.com