SiGeo: Sub-One-Shot NAS via Information Theory and Geometry of Loss Landscape (2311.13169v1)

Published 22 Nov 2023 in cs.LG and cs.AI

Abstract: Neural Architecture Search (NAS) has become a widely used tool for automating neural network design. While one-shot NAS methods have successfully reduced computational requirements, they often require extensive training. On the other hand, zero-shot NAS utilizes training-free proxies to evaluate a candidate architecture's test performance but has two limitations: (1) inability to use the information gained as a network improves with training and (2) unreliable performance, particularly in complex domains like RecSys, due to the multi-modal data inputs and complex architecture configurations. To synthesize the benefits of both methods, we introduce a "sub-one-shot" paradigm that serves as a bridge between zero-shot and one-shot NAS. In sub-one-shot NAS, the supernet is trained using only a small subset of the training data, a phase we refer to as "warm-up." Within this framework, we present SiGeo, a proxy founded on a novel theoretical framework that connects the supernet warm-up with the efficacy of the proxy. Extensive experiments have shown that SiGeo, with the benefit of warm-up, consistently outperforms state-of-the-art NAS proxies on various established NAS benchmarks. When a supernet is warmed up, it can achieve comparable performance to weight-sharing one-shot NAS methods, but with a significant reduction ($\sim 60$\%) in computational costs.

Authors (6)

Hua Zheng (76 papers)
Kuang-Hung Liu (3 papers)
Igor Fedorov (24 papers)
Xin Zhang (906 papers)
Wen-Yen Chen (10 papers)
Wei Wen (49 papers)

Citations (1)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

SiGeo: Sub-One-Shot NAS via Information Theory and Geometry of Loss Landscape (2311.13169v1)

Summary

Related Papers