Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

ROME: Robustifying Memory-Efficient NAS via Topology Disentanglement and Gradient Accumulation (2011.11233v2)

Published 23 Nov 2020 in cs.LG, cs.AI, and cs.CV

Abstract: Albeit being a prevalent architecture searching approach, differentiable architecture search (DARTS) is largely hindered by its substantial memory cost since the entire supernet resides in the memory. This is where the single-path DARTS comes in, which only chooses a single-path submodel at each step. While being memory-friendly, it also comes with low computational costs. Nonetheless, we discover a critical issue of single-path DARTS that has not been primarily noticed. Namely, it also suffers from severe performance collapse since too many parameter-free operations like skip connections are derived, just like DARTS does. In this paper, we propose a new algorithm called RObustifying Memory-Efficient NAS (ROME) to give a cure. First, we disentangle the topology search from the operation search to make searching and evaluation consistent. We then adopt Gumbel-Top2 reparameterization and gradient accumulation to robustify the unwieldy bi-level optimization. We verify ROME extensively across 15 benchmarks to demonstrate its effectiveness and robustness.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Xiaoxing Wang (11 papers)
  2. Xiangxiang Chu (62 papers)
  3. Yuda Fan (3 papers)
  4. Zhexi Zhang (5 papers)
  5. Bo Zhang (633 papers)
  6. Xiaokang Yang (210 papers)
  7. Junchi Yan (241 papers)
Citations (3)

Summary

We haven't generated a summary for this paper yet.