Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
166 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

DARTS-: Robustly Stepping out of Performance Collapse Without Indicators (2009.01027v2)

Published 2 Sep 2020 in cs.LG, cs.AI, cs.CV, and stat.ML

Abstract: Despite the fast development of differentiable architecture search (DARTS), it suffers from long-standing performance instability, which extremely limits its application. Existing robustifying methods draw clues from the resulting deteriorated behavior instead of finding out its causing factor. Various indicators such as Hessian eigenvalues are proposed as a signal to stop searching before the performance collapses. However, these indicator-based methods tend to easily reject good architectures if the thresholds are inappropriately set, let alone the searching is intrinsically noisy. In this paper, we undertake a more subtle and direct approach to resolve the collapse. We first demonstrate that skip connections have a clear advantage over other candidate operations, where it can easily recover from a disadvantageous state and become dominant. We conjecture that this privilege is causing degenerated performance. Therefore, we propose to factor out this benefit with an auxiliary skip connection, ensuring a fairer competition for all operations. We call this approach DARTS-. Extensive experiments on various datasets verify that it can substantially improve robustness. Our code is available at https://github.com/Meituan-AutoML/DARTS- .

Citations (146)

Summary

  • The paper introduces an auxiliary skip connection to balance operation competition and mitigate performance collapse without reliance on handcrafted indicators.
  • Extensive experiments validate DARTS-'s effectiveness, achieving 97.5% accuracy on CIFAR-10 with lower computational overhead than comparable methods.
  • The approach demonstrates versatility by delivering competitive results across various datasets and search spaces, offering a cost-effective NAS solution.

Overview of DARTS-: Robustly Stepping out of Performance Collapse Without Indicators

Differentiable architecture search (DARTS) has been a significant advancement in neural architecture search, enabling efficient search with its gradient-based optimization approach. However, it has been consistently plagued by performance instability characterized by a collapse often induced by a bias towards skip connections, resulting in overly simplistic architectures that perform poorly. The paper "DARTS-: Robustly Stepping out of Performance Collapse Without Indicators" proposes a novel methodology intended to address these issues without the reliance on handcrafted indicators or additional hyperparameter tuning.

The authors identify the intrinsic bias that skip connections introduce during the DARTS search process. Skip connections, while effective in mitigating vanishing gradients in deep networks, tend to crowd out other operations in the architecture search, leading to suboptimal and collapsed performance. Traditional approaches often employ indicators based on metrics like Hessian eigenvalues to preemptively abort search runs, leading to their own set of challenges, such as threshold sensitivity and additional computational overhead.

The paper introduces DARTS-, a refined DARTS approach that implements an auxiliary skip connection to equalize the competition among potential operations. This auxiliary skip connection acts as a stabilizing influence, decoupling the stabilization effect of skip connections during training from their architectural impact, thus allowing for a more accurate reflection of their utility within candidate architectures.

Numerical Results and Validation

Extensive experimentation substantiates the claims regarding the enhancements of DARTS-. The paper highlights the efficacy of the proposed approach across multiple datasets and search spaces. On CIFAR-10, DARTS- achieves an impressive test accuracy of 97.5% with substantially fewer computational resources compared to contemporaries like R-DARTS. The robustness of the methodology is further highlighted by consistently high performance across repeated trials, illustrating low variability in outcomes.

When transferred to a broader search space, as implemented in the NAS-Bench-201 framework, DARTS- continues to outperform existing methods, reaffirming its robustness and generalized applicability. Moreover, in tasks such as object detection on the MS COCO dataset, the architectures identified by DARTS- exhibit competitive performance, indicating its efficacy beyond image classification.

Practical and Theoretical Implications

Practically, DARTS- provides a more reliable and cost-effective architecture search strategy, reducing the burden of tuning additional hyperparameters and avoiding the pitfalls of indicator-based approaches. The theoretical underpinning that auxiliary connections can stabilize and balance architectural competitions offers insights into how neural architecture search can be refined further. This decoupling of training stabilization from architecture selection is an advancement that can influence future research in neural architecture design, potentially leading to more nuanced and stable search algorithms that don't compromise exploration quality.

Future Developments

While DARTS- presents a significant step forward, there is still room to refine and adapt this approach to broader neural architecture search domains, including those applicable to varied modalities like language and multi-task settings. Further exploration could involve the integration of auxiliary connections with other differentiable search methodologies, potentially hybridizing approaches across different architecture search paradigms.

As researchers continue to optimize and innovate upon NAS methodologies, DARTS- stands as an exemplary reduction of complexity and enhancement of robustness in search processes, informing future methodologies that seek balance and efficiency in architectural exploration.