- The paper introduces Progressive DARTS (P-DARTS), which incrementally deepens network architectures during the search phase to close the evaluation gap.
- It employs search space approximation and operation-level dropout to manage computational load and mitigate bias towards simpler operations.
- P-DARTS achieves a top-1 error of 24.4% on ImageNet in 0.3 GPU-days, demonstrating significant efficiency and performance improvements over traditional NAS methods.
Progressive Differentiable Architecture Search: Bridging the Depth Gap between Search and Evaluation
The paper "Progressive Differentiable Architecture Search: Bridging the Depth Gap between Search and Evaluation," introduces a novel approach to neural architecture search (NAS) that addresses a pivotal issue within differentiable search methods. Traditional differentiable NAS approaches, such as DARTS, have shown reduced accuracy when transitioning from the search to evaluation phase, primarily due to a disparity between the architectural depth utilized during these phases.
Key Contributions
The authors propose a method termed Progressive DARTS (P-DARTS) that incrementally increases the depth of the architectures during the search process, aiming to close this depth gap. This strategy involves several stages where network depth is progressively increased, resulting in architectures better suited for deep network evaluations.
To counter the increased computational demands of deeper architectures and maintain search stability, the paper introduces two main techniques:
- Search Space Approximation: This approach reduces the number of candidate operations based on their performance in previous stages, thereby managing the computational overhead effectively.
- Search Space Regularization: Using operation-level Dropout, the method mitigates bias towards parameter-free operations like skip connections, which could skew towards rapid gradient descent but offer limited learning capacity. This regularization is crucial in ensuring balanced exploration of the operation space.
Performance and Results
The authors demonstrate the effectiveness of P-DARTS by achieving state-of-the-art results on both CIFAR10 and ImageNet datasets. The proposed method achieves a top-1 error of 24.4% on ImageNet within the mobile setting, showing significant improvements over standard DARTS and other contemporary approaches.
Furthermore, the search method is remarkably efficient. With a search time of about 0.3 GPU-days, it significantly outpaces prior methods, such as AmoebaNet, which required thousands of GPU-days. This acceleration is particularly noteworthy given the competitive accuracy results.
Implications and Future Directions
The improvement in search efficiency and accuracy underlines the potential of P-DARTS for advancing automatic model design in deep learning. Practically, this method can be deployed across diverse datasets with minimal resource consumption. Theoretically, it prompts further investigation into progressive search strategies and their application to other neural architectures beyond image classification.
Looking ahead, future research could explore integrating more sophisticated regularization schemes or leveraging larger operation spaces. Moreover, adapting this methodology to handle more complex datasets and tasks could further establish its relevance in generalizing NAS applications.
By focusing on the critical "depth gap," this paper provides both a practical and theoretical contribution to the domain of NAS, representing a meaningful step forward in automated architecture optimization.