Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

BossNAS: Exploring Hybrid CNN-transformers with Block-wisely Self-supervised Neural Architecture Search (2103.12424v3)

Published 23 Mar 2021 in cs.CV and cs.LG

Abstract: A myriad of recent breakthroughs in hand-crafted neural architectures for visual recognition have highlighted the urgent need to explore hybrid architectures consisting of diversified building blocks. Meanwhile, neural architecture search methods are surging with an expectation to reduce human efforts. However, whether NAS methods can efficiently and effectively handle diversified search spaces with disparate candidates (e.g. CNNs and transformers) is still an open question. In this work, we present Block-wisely Self-supervised Neural Architecture Search (BossNAS), an unsupervised NAS method that addresses the problem of inaccurate architecture rating caused by large weight-sharing space and biased supervision in previous methods. More specifically, we factorize the search space into blocks and utilize a novel self-supervised training scheme, named ensemble bootstrapping, to train each block separately before searching them as a whole towards the population center. Additionally, we present HyTra search space, a fabric-like hybrid CNN-transformer search space with searchable down-sampling positions. On this challenging search space, our searched model, BossNet-T, achieves up to 82.5% accuracy on ImageNet, surpassing EfficientNet by 2.4% with comparable compute time. Moreover, our method achieves superior architecture rating accuracy with 0.78 and 0.76 Spearman correlation on the canonical MBConv search space with ImageNet and on NATS-Bench size search space with CIFAR-100, respectively, surpassing state-of-the-art NAS methods. Code: https://github.com/changlin31/BossNAS

An Analysis of BossNAS: Hybrid CNN-Transformers with Self-supervised NAS

The paper "BossNAS: Exploring Hybrid CNN-transformers with Block-wisely Self-supervised Neural Architecture Search" proposes an innovative approach to neural architecture search (NAS) focusing on hybrid CNN-transformer architectures. Specifically, it tackles the challenge of efficiently and effectively conducting NAS within a search space that includes disparate architectural elements like convolutional neural networks (CNNs) and transformers.

Key Contributions

  1. Architecture Search Methodology: The authors introduce Block-wisely Self-supervised Neural Architecture Search (BossNAS), an unsupervised NAS technique. Traditional high-overhead, manually-tuned NAS methods are efficient and accurate yet demanding. The proposed method significantly reduces computational demands through its self-supervised ensemble bootstrapping approach while maintaining high architectural evaluation accuracy.
  2. Hybrid Search Space: The paper presents the HyTra search space, a dynamically-structured hybrid CNN-transformer search space. This architecture involves a fabric-like design that allows both CNN and transformer building blocks in parallel across the network's layers. The HyTra system is versatile, encompassing architectures resembling various vision models with differences in computational and spatial scales.
  3. Practical and Numerical Validation: The BossNAS method yields BossNet-T architectures, which show improvements over previous architectures like EfficientNet and BoTNet in terms of accuracy and computational efficiency. Notably, BossNet-T achieved up to 82.5% accuracy on ImageNet, surpassing EfficientNet by 2.4%, with comparable computational requirements.

Exploration of the Method

The significant premise of BossNAS is the divide-and-conquer enhancement of NAS through block-wise factorization, processed through an unsupervised bootstrapping method. The search efficiency stems from the reduced search complexity resulting from modular block-wise processing rather than entirety.

  • Ensemble Bootstrapping: Key to the BossNAS methodology is the self-supervised ensemble bootstrapping, which generates a probability ensemble to stabilize convergence and elevate rating accuracy in the search process. This technique permits individual sampled architectures to learn generalized representations, removing biases typically introduced through single-path sampling techniques.
  • Architectural Robustness: At an analytical level, BossNAS addresses two pervasive issues in weight-sharing NAS—candidate preference and teacher preference. By excluding supervised distillation, BossNAS maneuvers past the architectural bias that often skews results.

Implications and Future Prospects

This work demonstrates a marked advance in solving complex NAS issues by introducing an efficient methodology that marries CNN and transformer strengths. The architecture ranking accuracy demonstrated by BossNAS—achieving up to 0.78 Spearman correlation on challenging benchmarks—indicates significant promise for creating optimized, task-specific neural architectures.

The implications of BossNAS extend to a broader scope of machine learning applications. As hybrid architectures become increasingly essential across diverse domains, the ability to efficiently search and optimize these configurations will be invaluable. Future research directions may include further reductions in computational expenses or adapt BossNAS to other tasks beyond visual recognition, potentially generalizing its benefits to broader applications in artificial intelligence and computational neuroscience.

In conclusion, the introduction of BossNAS marks a meaningful progression in NAS methodologies, particularly in the field of hybrid architectures, by ingeniously addressing key limitations of traditional approaches without the need for supervised learning. This research stands as a testament to the ability of self-supervised learning paradigms to finely operationalize the vast potential of hybrid network structures within the field's growing demand for scalability and precision.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Changlin Li (28 papers)
  2. Tao Tang (87 papers)
  3. Guangrun Wang (43 papers)
  4. Jiefeng Peng (8 papers)
  5. Bing Wang (246 papers)
  6. Xiaodan Liang (318 papers)
  7. Xiaojun Chang (148 papers)
Citations (99)