Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

OVO: One-shot Vision Transformer Search with Online distillation (2212.13766v2)

Published 28 Dec 2022 in cs.CV

Abstract: Pure transformers have shown great potential for vision tasks recently. However, their accuracy in small or medium datasets is not satisfactory. Although some existing methods introduce a CNN as a teacher to guide the training process by distillation, the gap between teacher and student networks would lead to sub-optimal performance. In this work, we propose a new One-shot Vision transformer search framework with Online distillation, namely OVO. OVO samples sub-nets for both teacher and student networks for better distillation results. Benefiting from the online distillation, thousands of subnets in the supernet are well-trained without extra finetuning or retraining. In experiments, OVO-Ti achieves 73.32% top-1 accuracy on ImageNet and 75.2% on CIFAR-100, respectively.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Zimian Wei (11 papers)
  2. Hengyue Pan (19 papers)
  3. Xin Niu (14 papers)
  4. Dongsheng Li (240 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.