Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Searching Intrinsic Dimensions of Vision Transformers (2204.07722v1)

Published 16 Apr 2022 in cs.CV and cs.LG

Abstract: It has been shown by many researchers that transformers perform as well as convolutional neural networks in many computer vision tasks. Meanwhile, the large computational costs of its attention module hinder further studies and applications on edge devices. Some pruning methods have been developed to construct efficient vision transformers, but most of them have considered image classification tasks only. Inspired by these results, we propose SiDT, a method for pruning vision transformer backbones on more complicated vision tasks like object detection, based on the search of transformer dimensions. Experiments on CIFAR-100 and COCO datasets show that the backbones with 20\% or 40\% dimensions/parameters pruned can have similar or even better performance than the unpruned models. Moreover, we have also provided the complexity analysis and comparisons with the previous pruning methods.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Fanghui Xue (6 papers)
  2. Biao Yang (48 papers)
  3. Yingyong Qi (20 papers)
  4. Jack Xin (85 papers)
Citations (2)

Summary

We haven't generated a summary for this paper yet.