Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

RTFormer: Efficient Design for Real-Time Semantic Segmentation with Transformer (2210.07124v1)

Published 13 Oct 2022 in cs.CV

Abstract: Recently, transformer-based networks have shown impressive results in semantic segmentation. Yet for real-time semantic segmentation, pure CNN-based approaches still dominate in this field, due to the time-consuming computation mechanism of transformer. We propose RTFormer, an efficient dual-resolution transformer for real-time semantic segmenation, which achieves better trade-off between performance and efficiency than CNN-based models. To achieve high inference efficiency on GPU-like devices, our RTFormer leverages GPU-Friendly Attention with linear complexity and discards the multi-head mechanism. Besides, we find that cross-resolution attention is more efficient to gather global context information for high-resolution branch by spreading the high level knowledge learned from low-resolution branch. Extensive experiments on mainstream benchmarks demonstrate the effectiveness of our proposed RTFormer, it achieves state-of-the-art on Cityscapes, CamVid and COCOStuff, and shows promising results on ADE20K. Code is available at PaddleSeg: https://github.com/PaddlePaddle/PaddleSeg.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Jian Wang (969 papers)
  2. Chenhui Gou (12 papers)
  3. Qiman Wu (3 papers)
  4. Haocheng Feng (33 papers)
  5. Junyu Han (53 papers)
  6. Errui Ding (156 papers)
  7. Jingdong Wang (237 papers)
Citations (67)

Summary

We haven't generated a summary for this paper yet.