Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

TDAF: Top-Down Attention Framework for Vision Tasks (2012.07248v1)

Published 14 Dec 2020 in cs.CV

Abstract: Human attention mechanisms often work in a top-down manner, yet it is not well explored in vision research. Here, we propose the Top-Down Attention Framework (TDAF) to capture top-down attentions, which can be easily adopted in most existing models. The designed Recursive Dual-Directional Nested Structure in it forms two sets of orthogonal paths, recursive and structural ones, where bottom-up spatial features and top-down attention features are extracted respectively. Such spatial and attention features are nested deeply, therefore, the proposed framework works in a mixed top-down and bottom-up manner. Empirical evidence shows that our TDAF can capture effective stratified attention information and boost performance. ResNet with TDAF achieves 2.0% improvements on ImageNet. For object detection, the performance is improved by 2.7% AP over FCOS. For pose estimation, TDAF improves the baseline by 1.6%. And for action recognition, the 3D-ResNet adopting TDAF achieves improvements of 1.7% accuracy.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Bo Pang (77 papers)
  2. Yizhuo Li (21 papers)
  3. Jiefeng Li (22 papers)
  4. Muchen Li (9 papers)
  5. Hanwen Cao (13 papers)
  6. Cewu Lu (203 papers)
Citations (7)

Summary

We haven't generated a summary for this paper yet.