Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Exploring Stronger Feature for Temporal Action Localization (2106.13014v1)

Published 24 Jun 2021 in cs.CV

Abstract: Temporal action localization aims to localize starting and ending time with action category. Limited by GPU memory, mainstream methods pre-extract features for each video. Therefore, feature quality determines the upper bound of detection performance. In this technical report, we explored classic convolution-based backbones and the recent surge of transformer-based backbones. We found that the transformer-based methods can achieve better classification performance than convolution-based, but they cannot generate accuracy action proposals. In addition, extracting features with larger frame resolution to reduce the loss of spatial information can also effectively improve the performance of temporal action localization. Finally, we achieve 42.42% in terms of mAP on validation set with a single SlowFast feature by a simple combination: BMN+TCANet, which is 1.87% higher than the result of 2020's multi-model ensemble. Finally, we achieve Rank 1st on the CVPR2021 HACS supervised Temporal Action Localization Challenge.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Zhiwu Qing (29 papers)
  2. Xiang Wang (279 papers)
  3. Ziyuan Huang (43 papers)
  4. Yutong Feng (33 papers)
  5. Shiwei Zhang (180 papers)
  6. Mingqian Tang (23 papers)
  7. Changxin Gao (77 papers)
  8. Nong Sang (87 papers)
  9. Jianwen Jiang (25 papers)
Citations (4)

Summary

We haven't generated a summary for this paper yet.