Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Complementary Boundary Generator with Scale-Invariant Relation Modeling for Temporal Action Localization: Submission to ActivityNet Challenge 2020 (2007.09883v2)

Published 20 Jul 2020 in cs.CV

Abstract: This technical report presents an overview of our solution used in the submission to ActivityNet Challenge 2020 Task 1 (\textbf{temporal action localization/detection}). Temporal action localization requires to not only precisely locate the temporal boundaries of action instances, but also accurately classify the untrimmed videos into specific categories. In this paper, we decouple the temporal action localization task into two stages (i.e. proposal generation and classification) and enrich the proposal diversity through exhaustively exploring the influences of multiple components from different but complementary perspectives. Specifically, in order to generate high-quality proposals, we consider several factors including the video feature encoder, the proposal generator, the proposal-proposal relations, the scale imbalance, and ensemble strategy. Finally, in order to obtain accurate detections, we need to further train an optimal video classifier to recognize the generated proposals. Our proposed scheme achieves the state-of-the-art performance on the temporal action localization task with \textbf{42.26} average mAP on the challenge testing set.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Haisheng Su (16 papers)
  2. Jinyuan Feng (6 papers)
  3. Hao Shao (25 papers)
  4. Zhenyu Jiang (24 papers)
  5. Manyuan Zhang (14 papers)
  6. Wei Wu (482 papers)
  7. Yu Liu (786 papers)
  8. Hongsheng Li (340 papers)
  9. Junjie Yan (109 papers)

Summary

We haven't generated a summary for this paper yet.