Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
91 tokens/sec
GPT-4o
12 tokens/sec
Gemini 2.5 Pro Pro
o3 Pro
5 tokens/sec
GPT-4.1 Pro
15 tokens/sec
DeepSeek R1 via Azure Pro
33 tokens/sec
Gemini 2.5 Flash Deprecated
12 tokens/sec
2000 character limit reached

Technical Report for CVPR 2022 LOVEU AQTC Challenge (2206.14555v1)

Published 29 Jun 2022 in cs.CV and cs.AI

Abstract: This technical report presents the 2nd winning model for AQTC, a task newly introduced in CVPR 2022 LOng-form VidEo Understanding (LOVEU) challenges. This challenge faces difficulties with multi-step answers, multi-modal, and diverse and changing button representations in video. We address this problem by proposing a new context ground module attention mechanism for more effective feature mapping. In addition, we also perform the analysis over the number of buttons and ablation study of different step networks and video features. As a result, we achieved the overall 2nd place in LOVEU competition track 3, specifically the 1st place in two out of four evaluation metrics. Our code is available at https://github.com/jaykim9870/ CVPR-22_LOVEU_unipyler.

Summary

We haven't generated a summary for this paper yet.