Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Technical Report for Ego4D Long Term Action Anticipation Challenge 2023 (2307.01467v1)

Published 4 Jul 2023 in cs.CV

Abstract: In this report, we describe the technical details of our approach for the Ego4D Long-Term Action Anticipation Challenge 2023. The aim of this task is to predict a sequence of future actions that will take place at an arbitrary time or later, given an input video. To accomplish this task, we introduce three improvements to the baseline model, which consists of an encoder that generates clip-level features from the video, an aggregator that integrates multiple clip-level features, and a decoder that outputs Z future actions. 1) Model ensemble of SlowFast and SlowFast-CLIP; 2) Label smoothing to relax order constraints for future actions; 3) Constraining the prediction of the action class (verb, noun) based on word co-occurrence. Our method outperformed the baseline performance and recorded as second place solution on the public leaderboard.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Tatsuya Ishibashi (3 papers)
  2. Kosuke Ono (2 papers)
  3. Noriyuki Kugo (4 papers)
  4. Yuji Sato (4 papers)
Citations (3)

Summary

We haven't generated a summary for this paper yet.