Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Inductive and Transductive Few-Shot Video Classification via Appearance and Temporal Alignments (2207.10785v1)

Published 21 Jul 2022 in cs.CV

Abstract: We present a novel method for few-shot video classification, which performs appearance and temporal alignments. In particular, given a pair of query and support videos, we conduct appearance alignment via frame-level feature matching to achieve the appearance similarity score between the videos, while utilizing temporal order-preserving priors for obtaining the temporal similarity score between the videos. Moreover, we introduce a few-shot video classification framework that leverages the above appearance and temporal similarity scores across multiple steps, namely prototype-based training and testing as well as inductive and transductive prototype refinement. To the best of our knowledge, our work is the first to explore transductive few-shot video classification. Extensive experiments on both Kinetics and Something-Something V2 datasets show that both appearance and temporal alignments are crucial for datasets with temporal order sensitivity such as Something-Something V2. Our approach achieves similar or better results than previous methods on both datasets. Our code is available at https://github.com/VinAIResearch/fsvc-ata.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Khoi D. Nguyen (3 papers)
  2. Quoc-Huy Tran (18 papers)
  3. Khoi Nguyen (35 papers)
  4. Binh-Son Hua (47 papers)
  5. Rang Nguyen (13 papers)
Citations (23)

Summary

We haven't generated a summary for this paper yet.