Channel-Temporal Attention for First-Person Video Domain Adaptation (2108.07846v2)

Published 17 Aug 2021 in cs.CV and cs.AI

Abstract: Unsupervised Domain Adaptation (UDA) can transfer knowledge from labeled source data to unlabeled target data of the same categories. However, UDA for first-person action recognition is an under-explored problem, with lack of datasets and limited consideration of first-person video characteristics. This paper focuses on addressing this problem. Firstly, we propose two small-scale first-person video domain adaptation datasets: ADL${small}$ and GTEA-KITCHEN. Secondly, we introduce channel-temporal attention blocks to capture the channel-wise and temporal-wise relationships and model their inter-dependencies important to first-person vision. Finally, we propose a Channel-Temporal Attention Network (CTAN) to integrate these blocks into existing architectures. CTAN outperforms baselines on the two proposed datasets and one existing dataset EPIC${cvpr20}$.

Authors (4)

Xianyuan Liu (12 papers)
Shuo Zhou (28 papers)
Tao Lei (51 papers)
Haiping Lu (38 papers)

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Channel-Temporal Attention for First-Person Video Domain Adaptation (2108.07846v2)

Summary

Related Papers