Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Channel Recurrent Attention Networks for Video Pedestrian Retrieval (2010.03108v1)

Published 7 Oct 2020 in cs.CV and cs.LG

Abstract: Full attention, which generates an attention value per element of the input feature maps, has been successfully demonstrated to be beneficial in visual tasks. In this work, we propose a fully attentional network, termed {\it channel recurrent attention network}, for the task of video pedestrian retrieval. The main attention unit, \textit{channel recurrent attention}, identifies attention maps at the frame level by jointly leveraging spatial and channel patterns via a recurrent neural network. This channel recurrent attention is designed to build a global receptive field by recurrently receiving and learning the spatial vectors. Then, a \textit{set aggregation} cell is employed to generate a compact video representation. Empirical experimental results demonstrate the superior performance of the proposed deep network, outperforming current state-of-the-art results across standard video person retrieval benchmarks, and a thorough ablation study shows the effectiveness of the proposed units.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Pengfei Fang (29 papers)
  2. Pan Ji (53 papers)
  3. Jieming Zhou (7 papers)
  4. Lars Petersson (88 papers)
  5. Mehrtash Harandi (108 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.