Papers
Topics
Authors
Recent
Search
2000 character limit reached

YoTube: Searching Action Proposal via Recurrent and Static Regression Networks

Published 26 Jun 2017 in cs.CV | (1706.08218v1)

Abstract: In this paper, we present YoTube-a novel network fusion framework for searching action proposals in untrimmed videos, where each action proposal corresponds to a spatialtemporal video tube that potentially locates one human action. Our method consists of a recurrent YoTube detector and a static YoTube detector, where the recurrent YoTube explores the regression capability of RNN for candidate bounding boxes predictions using learnt temporal dynamics and the static YoTube produces the bounding boxes using rich appearance cues in a single frame. Both networks are trained using rgb and optical flow in order to fully exploit the rich appearance, motion and temporal context, and their outputs are fused to produce accurate and robust proposal boxes. Action proposals are finally constructed by linking these boxes using dynamic programming with a novel trimming method to handle the untrimmed video effectively and efficiently. Extensive experiments on the challenging UCF-101 and UCF-Sports datasets show that our proposed technique obtains superior performance compared with the state-of-the-art.

Citations (45)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.