Papers
Topics
Authors
Recent
2000 character limit reached

PV-NAS: Practical Neural Architecture Search for Video Recognition

Published 2 Nov 2020 in cs.CV | (2011.00826v2)

Abstract: Recently, deep learning has been utilized to solve video recognition problem due to its prominent representation ability. Deep neural networks for video tasks is highly customized and the design of such networks requires domain experts and costly trial and error tests. Recent advance in network architecture search has boosted the image recognition performance in a large margin. However, automatic designing of video recognition network is less explored. In this study, we propose a practical solution, namely Practical Video Neural Architecture Search (PV-NAS).Our PV-NAS can efficiently search across tremendous large scale of architectures in a novel spatial-temporal network search space using the gradient based search methods. To avoid sticking into sub-optimal solutions, we propose a novel learning rate scheduler to encourage sufficient network diversity of the searched models. Extensive empirical evaluations show that the proposed PV-NAS achieves state-of-the-art performance with much fewer computational resources. 1) Within light-weight models, our PV-NAS-L achieves 78.7% and 62.5% Top-1 accuracy on Kinetics-400 and Something-Something V2, which are better than previous state-of-the-art methods (i.e., TSM) with a large margin (4.6% and 3.4% on each dataset, respectively), and 2) among median-weight models, our PV-NAS-M achieves the best performance (also a new record)in the Something-Something V2 dataset.

Citations (7)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.