Long-short Term Motion Feature for Action Classification and Retrieval
Abstract: We propose a method for representing motion information for video classification and retrieval. We improve upon local descriptor based methods that have been among the most popular and successful models for representing videos. The desired local descriptors need to satisfy two requirements: 1) to be representative, 2) to be discriminative. Therefore, they need to occur frequently enough in the videos and to be be able to tell the difference among different types of motions. To generate such local descriptors, the video blocks they are based on must contain just the right amount of motion information. However, current state-of-the-art local descriptor methods use video blocks with a single fixed size, which is insufficient for covering actions with varying speeds. In this paper, we introduce a long-short term motion feature that generates descriptors from video blocks with multiple lengths, thus covering motions with large speed variance. Experimental results show that, albeit simple, our model achieves state-of-the-arts results on several benchmark datasets.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.