Multimodal Approach for Video Surveillance Indexing and Retrieval

Published 6 Aug 2013 in cs.MM and cs.CV | (1308.1150v1)

Abstract: In this paper, we present an overview of a multimodal system to indexing and searching video sequence by the content that has been developed within the REGIMVid project. A large part of our system has been developed as part of TRECVideo evaluation. The MAVSIR platform provides High-level feature extraction from audio-visual content and concept/event-based video retrieval. We illustrate the architecture of the system as well as provide an overview of the descriptors supported to date. Then we demonstrate the usefulness of the toolbox in the context of feature extraction, concepts/events learning and retrieval in large collections of video surveillance dataset. The results are encouraging as we are able to get good results on several event categories, while for all events we have gained valuable insights and experience.