SLVideo: A Sign Language Video Moment Retrieval Framework (2407.15668v2)

Published 22 Jul 2024 in cs.CV and cs.AI

Abstract: SLVideo is a video moment retrieval system for Sign Language videos that incorporates facial expressions, addressing this gap in existing technology. The system extracts embedding representations for the hand and face signs from video frames to capture the signs in their entirety, enabling users to search for a specific sign language video segment with text queries. A collection of eight hours of annotated Portuguese Sign Language videos is used as the dataset, and a CLIP model is used to generate the embeddings. The initial results are promising in a zero-shot setting. In addition, SLVideo incorporates a thesaurus that enables users to search for similar signs to those retrieved, using the video segment embeddings, and also supports the edition and creation of video sign language annotations. Project web page: https://novasearch.github.io/SLVideo/

References (7)

Authors (5)

Gonçalo Vinagre Martins (1 paper)
Afonso Quinaz (1 paper)
Carla Viegas (5 papers)
Sofia Cavaco (3 papers)
João Magalhães (35 papers)

Summary

We haven't generated a summary for this paper yet.

Summarize Now

SLVideo: A Sign Language Video Moment Retrieval Framework (2407.15668v2)

Summary

Related Papers

GitHub