Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

SSCAP: Self-supervised Co-occurrence Action Parsing for Unsupervised Temporal Action Segmentation (2105.14158v3)

Published 29 May 2021 in cs.CV

Abstract: Temporal action segmentation is a task to classify each frame in the video with an action label. However, it is quite expensive to annotate every frame in a large corpus of videos to construct a comprehensive supervised training dataset. Thus in this work we propose an unsupervised method, namely SSCAP, that operates on a corpus of unlabeled videos and predicts a likely set of temporal segments across the videos. SSCAP leverages Self-Supervised learning to extract distinguishable features and then applies a novel Co-occurrence Action Parsing algorithm to not only capture the correlation among sub-actions underlying the structure of activities, but also estimate the temporal path of the sub-actions in an accurate and general way. We evaluate on both classic datasets (Breakfast, 50Salads) and the emerging fine-grained action dataset (FineGym) with more complex activity structures and similar sub-actions. Results show that SSCAP achieves state-of-the-art performance on all datasets and can even outperform some weakly-supervised approaches, demonstrating its effectiveness and generalizability.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Zhe Wang (574 papers)
  2. Hao Chen (1006 papers)
  3. Xinyu Li (136 papers)
  4. Chunhui Liu (23 papers)
  5. Yuanjun Xiong (52 papers)
  6. Joseph Tighe (30 papers)
  7. Charless Fowlkes (35 papers)
Citations (17)

Summary

We haven't generated a summary for this paper yet.