Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Semantic Decomposition and Recognition of Long and Complex Manipulation Action Sequences (1610.05693v1)

Published 18 Oct 2016 in cs.CV

Abstract: Understanding continuous human actions is a non-trivial but important problem in computer vision. Although there exists a large corpus of work in the recognition of action sequences, most approaches suffer from problems relating to vast variations in motions, action combinations, and scene contexts. In this paper, we introduce a novel method for semantic segmentation and recognition of long and complex manipulation action tasks, such as "preparing a breakfast" or "making a sandwich". We represent manipulations with our recently introduced "Semantic Event Chain" (SEC) concept, which captures the underlying spatiotemporal structure of an action invariant to motion, velocity, and scene context. Solely based on the spatiotemporal interactions between manipulated objects and hands in the extracted SEC, the framework automatically parses individual manipulation streams performed either sequentially or concurrently. Using event chains, our method further extracts basic primitive elements of each parsed manipulation. Without requiring any prior object knowledge, the proposed framework can also extract object-like scene entities that exhibit the same role in semantically similar manipulations. We conduct extensive experiments on various recent datasets to validate the robustness of the framework.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Eren Erdal Aksoy (21 papers)
  2. Adil Orhan (1 paper)
  3. Florentin Woergoetter (2 papers)
Citations (27)

Summary

We haven't generated a summary for this paper yet.