Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

PaStaNet: Toward Human Activity Knowledge Engine (2004.00945v2)

Published 2 Apr 2020 in cs.CV, cs.AI, and cs.LG

Abstract: Existing image-based activity understanding methods mainly adopt direct mapping, i.e. from image to activity concepts, which may encounter performance bottleneck since the huge gap. In light of this, we propose a new path: infer human part states first and then reason out the activities based on part-level semantics. Human Body Part States (PaSta) are fine-grained action semantic tokens, e.g. <hand, hold, something>, which can compose the activities and help us step toward human activity knowledge engine. To fully utilize the power of PaSta, we build a large-scale knowledge base PaStaNet, which contains 7M+ PaSta annotations. And two corresponding models are proposed: first, we design a model named Activity2Vec to extract PaSta features, which aim to be general representations for various activities. Second, we use a PaSta-based Reasoning method to infer activities. Promoted by PaStaNet, our method achieves significant improvements, e.g. 6.4 and 13.9 mAP on full and one-shot sets of HICO in supervised learning, and 3.2 and 4.2 mAP on V-COCO and images-based AVA in transfer learning. Code and data are available at http://hake-mvig.cn/.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (10)
  1. Yong-Lu Li (47 papers)
  2. Liang Xu (117 papers)
  3. Xinpeng Liu (19 papers)
  4. Xijie Huang (26 papers)
  5. Yue Xu (79 papers)
  6. Shiyi Wang (27 papers)
  7. Hao-Shu Fang (38 papers)
  8. Ze Ma (8 papers)
  9. Mingyang Chen (45 papers)
  10. Cewu Lu (203 papers)
Citations (142)

Summary

We haven't generated a summary for this paper yet.