Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

ExpertAF: Expert Actionable Feedback from Video (2408.00672v2)

Published 1 Aug 2024 in cs.CV

Abstract: Feedback is essential for learning a new skill or improving one's current skill-level. However, current methods for skill-assessment from video only provide scores or compare demonstrations, leaving the burden of knowing what to do differently on the user. We introduce a novel method to generate actionable feedback from video of a person doing a physical activity, such as basketball or soccer. Our method takes a video demonstration and its accompanying 3D body pose and generates (1) free-form expert commentary describing what the person is doing well and what they could improve, and (2) a visual expert demonstration that incorporates the required corrections. We show how to leverage Ego-Exo4D's videos of skilled activity and expert commentary together with a strong LLM to create a weakly-supervised training dataset for this task, and we devise a multimodal video-LLM to infer coaching feedback. Our method is able to reason across multi-modal input combinations to output full-spectrum, actionable coaching -- expert commentary, expert video retrieval, and expert pose generation -- outperforming strong vision-LLMs on both established metrics and human preference studies. Code and data will be publicly released.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Kumar Ashutosh (17 papers)
  2. Tushar Nagarajan (33 papers)
  3. Georgios Pavlakos (45 papers)
  4. Kris Kitani (96 papers)
  5. Kristen Grauman (136 papers)

Summary

We haven't generated a summary for this paper yet.