Papers
Topics
Authors
Recent
Search
2000 character limit reached

GHOST: Grounded Human Motion Generation with Open Vocabulary Scene-and-Text Contexts

Published 8 Apr 2024 in cs.CV | (2405.18438v1)

Abstract: The connection between our 3D surroundings and the descriptive language that characterizes them would be well-suited for localizing and generating human motion in context but for one problem. The complexity introduced by multiple modalities makes capturing this connection challenging with a fixed set of descriptors. Specifically, closed vocabulary scene encoders, which require learning text-scene associations from scratch, have been favored in the literature, often resulting in inaccurate motion grounding. In this paper, we propose a method that integrates an open vocabulary scene encoder into the architecture, establishing a robust connection between text and scene. Our two-step approach starts with pretraining the scene encoder through knowledge distillation from an existing open vocabulary semantic image segmentation model, ensuring a shared text-scene feature space. Subsequently, the scene encoder is fine-tuned for conditional motion generation, incorporating two novel regularization losses that regress the category and size of the goal object. Our methodology achieves up to a 30% reduction in the goal object distance metric compared to the prior state-of-the-art baseline model on the HUMANISE dataset. This improvement is demonstrated through evaluations conducted using three implementations of our framework and a perceptual study. Additionally, our method is designed to seamlessly accommodate future 2D segmentation methods that provide per-pixel text-aligned features for distillation.

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 3 tweets with 5 likes about this paper.