Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Towards More Realistic Human-Robot Conversation: A Seq2Seq-based Body Gesture Interaction System (1905.01641v3)

Published 5 May 2019 in cs.CV

Abstract: This paper presents a novel system that enables intelligent robots to exhibit realistic body gestures while communicating with humans. The proposed system consists of a listening model and a speaking model used in corresponding conversational phases. Both models are adapted from the sequence-to-sequence (seq2seq) architecture to synthesize body gestures represented by the movements of twelve upper-body keypoints. All the extracted 2D keypoints are firstly 3D-transformed, then rotated and normalized to discard irrelevant information. Substantial videos of human conversations from Youtube are collected and preprocessed to train the listening and speaking models separately, after which the two models are evaluated using metrics of mean squared error (MSE) and cosine similarity on the test dataset. The tuned system is implemented to drive a virtual avatar as well as Pepper, a physical humanoid robot, to demonstrate the improvement on conversational interaction abilities of our method in practice.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Minjie Hua (7 papers)
  2. Fuyuan Shi (3 papers)
  3. Yibing Nan (6 papers)
  4. Kai Wang (624 papers)
  5. Hao Chen (1006 papers)
  6. Shiguo Lian (54 papers)
Citations (10)
Youtube Logo Streamline Icon: https://streamlinehq.com