Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash 94 tok/s
Gemini 2.5 Pro 57 tok/s Pro
GPT-5 Medium 28 tok/s
GPT-5 High 38 tok/s Pro
GPT-4o 100 tok/s
GPT OSS 120B 461 tok/s Pro
Kimi K2 208 tok/s Pro
2000 character limit reached

HITTER: A HumanoId Table TEnnis Robot via Hierarchical Planning and Learning (2508.21043v1)

Published 28 Aug 2025 in cs.RO

Abstract: Humanoid robots have recently achieved impressive progress in locomotion and whole-body control, yet they remain constrained in tasks that demand rapid interaction with dynamic environments through manipulation. Table tennis exemplifies such a challenge: with ball speeds exceeding 5 m/s, players must perceive, predict, and act within sub-second reaction times, requiring both agility and precision. To address this, we present a hierarchical framework for humanoid table tennis that integrates a model-based planner for ball trajectory prediction and racket target planning with a reinforcement learning-based whole-body controller. The planner determines striking position, velocity and timing, while the controller generates coordinated arm and leg motions that mimic human strikes and maintain stability and agility across consecutive rallies. Moreover, to encourage natural movements, human motion references are incorporated during training. We validate our system on a general-purpose humanoid robot, achieving up to 106 consecutive shots with a human opponent and sustained exchanges against another humanoid. These results demonstrate real-world humanoid table tennis with sub-second reactive control, marking a step toward agile and interactive humanoid behaviors.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

  • The paper presents a hierarchical system that combines a model-based planner and a reinforcement learning-based whole-body controller for human-like table tennis play.
  • The approach uses polynomial ball trajectory prediction and a hybrid dynamics model to achieve sub-centimeter position accuracy and rapid striking with precise timing.
  • Empirical results demonstrate a high hit rate and up to 106 consecutive shots, validating the system’s robust performance in dynamic, real-world matches.

HITTER: Hierarchical Planning and Learning for Humanoid Table Tennis

Introduction

The paper presents HITTER, a hierarchical system for enabling general-purpose humanoid robots to play table tennis with high agility and human-like motion. The approach integrates a model-based planner for ball trajectory prediction and strike planning with a reinforcement learning (RL)–based whole-body controller (WBC) trained on human motion references. The system is validated on the Unitree G1 humanoid, achieving up to 106 consecutive shots against a human opponent and demonstrating sustained rallies in fully autonomous humanoid–humanoid matches. The work addresses the challenge of rapid perception-action loops, coordinated whole-body control, and naturalistic striking in highly dynamic environments. Figure 1

Figure 1: System overview, showing the hardware setup, motion capture, model-based planning, and RL-based whole-body control pipeline.

Hierarchical System Architecture

The system is modularized into two principal components:

  1. Model-Based Planner: Operates at high frequency (360 Hz) using motion capture data to estimate ball position and velocity via polynomial fitting. It predicts the ball’s future trajectory using a hybrid dynamics model, accounting for aerodynamic drag and bounce restitution. The planner computes the desired racket striking position, velocity, and timing, as well as the robot base target position, which are passed to the WBC.
  2. Learning-Based Whole-Body Controller (WBC): Trained in Isaac Lab using PPO, the WBC receives planner outputs and proprioceptive observations, generating joint position commands for all 29 degrees of freedom at 50 Hz. The policy is trained with dense and sparse rewards for imitation and goal tracking, using human forehand and backhand reference motions retargeted to the robot via SMPL and GMR pipelines.

This separation of planning and control improves sample efficiency, robustness to perception errors, and adaptability to real-world conditions.

Model-Based Planning and Ball Prediction

The planner uses a second-order polynomial fit for velocity estimation and a hybrid flight-bounce model for trajectory prediction. Parameters for drag and restitution are empirically identified from recorded trajectories. The planner achieves sub-centimeter position error and sub-20 ms timing error within 0.5 s of the strike, providing reliable commands for the WBC. Figure 2

Figure 2: Prediction errors of the model-based planner for striking position and time, demonstrating sub-racket-radius accuracy within 0.5 s of impact.

Racket-ball interaction is modeled with a simplified restitution-based approach, neglecting spin and tangential friction. The desired outgoing ball velocity is computed to target the center of the opponent’s side, and the required racket velocity is derived analytically.

Whole-Body Controller and Human-Like Motion

The WBC is trained with asymmetric actor-critic architecture, where the critic receives privileged information (body poses, time left, reference motions) to improve return estimation. The policy tracks separate commands for base and racket, enabling rapid lateral movement and coordinated arm swings. Episodes are structured to allow consecutive strikes and randomization of swing type and target positions.

Human motion references are processed and retargeted to the robot, with interpolation and kinematic augmentation for accurate tracking. The reward function combines imitation, goal tracking, and regularization, with sparse activation for critical strike moments. Figure 3

Figure 3: Agility evaluation of the WBC policy, showing sub-0.8 s reaching times for initial distances within 0.75 m and a 94.3% success rate in simulation.

Figure 4

Figure 4: Real-world rapid reaching motion, illustrating the robot’s ability to transition swiftly across the table and maintain balance during strikes.

Figure 5

Figure 5: Real-world human-like striking motion, with coordinated waist rotation and arm movement mimicking human table tennis play.

Empirical Results

The integrated system demonstrates high performance in real-world experiments:

  • Prediction Accuracy: Position error falls below the racket radius threshold (7.5 cm) 0.5 s before the strike; timing error drops below one control step (20 ms) 0.3 s before the strike.
  • Agility: In simulation, the WBC reaches target base positions within 0.8 s for most cases, faster than the typical strike duration (0.86 s).
  • Return Performance: Out of 26 thrown balls, the robot achieves a 96.2% hit rate and a 92.3% return rate, with forehand/backhand selection matching human strategies. Figure 6

    Figure 6: Return performance evaluation, showing spatial distribution of successful returns and misses on the virtual hit plane.

  • Rally Length: The robot sustains rallies of up to 106 consecutive shots against a human, exceeding casual human play. Reaction times to human smashes are as low as 0.42 s.
  • Autonomous Matches: Two humanoids equipped with the same policy can autonomously sustain rallies, demonstrating robustness and generality.

System Design Implications and Limitations

The hierarchical combination of model-based planning and learning-based control leverages the strengths of both paradigms. The planner provides reliable, interpretable commands, while the WBC enables agile, human-like motion. This modularity allows independent evaluation and improvement of each component.

Limitations include reliance on a fixed virtual hitting plane, external motion capture for ball and robot pose estimation, and neglect of spin and advanced stroke repertoire. These constraints limit table coverage, deployment flexibility, and performance against skilled opponents.

Future Directions

Potential avenues for advancement include:

  • Vision-Based Sensing: Replacing motion capture with onboard vision for ball and robot pose estimation.
  • Spin Perception and Stroke Diversity: Extending the system to handle spin and generate a wider range of strokes.
  • Multi-Agent Training: Joint training of policies for competitive humanoid–humanoid matches.
  • Autonomous Serving: Enabling robots to initiate rallies without human intervention.
  • Opponent Modeling: Integrating strategic and tactical learning to adapt to skilled human opponents.

Conclusion

HITTER demonstrates that hierarchical planning and learning can enable general-purpose humanoid robots to play table tennis with high agility, precision, and human-like motion. The system achieves robust real-world performance, including long rallies and autonomous matches, marking a significant step toward interactive, agile humanoid behaviors. Future work will focus on expanding perceptual capabilities, stroke repertoire, and competitive adaptation to approach championship-level play.

Ai Generate Text Spark Streamline Icon: https://streamlinehq.com

Paper Prompts

Sign up for free to create and run prompts on this paper using GPT-5.

Youtube Logo Streamline Icon: https://streamlinehq.com

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube