Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

TACT: Humanoid Whole-body Contact Manipulation through Deep Imitation Learning with Tactile Modality (2506.15146v1)

Published 18 Jun 2025 in cs.RO

Abstract: Manipulation with whole-body contact by humanoid robots offers distinct advantages, including enhanced stability and reduced load. On the other hand, we need to address challenges such as the increased computational cost of motion generation and the difficulty of measuring broad-area contact. We therefore have developed a humanoid control system that allows a humanoid robot equipped with tactile sensors on its upper body to learn a policy for whole-body manipulation through imitation learning based on human teleoperation data. This policy, named tactile-modality extended ACT (TACT), has a feature to take multiple sensor modalities as input, including joint position, vision, and tactile measurements. Furthermore, by integrating this policy with retargeting and locomotion control based on a biped model, we demonstrate that the life-size humanoid robot RHP7 Kaleido is capable of achieving whole-body contact manipulation while maintaining balance and walking. Through detailed experimental verification, we show that inputting both vision and tactile modalities into the policy contributes to improving the robustness of manipulation involving broad and delicate contact.

Summary

  • The paper introduces the TACT policy that integrates tactile feedback with vision and joint data to enhance whole-body contact manipulation.
  • It leverages deep imitation learning using human teleoperation data to enable high-precision handling of various objects.
  • Experimental trials on a life-size humanoid robot demonstrate robust, adaptable performance that outperforms baseline models in unseen scenarios.

Humanoid Whole-body Contact Manipulation through Imitation Learning

This paper presents a novel approach to enabling humanoid robots to perform complex manipulation tasks using whole-body contact, which is an area traditionally limited to extremity use such as hands and feet. The proposed method leverages advanced sensory-motor systems and deep imitation learning techniques, notably by integrating tactile modality into the humanoid control loop.

The TACT (tactile-modality extended ACT) policy is introduced within this paper, extending the widely recognized ACT model. It directly addresses the challenges associated with high-dimensional data from distributed tactile sensors and manages sensor modalities like vision, joint position, and tactile measurements. Humanoid manipulation is refined through learning-based control systems, allowing robots equipped with e-skin tactile sensors to achieve stable whole-body loco-manipulation. Training data acquired through human teleoperation informs learning, showcasing how sensors are integral to realizing adaptable and precise contact interaction.

Key Numerical Results

The experimental setup, utilizing a life-size humanoid robot RHP7. Kaleido, demonstrates high success rates in complex contact manipulation tasks. TACT was successfully deployed to manipulate objects such as paper boxes of different sizes without dropping or crushing them. Notably, the proposed system outperformed baseline models even when dealing with unseen configurations, underscoring that dual-modality input is critical for dexterous handling.

Contributions and Implications

The contributions of this paper are multifaceted: extending imitation learning frameworks to accommodate tactile inputs, integrating systematic retargeting and locomotion control with manipulation control, and empirically validating performance through physical trials. This paper underscores that incorporating tactile feedback along with visual inputs measurably enhances the robot's manipulation capabilities, particularly in scenarios requiring nuanced contact management. By doing so, the system is more robust across a wider array of tasks, potentially altering how robots achieve tasks in dynamic environments akin to human interaction.

The theoretical implications anchor around how integrating multimodal sensory processing can reframe approaches to robotic control systems. Practically, this heralds advancements in humanoid robots functioning in real-world environments, manipulating various object sizes and textures with increased adaptability. The findings also suggest directions for future exploration, such as marrying reinforcement learning with the current imitation learning structures to enhance responsiveness and control under varied conditions, as well as integrating additional sensory data layers for even richer interaction potential.

Overall, this paper is significant in advancing humanoid robotics toward more human-like manipulation capabilities, leveraging deep learning frameworks and multimodal sensory inputs to broaden operative reach in practical applications.

Youtube Logo Streamline Icon: https://streamlinehq.com