Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
98 tokens/sec
GPT-4o
11 tokens/sec
Gemini 2.5 Pro Pro
52 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
15 tokens/sec
DeepSeek R1 via Azure Pro
33 tokens/sec
Gemini 2.5 Flash Deprecated
12 tokens/sec
2000 character limit reached

Latent Object Characteristics Recognition with Visual to Haptic-Audio Cross-modal Transfer Learning (2403.10689v1)

Published 15 Mar 2024 in cs.RO, cs.CV, and cs.LG

Abstract: Recognising the characteristics of objects while a robot handles them is crucial for adjusting motions that ensure stable and efficient interactions with containers. Ahead of realising stable and efficient robot motions for handling/transferring the containers, this work aims to recognise the latent unobservable object characteristics. While vision is commonly used for object recognition by robots, it is ineffective for detecting hidden objects. However, recognising objects indirectly using other sensors is a challenging task. To address this challenge, we propose a cross-modal transfer learning approach from vision to haptic-audio. We initially train the model with vision, directly observing the target object. Subsequently, we transfer the latent space learned from vision to a second module, trained only with haptic-audio and motor data. This transfer learning framework facilitates the representation of object characteristics using indirect sensor data, thereby improving recognition accuracy. For evaluating the recognition accuracy of our proposed learning framework we selected shape, position, and orientation as the object characteristics. Finally, we demonstrate online recognition of both trained and untrained objects using the humanoid robot Nextage Open.

Citations (3)

Summary

We haven't generated a summary for this paper yet.