Adapting Skills to Novel Grasps: A Self-Supervised Approach
Abstract: In this paper, we study the problem of adapting manipulation trajectories involving grasped objects (e.g. tools) defined for a single grasp pose to novel grasp poses. A common approach to address this is to define a new trajectory for each possible grasp explicitly, but this is highly inefficient. Instead, we propose a method to adapt such trajectories directly while only requiring a period of self-supervised data collection, during which a camera observes the robot's end-effector moving with the object rigidly grasped. Importantly, our method requires no prior knowledge of the grasped object (such as a 3D CAD model), it can work with RGB images, depth images, or both, and it requires no camera calibration. Through a series of real-world experiments involving 1360 evaluations, we find that self-supervised RGB data consistently outperforms alternatives that rely on depth images including several state-of-the-art pose estimation methods. Compared to the best-performing baseline, our method results in an average of 28.5% higher success rate when adapting manipulation trajectories to novel grasps on several everyday tasks. Videos of the experiments are available on our webpage at https://www.robot-learning.uk/adapting-skills
- E. Valassakis et al., “Demonstrate once, imitate immediately (dome): Learning visual servoing for one-shot imitation learning,” IROS, 2022.
- W. Wan, H. Igawa, K. Harada, H. Onda, K. Nagata, and N. Yamanobe, “A regrasp planning component for object reorientation,” Auton. Robots, vol. 43, p. 1101–1115, jun 2019.
- A. Nguyen et al., “Preparatory object reorientation for task-oriented grasping,” in IROS, 2016.
- S. Cheng, K. Mo, and L. Shao, “Learning to regrasp by learning to place,” CoRR, vol. abs/2109.08817, 2021.
- A. Simeonov, Y. Du, A. Tagliasacchi, J. B. Tenenbaum, A. Rodriguez, P. Agrawal, and V. Sitzmann, “Neural descriptor fields: Se(3)-equivariant object representations for manipulation,” ICRA, 2022.
- H. Chen et al., “Aspanformer: Detector-free image matching with adaptive span transformer,” ECCV, 2022.
- S. Rusinkiewicz et al., “Efficient variants of the icp algorithm,” 3rd Intl. Conf. on 3D Digital Imaging and Modeling, 2001.
- S. Amir et al., “Deep vit features as dense visual descriptors,” ECCVW What is Motion For?, 2022.
- P. R. Florence, L. Manuelli, and R. Tedrake, “Dense object nets: Learning dense visual object descriptors by and for robotic manipulation,” arXiv preprint arXiv:1806.08756, 2018.
- B. Wen, W. Lian, K. E. Bekris, and S. Schaal, “You only demonstrate once: Category-level manipulation from single visual demonstration,” ArXiv, vol. abs/2201.12716, 2022.
- W. Goodwin et al., “You only look at one: Category-level object representations for pose estimation from a single example,” in CoRL, 2023.
- X. Deng, Y. Xiang, A. Mousavian, C. Eppner, T. Bretl, and D. Fox, “Self-supervised 6d object pose estimation for robot manipulation,” in ICRA, 2020.
- X. Li, H. Wang, L. Yi, L. Guibas, A. L. Abbott, and S. Song, “Category-level articulated object pose estimation,” CVPR, 2020.
- S. Devgon et al., “Orienting novel 3d objects using self-supervised learning of rotation transforms,” 2020 IEEE 16th International Conference on Automation Science and Engineering (CASE), 2020.
- H. Yisheng, W. Yao, F. Haoqiang, C. Qifeng, and S. Jian, “Fs6d: Few-shot 6d pose estimation of novel objects,” CVPR, 2022.
- 2014.
- E. Valassakis, K. Dreczkowski, and E. Johns, “Learning eye-in-hand calibration from a single image,” in CoRL, 2021.
- H. Fang et al., “Graspnet-1billion: A large-scale benchmark for general object grasping,” 2020 CVPR, pp. 11441–11450, 2020.
- H. Xu, J. Zhang, J. Cai, H. Rezatofighi, F. Yu, D. Tao, and A. Geiger, “Unifying flow, stereo and depth estimation,” 2022.
- W. Boerdijk, M. Sundermeyer, M. Durner, and R. Triebel, “Self-supervised object-in-gripper segmentation from robotic motions,” in Conference on Robot Learning, 2020.
- E. Johns, “Coarse-to-fine imitation learning: Robot manipulation from a single demonstration,” in IEEE International Conference on Robotics and Automation (ICRA), 2021.
- M. Caron et al., “Emerging properties in self-supervised vision transformers,” 2021 ICCV), 2021.
- K. S. Arun, T. S. Huang, and S. D. Blostein, “Least-squares fitting of two 3-d point sets,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. PAMI-9, pp. 698–700, 1987.
- D. Hadjivelichkov et al., “One-Shot Transfer of Affordance Regions? AffCorrs!,” in CoRL, 2023.
- V. Vosylius and E. Johns, “Where to start? collision-free transfer of skills to new environments,” in CoRL, 2022.
- O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” 2015. cite arxiv:1505.04597Comment: conditionally accepted at MICCAI 2015.
- T.-Y. Lin, M. Maire, S. J. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick, “Microsoft coco: Common objects in context,” in European Conference on Computer Vision, 2014.
- D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” 2014. cite arxiv:1412.6980Comment: Published as a conference paper at the 3rd International Conference for Learning Representations, San Diego, 2015.
- R. Liu, J. Lehman, P. Molino, F. P. Such, E. Frank, A. Sergeev, and J. Yosinski, “An intriguing failing of convolutional neural networks and the coordconv solution.,” in NeurIPS, 2018.
- H. Zhang, I. Goodfellow, D. Metaxas, and A. Odena, “Self-attention generative adversarial networks,” in arXiv, 2018.
- K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual Learning for Image Recognition,” in CVPR, IEEE, 2016.
- C. Finn, X. Y. Tan, Y. Duan, T. Darrell, S. Levine, and P. Abbeel, “Deep spatial autoencoders for visuomotor learning,” in ICRA, 2016.
- Q.-Y. Zhou, J. Park, and V. Koltun, “Open3D: A modern library for 3D data processing,” arXiv:1801.09847, 2018.
- J. Park, Q.-Y. Zhou, and V. Koltun, “Colored point cloud registration revisited,” in IEEE International Conference on Computer Vision (ICCV), pp. 143–152, 2017.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.