Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
144 tokens/sec
GPT-4o
8 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

SKT-Hang: Hanging Everyday Objects via Object-Agnostic Semantic Keypoint Trajectory Generation (2312.04936v1)

Published 8 Dec 2023 in cs.RO

Abstract: We study the problem of hanging a wide range of grasped objects on diverse supporting items. Hanging objects is a ubiquitous task that is encountered in numerous aspects of our everyday lives. However, both the objects and supporting items can exhibit substantial variations in their shapes and structures, bringing two challenging issues: (1) determining the task-relevant geometric structures across different objects and supporting items, and (2) identifying a robust action sequence to accommodate the shape variations of supporting items. To this end, we propose Semantic Keypoint Trajectory (SKT), an object-agnostic representation that is highly versatile and applicable to various everyday objects. We also propose Shape-conditioned Trajectory Deformation Network (SCTDN), a model that learns to generate SKT by deforming a template trajectory based on the task-relevant geometric structure features of the supporting items. We conduct extensive experiments and demonstrate substantial improvements in our framework over existing robot hanging methods in the success rate and inference time. Finally, our simulation-trained framework shows promising hanging results in the real world. For videos and supplementary materials, please visit our project webpage: https://hcis-lab.github.io/SKT-Hang/.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (27)
  1. L. Manuelli, W. Gao, P. Florence, and R. Tedrake, “kpam: Keypoint affordances for category-level robotic manipulation,” in The International Symposium of Robotics Research.   Springer, 2019, pp. 132–157.
  2. Y. You, L. Shao, T. Migimatsu, and J. Bohg, “OmniHang: Learning to hang arbitrary objects using contact point correspondences and neural collision estimation,” in 2021 IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2021, pp. 5921–5927.
  3. A. Simeonov, Y. Du, A. Tagliasacchi, J. B. Tenenbaum, A. Rodriguez, P. Agrawal, and V. Sitzmann, “Neural descriptor fields: Se (3)-equivariant object representations for manipulation,” in 2022 International Conference on Robotics and Automation (ICRA).   IEEE, 2022, pp. 6394–6400.
  4. C. Pan, B. Okorn, H. Zhang, B. Eisner, and D. Held, “Tax-pose: Task-specific cross-pose estimation for robot manipulation,” in Conference on Robot Learning.   PMLR, 2023, pp. 1783–1792.
  5. R. Wu, Y. Zhao, K. Mo, Z. Guo, Y. Wang, T. Wu, Q. Fan, X. Chen, L. Guibas, and H. Dong, “VAT-Mart: Learning visual action trajectory proposals for manipulating 3D articulated objects,” in International Conference on Learning Representations (ICLR), 2022.
  6. W. Gao and R. Tedrake, “kPAM 2.0: Feedback control for category-level robotic manipulation,” IEEE Robotics and Automation Letters, vol. 6, no. 2, pp. 2962–2969, 2021.
  7. P. R. Florence, L. Manuelli, and R. Tedrake, “Dense Object Nets: Learning dense visual object descriptors by and for robotic manipulation,” in Conference on Robot Learning.   PMLR, 2018, pp. 373–385.
  8. Z. Qin, K. Fang, Y. Zhu, L. Fei-Fei, and S. Savarese, “KETO: Learning keypoint representations for tool manipulation,” in 2020 IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2020, pp. 7278–7285.
  9. M. Vecerik, J.-B. Regli, O. Sushkov, D. Barker, R. Pevceviciute, T. Rothörl, R. Hadsell, L. Agapito, and J. Scholz, “S3K: Self-supervised semantic keypoints for robotic manipulation via multi-view consistency,” in Conference on Robot Learning.   PMLR, 2021, pp. 449–460.
  10. Z. Xue, Z. Yuan, J. Wang, X. Wang, Y. Gao, and H. Xu, “Useek: Unsupervised se (3)-equivariant 3d keypoints for generalizable manipulation,” in 2023 IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2023, pp. 1715–1722.
  11. P. Florence, L. Manuelli, and R. Tedrake, “Self-supervised correspondence in visuomotor policy learning,” IEEE Robotics and Automation Letters, vol. 5, no. 2, pp. 492–499, 2019.
  12. L. Manuelli, Y. Li, P. Florence, and R. Tedrake, “Keypoints into the future: Self-supervised correspondence in model-based reinforcement learning,” arXiv preprint arXiv:2009.05085, 2020.
  13. K. Mo, L. J. Guibas, M. Mukadam, A. Gupta, and S. Tulsiani, “Where2act: From pixels to actions for articulated 3d objects,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 6813–6823.
  14. Z. Xu, Z. He, and S. Song, “Universal manipulation policy network for articulated objects,” IEEE Robotics and Automation Letters, vol. 7, no. 2, pp. 2447–2454, 2022.
  15. S. Liu, S. Tripathi, S. Majumdar, and X. Wang, “Joint hand motion and interaction hotspots prediction from egocentric videos,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 3282–3292.
  16. S. Bahl, R. Mendonca, L. Chen, U. Jain, and D. Pathak, “Affordances from human videos as a versatile representation for robotics,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 13 778–13 790.
  17. T. Weng, S. M. Bajracharya, Y. Wang, K. Agrawal, and D. Held, “Fabricflownet: Bimanual cloth manipulation with a flow-based policy,” in Conference on Robot Learning.   PMLR, 2022, pp. 192–202.
  18. B. Eisner, H. Zhang, and D. Held, “Flowbot3d: Learning 3d articulation flow to manipulate articulated objects,” Robotics Science and Systems (RSS), 2022.
  19. D. Seita, Y. Wang, S. J. Shetty, E. Y. Li, Z. Erickson, and D. Held, “Toolflownet: Robotic manipulation with tools via predicting tool flow from point clouds,” in Conference on Robot Learning (CoRL).   PMLR, 2023, pp. 1038–1049.
  20. K. Takeuchi, I. Yanokura, Y. Kakiuchi, K. Okada, and M. Inaba, “Automatic hanging point learning from random shape generation and physical function validation,” in 2021 IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2021, pp. 4237–4243.
  21. A. Nichol, H. Jun, P. Dhariwal, P. Mishkin, and M. Chen, “Point-E: A system for generating 3D point clouds from complex prompts,” arXiv preprint arXiv:2212.08751, 2022.
  22. J. J. Kuffner and S. M. LaValle, “RRT-connect: An Efficient Approach to Single-query Path Planning,” in IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2000, pp. 5921–5927.
  23. E. Coumans and Y. Bai, “PyBullet: a Python module for physics simulation for games, robotics and machine learning,” https://pybullet.org, 2016–2021.
  24. C. R. Qi, L. Yi, H. Su, and L. J. Guibas, “PointNet++: Deep hierarchical feature learning on point sets in a metric space,” Advances in neural information processing systems, vol. 30, 2017.
  25. S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural computation, vol. 9, no. 8, pp. 1735–1780, 1997.
  26. M. Ester, H.-P. Kriegel, J. Sander, X. Xu et al., “A density-based algorithm for discovering clusters in large spatial databases with noise,” in Conference on Knowledge Discovery and Data Mining (KDD), vol. 96, no. 34, 1996, pp. 226–231.
  27. A. Kirillov, E. Mintun, N. Ravi, H. Mao, C. Rolland, L. Gustafson, T. Xiao, S. Whitehead, A. C. Berg, W.-Y. Lo, P. Dollár, and R. Girshick, “Segment anything,” arXiv preprint arXiv:2304.02643, 2023.

Summary

We haven't generated a summary for this paper yet.