Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

WeightedPose: Generalizable Cross-Pose Estimation via Weighted SVD (2405.02241v2)

Published 3 May 2024 in cs.RO

Abstract: We introduce a new approach for robotic manipulation tasks in human settings that necessitates understanding the 3D geometric connections between a pair of objects. Conventional end-to-end training approaches, which convert pixel observations directly into robot actions, often fail to effectively understand complex pose relationships and do not easily adapt to new object configurations. To overcome these issues, our method focuses on learning the 3D geometric relationships, particularly how critical parts of one object relate to those of another. We employ Weighted SVD in our standalone model to analyze pose relationships both in articulated parts and in free-floating objects. For instance, our model can comprehend the spatial relationship between an oven door and the oven body, as well as between a lasagna plate and the oven. By concentrating on the 3D geometric connections, our strategy empowers robots to carry out intricate manipulation tasks based on object-centric perspectives

Definition Search Book Streamline Icon: https://streamlinehq.com
References (45)
  1. 6-dof grasp planning using fast 3d reconstruction and grasp quality cnn. arXiv preprint arXiv:2009.08618, 2020.
  2. Avplug: Approach vector planning for unicontact grasping amid clutter. In 2021 IEEE 17th International Conference on Automation Science and Engineering (CASE), pages 1140–1147. IEEE, 2021.
  3. Task space regions: A framework for pose-constrained manipulation planning. The International Journal of Robotics Research, 30(12):1435–1460, 2011.
  4. Whole-body motion planning for manipulation of articulated objects. In 2013 IEEE International Conference on Robotics and Automation, pages 1656–1662, May 2013.
  5. Immobilizing Hinged Polygons. Int. J. Comput. Geom. Appl., 17(01):45–69, February 2007.
  6. Planning for autonomous door opening with a mobile manipulator. In 2010 IEEE International Conference on Robotics and Automation, pages 1799–1806, May 2010.
  7. Orienting novel 3d objects using self-supervised learning of rotation transforms. In 2020 IEEE 16th International Conference on Automation Science and Engineering (CASE), pages 1453–1460. IEEE, 2020.
  8. Art/atk: A research platform for assessing and mitigating the sim-to-real gap in robotics and autonomous vehicle engineering. arXiv preprint arXiv:2211.04886, 2022.
  9. Dense object nets: Learning dense visual object descriptors by and for robotic manipulation. In Conference on Robot Learning, pages 373–385. PMLR, 2018.
  10. Pvn3d: A deep point-wise 3d keypoints voting network for 6dof pose estimation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 11632–11641, 2020.
  11. Ffb6d: A full flow bidirectional fusion network for 6d pose estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3003–3013, 2021.
  12. Learning to predict part mobility from a single static snapshot. ACM Trans. Graph., 36(6):1–13, November 2017.
  13. ScrewNet: Category-Independent articulation model estimation from depth images using screw theory. In 2021 IEEE International Conference on Robotics and Automation (ICRA), pages 13670–13677, May 2021.
  14. Multi-model 3d registration: Finding multiple moving objects in cluttered point clouds. arXiv preprint arXiv:2402.10865, 2024.
  15. Learning to manipulate articulated objects in unstructured environments using a grounded relational representation. In Robotics: Science and Systems IV. Robotics: Science and Systems Foundation, June 2008.
  16. Category-Level articulated object pose estimation, 2020.
  17. Planar robot casting with real2sim2real self-supervised learning. arXiv preprint arXiv:2111.04814, 2021.
  18. Real2sim2real: Self-supervised learning of physical single-step dynamic actions for planar robot casting. In 2022 International Conference on Robotics and Automation (ICRA), pages 8282–8289. IEEE, 2022.
  19. David G Lowe. Object recognition from local scale-invariant features. In Proceedings of the seventh IEEE international conference on computer vision, volume 2, pages 1150–1157. Ieee, 1999.
  20. kpam: Keypoint affordances for category-level robotic manipulation. International Symposium on Robotics Research (ISRR) 2019, 2019.
  21. Keypoints into the future: Self-supervised correspondence in model-based reinforcement learning. In Conference on Robot Learning, pages 693–710. PMLR, 2021.
  22. Partnet: A large-scale benchmark for fine-grained and hierarchical part-level 3d object understanding. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 909–918, 2019.
  23. Where2act: From pixels to actions for articulated 3d objects. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 6813–6823, 2021.
  24. Task-oriented planning for manipulating articulated mechanisms under model uncertainty. In 2015 IEEE International Conference on Robotics and Automation (ICRA), pages 3095–3101, May 2015.
  25. Tax-pose: Task-specific cross-pose estimation for robot manipulation. arXiv preprint arXiv:2211.09325, 2022.
  26. Tax-pose: Task-specific cross-pose estimation for robot manipulation. In Conference on Robot Learning, pages 1783–1792. PMLR, 2023.
  27. Keto: Learning keypoint representations for tool manipulation. In 2020 IEEE International Conference on Robotics and Automation (ICRA), pages 7278–7285. IEEE, 2020.
  28. 3d object modeling and recognition using local affine-invariant image descriptors and multi-view spatial constraints. International journal of computer vision, 66(3):231–259, 2006.
  29. Diffclip: Leveraging stable diffusion for language grounded 3d classification. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 3596–3605, 2024.
  30. Personalization of end-to-end speech recognition on mobile devices for named entities. In 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), pages 23–30. IEEE, 2019.
  31. Neural descriptor fields: Se (3)-equivariant object representations for manipulation. In 2022 International Conference on Robotics and Automation (ICRA), pages 6394–6400. IEEE, 2022.
  32. Gift: Generalizable interaction-aware functional tool affordances without labels. Robotics: Science and Systems (RSS), 2021.
  33. S3k: Self-supervised semantic keypoints for robotic manipulation via multi-view consistency. In Conference on Robot Learning, pages 449–460. PMLR, 2021.
  34. Shape2motion: Joint analysis of motion parts and attributes from 3d shapes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8876–8884, 2019.
  35. Sapien: A simulated part-based interactive environment. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11097–11107, 2020.
  36. Posecnn: A convolutional neural network for 6d object pose estimation in cluttered scenes. Robotics: Science and Systems (RSS), 2018.
  37. Umpnet: Universal manipulation policy network for articulated objects. IEEE Robotics and Automation Letters, 2022.
  38. Rpm-net: recurrent prediction of motion and parts from point cloud. arXiv preprint arXiv:2006.14865, 2020.
  39. Apla: Additional perturbation for latent noise with adversarial training enables consistency. arXiv preprint arXiv:2308.12605, 2023.
  40. Transporter networks: Rearranging the visual world for robotic manipulation. In Conference on Robot Learning, pages 726–747. PMLR, 2021.
  41. Visual identification of articulated object parts. In 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 2443–2450. IEEE, 2020.
  42. Haolun Zhang. Health diagnosis based on analysis of data captured by wearable technology devices. International Journal of Advanced Science and Technology, 95:89–96, 2016.
  43. Dex-net ar: Distributed deep grasp planning using a commodity cellphone and augmented reality app. In 2020 IEEE International Conference on Robotics and Automation (ICRA), pages 552–558. IEEE, 2020.
  44. Robots of the lost arc: Self-supervised learning to dynamically manipulate fixed-endpoint cables. In 2021 IEEE International Conference on Robotics and Automation (ICRA), pages 4560–4567. IEEE, 2021.
  45. Flowbot++: Learning generalized articulated objects manipulation via articulation projection. arXiv preprint arXiv:2306.12893, 2023.

Summary

We haven't generated a summary for this paper yet.