OrbitGrasp: $SE(3)$-Equivariant Grasp Learning (2407.03531v3)
Abstract: While grasp detection is an important part of any robotic manipulation pipeline, reliable and accurate grasp detection in $SE(3)$ remains a research challenge. Many robotics applications in unstructured environments such as the home or warehouse would benefit a lot from better grasp performance. This paper proposes a novel framework for detecting $SE(3)$ grasp poses based on point cloud input. Our main contribution is to propose an $SE(3)$-equivariant model that maps each point in the cloud to a continuous grasp quality function over the 2-sphere $S2$ using spherical harmonic basis functions. Compared with reasoning about a finite set of samples, this formulation improves the accuracy and efficiency of our model when a large number of samples would otherwise be needed. In order to accomplish this, we propose a novel variation on EquiFormerV2 that leverages a UNet-style encoder-decoder architecture to enlarge the number of points the model can handle. Our resulting method, which we name $\textit{OrbitGrasp}$, significantly outperforms baselines in both simulation and physical experiments.
- Sample efficient grasp learning using equivariant models. arXiv preprint arXiv:2202.09468, 2022.
- Anygrasp: Robust and efficient grasp perception in spatial and temporal domains. IEEE Transactions on Robotics, 2023.
- 6-dof graspnet: Variational grasp generation for object manipulation. In Proceedings of the IEEE/CVF international conference on computer vision, pages 2901–2910, 2019.
- Contact-graspnet: Efficient 6-dof grasp generation in cluttered scenes. In 2021 IEEE International Conference on Robotics and Automation (ICRA), pages 13438–13444. IEEE, 2021.
- Volumetric grasping network: Real-time 6 dof grasp detection in clutter. In Conference on Robot Learning, pages 1602–1611. PMLR, 2021.
- Graspnet: An efficient convolutional neural network for real-time grasp detection for low-powered devices. In IJCAI, volume 7, pages 4875–4882, 2018.
- Grasp pose detection in point clouds. The International Journal of Robotics Research, 36(13-14):1455–1473, 2017.
- Edge grasp network: A graph-based se (3)-invariant approach to grasp detection. In 2023 IEEE International Conference on Robotics and Automation (ICRA), pages 3882–3888. IEEE, 2023.
- Icgnet: A unified approach for instance-centric grasping. arXiv preprint arXiv:2401.09939, 2024.
- Quaternions, interpolation and animation, volume 2. Citeseer, 1998.
- On the continuity of rotation representations in neural networks. CoRR, abs/1812.07035, 2018. URL http://arxiv.org/abs/1812.07035.
- Seil: Simulation-augmented equivariant imitation learning. In 2023 IEEE International Conference on Robotics and Automation (ICRA), pages 1845–1851. IEEE, 2023.
- Se (3)-equivariant relational rearrangement with neural descriptor fields. In Conference on Robot Learning, pages 835–846. PMLR, 2023.
- Neural descriptor fields: Se (3)-equivariant object representations for manipulation. In 2022 International Conference on Robotics and Automation (ICRA), pages 6394–6400. IEEE, 2022.
- Equivariant reinforcement learning under partial observability. In Conference on Robot Learning, pages 3309–3320. PMLR, 2023.
- On-robot learning with equivariant models. arXiv preprint arXiv:2203.04923, 2022.
- Equiformerv2: Improved equivariant transformer for scaling to higher-degree representations. arXiv preprint arXiv:2306.12059, 2023.
- Synergies between affordance and geometry: 6-dof grasp detection via implicit representations. arXiv preprint arXiv:2104.01542, 2021.
- Graspnet-1billion: A large-scale benchmark for general object grasping. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 11444–11453, 2020.
- Tensor field networks: Rotation-and translation-equivariant neural networks for 3d point clouds. arXiv preprint arXiv:1802.08219, 2018.
- Vector neurons: A general framework for so (3)-equivariant networks. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 12200–12209, 2021.
- Capgrasp: An ℝ3×SO(2)-Equivariantsuperscriptℝ3SO(2)-Equivariant\mathbb{R}^{3}\times\text{SO(2)-Equivariant}blackboard_R start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT × SO(2)-Equivariant continuous approach-constrained generative grasp sampler. IEEE Robotics and Automation Letters, 9(4):3641–3647, 2024. doi:10.1109/LRA.2024.3369444.
- Learning any-view 6dof robotic grasping in cluttered scenes via neural surface rendering. arXiv preprint arXiv:2306.07392, 2023.
- Equivariant q𝑞qitalic_q learning in spatial action spaces. In Conference on Robot Learning, pages 1713–1723. PMLR, 2022a.
- SO(2)SO2\mathrm{SO}(2)roman_SO ( 2 )-equivariant reinforcement learning. arXiv preprint arXiv:2203.04439, 2022b.
- T. Cohen and M. Welling. Group equivariant convolutional networks. In International conference on machine learning, pages 2990–2999. PMLR, 2016a.
- T. S. Cohen and M. Welling. Steerable cnns. arXiv preprint arXiv:1612.08498, 2016b.
- Equivariant transporter network. arXiv preprint arXiv:2202.09400, 2022.
- Transporter networks: Rearranging the visual world for robotic manipulation. In Conference on Robot Learning, pages 726–747. PMLR, 2021.
- Equivariant descriptor fields: Se (3)-equivariant energy-based models for end-to-end visual robotic manipulation learning. arXiv preprint arXiv:2206.08321, 2022.
- Fourier transporter: Bi-equivariant robotic manipulation in 3d. arXiv preprint arXiv:2401.12046, 2024.
- Deep se (3)-equivariant geometric reasoning for precise placement tasks. arXiv preprint arXiv:2404.13478, 2024.
- Riemann: Near real-time se (3)-equivariant robot manipulation without point cloud segmentation. arXiv preprint arXiv:2403.19460, 2024.
- Pointnetgpd: Detecting grasp configurations from point sets. In 2019 International Conference on Robotics and Automation (ICRA), pages 3629–3635. IEEE, 2019.
- Lie groups beyond an introduction, volume 140. Springer, 1996.
- Y.-L. Liao and T. Smidt. Equiformer: Equivariant graph attention transformer for 3d atomistic graphs. arXiv preprint arXiv:2206.11990, 2022.
- S. Passaro and C. L. Zitnick. Reducing so (3) convolutions to so (2) for efficient equivariant gnns. In International Conference on Machine Learning, pages 27420–27438. PMLR, 2023.
- Diffusion-edfs: Bi-equivariant denoising generative modeling on se (3) for visual robotic manipulation. arXiv preprint arXiv:2309.02685, 2023.
- E. Coumans and Y. Bai. Pybullet, a python module for physics simulation for games, robotics and machine learning. 2016.
- The ycb object and model set: Towards common benchmarks for manipulation research. In 2015 international conference on advanced robotics (ICAR), pages 510–517. IEEE, 2015.
- Bigbird: A large-scale 3d database of object instances. In 2014 IEEE international conference on robotics and automation (ICRA), pages 509–516. IEEE, 2014.
- The kit object models database: An object model database for object recognition, localization and manipulation in service robotics. The International Journal of Robotics Research, 31(8):927–934, 2012.
- Leveraging big data for grasp planning. In 2015 IEEE international conference on robotics and automation (ICRA), pages 4304–4311. IEEE, 2015.
- Segment anything. arXiv:2304.02643, 2023.
- I. Loshchilov and F. Hutter. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101, 2017.
- I. Loshchilov and F. Hutter. Sgdr: Stochastic gradient descent with warm restarts. arXiv preprint arXiv:1608.03983, 2016.
- Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580, 2012.
- Pointnet++: Deep hierarchical feature learning on point sets in a metric space. Advances in neural information processing systems, 30, 2017.
- Geometric deep learning: Grids, groups, graphs, geodesics, and gauges. arXiv preprint arXiv:2104.13478, 2021.
- Gauge equivariant mesh cnns: Anisotropic convolutions on geometric graphs. In International Conference on Learning Representations, 2020.