Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
126 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Leveraging SE(3) Equivariance for Learning 3D Geometric Shape Assembly (2309.06810v2)

Published 13 Sep 2023 in cs.CV and cs.AI

Abstract: Shape assembly aims to reassemble parts (or fragments) into a complete object, which is a common task in our daily life. Different from the semantic part assembly (e.g., assembling a chair's semantic parts like legs into a whole chair), geometric part assembly (e.g., assembling bowl fragments into a complete bowl) is an emerging task in computer vision and robotics. Instead of semantic information, this task focuses on geometric information of parts. As the both geometric and pose space of fractured parts are exceptionally large, shape pose disentanglement of part representations is beneficial to geometric shape assembly. In our paper, we propose to leverage SE(3) equivariance for such shape pose disentanglement. Moreover, while previous works in vision and robotics only consider SE(3) equivariance for the representations of single objects, we move a step forward and propose leveraging SE(3) equivariance for representations considering multi-part correlations, which further boosts the performance of the multi-part assembly. Experiments demonstrate the significance of SE(3) equivariance and our proposed method for geometric shape assembly. Project page: https://crtie.github.io/SE-3-part-assembly/

Definition Search Book Streamline Icon: https://streamlinehq.com
References (38)
  1. Designing effective step-by-step assembly instructions. ACM Transactions on Graphics (TOG), 22(3):828–837, 2003.
  2. ShapeNet: An Information-Rich 3D Model Repository. Technical Report arXiv:1512.03012 [cs.GR], Stanford University — Princeton University — Toyota Technological Institute at Chicago, 2015.
  3. Equivariant point network for 3d point cloud analysis. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 14514–14523, 2021.
  4. 3d equivariant graph implicit functions. In Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part III, pages 485–502. Springer, 2022.
  5. Neural shape mating: Self-supervised object assembly with adversarial shape priors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12724–12733, 2022.
  6. Vector neurons: A general framework for so (3)-equivariant networks. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 12200–12209, 2021.
  7. Kit-net: Self-supervised learning to kit novel 3d objects into novel 3d cavities. In 2021 IEEE 17th International Conference on Automation Science and Engineering (CASE), pages 1124–1131. IEEE, 2021.
  8. Se (3)-transformers: 3d roto-translation equivariant attention networks. Advances in Neural Information Processing Systems, 33:1970–1981, 2020.
  9. Learning how to match fresco fragments. Journal on Computing and Cultural Heritage (JOCCH), 4(2):1–13, 2011.
  10. Long short-term memory. Neural computation, 9(8):1735–1780, 1997.
  11. Contextual translation embedding for visual relationship detection and scene graph generation. IEEE transactions on pattern analysis and machine intelligence, 43(11):3820–3832, 2020.
  12. Automate: A dataset and learning approach for automatic mating of cad assemblies. ACM Transactions on Graphics (TOG), 40(6):1–18, 2021.
  13. Shape-pose disentanglement using se (3)-equivariant vector neurons. In Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part III, pages 468–484. Springer, 2022.
  14. Se (2)-equivariant pushing dynamics models for tabletop object manipulations. In 6th Annual Conference on Robot Learning.
  15. Adam: A method for stochastic optimization. The 3rd International Conference for Learning Representations, 2015.
  16. Ikea furniture assembly environment for long-horizon complex manipulation tasks. In 2021 ieee international conference on robotics and automation (icra), pages 6343–6349. IEEE, 2021.
  17. Leveraging se (3) equivariance for self-supervised category-level object pose estimation from point clouds. Advances in Neural Information Processing Systems, 34, 2021.
  18. Scene editing as teleoperation: A case study in 6dof kit assembly. In 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 4773–4780. IEEE, 2022.
  19. Learning 3d part assembly from a single image. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part VI 16, pages 664–682. Springer, 2020.
  20. Coarse-to-fine point cloud registration with se (3)-equivariant representations. arXiv preprint arXiv:2210.02045, 2022.
  21. Self-supervised category-level articulated object pose estimation with part-level SE(3) equivariance. In The Eleventh International Conference on Learning Representations, 2023.
  22. Rgl-net: A recurrent graph learning framework for progressive part assembly. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 78–87, 2022.
  23. Equivariant descriptor fields: Se (3)-equivariant energy-based models for end-to-end visual robotic manipulation learning. arXiv preprint arXiv:2206.08321, 2022.
  24. Breaking bad: A dataset for geometric fracture and reassembly. In Thirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track, 2022.
  25. SE(3)-equivariant relational rearrangement with neural descriptor fields. In 6th Annual Conference on Robot Learning, 2022.
  26. Neural descriptor fields: Se (3)-equivariant object representations for manipulation. In 2022 International Conference on Robotics and Automation (ICRA), pages 6394–6400. IEEE, 2022.
  27. Fast end-to-end learning on protein surfaces. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 15272–15281, 2021.
  28. Tensor field networks: Rotation-and translation-equivariant neural networks for 3d point clouds. arXiv preprint arXiv:1802.08219, 2018.
  29. Dynamic graph cnn for learning on point clouds. Acm Transactions On Graphics (tog), 38(5):1–12, 2019.
  30. 3d steerable cnns: Learning rotationally equivariant features in volumetric data. Advances in Neural Information Processing Systems, 31, 2018.
  31. Joinable: Learning bottom-up assembly of parametric cad joints. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 15849–15860, 2022.
  32. Useek: Unsupervised se (3)-equivariant 3d keypoints for generalizable manipulation. arXiv preprint arXiv:2209.13864, 2022.
  33. Coalesce: Component assembly by learning to synthesize connections. In 2020 International Conference on 3D Vision (3DV), pages 61–70. IEEE, 2020.
  34. Rotationally equivariant 3d object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1456–1464, 2022.
  35. Form2fit: Learning shape priors for generalizable assembly from disassembly. In 2020 IEEE International Conference on Robotics and Automation (ICRA), pages 9404–9410. IEEE, 2020.
  36. Generative 3d part assembly via dynamic graph learning. Advances in Neural Information Processing Systems, 33:6315–6326, 2020.
  37. Visual translation embedding network for visual relation detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 5532–5540, 2017.
  38. Quaternion equivariant capsule networks for 3d point clouds. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part I 16, pages 1–19. Springer, 2020.
Citations (15)

Summary

We haven't generated a summary for this paper yet.

Github Logo Streamline Icon: https://streamlinehq.com