2000 character limit reached
3D-COCO: extension of MS-COCO dataset for image detection and 3D reconstruction modules (2404.05641v3)
Published 8 Apr 2024 in cs.CV
Abstract: We introduce 3D-COCO, an extension of the original MS-COCO dataset providing 3D models and 2D-3D alignment annotations. 3D-COCO was designed to achieve computer vision tasks such as 3D reconstruction or image detection configurable with textual, 2D image, and 3D CAD model queries. We complete the existing MS-COCO dataset with 28K 3D models collected on ShapeNet and Objaverse. By using an IoU-based method, we match each MS-COCO annotation with the best 3D models to provide a 2D-3D alignment. The open-source nature of 3D-COCO is a premiere that should pave the way for new research on 3D-related topics. The dataset and its source codes is available at https://kalisteo.cea.fr/index.php/coco3d-object-detection-and-reconstruction/
- “Microsoft coco: Common objects in context,” in ECCV. Springer, 2014, pp. 740–755.
- “Shapenet: An information-rich 3d model repository,” arXiv:1512.03012, 2015.
- “Objaverse: A universe of annotated 3d objects,” in CVPR. IEEE/CVF, 2023, pp. 13142–13153.
- “Faster r-cnn: Towards real-time object detection with region proposal networks,” Advances in neural information processing systems, vol. 28, 2015.
- “You only look once: Unified, real-time object detection,” in CVPR. IEEE, 2016, pp. 779–788.
- “Ssd: Single shot multibox detector,” in ECCV. Springer, 2016, pp. 21–37.
- “End-to-end object detection with transformers,” in ECCV. Springer, 2020, pp. 213–229.
- “The pascal visual object classes challenge: A retrospective,” IJCV, vol. 111, pp. 98–136, 2015.
- “The open images dataset v4: Unified image classification, object detection, and visual relationship detection at scale,” IJCV, vol. 128, no. 7, pp. 1956–1981, 2020.
- “Open-vocabulary detr with conditional matching,” in ECCV. Springer, 2022, pp. 106–122.
- “3d-c2ft: Coarse-to-fine transformer for multi-view 3d reconstruction,” in ACCV, 2022, pp. 1438–1454.
- “Legoformer: Transformers for block-by-block multi-view 3d reconstruction,” arXiv:2106.12102, 2021.
- J. Mahmud and J. Frahm, “Vpfusion: Joint 3d volume and pixel-aligned feature fusion for single and multi-view 3d reconstruction,” arXiv:2203.07553, 2022.
- S. Savarese and L. Fei-Fei, “3d generic object categorization, localization and pose estimation,” in 2007 IEEE 11th International Conference on Computer Vision. IEEE/CVF, 2007, pp. 1–8.
- “Pose estimation for category specific multiview object localization,” IEEE, 06 2009.
- “Indoor segmentation and support inference from rgbd images,” 10 2012, pp. 746–760.
- “Sun rgb-d: A rgb-d scene understanding benchmark suite,” in CVPR. IEEE, June 2015.
- “Are we ready for autonomous driving? the kitti vision benchmark suite,” in CVPR. IEEE, 2012, pp. 3354–3361.
- “Parsing ikea objects: Fine pose estimation,” in ICCV. IEEE, December 2013.
- “Beyond pascal: A benchmark for 3d object detection in the wild,” in WACV. IEEE, 2014, pp. 75–82.
- “Objectnet3d: A large scale database for 3d object recognition,” in ECCV. Springer, 2016.
- “Abo: Dataset and benchmarks for real-world 3d object understanding,” in CVPR. IEEE/CVF, 2022, pp. 21126–21136.
- “3d-future: 3d furniture shape with texture,” IJCV, vol. 129, 12 2021.
- “Google scanned objects: A high-quality dataset of 3d scanned household items,” in ICRA. IEEE, 2022, pp. 2553–2560.
- “Common objects in 3d: Large-scale learning and evaluation of real-life 3d category reconstruction,” in ICCV. IEEE/CVF, 2021, pp. 10901–10911.
- “Pix3d: Dataset and methods for single-image 3d shape modeling,” in CVPR. IEEE, 2018, pp. 2974–2983.
- “Photoshape: Photorealistic materials for large-scale shape collections,” arXiv:1809.09761, 2018.
- “Objaverse-xl: A universe of 10m+ 3d objects,” arXiv:2307.05663, 2023.
- Maxence Bideaux (1 paper)
- Alice Phe (1 paper)
- Mohamed Chaouch (13 papers)
- Bertrand Luvison (7 papers)
- Quoc-Cuong Pham (11 papers)