Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Towards Sim-to-Real Industrial Parts Classification with Synthetic Dataset (2404.08778v1)

Published 12 Apr 2024 in cs.CV and cs.LG

Abstract: This paper is about effectively utilizing synthetic data for training deep neural networks for industrial parts classification, in particular, by taking into account the domain gap against real-world images. To this end, we introduce a synthetic dataset that may serve as a preliminary testbed for the Sim-to-Real challenge; it contains 17 objects of six industrial use cases, including isolated and assembled parts. A few subsets of objects exhibit large similarities in shape and albedo for reflecting challenging cases of industrial parts. All the sample images come with and without random backgrounds and post-processing for evaluating the importance of domain randomization. We call it Synthetic Industrial Parts dataset (SIP-17). We study the usefulness of SIP-17 through benchmarking the performance of five state-of-the-art deep network models, supervised and self-supervised, trained only on the synthetic data while testing them on real data. By analyzing the results, we deduce some insights on the feasibility and challenges of using synthetic data for industrial parts classification and for further developing larger-scale synthetic datasets. Our dataset and code are publicly available.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (35)
  1. Yolov4: Optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934, 2020.
  2. Learning 6d object pose estimation using 3d object coordinates. In 13th European Conference on Computer Vision (ECCV), pages 536–551. Springer, 2014.
  3. Emerging properties in self-supervised vision transformers. In Proceedings of the IEEE/CVF international conference on computer vision (ICCV), pages 9650–9660, 2021.
  4. Cad-based learning for egocentric object detection in industrial context. In 15th International Conference on Computer Vision Theory and Applications, volume 5, pages 644–651, 2020.
  5. Dataset of industrial metal objects. arXiv preprint arXiv:2208.04052, 2022.
  6. Deep learning for big data applications in cad and plm–research review, opportunities and case study. Computers in Industry, 100:227–243, 2018.
  7. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020.
  8. Generating images with physics-based rendering for an industrial object detection task: Realism versus domain randomization. Sensors, 21(23):7901, 2021.
  9. Virtual worlds as proxy for multi-object tracking analysis. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pages 4340–4349, 2016.
  10. Vision meets robotics: The kitti dataset. The International Journal of Robotics Research, 32(11):1231–1237, 2013.
  11. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pages 770–778, 2016.
  12. Model based training, detection and pose estimation of texture-less 3d objects in heavily cluttered scenes. In 11th Asian Conference on Computer Vision (ACCV), pages 548–562. Springer, 2013.
  13. T-less: An rgb-d dataset for 6d pose estimation of texture-less objects. In IEEE Winter Conference on Applications of Computer Vision (WACV), pages 880–888. IEEE, 2017.
  14. Object detection using sim2real domain randomization for robotic applications. IEEE Transactions on Robotics, 2022.
  15. Deep learning for part identification based on inherent features. CIRP Annals, 68(1):9–12, 2019.
  16. Microsoft coco: Common objects in context. In 13th European Conference on Computer Vision (ECCV), pages 740–755. Springer, 2014.
  17. A convnet for the 2020s. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 11976–11986, 2022.
  18. Loco: Logistics objects in context. In 19th IEEE International Conference on Machine Learning and Applications (ICMLA), pages 612–617. IEEE, 2020.
  19. Learning deep object detectors from 3d models. In Proceedings of the IEEE international conference on computer vision, pages 1278–1286, 2015.
  20. Visda: A synthetic-to-real benchmark for visual domain adaptation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, pages 2021–2026, 2018.
  21. Structured domain randomization: Bridging the reality gap by context-aware synthetic data. In International Conference on Robotics and Automation (ICRA), pages 7249–7255. IEEE, 2019.
  22. Computer-implemented method and system for generating a synthetic training data set for training a machine learning computer vision model, December 2022. European patent application, 21179758.4.
  23. Youtube-boundingboxes: A large high-precision human-annotated data set for object detection in video. In proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 5296–5305, 2017.
  24. Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems, 28, 2015.
  25. Playing for data: Ground truth from computer games. In 14th European Conference on Computer Vision (ECCV), pages 102–118. Springer, 2016.
  26. Cad2rl: Real single-image flight without a single real image. arXiv preprint arXiv:1611.04201, 2016.
  27. Efficientnet: Rethinking model scaling for convolutional neural networks. In International conference on machine learning, pages 6105–6114. PMLR, 2019.
  28. Contrastive multiview coding. In 16th European Conference on Computer Vision (ECCV), pages 776–794. Springer, 2020.
  29. Domain randomization for transferring deep neural networks from simulation to the real world. In IEEE/RSJ international conference on intelligent robots and systems (IROS), pages 23–30. IEEE, 2017.
  30. Michael Tovey. Drawing and cad in industrial design. Design Studies, 10(1):24–39, 1989.
  31. Training deep networks with synthetic data: Bridging the reality gap by domain randomization. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) workshops, pages 969–977, 2018.
  32. Unsplash. Unsplash, 2003–2023.
  33. Synthetic dataset generation for object-to-model deep learning in industrial applications. PeerJ Computer Science, 5:e222, 2019.
  34. Domain randomization and pyramid consistency: Simulation-to-real generalization without accessing target domain data. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 2100–2110, 2019.
  35. Youshan Zhang. A survey of unsupervised domain adaptation for visual recognition. arXiv preprint arXiv:2112.06745, 2021.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Xiaomeng Zhu (10 papers)
  2. Talha Bilal (2 papers)
  3. Pär Mårtensson (2 papers)
  4. Lars Hanson (2 papers)
  5. Mårten Björkman (49 papers)
  6. Atsuto Maki (22 papers)
Citations (4)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com