Investigation of the Impact of Synthetic Training Data in the Industrial Application of Terminal Strip Object Detection (2403.04809v1)
Abstract: In industrial manufacturing, numerous tasks of visually inspecting or detecting specific objects exist that are currently performed manually or by classical image processing methods. Therefore, introducing recent deep learning models to industrial environments holds the potential to increase productivity and enable new applications. However, gathering and labeling sufficient data is often intractable, complicating the implementation of such projects. Hence, image synthesis methods are commonly used to generate synthetic training data from 3D models and annotate them automatically, although it results in a sim-to-real domain gap. In this paper, we investigate the sim-to-real generalization performance of standard object detectors on the complex industrial application of terminal strip object detection. Combining domain randomization and domain knowledge, we created an image synthesis pipeline for automatically generating the training data. Moreover, we manually annotated 300 real images of terminal strips for the evaluation. The results show the cruciality of the objects of interest to have the same scale in either domain. Nevertheless, under optimized scaling conditions, the sim-to-real performance difference in mean average precision amounts to 2.69 % for RetinaNet and 0.98 % for Faster R-CNN, qualifying this approach for industrial requirements.
- Synthetic object recognition dataset for industries. In 2022 35th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI), volume 1, pages 150–155.
- Benchmarking domain randomisation for visual sim-to-real transfer. In 2021 IEEE International Conference on Robotics and Automation (ICRA), pages 12802–12808.
- The cityscapes dataset for semantic urban scene understanding. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 3213–3223.
- Imagenet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 248–255.
- Modeling visual context is key to augmenting object detection datasets. In Ferrari, V., Hebert, M., Sminchisescu, C., and Weiss, Y., editors, Computer Vision – ECCV 2018, pages 375–391, Cham. Springer International Publishing.
- Cut, paste and learn: Surprisingly easy synthesis for instance detection. In Proceedings of the IEEE International Conference on Computer Vision (ICCV).
- Generating images with physics-based rendering for an industrial object detection task: Realism versus domain randomization. Sensors, 21(23).
- Virtual worlds as proxy for multi-object tracking analysis. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 4340–4349.
- Are we ready for autonomous driving? the kitti vision benchmark suite. In Conference on Computer Vision and Pattern Recognition (CVPR).
- Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 770–778.
- Speed/accuracy trade-offs for modern convolutional object detectors. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 3296–3297.
- Non-maximum suppression guided label assignment for object detection in crowd scenes. IEEE Transactions on Multimedia, pages 1–12.
- Focal loss for dense object detection. In 2017 IEEE International Conference on Computer Vision (ICCV), pages 2999–3007.
- Microsoft coco: Common objects in context. In Fleet, D., Pajdla, T., Schiele, B., and Tuytelaars, T., editors, Computer Vision – ECCV 2014, pages 740–755, Cham. Springer International Publishing.
- Object detector differences when using synthetic and real training data. SN Computer Science, 4(3):302.
- Towards fully-synthetic training for industrial applications. In Liu, S., Bohács, G., Shi, X., Shang, X., and Huang, A., editors, LISS 2020, pages 765–782, Singapore. Springer Singapore.
- Structured domain randomization: Bridging the reality gap by context-aware synthetic data. In 2019 International Conference on Robotics and Automation (ICRA), pages 7249–7255.
- Faster r-cnn: Towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(6):1137–1149.
- Playing for data: Ground truth from computer games. In Leibe, B., Matas, J., Sebe, N., and Welling, M., editors, Computer Vision – ECCV 2016, pages 102–118, Cham. Springer International Publishing.
- Style-transfer gans for bridging the domain gap in synthetic pose estimator training. In 2020 IEEE International Conference on Artificial Intelligence and Virtual Reality (AIVR), pages 188–195.
- Learning from simulated and unsupervised images through adversarial training. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 2242–2251.
- Render for cnn: Viewpoint estimation in images using cnns trained with rendered 3d model views. In 2015 IEEE International Conference on Computer Vision (ICCV), pages 2686–2694.
- Inception-v4, inception-resnet and the impact of residual connections on learning. Proceedings of the AAAI Conference on Artificial Intelligence, 31(1).
- Domain randomization for transferring deep neural networks from simulation to the real world. In 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 23–30.
- Training deep networks with synthetic data: Bridging the reality gap by domain randomization. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pages 1082–10828.
- A survey of image synthesis methods for visual machine learning. Computer Graphics Forum, 39(6):426–451.
- A parallel teacher for synthetic-to-real domain adaptation of traffic object detection. IEEE Transactions on Intelligent Vehicles, 7(3):441–455.
- Synscapes: A photorealistic synthetic dataset for street scene parsing. arXiv preprint arXiv:1810.08705.
- Towards sim-to-real industrial parts classification with synthetic dataset. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, pages 4453–4462.
- Nico Baumgart (1 paper)
- Markus Lange-Hegermann (35 papers)
- Mike Mücke (1 paper)