Papers

Topics

Authors

Recent

View all

Assistant

AI Research Assistant

Well-researched responses based on relevant abstracts and paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses.

Gemini 2.5 Flash

Gemini 2.5 Flash 59 tok/s

Gemini 2.5 Pro 50 tok/s Pro

GPT-5 Medium 28 tok/s Pro

GPT-5 High 29 tok/s Pro

GPT-4o 80 tok/s Pro

Kimi K2 181 tok/s Pro

GPT OSS 120B 454 tok/s Pro

Claude Sonnet 4.5 33 tok/s Pro

2000 character limit reached

Synthetic Dataset Generation for Autonomous Mobile Robots Using 3D Gaussian Splatting for Vision Training (2506.05092v1)

Published 5 Jun 2025 in cs.RO and cs.CV

Abstract: Annotated datasets are critical for training neural networks for object detection, yet their manual creation is time- and labour-intensive, subjective to human error, and often limited in diversity. This challenge is particularly pronounced in the domain of robotics, where diverse and dynamic scenarios further complicate the creation of representative datasets. To address this, we propose a novel method for automatically generating annotated synthetic data in Unreal Engine. Our approach leverages photorealistic 3D Gaussian splats for rapid synthetic data generation. We demonstrate that synthetic datasets can achieve performance comparable to that of real-world datasets while significantly reducing the time required to generate and annotate data. Additionally, combining real-world and synthetic data significantly increases object detection performance by leveraging the quality of real-world images with the easier scalability of synthetic data. To our knowledge, this is the first application of synthetic data for training object detection algorithms in the highly dynamic and varied environment of robot soccer. Validation experiments reveal that a detector trained on synthetic images performs on par with one trained on manually annotated real-world images when tested on robot soccer match scenarios. Our method offers a scalable and comprehensive alternative to traditional dataset creation, eliminating the labour-intensive error-prone manual annotation process. By generating datasets in a simulator where all elements are intrinsically known, we ensure accurate annotations while significantly reducing manual effort, which makes it particularly valuable for robotics applications requiring diverse and scalable training data.

Summary

The paper demonstrates a novel 3D Gaussian splatting method that creates synthetic annotated datasets with performance comparable to real-world data.
The approach is validated on dynamic tasks like robot soccer using YOLOv8, achieving high mAP scores that underline its efficiency.
The work significantly reduces manual annotation labor and paves the way for scalable, hybrid vision training in autonomous robotics.

Synthetic Dataset Generation for Autonomous Mobile Robots Using 3D Gaussian Splatting for Vision Training

The paper "Synthetic Dataset Generation for Autonomous Mobile Robots Using 3D Gaussian Splatting for Vision Training" presents a methodological advancement in the field of computer vision for robotics, specifically addressing the challenges associated with annotated dataset creation. Annotated datasets are vital for training convolutional neural networks (CNNs) for object detection, yet the process of manually creating these datasets is laborious, susceptible to errors, and limited in diversity. The authors propose the employment of synthetic datasets generated via photorealistic 3D Gaussian splatting within the Unreal Engine to train object detection algorithms efficiently.

Proposed Methodology

The authors introduce a novel approach to synthetic data generation which utilizes 3D Gaussian splatting to automatically produce annotated datasets. These synthetic datasets demonstrate comparability in performance to their real-world counterparts while drastically reducing the time and effort involved in creation. The methodology involves capturing object images to develop 3D photorealistic models, utilizing software like LUMA AI for Gaussian splat modeling. These models are then deployed in virtual environments to facilitate the generation of large-scale synthetic datasets, expediting the development of robust object detection solutions.

Application and Validation

Robotic soccer serves as the testing ground for this approach, given its highly dynamic and unpredictable environment. The methodology is validated using YOLOv8 object detection models trained on datasets comprising various robot models and balls, all rendered using 3D Gaussian splats. Performance metrics such as Precision, Recall, F1-Score, Intersection over Union (IoU), and mean Average Precision (mAP) facilitate a comprehensive comparison.

The experimental outcomes reveal that synthetic datasets generated using 3D Gaussian splatting offer accuracy close to that of real-world datasets while leveraging scalability advantages. Additionally, combining real-world and synthetic datasets further enhances object detection performance. For simple objects like spheres, low-fidelity models suffices, evidenced by an mAP50 of 0.962. For complex objects, a hybrid dataset improved mAP50 to 0.992, suggesting a promising compromise that balances time efficiency and accuracy.

Implications and Future Directions

The implications of this work are particularly profound for applications requiring rapid dataset generation, such as autonomous robotics deployed in constantly shifting environments. The authors illustrate the scalable potential of synthetic datasets, reducing the dependency on tedious manual annotations and providing the capability to introduce domain randomizations. While the methodology proves effective for scenarios like robot soccer, the generalizability to other dynamic robotic fields is promising.

Moving forward, this work could pave the way for more sophisticated synthetic data generation methods that incorporate additional environmental variations, enhance the photorealism of synthetic images, and further minimize the domain gap between synthetic and real-world datasets. Future research could also focus on integrating more advanced techniques like Neural Radiance Fields (NeRFs) or hybrid approaches employing both synthetic and real-world data to coax the most nuanced aspects of scene understanding from CNNs.

In conclusion, the paper presents a compelling advancement in synthetic data generation for training vision models in robotics, marking a significant stride towards more efficient, scalable, and less error-prone dataset creation methodologies.