SynGen-Vision: Synthetic Data Generation for training industrial vision models (2509.04894v1)
Abstract: We propose an approach to generate synthetic data to train computer vision (CV) models for industrial wear and tear detection. Wear and tear detection is an important CV problem for predictive maintenance tasks in any industry. However, data curation for training such models is expensive and time-consuming due to the unavailability of datasets for different wear and tear scenarios. Our approach employs a vision LLM along with a 3D simulation and rendering engine to generate synthetic data for varying rust conditions. We evaluate our approach by training a CV model for rust detection using the generated dataset and tested the trained model on real images of rusted industrial objects. The model trained with the synthetic data generated by our approach, outperforms the other approaches with a mAP50 score of 0.87. The approach is customizable and can be easily extended to other industrial wear and tear detection scenarios
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Explain it Like I'm 14
What is this paper about?
This paper introduces SynGen-Vision, a way to create “fake but realistic” images to train computer vision systems that spot rust on industrial equipment. Instead of waiting months or years to collect real photos of machines rusting in different ways, the authors use generative AI and 3D tools to make high‑quality training pictures with automatic labels. These pictures help a computer learn to detect rust in real-world photos.
What questions were they trying to answer?
The authors focused on simple, practical questions:
- Can we generate realistic training images that show different kinds and amounts of rust?
- Can these synthetic images train a model that works well on real photos?
- Which steps make the synthetic data most useful: using GenAI alone, adding style transfer, or also cleaning up noisy textures?
How did they do it?
They built an end-to-end pipeline that turns text prompts into useful training data. Here’s the idea in everyday language:
- Start with 3D models and scenes Think of a video game environment: you have 3D objects (like tanks or pipes) in a 3D world. The team uses these as the base.
- Generate rust “textures” with GenAI A texture is like a sticker or skin that wraps around a 3D object to give it color and detail. Using a text-to-image model (Stable Diffusion), they type prompts like “complete rust” or “rust streaks” to create rust textures. They learned that adding words like “texture” or “surface” produces cleaner results.
- Blend the new rust with the original details (style transfer) If you slap a new rust skin on the object, you might lose important markings (logos, labels). Style transfer acts like combining two photos: it keeps the original object’s details (content) but adds the look of rust (style). This makes the rusted object look more realistic and keeps fine details.
- Remove bad textures (noise removal) Sometimes AI-generated images include watermarks, random text, or the wrong amount of rust. The team filters out those bad textures with image processing so only good ones are used.
- Wrap textures correctly (UV mapping) and build scenes UV mapping is like unwrapping a 3D toy into a flat map so you can place the sticker precisely, then wrapping it back on. They use Blender (a 3D tool) to apply the rust textures and set up scenes with different camera angles, distances, and lighting.
- Render images and auto-label them They render many images and automatically add “bounding boxes” (rectangles around the object) and a rust label like “complete rust” or “rust streaks.” These labeled images become the training set.
- Train and test a detection model They train a popular object detection model (YOLOv5) on 2,000 synthetic images and then test it on about 100 real photos they labeled by hand.
Key terms in simple words:
- Synthetic data: fake but realistic images made by computer.
- Texture: the “skin” or surface pattern you wrap around a 3D model.
- UV map: a 2D layout of a 3D object’s surface so textures can be placed precisely.
- Bounding box: a rectangle drawn around what you’re trying to detect.
- mAP50: a score (0 to 1) showing how good detection is when the predicted box overlaps the real target by at least 50%. Higher is better.
What did they find?
- Best performance came from combining three steps: GenAI + style transfer + noise removal. This combo produced the most realistic and useful training images.
- The trained model reached an mAP50 score of about 0.87 on real images. In simple terms, that’s strong performance for a model trained entirely on synthetic data.
- The model worked across different object shapes (not just a single kind of tank), showing it learned general rust patterns.
- Compared to using GenAI alone, adding style transfer kept important details and improved results; cleaning noisy textures improved them even more.
Why this matters:
- Getting lots of real photos of equipment with different rust levels is slow, costly, and sometimes impossible. Synthetic data speeds things up and lowers cost.
- Despite common GenAI issues (like watermarks or random text), careful filtering and style transfer can make the results clean and realistic.
Why does it matter?
This approach can make industrial maintenance smarter and cheaper:
- Faster training: Companies don’t need to wait for real rust to appear in many conditions to train a model.
- Safer inspections: Automated rust detection can help catch problems early, preventing breakdowns.
- Flexible use: The same pipeline can be adapted to other wear-and-tear signs, like cracks, dents, or aging paint—just change the prompts and textures.
- Better data with less effort: High-quality, labeled images can be generated on demand, which is a big deal when real data is scarce.
In short, SynGen-Vision shows that carefully crafted synthetic data—using GenAI, style transfer, cleanup steps, and 3D rendering—can train reliable vision models for real industrial problems.
Collections
Sign up for free to add this paper to one or more collections.