Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 173 tok/s
Gemini 2.5 Pro 48 tok/s Pro
GPT-5 Medium 20 tok/s Pro
GPT-5 High 23 tok/s Pro
GPT-4o 76 tok/s Pro
Kimi K2 202 tok/s Pro
GPT OSS 120B 447 tok/s Pro
Claude Sonnet 4.5 36 tok/s Pro
2000 character limit reached

MatSynth: A Modern PBR Materials Dataset (2401.06056v3)

Published 11 Jan 2024 in cs.CV and cs.GR

Abstract: We introduce MatSynth, a dataset of 4,000+ CC0 ultra-high resolution PBR materials. Materials are crucial components of virtual relightable assets, defining the interaction of light at the surface of geometries. Given their importance, significant research effort was dedicated to their representation, creation and acquisition. However, in the past 6 years, most research in material acquisiton or generation relied either on the same unique dataset, or on company-owned huge library of procedural materials. With this dataset we propose a significantly larger, more diverse, and higher resolution set of materials than previously publicly available. We carefully discuss the data collection process and demonstrate the benefits of this dataset on material acquisition and generation applications. The complete data further contains metadata with each material's origin, license, category, tags, creation method and, when available, descriptions and physical size, as well as 3M+ renderings of the augmented materials, in 1K, under various environment lightings. The MatSynth dataset is released through the project page at: https://www.gvecchio.com/matsynth.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (53)
  1. Adobe. Substance 3D Assets. https://substance3d.adobe.com/assets/, 2023.
  2. AmbientCG. https://www.ambientcg.com/, 2023.
  3. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020.
  4. CGBookCase. https://www.cgbookcase.com/, 2023.
  5. Single-image svbrdf capture with a rendering-aware deep network. ACM Transactions on Graphics (SIGGRAPH Conference Proceedings), 37(128):15, 2018.
  6. Flexible svbrdf capture with a multi-image deep network. Computer Graphics Forum(Eurographics Symposium on Rendering Conference Proceedings), 38(4):13, 2019.
  7. Guided fine-tuning for large-scale material transfer. Computer Graphics Forum (Proceedings of the Eurographics Symposium on Rendering), 39(4), 2020.
  8. Deep polarization imaging for 3d shape and svbrdf acquisition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
  9. The visual language of fabrics. ACM Transactions on Graphics (TOG), 42(4):1–15, 2023.
  10. An adaptive parameterization for efficient material acquisition and rendering. Transactions on Graphics (Proceedings of SIGGRAPH Asia), 37(6):274:1–274:18, 2018.
  11. Metappearance: Meta-learning for visual appearance reproduction. ACM Trans Graph (Proc. SIGGRAPH Asia), 41(4), 2022.
  12. Deep inverse rendering for high-resolution svbrdf estimation from an arbitrary number of images. ACM Trans. Graph., 38(4), 2019.
  13. Outcast: Single image relighting with cast shadows. Computer Graphics Forum, 43, 2022.
  14. Brdf representation and acquisition. Computer Graphics Forum, 35(2):625–650, 2016.
  15. Matformer: A generative model for procedural materials. ACM Trans. Graph., 41(4), 2022.
  16. Highlight-aware two-stream network for single-image svbrdf acquisition. ACM Trans. Graph., 40(4), 2021.
  17. Ultra-high resolution svbrdf recovery from a single image. ACM Trans. Graph., 42(3), 2023.
  18. MaterialGAN: Reflectance capture using a generative svbrdf model. ACM Trans. Graph., 39(6), 2020.
  19. Text2Mat: Generating Materials from Text. In Pacific Graphics Short Papers and Posters. The Eurographics Association, 2023.
  20. Generative modelling of brdf textures from flash images. ACM Trans Graph (Proc. SIGGRAPH Asia), 40(6), 2021.
  21. Generating Procedural Materials from Text or Image Prompts. In ACM SIGGRAPH 2023 Conference Proceedings, 2023.
  22. Analyzing and improving the image quality of stylegan. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 8110–8119, 2020.
  23. Modeling surface appearance from a single photograph using self-augmented convolutional neural networks. ACM Trans. Graph., 36(4), 2017.
  24. Learning to reconstruct shape and spatially-varying reflectance from a single image. In SIGGRAPH Asia 2018 Technical Papers, page 269. ACM, 2018.
  25. Inverse rendering for complex indoor scenes: Shape, spatially-varying lighting and svbrdf from a single image. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2475–2484, 2020.
  26. Openrooms: An open framework for photorealistic indoor scene datasets. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 7190–7199, 2021.
  27. MaterIA: Single image high-resolution material capture in the wild. Computer Graphics Forum, 41(2):163–177, 2022.
  28. A data-driven reflectance model. ACM Transactions on Graphics (TOG), 22(3):759–769, 2003.
  29. Wes McDermott. The PBR Guide. Allegorithmic, 2018.
  30. Material swapping for 3d scenes using a learnt material similarity measure. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, pages 2034–2043, 2022.
  31. PolyHaven. https://www.polyhaven.com/, 2023.
  32. Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748–8763. PMLR, 2021.
  33. Infinite photorealistic worlds using procedural generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12630–12641, 2023.
  34. Hierarchical text-conditional image generation with clip latents. arXiv preprint arXiv:2204.06125, 1(2):3, 2022.
  35. Hypersim: A photorealistic synthetic dataset for holistic indoor scene understanding. In International Conference on Computer Vision (ICCV) 2021, 2021.
  36. ShareTextures. https://www.sharetextures.com/, 2023.
  37. Materialistic: Selecting similar materials in images. ACM Trans. Graph., 42(4), 2023.
  38. Denoising diffusion implicit models. arXiv preprint arXiv:2010.02502, 2020.
  39. Synthetic datasets for autonomous driving: A survey, 2023.
  40. TextureCan. https://www.texturecan.com/, 2023.
  41. Giuseppe Vecchio. Accelerating reality: virtual environment generation for outdoor robot navigation using deep learning. PhD thesis, Universita degli studi di Catania, 2023.
  42. Surfacenet: Adversarial svbrdf estimation from a single image. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 12840–12848, 2021.
  43. Controlmat: A controlled generative approach to material capture, 2023a.
  44. Matfuse: Controllable material generation with diffusion models, 2023b.
  45. VRay. VRay:MaterialLibrary. http://www.vray-materials.de/, 2023.
  46. Exploring clip for assessing the look and feel of images. In Proceedings of the AAAI Conference on Artificial Intelligence, pages 2555–2563, 2023.
  47. Psdr-room: Single photo to scene using differentiable rendering. In ACM SIGGRAPH Asia 2023 Conference Proceedings, 2023.
  48. Adversarial single-image svbrdf estimation with hybrid training. Computer Graphics Forum, 2021.
  49. Tilegen: Tileable, controllable material generation and capture. In SIGGRAPH Asia 2022 Conference Papers, New York, NY, USA, 2022a. Association for Computing Machinery.
  50. Tilegen: Tileable, controllable material generation and capture. In SIGGRAPH Asia 2022 Conference Papers, pages 1–9, 2022b.
  51. Photomat: A material generator learned from single flash photos. In ACM SIGGRAPH 2023 Conference Proceedings, New York, NY, USA, 2023. Association for Computing Machinery.
  52. Learning-based inverse rendering of complex indoor scenes with differentiable monte carlo raytracing. In SIGGRAPH Asia 2022 Conference Papers. ACM, 2022a.
  53. Irisformer: Dense vision transformers for single-image inverse rendering in indoor scenes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2822–2831, 2022b.
Citations (9)

Summary

  • The paper presents a curated dataset of over 4,000 unique 4K, tileable PBR materials with comprehensive metadata and licensing details.
  • The methodology includes rigorous quality control, diverse data augmentation techniques, and seamless integration with existing datasets.
  • Experimental results demonstrate significant improvements in material acquisition and generation, enhancing performance in models like SurfaceNet and MatFuse.

MatSynth: A Modern PBR Materials Dataset

Introduction

The MatSynth dataset is a comprehensive collection of Physically Based Rendering (PBR) materials designed to support modern learning-based techniques for material-related tasks such as acquisition, generation, and synthetic data augmentation. This dataset aims to bridge the gap between public and private material datasets by providing a rich and diverse assortment of high-resolution materials under a permissive license. It addresses the limitations of previous datasets by offering a greater variety and volume of materials. Figure 1

Figure 1

Figure 1

Figure 1

Figure 1

Figure 1

Figure 1

Figure 1

Figure 1

Figure 1

Figure 1

Figure 1

Figure 1

Figure 1

Figure 1

Figure 1

Figure 1: Renderings under various environment maps. We show four materials (Metal, Leather, Plastic, and Pebbles) from the dataset rendered under the 5 chosen environment maps.

Materials Collection and Data Processing

MatSynth was meticulously curated from publicly available online sources, resulting in over 6,000 materials, filtered down to 4,069 unique 4K, tileable materials. Each material is represented by a comprehensive set of maps, including Base Color, Diffuse, Normal, Height, Roughness, Metallic, and Specular. All materials are subject to stringent quality control checks, ensuring their suitability for high-fidelity rendering tasks.

Dataset Annotations and Augmentation

Each material in the dataset is accompanied by extensive metadata, including its source, tags, creation method, and licensing information. This comprehensive annotation supports a wide range of research applications, from machine learning-based material synthesis to detailed material property studies.

Data augmentation is a prominent feature of MatSynth, with numerous rotations, crops, and environment map renderings of each material, resulting in millions of sample renderings. This augmentation strategy is essential for training robust machine learning models and enables more diverse real-world applications. Figure 2

Figure 2

Figure 2

Figure 2

Figure 2

Figure 2

Figure 2: Render samples using the two-pass strategy. This ensures that the maps and the rendering are well aligned, avoiding parallax effects while preserving specular highlights.

Compatibility and Integration with Existing Datasets

To ensure seamless integration with existing resources, the MatSynth dataset has been processed with compatibility in mind. It can be combined with prior datasets to form an extensive library for research and development. The dataset adheres to uniform standards in its material representation, facilitating its use alongside established workflows.

Experimental Results and Evaluation

The quantitative and qualitative evaluation of MatSynth highlights its impact on material acquisition and generation tasks. When used to train state-of-the-art models like SurfaceNet, the dataset improves on prior results in normal and roughness maps recovery, demonstrating significant enhancements in material quality capture.

The dataset's effectiveness in generating diverse and high-quality synthetic materials is validated by training generative models (e.g., MatFuse), which show improved FID scores—indicative of higher realism and variety in generated textures. This diversity and realism are essential for advancing synthetic data generation and enriching virtual environments. Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3: Qualitative material acquisition comparison on synthetic data. We compare the dataset against previous benchmarks, showing improved fidelity and diversity.

Conclusion

MatSynth represents a substantial advancement in the availability of high-quality material datasets. By offering a broad assortment of materials with comprehensive metadata and rendering options, the dataset facilitates a variety of research endeavors in material acquisition and generation. It effectively equates some of the benefits previously reserved for internal libraries, fostering broader innovation within the research community.

In summary, MatSynth promises to propel forward research and development in fields related to material science, offering tools to develop more accurate computational models and richer digital environments. As advances continue in the area of materials science and computer graphics, MatSynth serves as a foundational resource for ongoing innovation.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 3 tweets and received 0 likes.

Upgrade to Pro to view all of the tweets about this paper:

Youtube Logo Streamline Icon: https://streamlinehq.com