ObjBlur: A Curriculum Learning Approach With Progressive Object-Level Blurring for Improved Layout-to-Image Generation (2404.07564v1)
Abstract: We present ObjBlur, a novel curriculum learning approach to improve layout-to-image generation models, where the task is to produce realistic images from layouts composed of boxes and labels. Our method is based on progressive object-level blurring, which effectively stabilizes training and enhances the quality of generated images. This curriculum learning strategy systematically applies varying degrees of blurring to individual objects or the background during training, starting from strong blurring to progressively cleaner images. Our findings reveal that this approach yields significant performance improvements, stabilized training, smoother convergence, and reduced variance between multiple runs. Moreover, our technique demonstrates its versatility by being compatible with generative adversarial networks and diffusion models, underlining its applicability across various generative modeling paradigms. With ObjBlur, we reach new state-of-the-art results on the complex COCO and Visual Genome datasets.
- Curriculum learning. In ICML.
- Colin Blakemore and Fergus W Campbell. 1969. On the existence of neurones in the human visual system selectively sensitive to the orientation and size of retinal images. In The Journal of Physiology.
- Coco-stuff: Thing and stuff classes in context. In CVPR.
- Adversarial learning of semantic relevance in text to image synthesis. In AAAI.
- Layoutdiffuse: Adapting foundational diffusion models for layout-to-image generation. arXiv:2302.08908 (2023).
- Soft Diffusion: Score Matching with General Corruptions. In TMLR.
- On-line adaptative curriculum learning for gans. In AAAI.
- Attrlostgan: Attribute controlled image synthesis from reconfigurable layout and style. In GCPR.
- Generative adversarial nets. In NeurIPS.
- Context-aware layout to image generation with enhanced object appearance. In CVPR.
- Gans trained by a two time-scale update rule converge to a local nash equilibrium. In NeurIPS.
- Emiel Hoogeboom and Tim Salimans. 2023. Blurring diffusion models. In ICLR.
- Animesh Karnewar and Oliver Wang. 2020. Msg-gan: Multi-scale gradients for generative adversarial networks. In CVPR.
- Progressive growing of gans for improved quality, stability, and variation. In ICLR.
- Training generative adversarial networks with limited data. In NeurIPS.
- Diederik P. Kingma and Max Welling. 2013. Auto-Encoding Variational Bayes. In ICLR.
- Visual genome: Connecting language and vision using crowdsourced dense image annotations. In IJCV.
- Progressive deblurring of diffusion models for coarse-to-fine image synthesis. In NeurIPS Workshop on Score-Based Methods.
- Language generation with recurrent generative adversarial networks without pre-training. In ICML Workshop on Learning to Generate Natural Language.
- Suman Ravuri and Oriol Vinyals. 2019. Classification accuracy score for conditional generative models. (2019).
- Generative modelling with inverse heat dissipation. In ICLR.
- Improved techniques for training gans. In NeurIPS.
- Philippe G Schyns and Aude Oliva. 1994. From blobs to boundary edges: Evidence for time-and spatial-scale-dependent scene recognition. In Psychological Science.
- Improved training with curriculum gans. arXiv:1807.09295 (2018).
- Connor Shorten and Taghi M Khoshgoftaar. 2019. A survey on image data augmentation for deep learning. In Journal of Big Data.
- Curriculum by smoothing. In NeurIPS.
- Image difficulty curriculum for generative adversarial networks (CuGAN). In WACV.
- Curriculum learning: A survey. In IJCV.
- Wei Sun and Tianfu Wu. 2019. Image synthesis from reconfigurable layout and style. In ICCV.
- Object-Centric Image Generation from Layouts. In AAAI.
- On data augmentation for gan training. In IEEE Transactions on Image Processing.
- A survey on curriculum learning. In IEEE TPAMI.
- Diffusion-gan: Training gans with diffusion. In ICLR.
- Tackling the generative learning trilemma with denoising diffusion gans. In ICLR.
- A comprehensive survey of image augmentation techniques for deep learning. In Pattern Recognition.
- Rethinking data augmentation for image super-resolution: A comprehensive analysis and a new strategy. In CVPR.
- Consistency regularization for generative adversarial networks. In ICLR.
- The unreasonable effectiveness of deep features as a perceptual metric. In CVPR.
- Image Generation From Layout. In CVPR.
- Differentiable augmentation for data-efficient gan training. In NeurIPS.
- Improved consistency regularization for gans. In AAAI.
- Image augmentations for gan training. arXiv:2006.02595 (2020).
- LayoutDiffusion: Controllable Diffusion Model for Layout-to-image Generation. In CVPR.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.