Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Image Inpainting via Conditional Texture and Structure Dual Generation (2108.09760v2)

Published 22 Aug 2021 in cs.CV

Abstract: Deep generative approaches have recently made considerable progress in image inpainting by introducing structure priors. Due to the lack of proper interaction with image texture during structure reconstruction, however, current solutions are incompetent in handling the cases with large corruptions, and they generally suffer from distorted results. In this paper, we propose a novel two-stream network for image inpainting, which models the structure-constrained texture synthesis and texture-guided structure reconstruction in a coupled manner so that they better leverage each other for more plausible generation. Furthermore, to enhance the global consistency, a Bi-directional Gated Feature Fusion (Bi-GFF) module is designed to exchange and combine the structure and texture information and a Contextual Feature Aggregation (CFA) module is developed to refine the generated contents by region affinity learning and multi-scale feature aggregation. Qualitative and quantitative experiments on the CelebA, Paris StreetView and Places2 datasets demonstrate the superiority of the proposed method. Our code is available at https://github.com/Xiefan-Guo/CTSDG.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (40)
  1. Filling-in by joint interpolation of vector fields and gray levels. IEEE TIP, 10(8):1200–1211, 2001.
  2. Patchmatch: A randomized correspondence algorithm for structural image editing. ACM TOG, 28(3):24, 2009.
  3. Image inpainting. In SIGGRAPH, 2000.
  4. What makes paris look like paris? ACM TOG, 31(4):101:1–101:9, 2012.
  5. Image quilting for texture synthesis and transfer. In SIGGRAPH, 2001.
  6. Deep residual learning for image recognition. In CVPR, 2016.
  7. Globally and locally consistent image completion. ACM TOG, 36(4):107:1–107:14, 2017.
  8. Prior guided gan based semantic inpainting. In CVPR, 2020.
  9. Precomputed real-time texture synthesis with markovian generative adversarial networks. In ECCV, 2016.
  10. Progressive reconstruction of visual structure for image inpainting. In ICCV, 2019.
  11. Recurrent feature reasoning for image inpainting. In CVPR, 2020.
  12. Guidance and evaluation: Semantic-aware image inpainting for mixed scenes. In ECCV, 2020.
  13. Image inpainting for irregular holes using partial convolutions. In ECCV, 2018.
  14. Rethinking image inpainting via a mutual encoder-decoder with feature equalizations. In ECCV, 2020.
  15. Coherent semantic attention for image inpainting. In ICCV, 2019.
  16. Deep learning face attributes in the wild. In ICCV, 2015.
  17. Spectral normalization for generative adversarial networks. In ICLR, 2018.
  18. Edgeconnect: Structure guided image inpainting using edge prediction. In ICCVW, 2019.
  19. Context encoders: Feature learning by inpainting. In CVPR, 2016.
  20. Structureflow: Image inpainting via structure-aware appearance flow. In ICCV, 2019.
  21. U-net: Convolutional networks for biomedical image segmentation. In MICCAI, 2015.
  22. Imagenet large scale visual recognition challenge. IJCV, 115(3):211–252, 2015.
  23. Very deep convolutional networks for large-scale image recognition. In ICLR, 2015.
  24. Contextual-based image inpainting: Infer, match, and translate. In ECCV, 2018.
  25. High-resolution image synthesis and semantic manipulation with conditional gans. In CVPR, 2018.
  26. Vcnet: A robust approach to blind image inpainting. In ECCV, 2020.
  27. Image inpainting with learnable bidirectional attention maps. In ICCV, 2019.
  28. Foreground-aware image inpainting. In CVPR, 2019.
  29. Image inpainting by patch propagation using patch sparsity. IEEE TIP, 19(5):1153–1165, 2010.
  30. Shift-net: Image inpainting via deep feature rearrangement. In ECCV, pages 1–17, 2018.
  31. High-resolution image inpainting using multi-scale neural patch synthesis. In CVPR, 2017.
  32. Learning to incorporate structure knowledge for image inpainting. In AAAI, 2020.
  33. Semantic image inpainting with deep generative models. In CVPR, 2017.
  34. Contextual residual aggregation for ultra high-resolution image inpainting. In CVPR, 2020.
  35. Generative image inpainting with contextual attention. In CVPR, 2018.
  36. Free-form image inpainting with gated convolution. In ICCV, 2019.
  37. High-resolution image inpainting with iterative confidence feedback and guided upsampling. In ECCV, 2020.
  38. Uctgan: Diverse image inpainting based on unsupervised cross-space translation. In CVPR, 2020.
  39. Places: A 10 million image database for scene recognition. IEEE TPAMI, 40(6):1452–1464, 2018.
  40. Learning oracle attention for high-fidelity face completion. In CVPR, 2020.
Citations (172)

Summary

  • The paper introduces a novel two-stream GAN architecture that dual-generates image structure and texture for enhanced restoration of large corruptions.
  • The Bi-GFF and CFA modules enable bi-directional feature fusion and multi-scale contextual aggregation to improve the coherence of restored images.
  • Experimental results on CelebA, Paris StreetView, and Places2 demonstrate superior performance over state-of-the-art methods using metrics like LPIPS, PSNR, and SSIM.

Image Inpainting via Conditional Texture and Structure Dual Generation

The paper "Image Inpainting via Conditional Texture and Structure Dual Generation" by Xiefan Guo, Hongyu Yang, and Di Huang introduces a sophisticated approach to the task of image inpainting, focusing on the integration of structure and texture information to better tackle complex and large corruptions in images. The proposed method extends traditional inpainting techniques by employing a novel two-stream network architecture that effectively facilitates the dual generation of image structures and textures, aiming to produce more visually plausible and semantically consistent inpainting results.

Methodology Overview

The core of the proposed approach is a two-stream generative adversarial network (GAN), which builds upon the concept of structure-constrained texture synthesis and texture-guided structure reconstruction. This dual generation tasks are designed to complement each other, leveraging the synergistic interactions between texture and structure components within the network. The generator consists of two parallel branches, each responsible for one of the subtasks, which are combined by a two-branch discriminator that evaluates the outputs for both realism and consistency.

Key innovations within this framework include the Bi-directional Gated Feature Fusion (Bi-GFF) module and the Contextual Feature Aggregation (CFA) module. The Bi-GFF module refines the consistency between structure and texture features by integrating them bi-directionally through soft gating mechanisms, thus enhancing the generation's coherence. The CFA module captures long-range dependencies and aggregates features at multiple scales to ensure detailed and context-aware inpainting, which is crucial for managing large missing image regions with complex patterns.

Experimental Evaluation

The effectiveness of the proposed model is rigorously evaluated on standard datasets, including CelebA, Paris StreetView, and Places2, where it demonstrates superior performance both qualitatively and quantitatively. Metrics such as LPIPS, PSNR, and SSIM indicate that the method outperforms contemporary state-of-the-art approaches, such as EdgeConnect, MED, and PatchMatch, particularly in handling large corruptions. Visual comparisons highlight the method's ability to restore both the overall structure and fine textures more accurately than its predecessors.

Implications and Future Directions

The dual generation strategy outlined in this work opens new avenues for more intelligent design of inpainting networks that effectively intertwine structure and texture synthesis tasks. This approach not only improves the visual authenticity of the inpainted regions but also addresses limitations observed in single-stream or structurally driven models. The paper’s findings suggest potential improvements in applications such as photo editing, object removal, and image restoration, where maintaining both local texture fidelity and global structural integrity is essential.

Future research could explore extending this dual generation framework to other domains beyond image inpainting, such as video restoration or 3D model reconstruction, where the integration of multi-source information is increasingly critical. Additionally, further refinements in the network architecture, perhaps through attention mechanisms or transformer models, could enhance the capability of dual methods to efficiently handle even more complex scenes.

In conclusion, the dual generation approach presented in this paper highlights a significant step forward in the field of image inpainting, encouraging further exploration into sophisticated network designs that balance and integrate diverse visual elements for comprehensive scene restoration.

Github Logo Streamline Icon: https://streamlinehq.com