Cylin-Painting: Seamless {360\textdegree} Panoramic Image Outpainting and Beyond (2204.08563v2)
Abstract: Image outpainting gains increasing attention since it can generate the complete scene from a partial view, providing a valuable solution to construct {360\textdegree} panoramic images. As image outpainting suffers from the intrinsic issue of unidirectional completion flow, previous methods convert the original problem into inpainting, which allows a bidirectional flow. However, we find that inpainting has its own limitations and is inferior to outpainting in certain situations. The question of how they may be combined for the best of both has as yet remained under-explored. In this paper, we provide a deep analysis of the differences between inpainting and outpainting, which essentially depends on how the source pixels contribute to the unknown regions under different spatial arrangements. Motivated by this analysis, we present a Cylin-Painting framework that involves meaningful collaborations between inpainting and outpainting and efficiently fuses the different arrangements, with a view to leveraging their complementary benefits on a seamless cylinder. Nevertheless, straightforwardly applying the cylinder-style convolution often generates visually unpleasing results as it discards important positional information. To address this issue, we further present a learnable positional embedding strategy to incorporate the missing component of positional encoding into the cylinder convolution, which significantly improves the panoramic results. It is noted that while developed for image outpainting, the proposed algorithm can be effectively extended to other panoramic vision tasks, such as object detection, depth estimation, and image super-resolution. Code will be made available at \url{https://github.com/KangLiao929/Cylin-Painting}.
- Mind the pad–cnns can develop blind spots. ICLR, 2021.
- Oconet: Image extrapolation by object completion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2307–2317, 2021.
- Matterport3d: Learning from rgb-d data in indoor environments. arXiv preprint arXiv:1709.06158, 2017.
- Generating 360 outdoor panorama dataset with reliable sun position estimation. In SIGGRAPH Asia Posters, pages 1–2. 2018.
- Twins: Revisiting the design of spatial attention in vision transformers. Advances in Neural Information Processing Systems, 34:9355–9366, 2021.
- An image is worth 16x16 words: Transformers for image recognition at scale. ICLR, 2021.
- Spiral generative network for image extrapolation. In European Conference on Computer Vision, 2020.
- Rethinking image inpainting via a mutual encoder-decoder with feature equalizations. In Proceedings of the European Conference on Computer Vision, 2020.
- How much position information do convolutional neural networks encode? ICLR, 2020.
- A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4401–4410, 2019.
- Osman Semih Kayhan and Jan C van Gemert. On translation invariance in cnns: Convolutional layers can exploit absolute spatial location. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 14274–14285, 2020.
- Semie: Semantically-aware image extrapolation. arXiv preprint arXiv:2108.13702, 2021.
- Painting outside as inside: Edge guided image outpainting via bidirectional rearrangement with progressive step learning. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 2122–2130, 2021.
- Deeper depth prediction with fully convolutional residual networks. In International Conference on 3D Vision (3DV), pages 239–248. IEEE, 2016.
- Controllable and progressive image extrapolation. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 2140–2149, 2021.
- Enhanced deep residual networks for single image super-resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pages 136–144, 2017.
- Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 10012–10022, 2021.
- Bridging the visual gap: Wide-range image blending. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 843–851, 2021.
- Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in Neural Information Processing Systems, 28:91–99, 2015.
- Painting outside the box: Image outpainting with gans. arXiv preprint arXiv:1808.08483, 2018.
- Boundless: Generative adversarial networks for image extension. In Proceedings of the IEEE International Conference on Computer Vision, pages 10521–10530, 2019.
- Mlp-mixer: An all-mlp architecture for vision. Advances in Neural Information Processing Systems, 34:24261–24272, 2021.
- Improved texture networks: Maximizing quality and diversity in feed-forward stylization and texture synthesis. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 6924–6932, 2017.
- Deep high-resolution representation learning for visual recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020.
- Pyramid vision transformer: A versatile backbone for dense prediction without convolutions. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 568–578, 2021.
- Wide-context semantic image extrapolation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1399–1408, 2019.
- Sketch-guided scenery image outpainting. IEEE Transactions on Image Processing, 30:2643–2655, 2021.
- Recognizing scene viewpoint using panoramic place representation. In IEEE Conference on Computer Vision and Pattern Recognition, pages 2695–2702. IEEE, 2012.
- Positional encoding as spatial inductive bias in gans. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13569–13578, 2021.
- High-resolution image inpainting using multi-scale neural patch synthesis. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017.
- Very long natural scenery image prediction by outpainting. In Proceedings of the IEEE International Conference on Computer Vision, pages 10561–10570, 2019.
- Contextual residual aggregation for ultra high-resolution image inpainting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7508–7517, 2020.
- Free-form image inpainting with gated convolution. In Proceedings of the IEEE International Conference on Computer Vision, pages 4471–4480, 2019.
- Learning pyramid-context encoder network for high-quality image inpainting. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1486–1494, 2019.
- High-resolution image inpainting with iterative confidence feedback and guided upsampling. In European Conference on Computer Vision, pages 1–17, 2020.
- Residual dense network for image super-resolution. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pages 2472–2481, 2018.
- Wasserstein generative adversarial networks. In International Conference on Machine Learning, pages 214–223, 2017.
- Spectral normalization for generative adversarial networks. arXiv preprint arXiv:1802.05957, 2018.
- Free-form image inpainting with gated convolution. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 4471–4480, 2019.
- Learning spherical convolution for fast features from 360 imagery. Advances in Neural Information Processing Systems, 30, 2017.
- Very deep convolutional networks for large-scale image recognition. International Conference on Learning Representations, 2015.
- Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016.
- PanoFlow: Learning 360° Optical Flow for Surrounding Temporal Understanding. IEEE Transactions on Intelligent Transportation Systems, 2023.
- PanoFormer: Panorama Transformer for Indoor 360° Depth Estimation. European Conference on Computer Vision, 2022.
- 360-indoor: Towards learning real-world objects in 360deg indoor equirectangular images. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2020.
- Both Style and Distortion Matter: Dual-Path Unsupervised Domain Adaptation for Panoramic Semantic Segmentation. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023.
- Panoramic video salient object detection with ambisonic audio guidance. Proceedings of the AAAI Conference on Artificial Intelligence, 2023.
- Lgt-net: Indoor panoramic room layout estimation with geometry-aware transformer network. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022.
- Pass: Panoramic annular semantic segmentation. IEEE Transactions on Intelligent Transportation Systems, 21(10): 4171–4185, 2019.
- Eliminating the blind spot: Adapting 3d object detection and monocular depth estimation to 360 panoramic imagery. European Conference on Computer Vision, 2018.
- Spherical Image Generation From a Few Normal-Field-of-View Images by Considering Scene Symmetry. IEEE Transactions on Pattern Analysis and Machine Intelligence, 54(5): 6339–6353, 2022.
- Diverse plausible 360-degree image outpainting for efficient 3DCG background creation. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022.
- SPDET: Edge-Aware Self-Supervised Panoramic Depth Estimation Transformer With Spherical Geometry. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023.
- Capturing omni-range context for omnidirectional segmentation. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021.
- Review on panoramic imaging and its applications in scene understanding. IEEE Transactions on Instrumentation and Measurement, 71: 1–34, 2022.
- Panoramic panoptic segmentation: Insights into surrounding parsing for mobile agents via unsupervised contrastive learning. IEEE Transactions on Intelligent Transportation Systems, 24(4): 4438–4453, 2023.
- FlowLens: Seeing beyond the FoV via flow-guided clip-recurrent transformer. arXiv preprint arXiv:2211.11293, 2022.
- FisheyeEX: Polar outpainting for extending the FoV of fisheye lens. arXiv preprint arXiv:2206.05844, 2022.
- FishDreamer: Towards Fisheye Semantic Completion via Unified Image Outpainting and Segmentation. In IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2023.
- Complementary bi-directional feature compression for indoor 360deg semantic segmentation with self-distillation. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2023.
- Bending reality: Distortion-aware transformers for adapting to panoramic semantic segmentation. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022.
- HDR environment map estimation for real-time augmented reality. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021.
- Spherical image generation from a single image by considering scene symmetry. Proceedings of the AAAI Conference on Artificial Intelligence, 2021.