Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Implicit Neural Image Stitching (2309.01409v5)

Published 4 Sep 2023 in cs.CV

Abstract: Existing frameworks for image stitching often provide visually reasonable stitchings. However, they suffer from blurry artifacts and disparities in illumination, depth level, etc. Although the recent learning-based stitchings relax such disparities, the required methods impose sacrifice of image qualities failing to capture high-frequency details for stitched images. To address the problem, we propose a novel approach, implicit Neural Image Stitching (NIS) that extends arbitrary-scale super-resolution. Our method estimates Fourier coefficients of images for quality-enhancing warps. Then, the suggested model blends color mismatches and misalignment in the latent space and decodes the features into RGB values of stitched images. Our experiments show that our approach achieves improvement in resolving the low-definition imaging of the previous deep image stitching with favorable accelerated image-enhancing methods. Our source code is available at https://github.com/minshu-kim/NIS.

Implicit Neural Image Stitching with Enhanced and Blended Feature Reconstruction

The paper "Implicit Neural Image Stitching with Enhanced and Blended Feature Reconstruction" introduces an innovative framework for image stitching, addressing the limitations of existing approaches that often suffer from blurry artifacts and inconsistencies in illumination and depth. This new approach, termed Neural Image Stitching (NIS), stands out by incorporating implicit neural representation techniques which have demonstrated success in high-frequency detail recovery, particularly in super-resolution tasks. The primary objective is to enhance image quality in panoramic views while rectifying common stitching artifacts.

Methodology Overview

The NIS framework is grounded in the principles of implicit neural representation (INR), which captures continuous signals through neural networks, an approach that has recently seen successful applications in arbitrary-scale super-resolution. The authors propose an intriguing method whereby NIS uses Fourier coefficients to predict high-quality warped images. This is particularly relevant as it extends the concept of arbitrary-scale super-resolution to image stitching, allowing for improved image reconstruction.

The architecture of NIS consists of three main components: neural warping, a blender, and a decoder. The neural warping module is responsible for extracting high-frequency-aware features from the input images, which consist of a reference and a target image. These features are then aligned via pre-trained transformation estimators. The blended features are processed within a latent space, which is crucial for correcting color mismatches and minimizing parallax errors. The decoder, implemented as a multilayer perceptron (MLP), outputs the final RGB image by mapping processed features to pixel values in the image domain.

Key Technical Contributions

  1. Implicit Neural Representation for Stitching: The authors integrate INR to achieve high-frequency detail recovery in the stitched images, addressing the spectral bias limitation inherent in standard neural networks.
  2. Fourier Coefficient Estimation: By predicting Fourier coefficients, the model ensures high-quality reconstruction and effective image warping, which is instrumental in maintaining texture consistency across stitched frames.
  3. Simplified Pipeline: The proposed method unifies several processes, including warping, blending, and image enhancement, into a single, streamlined inference pipeline, potentially improving both efficiency and performance.

Experimental Results

The experiments conducted demonstrate that the NIS framework significantly outperforms existing methods in both synthetic and real-world scenarios, as evaluated by common metrics such as PSNR and SSIM on synthetic datasets, and NIQE, PIQE, and BRISQUE on real datasets. Particularly notable is the improvement in resolving low-definition artifacts, where NIS achieves a considerable boost in PSNR over traditional interpolation methods like bicubic and bilinear. Additionally, the visual quality of stitches on real-world datasets shows enhanced detail and reduced artifact presence compared to prior art.

Implications and Future Directions

The authors suggest that NIS's ability to maintain high-frequency detail can lead to more accurate and visually pleasing panoramic images, making it a promising tool for applications requiring high-quality panoramic views, such as virtual reality, medical imaging, and autonomous driving. Furthermore, the efficient blending of misaligned features has potential applications in enhanced remote sensing and surveillance imaging.

Moving forward, there are opportunities to extend the NIS approach beyond static frame stitching to dynamically captured scenes, potentially incorporating real-time adjustments and on-the-fly rendering for continuous and immersive viewing experiences. Moreover, adapting the framework to handle multiple input modalities could further improve its applicability across diverse domains.

In conclusion, this paper presents a substantial advance in the field of image stitching, leveraging recent developments in neural representation to offer a comprehensive solution to longstanding challenges in the field. The integration of Fourier-based feature prediction with implicit neural reconstruction paves the way for further exploration and deployment of neural methods in complex image processing tasks.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (53)
  1. Interactive digital photomontage. In ACM SIGGRAPH 2004 Papers, pages 294–302, 2004.
  2. Jump: virtual reality video. ACM Transactions on Graphics (TOG), 35(6):1–13, 2016.
  3. Surf: Speeded up robust features. In European conference on computer vision, pages 404–417. Springer, 2006.
  4. Iterative deep homography estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1879–1888, 2022.
  5. Learning continuous image representation with local implicit image function. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 8628–8638, 2021.
  6. Lau-net: Latitude adaptive upscaling network for omnidirectional image super-resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9189–9198, 2021.
  7. Deep image homography estimation. arXiv preprint arXiv:1606.03798, 2016.
  8. Homography estimation from image pairs with hierarchical convolutional networks. In Proceedings of the IEEE international conference on computer vision workshops, pages 913–920, 2017.
  9. Gradient domain high dynamic range compression. In Proceedings of the 29th annual conference on Computer graphics and interactive techniques, pages 249–256, 2002.
  10. Constructing image panoramas using dual-homography warping. In CVPR 2011, pages 49–56. IEEE, 2011.
  11. Multiple view geometry in computer vision. Cambridge university press, 2003.
  12. Multilayer feedforward networks are universal approximators. Neural networks, 2(5):359–366, 1989.
  13. Leveraging line-point consistence to preserve structures for wide parallax image stitching. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 12186–12195, 2021.
  14. Towards all weather and unobstructed multi-spectral image stitching: Algorithm and benchmark. In Proceedings of the 30th ACM International Conference on Multimedia, pages 3783–3791, 2022.
  15. Deep virtual reality image quality assessment with human perception guider for omnidirectional image. IEEE Transactions on Circuits and Systems for Video Technology, 30(4):917–928, 2019.
  16. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
  17. Video stitching for linear camera arrays. arXiv preprint arXiv:1907.13622, 2019.
  18. Deep homography estimation for dynamic scenes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7652–7661, 2020.
  19. Learning local implicit fourier representation for image warping. In ECCV, 2022.
  20. Local texture estimator for implicit representation function. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1929–1938, 2022.
  21. Parallax-tolerant image stitching based on robust elastic warping. IEEE Transactions on multimedia, 20(7):1672–1687, 2017.
  22. Attentive deep stitching and quality assessment for 360∘superscript360360^{\circ}360 start_POSTSUPERSCRIPT ∘ end_POSTSUPERSCRIPT omnidirectional images. IEEE Journal of Selected Topics in Signal Processing, 14(1):209–221, 2019.
  23. Single-perspective warps in natural image stitching. IEEE transactions on image processing, 29:724–735, 2019.
  24. Enhanced deep residual networks for single image super-resolution. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pages 136–144, 2017.
  25. Microsoft COCO: Common objects in context. In European conference on computer vision, pages 740–755. Springer, 2014.
  26. Smoothly varying affine stitching. In CVPR 2011, pages 345–352. IEEE, 2011.
  27. David G Lowe. Distinctive image features from scale-invariant keypoints. International journal of computer vision, 60(2):91–110, 2004.
  28. NeRF: Representing scenes as neural radiance fields for view synthesis. Communications of the ACM, 65(1):99–106, 2021.
  29. No-reference image quality assessment in the spatial domain. IEEE Transactions on image processing, 21(12):4695–4708, 2012.
  30. Making a “completely blind” image quality analyzer. IEEE Signal processing letters, 20(3):209–212, 2012.
  31. Unsupervised deep homography: A fast and robust homography estimation model. IEEE Robotics and Automation Letters, 3(3):2346–2353, 2018.
  32. A view-free image stitching network based on global homography. Journal of Visual Communication and Image Representation, 73:102950, 2020.
  33. Unsupervised deep image stitching: Reconstructing stitched features to images. IEEE Transactions on Image Processing, 30:6184–6197, 2021.
  34. Learning thin-plate spline motion and seamless composition for parallax-tolerant unsupervised deep image stitching. arXiv preprint arXiv:2302.08207, 2023.
  35. Learning edge-preserved image stitching from large-baseline deep homography. arXiv preprint arXiv:2012.06194, 2020.
  36. Poisson image editing. In ACM SIGGRAPH 2003 Papers, pages 313–318, 2003.
  37. Compositing digital images. In Proceedings of the 11th annual conference on Computer graphics and interactive techniques, pages 253–259, 1984.
  38. On the spectral bias of neural networks. In International Conference on Machine Learning, pages 5301–5310. PMLR, 2019.
  39. Orb: An efficient alternative to sift or surf. In 2011 International conference on computer vision, pages 2564–2571. Ieee, 2011.
  40. SRWarp: Generalized image super-resolution under arbitrary transformation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 7782–7791, 2021.
  41. Weakly-supervised stitching network for real-world panoramic image generation. In European Conference on Computer Vision, pages 54–71. Springer, 2022.
  42. Eliminating ghosting and exposure artifacts in image mosaics. In Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001, volume 2, pages II–II. IEEE, 2001.
  43. Blind image quality evaluation using perception based features. In 2015 Twenty First National Conference on Communications (NCC), pages 1–6. IEEE, 2015.
  44. Multi-scenes image stitching based on autonomous driving. In 2020 IEEE 4th Information Technology, Networking, Electronic and Automation Control Conference (ITNEC), volume 1, pages 694–698. IEEE, 2020.
  45. Gp-gan: Towards realistic high-resolution image blending. In Proceedings of the 27th ACM international conference on multimedia, pages 2487–2495, 2019.
  46. Recognizing scene viewpoint using panoramic place representation. In 2012 IEEE Conference on Computer Vision and Pattern Recognition, pages 2695–2702. IEEE, 2012.
  47. SphereSR: 360deg image super-resolution with arbitrary projection via continuous spherical image representation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5677–5686, 2022.
  48. As-projective-as-possible image stitching with moving DLT. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2339–2346, 2013.
  49. Parallax-tolerant image stitching. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 3262–3269, 2014.
  50. Content-aware unsupervised deep homography estimation. In European Conference on Computer Vision, pages 653–669. Springer, 2020.
  51. Deep image blending. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 231–240, 2020.
  52. Deep Lucas-Kanade homography for multimodal image alignment. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 15950–15959, 2021.
  53. STN-homography: Direct estimation of homography parameters for image pairs. Applied Sciences, 9(23):5187, 2019.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Minsu Kim (115 papers)
  2. Jaewon Lee (39 papers)
  3. Byeonghun Lee (3 papers)
  4. Sunghoon Im (30 papers)
  5. Kyong Hwan Jin (24 papers)
Citations (4)
Youtube Logo Streamline Icon: https://streamlinehq.com