Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

MetaISP -- Exploiting Global Scene Structure for Accurate Multi-Device Color Rendition (2401.03220v1)

Published 6 Jan 2024 in cs.CV

Abstract: Image signal processors (ISPs) are historically grown legacy software systems for reconstructing color images from noisy raw sensor measurements. Each smartphone manufacturer has developed its ISPs with its own characteristic heuristics for improving the color rendition, for example, skin tones and other visually essential colors. The recent interest in replacing the historically grown ISP systems with deep-learned pipelines to match DSLR's image quality improves structural features in the image. However, these works ignore the superior color processing based on semantic scene analysis that distinguishes mobile phone ISPs from DSLRs. Here, we present MetaISP, a single model designed to learn how to translate between the color and local contrast characteristics of different devices. MetaISP takes the RAW image from device A as input and translates it to RGB images that inherit the appearance characteristics of devices A, B, and C. We achieve this result by employing a lightweight deep learning technique that conditions its output appearance based on the device of interest. In this approach, we leverage novel attention mechanisms inspired by cross-covariance to learn global scene semantics. Additionally, we use the metadata that typically accompanies RAW images and estimate scene illuminants when they are unavailable.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (42)
  1. Leveraging the availability of two cameras for illuminant estimation. In Proc. CVPR (2021), pp. 6637–6646.
  2. Xcit: Cross-covariance image transformers. Proc. NeurIPS 34 (2021), 20014–20027.
  3. Unprocessing images for learned raw denoising. In Proc. CVPR (2019), pp. 11036–11045.
  4. Learning to see in the dark. In Proc. CVPR (2018), pp. 3291–3300.
  5. You only need 90k parameters to adapt light: a light weight transformer for image enhancement and exposure correction. In 33rd British Machine Vision Conference 2022, BMVC 2022, London, UK, November 21-24, 2022 (2022), BMVA Press. URL: https://bmvc2022.mpi-inf.mpg.de/0238.pdf.
  6. End-to-end object detection with transformers. In Proc. ECCV (2020), Springer, pp. 213–229.
  7. Awnet: Attentive wavelet network for image isp. In Proc. ECCV Workshops (2020).
  8. Deep joint demosaicking and denoising. ACM Transactions on Graphics (ToG) 35, 6 (2016), 1–12.
  9. Generative adversarial networks. Communications of the ACM 63, 11 (2020), 139–144.
  10. Huang X., Belongie S.: Arbitrary style transfer in real-time with adaptive instance normalization. In Proc. ICCV (2017), pp. 1501–1510.
  11. Flexisp: A flexible camera image processing framework. ACM Transactions on Graphics (Proceedings SIGGRAPH Asia 2014) 33, 6 (December 2014).
  12. Learned smartphone isp on mobile npus with deep learning, mobile ai 2021 challenge: Report. In Proc. CVPR Workshops (June 2021), pp. 2503–2514. doi:10.1109/CVPRW53098.2021.00284.
  13. Inc. A.: Panoptic segmentation. https://machinelearning.apple.com/research/panoptic-segmentation, 2021. Accessed on September 2, 2023.
  14. AIM 2020 Challenge on Learned Image Signal Processing Pipeline. 01 2020, pp. 152–170. doi:10.1007/978-3-030-67070-2_9.
  15. Replacing mobile camera isp with a single deep learning model. In Proc. CVPR Workshops (June 2020).
  16. Image-to-image translation with conditional adversarial networks. In Proc. CVPR (2017), pp. 5967–5976. doi:10.1109/CVPR.2017.632.
  17. Karaimer H. C., Brown M. S.: A software platform for manipulating the camera imaging pipeline. In Proc. ECCV (2016).
  18. A style-based generator architecture for generative adversarial networks. In Proc. CVPR (2019), pp. 4401–4410.
  19. Swinir: Image restoration using swin transformer. In Proc. CVPR (2021), pp. 1833–1844.
  20. Polarized reflection removal with perfect alignment in the wild. In Proc. CVPR (2020), pp. 1750–1758.
  21. Adaattn: Revisit attention mechanism in arbitrary neural style transfer. In Proc. ICCV (2021), pp. 6649–6658.
  22. Learning representations for automatic colorization. In Proc. ECCV (2016), Springer, pp. 577–593.
  23. Enhanced deep residual networks for single image super-resolution. In Proc. CVPR Workshops (July 2017).
  24. All in one bad weather removal using architectural search. In Proc. CVPR (2020), pp. 3175–3185.
  25. Photo-realistic single image super-resolution using a generative adversarial network. In Proc. CVPR (2017), pp. 4681–4690.
  26. Unflow: Unsupervised learning of optical flow with a bidirectional census loss. In Proceedings of the AAAI conference on artificial intelligence (2018), vol. 32.
  27. Mirza M., Osindero S.: Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784 (2014).
  28. Nguyen R. M., Brown M. S.: Raw image reconstruction using a self-contained srgb-jpeg image with only 64 kb overhead. In Proc. CVPR (2016), pp. 1655–1663.
  29. Punnappurath A., Brown M. S.: Learning raw image reconstruction-aware deep image compressors. IEEE transactions on pattern analysis and machine intelligence 42, 4 (2019), 1013–1019.
  30. Context encoders: Feature learning by inpainting. In Proc. CVPR (2016), pp. 2536–2544.
  31. Park D. Y., Lee K. H.: Arbitrary style transfer with style-attentional networks. In Proc. CVPR (2019), pp. 5880–5888.
  32. U-net: Convolutional networks for biomedical image segmentation. vol. 9351, pp. 234–241. doi:10.1007/978-3-319-24574-4_28.
  33. PWC-Net: CNNs for optical flow using pyramid, warping, and cost volume. In Proc. CVPR (2018).
  34. Simonyan K., Zisserman A.: Very deep convolutional networks for large-scale image recognition, 2014. URL: https://arxiv.org/abs/1409.1556, doi:10.48550/ARXIV.1409.1556.
  35. Going deeper with image transformers. In Proc. ICCV (October 2021), pp. 32–42.
  36. Vedaldi A., Fulkerson B.: Vlfeat: An open and portable library of computer vision algorithms. In Proceedings of the 18th ACM International Conference on Multimedia (New York, NY, USA, 2010), MM ’10, Association for Computing Machinery, p. 1469–1472. URL: https://doi.org/10.1145/1873951.1874249, doi:10.1145/1873951.1874249.
  37. Transweather: Transformer-based restoration of images degraded by adverse weather conditions. In Proc. CVPR (2022), pp. 2353–2363.
  38. Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing 13, 4 (2004), 600–612. doi:10.1109/TIP.2003.819861.
  39. Invertible image signal processing. In Proc. CVPR (2021), pp. 6287–6296.
  40. Cycleisp: Real image restoration via improved data synthesis. In Proc. CVPR (2020), pp. 2696–2705.
  41. Learning raw-to-srgb mappings with inaccurately aligned supervision. In Proc. ICCV (2021).
  42. Toward multimodal image-to-image translation. In Proc. NeurIPS (2017).
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Matheus Souza (9 papers)
  2. Wolfgang Heidrich (34 papers)
Citations (2)

Summary

We haven't generated a summary for this paper yet.