Pix2HDR -- A pixel-wise acquisition and deep learning-based synthesis approach for high-speed HDR videos (2310.16139v2)
Abstract: Accurately capturing dynamic scenes with wide-ranging motion and light intensity is crucial for many vision applications. However, acquiring high-speed high dynamic range (HDR) video is challenging because the camera's frame rate restricts its dynamic range. Existing methods sacrifice speed to acquire multi-exposure frames. Yet, misaligned motion in these frames can still pose complications for HDR fusion algorithms, resulting in artifacts. Instead of frame-based exposures, we sample the videos using individual pixels at varying exposures and phase offsets. Implemented on a monochrome pixel-wise programmable image sensor, our sampling pattern simultaneously captures fast motion at a high dynamic range. We then transform pixel-wise outputs into an HDR video using end-to-end learned weights from deep neural networks, achieving high spatiotemporal resolution with minimized motion blurring. We demonstrate aliasing-free HDR video acquisition at 1000 FPS, resolving fast motion under low-light conditions and against bright backgrounds - both challenging conditions for conventional cameras. By combining the versatility of pixel-wise sampling patterns with the strength of deep neural networks at decoding complex scenes, our method greatly enhances the vision system's adaptability and performance in dynamic conditions.
- B. Wilburn, N. Joshi, V. Vaish, E.-V. Talvala, E. Antunez, A. Barth, A. Adams, M. Horowitz, and M. Levoy, “High performance imaging using large camera arrays,” in ACM SIGGRAPH 2005 Papers, 2005, pp. 765–776.
- V. Popovic, K. Seyid, E. Pignat, Ö. Çogal, and Y. Leblebici, “Multi-camera platform for panoramic real-time hdr video construction and rendering,” Journal of Real-Time Image Processing, vol. 12, pp. 697–708, 2016.
- M. D. Tocci, C. Kiser, N. Tocci, and P. Sen, “A versatile hdr video production system,” ACM Transactions on Graphics (TOG), vol. 30, no. 4, pp. 1–10, 2011.
- N. K. Kalantari and R. Ramamoorthi, “Deep hdr video from sequences with alternating exposures,” in Computer graphics forum, vol. 38, no. 2. Wiley Online Library, 2019, pp. 193–205.
- G. Chen, C. Chen, S. Guo, Z. Liang, K.-Y. K. Wong, and L. Zhang, “Hdr video reconstruction: A coarse-to-fine network and a real-world benchmark dataset,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 2502–2511.
- Z. Khan, P. Shettiwar, M. Khanna, and S. Raman, “Deephs-hdrvideo: deep high speed high dynamic range video reconstruction,” in 2022 26th International Conference on Pattern Recognition (ICPR). IEEE, 2022, pp. 1959–1966.
- J. N. Martel, L. K. Mueller, S. J. Carey, P. Dudek, and G. Wetzstein, “Neural sensors: Learning pixel exposures for hdr imaging and video compressive sensing with programmable sensors,” IEEE transactions on pattern analysis and machine intelligence, vol. 42, no. 7, pp. 1642–1653, 2020.
- S. K. Nayar and T. Mitsunaga, “High dynamic range imaging: Spatially varying pixel exposures,” in Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No. PR00662), vol. 1. IEEE, 2000, pp. 472–479.
- S. K. Nayar, V. Branzoi, and T. E. Boult, “Programmable imaging: Towards a flexible camera,” International Journal of Computer Vision, vol. 70, pp. 7–22, 2006.
- A. Serrano, F. Heide, D. Gutierrez, G. Wetzstein, and B. Masia, “Convolutional sparse coding for high dynamic range imaging,” in Computer Graphics Forum, vol. 35, no. 2. Wiley Online Library, 2016, pp. 153–163.
- M. M. Alghamdi, Q. Fu, A. K. Thabet, and W. Heidrich, “Reconfigurable snapshot hdr imaging using coded masks and inception network,” 2019.
- R. Gulve, N. Sarhangnejad, G. Dutta, M. Sakr, D. Nguyen, R. Rangel, W. Chen, Z. Xia, M. Wei, N. Gusev et al., “39 000-subexposures/s dual-adc cmos image sensor with dual-tap coded-exposure pixels for single-shot hdr and 3-d computational imaging,” IEEE Journal of Solid-State Circuits, 2023.
- Y. Luo and S. Mirabbasi, “A 30-fps 192×\times× 192 cmos image sensor with per-frame spatial-temporal coded exposure for compressive focal-stack depth sensing,” IEEE Journal of Solid-State Circuits, vol. 57, no. 6, pp. 1661–1672, 2022.
- S. Hajisharif, J. Kronander, and J. Unger, “Adaptive dualiso hdr reconstruction,” EURASIP Journal on Image and Video Processing, vol. 2015, no. 1, pp. 1–13, 2015.
- J. Zhang, J. P. Newman, X. Wang, C. S. Thakur, J. Rattray, R. Etienne-Cummings, and M. A. Wilson, “A closed-loop, all-electronic pixel-wise adaptive imaging system for high dynamic range videography,” IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 67, no. 6, pp. 1803–1814, 2020.
- J. Zhang, T. Xiong, T. Tran, S. Chin, and R. Etienne-Cummings, “Compact all-cmos spatiotemporal compressive sensing video camera with pixel-wise coded exposure,” Optics express, vol. 24, no. 8, pp. 9013–9024, 2016.
- T. Portz, L. Zhang, and H. Jiang, “Random coded sampling for high-speed hdr video,” in IEEE International Conference on Computational Photography (ICCP). IEEE, 2013, pp. 1–8.
- T. Mitsunaga and S. K. Nayar, “Radiometric self calibration,” in Proceedings. 1999 IEEE computer society conference on computer vision and pattern recognition (Cat. No PR00149), vol. 1. IEEE, 1999, pp. 374–380.
- P. E. Debevec and J. Malik, “Recovering high dynamic range radiance maps from photographs,” in Seminal Graphics Papers: Pushing the Boundaries, Volume 2, 2023, pp. 643–652.
- O. Gallo, N. Gelfandz, W.-C. Chen, M. Tico, and K. Pulli, “Artifact-free high dynamic range imaging,” in 2009 IEEE International conference on computational photography (ICCP). IEEE, 2009, pp. 1–7.
- P. Sen, N. K. Kalantari, M. Yaesoubi, S. Darabi, D. B. Goldman, and E. Shechtman, “Robust patch-based hdr reconstruction of dynamic scenes.” ACM Trans. Graph., vol. 31, no. 6, pp. 203–1, 2012.
- T.-H. Oh, J.-Y. Lee, Y.-W. Tai, and I. S. Kweon, “Robust high dynamic range imaging by rank minimization,” IEEE transactions on pattern analysis and machine intelligence, vol. 37, no. 6, pp. 1219–1232, 2014.
- N. K. Kalantari, R. Ramamoorthi et al., “Deep high dynamic range imaging of dynamic scenes.” ACM Trans. Graph., vol. 36, no. 4, pp. 144–1, 2017.
- J. Cai, S. Gu, and L. Zhang, “Learning a deep single image contrast enhancer from multi-exposure images,” IEEE Transactions on Image Processing, vol. 27, no. 4, pp. 2049–2062, 2018.
- Q. Yan, L. Zhang, Y. Liu, Y. Zhu, J. Sun, Q. Shi, and Y. Zhang, “Deep hdr imaging via a non-local network,” IEEE Transactions on Image Processing, vol. 29, pp. 4308–4322, 2020.
- S. Yao, “Robust image registration for multiple exposure high dynamic range image synthesis,” in Image Processing: Algorithms and Systems IX, vol. 7870. SPIE, 2011, pp. 216–224.
- S. Baker, D. Scharstein, J. Lewis, S. Roth, M. J. Black, and R. Szeliski, “A database and evaluation methodology for optical flow,” International journal of computer vision, vol. 92, pp. 1–31, 2011.
- H. Zimmer, A. Bruhn, and J. Weickert, “Freehand hdr imaging of moving scenes with simultaneous resolution enhancement,” in Computer Graphics Forum, vol. 30, no. 2. Wiley Online Library, 2011, pp. 405–414.
- M. McGuire, W. Matusik, H. Pfister, B. Chen, J. F. Hughes, and S. K. Nayar, “Optical splitting trees for high-precision monocular imaging,” IEEE Computer Graphics and Applications, vol. 27, no. 2, pp. 32–42, 2007.
- J. Froehlich, S. Grandinetti, B. Eberhardt, S. Walter, A. Schilling, and H. Brendel, “Creating cinematic wide gamut hdr-video for the evaluation of tone mapping operators and hdr-displays,” in Digital photography X, vol. 9023. SPIE, 2014, pp. 279–288.
- V. Ramachandra, M. Zwicker, and T. Nguyen, “Hdr imaging from differently exposed multiview videos,” in 2008 3DTV Conference: The True Vision-Capture, Transmission and Display of 3D Video. IEEE, 2008, pp. 85–88.
- Nayar and Branzoi, “Adaptive dynamic range imaging: Optical control of pixel exposures over space and time,” in Proceedings Ninth IEEE International Conference on Computer Vision. IEEE, 2003, pp. 1168–1175.
- S. J. Carey, A. Lopich, D. R. Barr, B. Wang, and P. Dudek, “A 100,000 fps vision sensor with embedded 535gops/w 256×\times× 256 simd processor array,” in 2013 symposium on VLSI circuits. IEEE, 2013, pp. C182–C183.
- Y. Jiang, I. Choi, J. Jiang, and J. Gu, “Hdr video reconstruction with tri-exposure quad-bayer sensors,” arXiv preprint arXiv:2103.10982, 2021.
- U. Cogalan, M. Bemana, K. Myszkowski, H.-P. Seidel, and T. Ritschel, “Learning hdr video reconstruction for dual-exposure sensors with temporally-alternating exposures,” Computers & Graphics, vol. 105, pp. 57–72, 2022.
- J. Zhang, J. Newman, Z. Wang, Y. Qian, W. Guo, Z. S. Chen, C. Linghu, R. Etienne-Cummings, E. Fossum, E. Boyden et al., “Pixel-wise programmability enables dynamic high-snr cameras for high-speed microscopy,” bioRxiv, pp. 2023–06, 2023.
- A. Agrawal, Y. Xu, and R. Raskar, “Invertible motion blur in video,” in ACM SIGGRAPH 2009 papers, 2009, pp. 1–8.
- D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.
- K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv preprint arXiv:1409.1556, 2014.
- J. Snell, K. Ridgeway, R. Liao, B. D. Roads, M. C. Mozer, and R. S. Zemel, “Learning to generate images with perceptual similarity metrics,” in 2017 IEEE International Conference on Image Processing (ICIP). IEEE, 2017, pp. 4277–4281.
- H. Zhao, O. Gallo, I. Frosio, and J. Kautz, “Loss functions for image restoration with neural networks,” IEEE Transactions on computational imaging, vol. 3, no. 1, pp. 47–57, 2016.
- R. Mantiuk, K. J. Kim, A. G. Rempel, and W. Heidrich, “Hdr-vdp-2: A calibrated visual metric for visibility and quality predictions in all luminance conditions,” ACM Transactions on graphics (TOG), vol. 30, no. 4, pp. 1–14, 2011.
- M. Narwaria, M. P. Da Silva, and P. Le Callet, “Hdr-vqm: An objective quality measure for high dynamic range video,” Signal Processing: Image Communication, vol. 35, pp. 46–60, 2015.
- J. P. Newman, J. Zhang, A. Cuevas-Lopez, N. J. Miller, T. Honda, M.-S. H. van der Goes, A. H. Leighton, F. Carvalho, G. Lopes, A. Lakunina et al., “A unified open-source platform for multimodal neural recording and perturbation during naturalistic behavior,” bioRxiv, pp. 2023–08, 2023.
- E. Peli, “Contrast in complex images,” JOSA A, vol. 7, no. 10, pp. 2032–2040, 1990.
- M. Estribeau and P. Magnan, “Fast mtf measurement of cmos imagers using iso 12333 slanted-edge methodology,” in Detectors and Associated Signal Processing, vol. 5251. SPIE, 2004, pp. 243–252.
- M. Jin, G. Meishvili, and P. Favaro, “Learning to extract a video sequence from a single motion-blurred image,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 6334–6342.
- K. Purohit, A. Shah, and A. Rajagopalan, “Bringing alive blurred moments,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 6830–6839.