HDRFlow: Real-Time HDR Video Reconstruction with Large Motions (2403.03447v1)
Abstract: Reconstructing High Dynamic Range (HDR) video from image sequences captured with alternating exposures is challenging, especially in the presence of large camera or object motion. Existing methods typically align low dynamic range sequences using optical flow or attention mechanism for deghosting. However, they often struggle to handle large complex motions and are computationally expensive. To address these challenges, we propose a robust and efficient flow estimator tailored for real-time HDR video reconstruction, named HDRFlow. HDRFlow has three novel designs: an HDR-domain alignment loss (HALoss), an efficient flow network with a multi-size large kernel (MLK), and a new HDR flow training scheme. The HALoss supervises our flow network to learn an HDR-oriented flow for accurate alignment in saturated and dark regions. The MLK can effectively model large motions at a negligible cost. In addition, we incorporate synthetic data, Sintel, into our training dataset, utilizing both its provided forward flow and backward flow generated by us to supervise our flow network, enhancing our performance in large motion regions. Extensive experiments demonstrate that our HDRFlow outperforms previous methods on standard benchmarks. To the best of our knowledge, HDRFlow is the first real-time HDR video reconstruction method for video sequences captured with alternating exposures, capable of processing 720p resolution inputs at 25ms.
- A naturalistic open source movie for optical flow evaluation. In Proceedings of the European Conference on Computer Vision, pages 611–625. Springer, 2012.
- Hdr video reconstruction: A coarse-to-fine network and a real-world benchmark dataset. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 2502–2511, 2021.
- Attention-guided progressive neural texture fusion for high dynamic range image restoration. IEEE Transactions on Image Processing, 31:2661–2672, 2022.
- Largekernel3d: Scaling up kernels in 3d sparse cnns. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13488–13498, 2023.
- Reconstructing interlaced high-dynamic-range video using joint learning. IEEE Transactions on Image Processing, 26(11):5353–5366, 2017.
- Lan-hdr: Luminance-based alignment network for high dynamic range video reconstruction. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 12760–12769, 2023.
- Deformable convolutional networks. In Proceedings of the IEEE international conference on computer vision, pages 764–773, 2017.
- Scaling up your kernels to 31x31: Revisiting large kernel design in cnns. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11963–11975, 2022.
- Creating cinematic wide gamut hdr-video for the evaluation of tone mapping operators and hdr-displays. In Digital photography X, pages 279–288. SPIE, 2014.
- Flexisp: A flexible camera image processing framework. ACM Transactions on Graphics (TOG), 33(6):1–13, 2014.
- Hdr deghosting: How to deal with saturation? In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1163–1170, 2013.
- Flowformer: A transformer architecture for optical flow. In Proceedings of the European Conference on Computer Vision, pages 668–685. Springer, 2022.
- Learning to estimate hidden motions with global motion aggregation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 9772–9781, 2021.
- Deep hdr video from sequences with alternating exposures. In Computer graphics forum, pages 193–205. Wiley Online Library, 2019.
- Patch-based high dynamic range video. ACM Transactions on Graphics (TOG), 32(6):202–1, 2013.
- High dynamic range video. ACM Transactions on Graphics (TOG), 22(3):319–325, 2003.
- A method for stochastic optimization. In International conference on learning representations (ICLR), page 6. San Diego, California;, 2015.
- A unified framework for multi-sensor hdr video reconstruction. Signal Processing: Image Communication, 29(2):203–215, 2014.
- Joint hdr denoising and fusion: A real-world mobile hdr image dataset. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13966–13975, 2023.
- Adnet: Attention-guided deformable convolutional network for high dynamic range imaging. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 463–470, 2021.
- Ghost-free high dynamic range imaging with context-aware transformer. In Proceedings of the European Conference on Computer Vision, pages 344–360. Springer, 2022.
- Transflow: Transformer as flow learner. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 18063–18073, 2023.
- Robust multi-exposure image fusion: a structural patch decomposition approach. IEEE Transactions on Image Processing, 26(5):2519–2532, 2017.
- High dynamic range video with ghost removal. In Applications of Digital Image Processing XXXIII, pages 307–314. SPIE, 2010.
- Hdr-vdp-2: A calibrated visual metric for visibility and quality predictions in all luminance conditions. ACM Transactions on Graphics (TOG), 30(4):1–14, 2011.
- Optical splitting trees for high-precision monocular imaging. IEEE Computer Graphics and Applications, 27(2):32–42, 2007.
- Robust high dynamic range imaging by rank minimization. IEEE transactions on pattern analysis and machine intelligence, 37(6):1219–1232, 2014.
- Labeled from unlabeled: Exploiting unlabeled data for few-shot deep hdr deghosting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4875–4885, 2021.
- Optical flow estimation using a spatial pyramid network. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4161–4170, 2017.
- Robust patch-based hdr reconstruction of dynamic scenes. ACM Transactions on Graphics (TOG), 31(6):203–1, 2012.
- Videoflow: Exploiting temporal cues for multi-frame optical flow estimation. arXiv preprint arXiv:2303.08340, 2023a.
- Flowformer++: Masked cost volume autoencoding for pretraining optical flow estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1599–1610, 2023b.
- Pwc-net: Cnns for optical flow using pyramid, warping, and cost volume. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8934–8943, 2018.
- Raft: Recurrent all-pairs field transforms for optical flow. In Proceedings of the European Conference on Computer Vision, pages 402–419. Springer, 2020.
- A versatile hdr video production system. ACM Transactions on Graphics (TOG), 30(4):1–10, 2011.
- Deep high dynamic range imaging with large foreground motions. In Proceedings of the European Conference on Computer Vision, pages 117–132, 2018.
- Memory-efficient optical flow via radius-distribution orthogonal cost volume. arXiv preprint arXiv:2312.03790, 2023a.
- Iterative geometry encoding volume for stereo matching. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 21919–21928, 2023b.
- Accurate and efficient stereo matching via attention concatenation volume. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023c.
- Gmflow: Learning optical flow via global matching. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8121–8130, 2022.
- Unifying flow, stereo and depth estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023d.
- Video enhancement with task-oriented flow. International Journal of Computer Vision, 127:1106–1125, 2019.
- Attention-guided network for ghost-free high dynamic range imaging. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1751–1760, 2019.
- Dual-attention-guided network for ghost-free high dynamic range imaging. International Journal of Computer Vision, pages 1–19, 2022.
- A unified hdr imaging method with pixel and patch level. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 22211–22220, 2023.