Deep Optics for Video Snapshot Compressive Imaging
Abstract: Video snapshot compressive imaging (SCI) aims to capture a sequence of video frames with only a single shot of a 2D detector, whose backbones rest in optical modulation patterns (also known as masks) and a computational reconstruction algorithm. Advanced deep learning algorithms and mature hardware are putting video SCI into practical applications. Yet, there are two clouds in the sunshine of SCI: i) low dynamic range as a victim of high temporal multiplexing, and ii) existing deep learning algorithms' degradation on real system. To address these challenges, this paper presents a deep optics framework to jointly optimize masks and a reconstruction network. Specifically, we first propose a new type of structural mask to realize motion-aware and full-dynamic-range measurement. Considering the motion awareness property in measurement domain, we develop an efficient network for video SCI reconstruction using Transformer to capture long-term temporal dependencies, dubbed Res2former. Moreover, sensor response is introduced into the forward model of video SCI to guarantee end-to-end model training close to real system. Finally, we implement the learned structural masks on a digital micro-mirror device. Experimental results on synthetic and real data validate the effectiveness of the proposed framework. We believe this is a milestone for real-world video SCI. The source code and data are available at https://github.com/pwangcs/DeepOpticsSCI.
- Is space-time attention all you need for video understanding? In Proceedings of the International Conference on Machine Learning (ICML), volume 2, page 4, 2021.
- Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends in Machine Learning, 3(1):1–122, January 2011.
- Deep optics for monocular depth estimation and 3d object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision (CVPR), pages 10193–10202, 2019.
- Memory-efficient network for large-scale video compressive sensing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 16246–16255, June 2021.
- BIRNAT: Bidirectional recurrent neural networks with adversarial training for video snapshot compressive imaging. In Proceedings of the European Conference on Computer Vision (ECCV), August 2020.
- Single-pixel imaging via compressive sampling. IEEE Signal Processing Magazine, 25(2):83–91, 2008.
- Res2net: A new multi-scale backbone architecture. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(2):652–662, 2019.
- Single-shot compressive spectral imaging with a dual-disperser architecture. Optics Express, 15(21):14013–14027, Oct 2007.
- Video from a single coded exposure photograph using a learned over-complete dictionary. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 287–294, Nov 2011.
- Deeply coded aperture for lensless imaging. Optics Letters, 45(11):3131–3134, 2020.
- Binarized neural networks. Advances in Neural Information Processing Systems (NeurIPS), 29, 2016.
- Deepbinarymask: Learning a binary mask for video compressive sensing. Digital Signal Processing, 96:102591, 2020.
- Learning to capture light fields through a coded aperture camera. In Proceedings of the European Conference on Computer Vision (ECCV), pages 418–434, 2018.
- End-to-end video compressive sensing using anderson-accelerated unrolled networks. In 2020 IEEE International Conference on Computational Photography (ICCP), pages 1–12, 2020.
- Generalized alternating projection for weighted-2,1 minimization with applications to model-based compressive sensing. SIAM Journal on Imaging Sciences, 7(2):797–823, 2014.
- Efficient space-time sampling with pixel-wise coded exposure for high-speed imaging. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(2):248–260, 2013.
- Rank minimization for snapshot compressive imaging. IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(12):2990–3006, Dec 2019.
- Deep tensor admm-net for snapshot compressive imaging. In Proceedings of the IEEE/CVF Internatinal Conference on Computer Vision (ICCV), 2019.
- Shufflenet v2: Practical guidelines for efficient cnn architecture design. In Proceedings of the European conference on computer vision (ECCV), pages 116–131, 2018.
- Hdr-vdp-2: A calibrated visual metric for visibility and quality predictions in all luminance conditions. ACM Transactions on Graphics, 30(4), jul 2011.
- Neural sensors: Learning pixel exposures for hdr imaging and video compressive sensing with programmable sensors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(7):1642–1653, 2020.
- Compressive light field photography using overcomplete dictionaries and optimized projections. ACM Transactions on Graphics, 32(4):1–12, 2013.
- Deep unfolding for snapshot compressive imaging. International Journal of Computer Vision, pages 1–26, 2023.
- Deep optics for single-shot high-dynamic-range imaging. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1375–1385, 2020.
- Deepstorm3d: dense 3d localization microscopy and psf design by deep learning. Nature Methods, 17(7):734–740, 2020.
- The 2017 davis challenge on video object segmentation. arXiv preprint arXiv:1704.00675, 2017.
- Snapshot spatial–temporal compressive imaging. Optics Letters, 45(7):1659–1662, Apr 2020.
- Deep learning for video compressive sensing. APL Photonics, 5(3):030801, 2020.
- P2c2: Programmable pixel compressive camera for high speed imaging. In In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 329–336, June 2011.
- U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18, pages 234–241. Springer, 2015.
- Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 1874–1883, 2016.
- Learning rank-1 diffractive optics for single-shot high dynamic range imaging. In Proceedings of the IEEE/CVF Conference on Computer Cision and Pattern Recognition (CVPR), pages 1386–1396, 2020.
- Time-multiplexed coded aperture imaging: Learned coded aperture and pixel exposures for compressive imaging systems. In Proceedings of the IEEE/CVF International Conference on Computer Vision (CVPR), pages 2692–2702, 2021.
- Spatial-temporal transformer for video snapshot compressive imaging. IEEE Transactions on Pattern Analysis and Machine Intelligence, pages 1–18, 2022.
- Metasci: Scalable and adaptive reconstruction for video compressive sensing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 2083–2092, June 2021.
- Phasecam3d—learning phase masks for passive single view depth estimation. In 2019 IEEE International Conference on Computational Photography (ICCP), pages 1–12. IEEE, 2019.
- Dense deep unfolding network with 3d-cnn prior for snapshot compressive imaging. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 4892–4901, 2021.
- Aggregated residual transformations for deep neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 1492–1500, 2017.
- Ensemble learning priors unfolding for scalable snapshot compressive sensing. In Proceedings of the European Conference on Computer Vision (ECCV), 2022.
- Joint optimization for compressive video sensing and reconstruction under hardware constraints. In Proceedings of the European Conference on Computer Vision (ECCV), pages 634–649, 2018.
- X. Yuan. Generalized alternating projection based total variation minimization for compressive sensing. In 2016 IEEE International Conference on Image Processing (ICIP), pages 2539–2543, Sept 2016.
- Plug-and-play algorithms for large-scale snapshot compressive imaging. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 1444–1454, 2020.
- Plug-and-play algorithms for video snapshot compressive imaging. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(10):7093–7111, 2022.
- End-to-end snapshot compressed super-resolution imaging with deep optics. Optica, 9(4):451–454, Apr 2022.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.