Papers
Topics
Authors
Recent
Search
2000 character limit reached

Deep Optics for Video Snapshot Compressive Imaging

Published 8 Apr 2024 in cs.CV | (2404.05274v1)

Abstract: Video snapshot compressive imaging (SCI) aims to capture a sequence of video frames with only a single shot of a 2D detector, whose backbones rest in optical modulation patterns (also known as masks) and a computational reconstruction algorithm. Advanced deep learning algorithms and mature hardware are putting video SCI into practical applications. Yet, there are two clouds in the sunshine of SCI: i) low dynamic range as a victim of high temporal multiplexing, and ii) existing deep learning algorithms' degradation on real system. To address these challenges, this paper presents a deep optics framework to jointly optimize masks and a reconstruction network. Specifically, we first propose a new type of structural mask to realize motion-aware and full-dynamic-range measurement. Considering the motion awareness property in measurement domain, we develop an efficient network for video SCI reconstruction using Transformer to capture long-term temporal dependencies, dubbed Res2former. Moreover, sensor response is introduced into the forward model of video SCI to guarantee end-to-end model training close to real system. Finally, we implement the learned structural masks on a digital micro-mirror device. Experimental results on synthetic and real data validate the effectiveness of the proposed framework. We believe this is a milestone for real-world video SCI. The source code and data are available at https://github.com/pwangcs/DeepOpticsSCI.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (44)
  1. Is space-time attention all you need for video understanding? In Proceedings of the International Conference on Machine Learning (ICML), volume 2, page 4, 2021.
  2. Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends in Machine Learning, 3(1):1–122, January 2011.
  3. Deep optics for monocular depth estimation and 3d object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision (CVPR), pages 10193–10202, 2019.
  4. Memory-efficient network for large-scale video compressive sensing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 16246–16255, June 2021.
  5. BIRNAT: Bidirectional recurrent neural networks with adversarial training for video snapshot compressive imaging. In Proceedings of the European Conference on Computer Vision (ECCV), August 2020.
  6. Single-pixel imaging via compressive sampling. IEEE Signal Processing Magazine, 25(2):83–91, 2008.
  7. Res2net: A new multi-scale backbone architecture. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(2):652–662, 2019.
  8. Single-shot compressive spectral imaging with a dual-disperser architecture. Optics Express, 15(21):14013–14027, Oct 2007.
  9. Video from a single coded exposure photograph using a learned over-complete dictionary. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 287–294, Nov 2011.
  10. Deeply coded aperture for lensless imaging. Optics Letters, 45(11):3131–3134, 2020.
  11. Binarized neural networks. Advances in Neural Information Processing Systems (NeurIPS), 29, 2016.
  12. Deepbinarymask: Learning a binary mask for video compressive sensing. Digital Signal Processing, 96:102591, 2020.
  13. Learning to capture light fields through a coded aperture camera. In Proceedings of the European Conference on Computer Vision (ECCV), pages 418–434, 2018.
  14. End-to-end video compressive sensing using anderson-accelerated unrolled networks. In 2020 IEEE International Conference on Computational Photography (ICCP), pages 1–12, 2020.
  15. Generalized alternating projection for weighted-2,1 minimization with applications to model-based compressive sensing. SIAM Journal on Imaging Sciences, 7(2):797–823, 2014.
  16. Efficient space-time sampling with pixel-wise coded exposure for high-speed imaging. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(2):248–260, 2013.
  17. Rank minimization for snapshot compressive imaging. IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(12):2990–3006, Dec 2019.
  18. Deep tensor admm-net for snapshot compressive imaging. In Proceedings of the IEEE/CVF Internatinal Conference on Computer Vision (ICCV), 2019.
  19. Shufflenet v2: Practical guidelines for efficient cnn architecture design. In Proceedings of the European conference on computer vision (ECCV), pages 116–131, 2018.
  20. Hdr-vdp-2: A calibrated visual metric for visibility and quality predictions in all luminance conditions. ACM Transactions on Graphics, 30(4), jul 2011.
  21. Neural sensors: Learning pixel exposures for hdr imaging and video compressive sensing with programmable sensors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(7):1642–1653, 2020.
  22. Compressive light field photography using overcomplete dictionaries and optimized projections. ACM Transactions on Graphics, 32(4):1–12, 2013.
  23. Deep unfolding for snapshot compressive imaging. International Journal of Computer Vision, pages 1–26, 2023.
  24. Deep optics for single-shot high-dynamic-range imaging. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1375–1385, 2020.
  25. Deepstorm3d: dense 3d localization microscopy and psf design by deep learning. Nature Methods, 17(7):734–740, 2020.
  26. The 2017 davis challenge on video object segmentation. arXiv preprint arXiv:1704.00675, 2017.
  27. Snapshot spatial–temporal compressive imaging. Optics Letters, 45(7):1659–1662, Apr 2020.
  28. Deep learning for video compressive sensing. APL Photonics, 5(3):030801, 2020.
  29. P2c2: Programmable pixel compressive camera for high speed imaging. In In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 329–336, June 2011.
  30. U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18, pages 234–241. Springer, 2015.
  31. Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 1874–1883, 2016.
  32. Learning rank-1 diffractive optics for single-shot high dynamic range imaging. In Proceedings of the IEEE/CVF Conference on Computer Cision and Pattern Recognition (CVPR), pages 1386–1396, 2020.
  33. Time-multiplexed coded aperture imaging: Learned coded aperture and pixel exposures for compressive imaging systems. In Proceedings of the IEEE/CVF International Conference on Computer Vision (CVPR), pages 2692–2702, 2021.
  34. Spatial-temporal transformer for video snapshot compressive imaging. IEEE Transactions on Pattern Analysis and Machine Intelligence, pages 1–18, 2022.
  35. Metasci: Scalable and adaptive reconstruction for video compressive sensing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 2083–2092, June 2021.
  36. Phasecam3d—learning phase masks for passive single view depth estimation. In 2019 IEEE International Conference on Computational Photography (ICCP), pages 1–12. IEEE, 2019.
  37. Dense deep unfolding network with 3d-cnn prior for snapshot compressive imaging. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 4892–4901, 2021.
  38. Aggregated residual transformations for deep neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 1492–1500, 2017.
  39. Ensemble learning priors unfolding for scalable snapshot compressive sensing. In Proceedings of the European Conference on Computer Vision (ECCV), 2022.
  40. Joint optimization for compressive video sensing and reconstruction under hardware constraints. In Proceedings of the European Conference on Computer Vision (ECCV), pages 634–649, 2018.
  41. X. Yuan. Generalized alternating projection based total variation minimization for compressive sensing. In 2016 IEEE International Conference on Image Processing (ICIP), pages 2539–2543, Sept 2016.
  42. Plug-and-play algorithms for large-scale snapshot compressive imaging. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 1444–1454, 2020.
  43. Plug-and-play algorithms for video snapshot compressive imaging. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(10):7093–7111, 2022.
  44. End-to-end snapshot compressed super-resolution imaging with deep optics. Optica, 9(4):451–454, Apr 2022.
Citations (9)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (3)

Collections

Sign up for free to add this paper to one or more collections.