Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
146 tokens/sec
GPT-4o
10 tokens/sec
Gemini 2.5 Pro Pro
47 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

EvPlug: Learn a Plug-and-Play Module for Event and Image Fusion (2312.16933v1)

Published 28 Dec 2023 in cs.CV and cs.AI

Abstract: Event cameras and RGB cameras exhibit complementary characteristics in imaging: the former possesses high dynamic range (HDR) and high temporal resolution, while the latter provides rich texture and color information. This makes the integration of event cameras into middle- and high-level RGB-based vision tasks highly promising. However, challenges arise in multi-modal fusion, data annotation, and model architecture design. In this paper, we propose EvPlug, which learns a plug-and-play event and image fusion module from the supervision of the existing RGB-based model. The learned fusion module integrates event streams with image features in the form of a plug-in, endowing the RGB-based model to be robust to HDR and fast motion scenes while enabling high temporal resolution inference. Our method only requires unlabeled event-image pairs (no pixel-wise alignment required) and does not alter the structure or weights of the RGB-based model. We demonstrate the superiority of EvPlug in several vision tasks such as object detection, semantic segmentation, and 3D hand pose estimation

Definition Search Book Streamline Icon: https://streamlinehq.com
References (56)
  1. Ev-SegNet: Semantic segmentation for event-based cameras. In CVPR Workshops, 2019.
  2. End-to-end object detection with transformers. In ECCV, 2020.
  3. Cross-attention of disentangled modalities for 3D human mesh recovery with transformers. In ECCV, 2022.
  4. The cityscapes dataset for semantic urban scene understanding. In CVPR, 2016.
  5. ImageNet: A large-scale hierarchical image database. In CVPR, 2009.
  6. Learning from images: A distillation learning framework for event cameras. IEEE TIP, 2021.
  7. An image is worth 16x16 words: Transformers for image recognition at scale. In ICLR, 2021.
  8. Guided Event Filtering: Synergy between intensity images and neuromorphic events for high performance imaging. IEEE TPAMI, 2021.
  9. Gunnar Farnebäck. Two-frame motion estimation based on polynomial expansion. In Image Analysis, 2003.
  10. Event-based vision: A survey. IEEE TPAMI, 44(1):154–180, 2022.
  11. Image style transfer using convolutional neural networks. In CVPR, 2016.
  12. Video to events: Recycling video datasets for event cameras. In CVPR, 2020.
  13. Combining events and frames using recurrent asynchronous multimodal networks for monocular depth prediction. IRAL, 2021.
  14. DSEC: A stereo event camera dataset for driving scenarios. IRAL, 2021.
  15. EvIntSR-Net: Event guided multiple latent frames reconstruction and super-resolution. In ICCV, 2021.
  16. Neuromorphic camera guided high dynamic range imaging. In CVPR, 2020.
  17. Deep residual learning for image recognition. In CVPR, 2016.
  18. Event-aided direct sparse odometry. In CVPR, 2022.
  19. Learning to exploit multiple vision modalities by using grafted networks. In ECCV, 2020.
  20. EvHandPose: Event-based 3D hand pose estimation with sparse supervision. In arXiv:2303.02862, 2023.
  21. Adam: A method for stochastic optimization. In ICLR, 2015.
  22. Panoptic segmentation. In CVPR, 2019.
  23. A 128×128 120 dB 15 μ𝜇{\mu}italic_μs latency asynchronous temporal contrast vision sensor. JSSC, 2008.
  24. Microsoft COCO: common objects in context. In ECCV, 2014.
  25. Swin transformer: Hierarchical vision transformer using shifted windows. In ICCV, 2021.
  26. Bridging the gap between events and frames through unsupervised domain adaptation. IRAL, 2022.
  27. Multi-bracket high dynamic range imaging with event cameras. In CVPR, 2022.
  28. Event-based moving object detection and tracking. In IROS, 2018.
  29. Learning visual motion segmentation using event surfaces. In CVPR, 2020.
  30. Learning to super resolve intensity images from events. In CVPR, 2020.
  31. Bringing a blurry frame alive at high frame-rate with an event camera. In CVPR, 2019.
  32. PyTorch: An imperative style, high-performance deep learning library. In NeurIPS, 2019.
  33. Learning to detect objects with a 1 megapixel event camera. In NeurIPS, 2020.
  34. Learning transferable visual models from natural language supervision. In ICML, 2021.
  35. High speed and high dynamic range video with an event camera. IEEE TPAMI, 2021.
  36. EventHands: Real-time neural 3D hand pose estimation from an event stream. In ICCV, 2021.
  37. A 640×\times×480 dynamic vision sensor with a 9μ𝜇\muitalic_μm pixel and 300Meps address-event representation. In ISSCC, pages 66–67, 2017.
  38. Event-based motion segmentation by motion compensation. In ICCV, 2019.
  39. Event-based fusion for motion deblurring with cross-modal attention. In ECCV, 2022.
  40. ESS: learning event-based semantic segmentation from still images. In ECCV, 2022.
  41. Front and back illuminated dynamic and active pixel vision sensors comparison. IEEE TCAS-II, 65(5):677–681, 2018.
  42. NEST: neural event stack for event-based image enhancement. In ECCV, 2022.
  43. Fusing event-based and RGB camera for robust object detection in adverse conditions. In ICRA, 2022.
  44. Time Lens++: Event-based frame interpolation with parametric non-linear flow and multi-scale fusion. In CVPR, 2022.
  45. Time Lens: Event-based video frame interpolation. In CVPR, 2021.
  46. Attention is all you need. In NeurIPS, 2017.
  47. Dual transfer learning for event-based end-task prediction via pluggable event to image translation. In ICCV, 2021.
  48. EvDistill: Asynchronous events to end-task learning via bidirectional reconstruction-guided cross-modal knowledge distillation. In CVPR, 2021.
  49. Knowledge distillation and student-teacher learning for visual intelligence: A review and new outlooks. IEEE TPAMI, 2022.
  50. Joint filtering of intensity images and neuromorphic events for high-resolution noise-robust imaging. In CVPR, 2020.
  51. EventCap: Monocular 3D capture of high-speed human motions using an event camera. In CVPR, 2020.
  52. Rgb-event fusion for moving object detection in autonomous driving. ICRA, 2023.
  53. EventGAN: Leveraging large scale image datasets for event cameras. In ICCP, 2021.
  54. Ev-FlowNet: Self-supervised optical flow estimation for event-based cameras. In RSS, 2018.
  55. Unsupervised event-based learning of optical flow, depth, and egomotion. In CVPR, 2019.
  56. EventHPE: Event-based 3D human pose and shape estimation. In ICCV, 2021.

Summary

We haven't generated a summary for this paper yet.