Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
153 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Video Frame Interpolation with Region-Distinguishable Priors from SAM (2312.15868v1)

Published 26 Dec 2023 in cs.CV

Abstract: In existing Video Frame Interpolation (VFI) approaches, the motion estimation between neighboring frames plays a crucial role. However, the estimation accuracy in existing methods remains a challenge, primarily due to the inherent ambiguity in identifying corresponding areas in adjacent frames for interpolation. Therefore, enhancing accuracy by distinguishing different regions before motion estimation is of utmost importance. In this paper, we introduce a novel solution involving the utilization of open-world segmentation models, e.g., SAM (Segment Anything Model), to derive Region-Distinguishable Priors (RDPs) in different frames. These RDPs are represented as spatial-varying Gaussian mixtures, distinguishing an arbitrary number of areas with a unified modality. RDPs can be integrated into existing motion-based VFI methods to enhance features for motion estimation, facilitated by our designed play-and-plug Hierarchical Region-aware Feature Fusion Module (HRFFM). HRFFM incorporates RDP into various hierarchical stages of VFI's encoder, using RDP-guided Feature Normalization (RDPFN) in a residual learning manner. With HRFFM and RDP, the features within VFI's encoder exhibit similar representations for matched regions in neighboring frames, thus improving the synthesis of intermediate frames. Extensive experiments demonstrate that HRFFM consistently enhances VFI performance across various scenes.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (52)
  1. Memc-net: Motion estimation and motion compensation driven neural network for video interpolation and enhancement. TPAMI, 2019.
  2. Video frame interpolation via deformable separable convolution. In AAAI, 2020a.
  3. Video frame interpolation via deformable separable convolution. In AAAI, 2020b.
  4. Multiple video frame interpolation via enhanced deformable separable convolution. IEEE TPAMI, 2021.
  5. Segment and track anything. arXiv preprint arXiv:2305.06558, 2023.
  6. Channel attention is all you need for video frame interpolation. In AAAI, 2020a.
  7. Channel attention is all you need for video frame interpolation. In AAAI, 2020b.
  8. Channel attention is all you need for video frame interpolation. In AAAI, 2020c.
  9. Cdfi: Compression-driven network design for frame interpolation. In CVPR, 2021.
  10. Deepstereo: Learning to predict new views from the world’s imagery. In CVPR, 2016.
  11. Featureflow: Robust video interpolation via structure-to-texture generation. CVPR, 2020.
  12. Many-to-many splatting for efficient video frame interpolation. In CVPR, 2023.
  13. Liteflownet: A lightweight convolutional neural network for optical flow estimation. In CVPR, 2018.
  14. Flownet 2.0: Evolution of optical flow estimation with deep networks. In CVPR, 2017.
  15. Super slomo: High quality estimation of multiple intermediate frames for video interpolation. In ICCV, 2018a.
  16. Super slomo: High quality estimation of multiple intermediate frames for video interpolation. In CVPR, 2018b.
  17. Enhanced bi-directional motion estimation for video frame interpolation. arXiv preprint arXiv:2206.08572, 2022.
  18. A unified pyramid recurrent network for video frame interpolation. In CVPR, 2023.
  19. Flavr: Flow-agnostic video representations for fast frame interpolation. In WACV, 2020.
  20. Segment anything. arXiv preprint arXiv:2304.02643, 2023.
  21. Ifrnet: Intermediate feature refine network for efficient frame interpolation. In CVPR, 2022.
  22. Adacof: Adaptive collaboration of flows for video frame interpolation. In CVPR, 2020.
  23. Enhanced correlation matching based video frame interpolation. In WACV, 2022.
  24. Deep video frame interpolation using cyclic frame generation. In AAAI, 2019.
  25. Video frame synthesis using deep voxel flow. In ICCV, 2017a.
  26. Video frame synthesis using deep voxel flow. In ICCV, 2017b.
  27. Learning image matching by simply watching video. In ECCV, 2016.
  28. Novel integration of frame rate up conversion and hevc coding based on rate-distortion optimization. TIP, 2017.
  29. Video frame interpolation with transformer. In CVPR, 2022.
  30. Can sam boost video super-resolution ? arXiv preprint arXiv:2305.06524, 2023.
  31. Phasenet for video frame interpolation. In CVPR, 2018.
  32. Context-aware synthesis for video frame interpolation. In CVPR, 2018.
  33. Softmax splatting for video frame interpolation. In CVPR, 2020a.
  34. Softmax splatting for video frame interpolation. In CVPR, 2020b.
  35. Video frame interpolation via adaptive separable convolution. In ICCV, 2017a.
  36. Video frame interpolation via adaptive convolution. In CVPR, 2017b.
  37. Video frame interpolation via adaptive separable convolution. In ICCV, 2017c.
  38. Revisiting adaptive convolutions for video frame interpolation. In WACV, 2021.
  39. Bmbc: Bilateral motion estimation with bilateral cost volume for video interpolation. In ECCV, 2020.
  40. Asymmetric bilateral motion estimation for video frame interpolation. In ICCV, 2021a.
  41. Asymmetric bilateral motion estimation for video frame interpolation. In ICCV, 2021b.
  42. Film: Frame interpolation for large motion. arXiv preprint arXiv:2202.04901, 2022.
  43. Video frame interpolation transformer. In CVPR, 2022.
  44. Xvfi: Extreme video frame interpolation. In ICCV, 2021.
  45. Deep animation video interpolation in the wild. In CVPR, 2021.
  46. Ucf101: A dataset of 101 human actions classes from videos in the wild. arXiv preprint arXiv:1212.0402, 2012.
  47. Pwc-net: Cnns for optical flow using pyramid, warping, and cost volume. In CVPR, 2018.
  48. Raft: Recurrent all-pairs field transforms for optical flow. In ECCV, 2020.
  49. Video enhancement with taskoriented flow. IJCV, 2019.
  50. Inpaint anything: Segment anything meets image inpainting. arXiv preprint arXiv:2304.06790, 2023.
  51. Extracting motion and appearance via inter-frame attention for efficient video frame interpolation. In CVPR, 2023.
  52. Video frame interpolation with densely queried bilateral correlation. arXiv preprint arXiv:2304.13596, 2023.

Summary

We haven't generated a summary for this paper yet.