Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
144 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

OCAI: Improving Optical Flow Estimation by Occlusion and Consistency Aware Interpolation (2403.18092v1)

Published 26 Mar 2024 in cs.CV

Abstract: The scarcity of ground-truth labels poses one major challenge in developing optical flow estimation models that are both generalizable and robust. While current methods rely on data augmentation, they have yet to fully exploit the rich information available in labeled video sequences. We propose OCAI, a method that supports robust frame interpolation by generating intermediate video frames alongside optical flows in between. Utilizing a forward warping approach, OCAI employs occlusion awareness to resolve ambiguities in pixel values and fills in missing values by leveraging the forward-backward consistency of optical flows. Additionally, we introduce a teacher-student style semi-supervised learning method on top of the interpolated frames. Using a pair of unlabeled frames and the teacher model's predicted optical flow, we generate interpolated frames and flows to train a student model. The teacher's weights are maintained using Exponential Moving Averaging of the student. Our evaluations demonstrate perceptually superior interpolation quality and enhanced optical flow accuracy on established benchmarks such as Sintel and KITTI.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (51)
  1. Depth-aware video frame interpolation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 3703–3712, 2019a.
  2. Memc-net: Motion estimation and motion compensation driven neural network for video interpolation and enhancement. IEEE transactions on pattern analysis and machine intelligence, 43(3):933–948, 2019b.
  3. Patch-based video denoising with optical flow estimation. IEEE Transactions on Image Processing, 25(6):2573–2586, 2016.
  4. A naturalistic open source movie for optical flow evaluation. In Proceedings of the European Conference on Computer Vision, pages 611–625. Springer, 2012.
  5. Temporal hockey action recognition via pose and optical flows. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pages 0–0, 2019.
  6. Video frame interpolation via deformable separable convolution. In Proceedings of the AAAI Conference on Artificial Intelligence, pages 10607–10614, 2020.
  7. Channel attention is all you need for video frame interpolation. In Proceedings of the AAAI Conference on Artificial Intelligence, pages 10663–10671, 2020.
  8. Flownet: Learning optical flow with convolutional networks. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 2758–2766, 2015.
  9. Vision meets robotics: The kitti dataset. The International Journal of Robotics Research, 32(11):1231–1237, 2013.
  10. Realflow: Em-based realistic optical flow dataset generation from videos. arXiv preprint arXiv:2207.11075, 2022.
  11. Flowformer: A transformer architecture for optical flow. In Proceedings of the European Conference on Computer Vision, 2022a.
  12. Real-time intermediate flow estimation for video frame interpolation. In European Conference on Computer Vision, pages 624–642. Springer, 2022b.
  13. Flownet 2.0: Evolution of optical flow estimation with deep networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2462–2470, 2017.
  14. Semi-supervised learning of optical flow by flow supervisor. In Proceedings of the European Conference on Computer Vision, 2022.
  15. Slow flow: Exploiting high-speed cameras for accurate and diverse optical flow reference data. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2017.
  16. Interpolation-based semi-supervised learning for object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11602–11611, 2021.
  17. Imposing consistency for optical flow estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3181–3191, 2022.
  18. Distractflow: Improving optical flow estimation via realistic distractions and pseudo-labeling. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13691–13700, 2023.
  19. The hci benchmark suite: Stereo and flow ground truth with uncertainties for urban autonomous driving. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pages 19–28, 2016.
  20. Ifrnet: Intermediate feature refine network for efficient frame interpolation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1969–1978, 2022.
  21. Temporal ensembling for semi-supervised learning. arXiv preprint arXiv:1610.02242, 2016.
  22. Motion feature network: Fixed motion filter for action recognition. In Proceedings of the European Conference on Computer Vision, pages 387–403, 2018.
  23. Amt: All-pairs multi-field transforms for efficient frame interpolation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9801–9810, 2023.
  24. Selflow: Self-supervised learning of optical flow. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4571–4580, 2019.
  25. Unbiased teacher for semi-supervised object detection. arXiv preprint arXiv:2102.09480, 2021.
  26. Dvc: An end-to-end deep video compression framework. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11006–11015, 2019.
  27. Video frame interpolation with transformer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3532–3542, 2022.
  28. A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4040–4048, 2016.
  29. Unflow: Unsupervised learning of optical flow with a bidirectional census loss. In Proceedings of the AAAI conference on artificial intelligence, 2018.
  30. Object scene flow for autonomous vehicles. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3061–3070, 2015.
  31. Softmax splatting for video frame interpolation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5437–5446, 2020.
  32. Im-net for high resolution video frame interpolation. In Proceedings of the IEEE/CVF conference on computer vision and pattern Recognition, pages 2398–2407, 2019.
  33. Pieapp: Perceptual image-error assessment through pairwise preference. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1808–1817, 2018.
  34. Optical flow estimation using a spatial pyramid network. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4161–4170, 2017.
  35. Video frame interpolation transformer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 17482–17491, 2022.
  36. Fixmatch: Simplifying semi-supervised learning with consistency and confidence. Advances in neural information processing systems, 33:596–608, 2020.
  37. Smurf: Self-teaching multi-frame unsupervised raft with full-image warping. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3887–3896, 2021.
  38. Pwc-net: Cnns for optical flow using pyramid, warping, and cost volume. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8934–8943, 2018.
  39. Autoflow: Learning a better training set for optical flow. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10093–10102, 2021.
  40. Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. In Advances in neural information processing systems, pages 1195–1204, 2017.
  41. Raft: Recurrent all-pairs field transforms for optical flow. In Proceedings of the European Conference on Computer Vision, pages 402–419. Springer, 2020.
  42. Interpolation consistency training for semi-supervised learning. arXiv preprint arXiv:1903.03825, 2019.
  43. Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing, 13(4):600–612, 2004.
  44. Video compression through image interpolation. In Proceedings of the European Conference on Computer Vision, pages 416–431, 2018.
  45. Video enhancement with task-oriented flow. International Journal of Computer Vision, 127:1106–1125, 2019.
  46. Efficient dynamic scene deblurring using spatially variant deconvolution network with optical flow guided training. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3555–3564, 2020.
  47. Extracting motion and appearance via inter-frame attention for efficient video frame interpolation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5682–5692, 2023.
  48. mixup: Beyond empirical risk minimization. In International Conference on Learning Representations, 2018a.
  49. Deep dynamic scene deblurring from optical flow. IEEE Transactions on Circuits and Systems for Video Technology, 32(12):8250–8260, 2021.
  50. The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 586–595, 2018b.
  51. Maskflownet: Asymmetric feature matching with learnable occlusion mask. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6278–6287, 2020.
Citations (2)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com