Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
184 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

SplatFlow: Learning Multi-frame Optical Flow via Splatting (2306.08887v2)

Published 15 Jun 2023 in cs.CV

Abstract: The occlusion problem remains a crucial challenge in optical flow estimation (OFE). Despite the recent significant progress brought about by deep learning, most existing deep learning OFE methods still struggle to handle occlusions; in particular, those based on two frames cannot correctly handle occlusions because occluded regions have no visual correspondences. However, there is still hope in multi-frame settings, which can potentially mitigate the occlusion issue in OFE. Unfortunately, multi-frame OFE (MOFE) remains underexplored, and the limited studies on it are mainly specially designed for pyramid backbones or else obtain the aligned previous frame's features, such as correlation volume and optical flow, through time-consuming backward flow calculation or non-differentiable forward warping transformation. This study proposes an efficient MOFE framework named SplatFlow to address these shortcomings. SplatFlow introduces the differentiable splatting transformation to align the previous frame's motion feature and designs a Final-to-All embedding method to input the aligned motion feature into the current frame's estimation, thus remodeling the existing two-frame backbones. The proposed SplatFlow is efficient yet more accurate, as it can handle occlusions properly. Extensive experimental evaluations show that SplatFlow substantially outperforms all published methods on the KITTI2015 and Sintel benchmarks. Especially on the Sintel benchmark, SplatFlow achieves errors of 1.12 (clean pass) and 2.07 (final pass), with surprisingly significant 19.4% and 16.2% error reductions, respectively, from the previous best results submitted. The code for SplatFlow is available at https://github.com/wwsource/SplatFlow.

Citations (5)

Summary

  • The paper introduces a differentiable splatting technique that aligns motion features at a sub-pixel level to improve occlusion handling in optical flow.
  • It presents a Final-to-All embedding method that integrates prior frame motion features consistently across iterations for enhanced estimation accuracy.
  • Extensive benchmarking on the Sintel datasets demonstrates up to 19.4% error reduction, confirming SplatFlow’s efficiency in real-time, resource-constrained environments.

"SplatFlow: Learning Multi-frame Optical Flow via Splatting" addresses the challenging problem of occlusion in optical flow estimation (OFE) with a focus on multi-frame optical flow estimation (MOFE). Traditional single-frame OFE struggles with occlusions due to the lack of visual correspondences in occluded regions. While recent deep learning advancements have been substantial, the structural limitations of two-frame models persist, alongside high computational costs and constraints on resource-limited systems.

The paper introduces SplatFlow, a novel and efficient MOFE framework designed to address these shortcomings by integrating differentiable splatting transformations and a Final-to-All embedding method, which align prior frames' motion features and seamlessly input them into the current frame's estimation process.

The significant contributions of the paper are:

  1. New Alignment Technique: The introduction of a differentiable splatting method allows the alignment of motion features at a sub-pixel level without the computational burden of backward flow calculations or non-differentiable warping. This is achieved by splatting previous frame features to align with the current frame's coordinates, thus facilitating gradient propagation throughout the estimation process.
  2. Embedding Method: The development of the Final-to-All embedding method leverages the aligned motion features from the last iteration of the previous frame estimation and applies them to all iterations of the current frame. This approach ensures a consistent motion reference throughout the iterative refinement in single-resolution backbone networks such as RAFT and GMA.
  3. Extensive Benchmarking: The SplatFlow framework is extensively tested against other methods and demonstrates superior performance, particularly in occluded areas. On the Sintel benchmarking datasets, it achieves error reductions of 19.4% and 16.2% for the clean and final passes, respectively, setting new records. This strongly supports its efficacy in leveraging multi-frame information to overcome the limitations of traditional two-frame approaches.
  4. Real-world Application: Beyond theoretical superiority, SplatFlow demonstrates a significant boost in efficiency with competitive inference speeds and lower resource requirements for non-first frame estimates. This positions it advantageously for applications requiring real-time processing in constrained environments.

Overall, SplatFlow leverages the multi-frame approach to significantly improve occlusion handling in optical flow, promising less computational overhead and better accuracy than existing methods. Its design choices—optimized feature alignment and embedding strategies—highlight its capacity to utilize multiple frames effectively, heralding advancements in tasks reliant on precise motion estimations.

Github Logo Streamline Icon: https://streamlinehq.com