Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
162 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Revisiting Learning-based Video Motion Magnification for Real-time Processing (2403.01898v1)

Published 4 Mar 2024 in cs.CV and eess.IV

Abstract: Video motion magnification is a technique to capture and amplify subtle motion in a video that is invisible to the naked eye. The deep learning-based prior work successfully demonstrates the modelling of the motion magnification problem with outstanding quality compared to conventional signal processing-based ones. However, it still lags behind real-time performance, which prevents it from being extended to various online applications. In this paper, we investigate an efficient deep learning-based motion magnification model that runs in real time for full-HD resolution videos. Due to the specified network design of the prior art, i.e. inhomogeneous architecture, the direct application of existing neural architecture search methods is complicated. Instead of automatic search, we carefully investigate the architecture module by module for its role and importance in the motion magnification task. Two key findings are 1) Reducing the spatial resolution of the latent motion representation in the decoder provides a good trade-off between computational efficiency and task quality, and 2) surprisingly, only a single linear layer and a single branch in the encoder are sufficient for the motion magnification task. Based on these findings, we introduce a real-time deep learning-based motion magnification model with4.2X fewer FLOPs and is 2.7X faster than the prior art while maintaining comparable quality.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (36)
  1. Jae Young An and Soo Il Lee. 2022. Phase-Based Motion Magnification for Structural Vibration Monitoring at a Video Streaming Rate. IEEE Access 10 (2022), 123423–123435.
  2. Detecting Pulse from Head Motions in Video. In CVPR. IEEE, Portland, OR, USA, 3430–3437. https://doi.org/10.1109/CVPR.2013.440
  3. Output-only computer vision based damage detection using phase-based optical flow and unscented Kalman filters. Engineering Structures 132 (2017), 300–313.
  4. Video camera–based vibration measurement for civil infrastructure applications. Journal of Infrastructure Systems 23, 3 (2017), B4016013.
  5. Structural modal identification through high speed camera video: Motion magnification. Topics in Modal Analysis I, Volume 7 7 (2014), 191–197.
  6. Modal identification of simple structures with high-speed video using motion magnification. Journal of Sound and Vibration 345 (2015), 58–71.
  7. Developments with motion magnification for structural modal identification through camera video. In Dynamics of Civil Structures, Volume 2. Springer, Cham, Switzerland, 49–57.
  8. The Visual Microphone: Passive Recovery of Sound from Video. ACM Transactions on Graphics (SIGGRAPH) 33, 4, Article 79 (jul 2014), 10 pages.
  9. Layer Folding: Neural Network Depth Reduction using Activation Linearization. arXiv preprint arXiv:2106.09309 (2021).
  10. Robotically Surgical Vessel Localization Using Robust Hybrid Video Motion Magnification. IEEE Robotics and Automation Letters 6, 2 (2021), 1567–1573.
  11. The design and use of steerable filters. IEEE TPAMI 13, 9 (1991), 891–906.
  12. MagFormer: Hybrid Video Motion Magnification Transformer from Eulerian and Lagrangian Perspectives. In BMVC. BMVA Press, London, UK, 444. https://bmvc2022.mpi-inf.mpg.de/444/
  13. Single path one-shot neural architecture search with uniform sampling. In ECCV. Springer, Cham, Switzerland, 544–560.
  14. Zehao Huang and Naiyan Wang. 2018. Data-driven sparse structure selection for deep neural networks. In ECCV.
  15. Ricard Lado-Roigé and Marco A. Pérez. 2023. STB-VMM: Swin Transformer based Video Motion Magnification. Knowledge-Based Systems 269, 7 (2023), 110493.
  16. Gan compression: Efficient architectures for interactive conditional gans. In CVPR. IEEE, Seattle, WA, USA, 5284–5294.
  17. Enhanced deep residual networks for single image super-resolution. In CVPR. IEEE, Honolulu, HI, USA, 1132–1140.
  18. Motion magnification. ACM TOG 24, 3 (2005), 519–526.
  19. Efficient super resolution using binarized neural network. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. IEEE, Long Beach, CA, USA, 694–703.
  20. Learning-based video motion magnification. In ECCV. Springer International Publishing, Cham, Switzerland, 663–679.
  21. Lightweight Network for Video Motion Magnification. In IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE Computer Society, Los Alamitos, CA, USA, 2040–2049.
  22. Multi Domain Learning for Motion Magnification. In CVPR. IEEE, Vancouver, BC, Canada, 13914–13923. https://doi.org/10.1109/CVPR52729.2023.01337
  23. Video magnification in the wild using fractional anisotropy in temporal distribution. In CVPR. IEEE, Long Beach, CA, USA, 1614–1622.
  24. Local Riesz pyramid for faster phase-based video magnification. IEICE Transactions on Information and Systems 103, 10 (2020), 2036–2046.
  25. Bilateral Video Magnification Filter. In CVPR. IEEE, New Orleans, LA, USA, 17348–17357.
  26. Jerk-aware video acceleration magnification. In CVPR. IEEE, Salt Lake City, UT, USA, 1769–1777.
  27. Carlo Tomasi and Takeo Kanade. 1991. Detection and tracking of point. IJCV 9 (1991), 137–154.
  28. Motion based detection of respiration rate in infants using video. In ICIP. IEEE, Phoenix, AZ, USA, 1225–1229.
  29. Phase-based video motion processing. ACM TOG 32, 4 (2013), 1–10.
  30. Riesz pyramids for fast phase-based video magnification. In IEEE International Conference on Computational Photography (ICCP). IEEE, IEEE, Santa Clara, CA, USA, 1–10.
  31. Image quality assessment: from error visibility to structural similarity. IEEE TIP 13, 4 (2004), 600–612.
  32. Eulerian video magnification for revealing subtle changes in the world. ACM TOG 31, 4 (2012), 1–8.
  33. Refraction wiggles for measuring fluid depth and velocity from video. In ECCV. Springer International Publishing, Cham, Switzerland, 767–782.
  34. The unreasonable effectiveness of deep features as a perceptual metric. In CVPR. IEEE, Salt Lake City, UT, USA, 586–595.
  35. Video acceleration magnification. In CVPR. IEEE, Honolulu, HI, USA, 502–510.
  36. Learning efficient image super-resolution networks via structure-regularized pruning. In ICLR.
Citations (1)

Summary

We haven't generated a summary for this paper yet.