Papers
Topics
Authors
Recent
Search
2000 character limit reached

RhythmFormer: Extracting Patterned rPPG Signals based on Periodic Sparse Attention

Published 20 Feb 2024 in cs.CV | (2402.12788v3)

Abstract: Remote photoplethysmography (rPPG) is a non-contact method for detecting physiological signals based on facial videos, holding high potential in various applications. Due to the periodicity nature of rPPG signals, the long-range dependency capturing capacity of the transformer was assumed to be advantageous for such signals. However, existing methods have not conclusively demonstrated the superior performance of transformers over traditional convolutional neural networks. This may be attributed to the quadratic scaling exhibited by transformer with sequence length, resulting in coarse-grained feature extraction, which in turn affects robustness and generalization. To address that, this paper proposes a periodic sparse attention mechanism based on temporal attention sparsity induced by periodicity. A pre-attention stage is introduced before the conventional attention mechanism. This stage learns periodic patterns to filter out a large number of irrelevant attention computations, thus enabling fine-grained feature extraction. Moreover, to address the issue of fine-grained features being more susceptible to noise interference, a fusion stem is proposed to effectively guide self-attention towards rPPG features. It can be easily integrated into existing methods to enhance their performance. Extensive experiments show that the proposed method achieves state-of-the-art performance in both intra-dataset and cross-dataset evaluations. The codes are available at https://github.com/zizheng-guo/RhythmFormer.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (61)
  1. Vivit: A video vision transformer. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 6836–6846, 2021.
  2. Unsupervised skin tissue segmentation for remote photoplethysmography. Pattern Recognition Letters, 124:82–90, 2019.
  3. Deepphys: Video-based physiological measurement using convolutional attention networks. In Proceedings of the European Conference on Computer Vision (ECCV), pages 349–365, 2018.
  4. Dpt: Deformable patch-based transformer for visual recognition. In Proceedings of the 29th ACM International Conference on Multimedia, pages 2899–2907, 2021.
  5. Generating long sequences with sparse transformers. arXiv preprint arXiv:1904.10509, 2019.
  6. Efficient remote photoplethysmography with temporal derivative modules and time-shift invariant loss. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pages 2182–2191, 2022.
  7. Ms-tct: multi-scale temporal convtransformer for action detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 20041–20051, 2022.
  8. Transformer-xl: Attentive language models beyond a fixed-length context. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 2978–2988, 2019.
  9. Robust pulse rate from chrominance-based rppg. IEEE Transactions on Biomedical Engineering, 60(10):2878–2886, 2013.
  10. Improved motion robustness of remote-ppg by using the blood volume pulse signature. Physiological measurement, 35(9):1913, 2014.
  11. Cswin transformer: A general vision transformer backbone with cross-shaped windows. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12124–12134, 2022.
  12. Dual-bridging with adversarial noise generation for domain adaptive rppg estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10355–10364, 2023.
  13. Radiant: Better rppg estimation using signal embeddings and transformer. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 4976–4986, 2023.
  14. Transrac: Encoding multi-scale temporal correlation with transformers for repetitive action counting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 19013–19022, 2022.
  15. Transppg: Two-stream transformer for remote heart rate estimate. arXiv preprint arXiv:2201.10873, 2022.
  16. Mtt: Multi-scale temporal transformer for skeleton-based action recognition. IEEE Signal Processing Letters, 29:528–532, 2022.
  17. Meta-rppg: Remote heart rate estimation using a transductive meta-learner. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXVII 16, pages 392–409. Springer, 2020.
  18. Lstc-rppg: Long short-term convolutional network for remote photoplethysmography. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pages 6014–6022, 2023.
  19. Measuring pulse rate with a webcam—a non-contact method for evaluating cardiac activity. In 2011 Federated Conference on Computer Science and Information Systems (FedCSIS), pages 405–410. IEEE, 2011.
  20. Learning motion-robust remote photoplethysmography through arbitrary resolution videos. In Proceedings of the AAAI Conference on Artificial Intelligence, pages 1334–1342, 2023.
  21. Remote heart rate measurement from face videos under realistic situations. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 4264–4271, 2014.
  22. Multi-task temporal shift attention networks for on-device contactless vitals measurement. Advances in Neural Information Processing Systems, 33:19400–19411, 2020.
  23. Efficientphys: Enabling simple, fast and accurate camera-based cardiac measurement. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 5008–5017, 2023a.
  24. rPPG-toolbox: Deep remote PPG toolbox. In Thirty-seventh Conference on Neural Information Processing Systems Datasets and Benchmarks Track, 2023b.
  25. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 10012–10022, 2021.
  26. And-rppg: A novel denoising-rppg network for improving remote heart rate estimation. Computers in biology and medicine, 141:105146, 2022.
  27. Dual-gan: Joint bvp and noise modeling for remote physiological measurement. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12404–12413, 2021.
  28. Neuron structure modeling for generalizable remote physiological measurement. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 18589–18599, 2023.
  29. Daniel McDuff. Camera measurement of physiological vital signs. ACM Computing Surveys, 55(9):1–40, 2023.
  30. Synrhythm: Learning a deep heart rate estimator from general to specific. In 2018 24th International Conference on Pattern Recognition (ICPR), pages 3580–3585. IEEE, 2018.
  31. Rhythmnet: End-to-end heart rate estimation from face via spatial-temporal representation. IEEE Transactions on Image Processing, 29:2409–2423, 2019.
  32. Video-based remote physiological measurement via cross-verified feature disentangling. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part II 16, pages 295–310. Springer, 2020.
  33. Local group invariance for heart rate estimation from face videos in the wild. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pages 1254–1262, 2018.
  34. Non-contact, automated cardiac pulse measurements using video imaging and blind source separation. Optics express, 18(10):10762–10774, 2010.
  35. Shunted self-attention via multi-scale token aggregation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10853–10862, 2022.
  36. Instantaneous physiological estimation using video transformers. In Multimodal AI in healthcare: A paradigm shift in health intelligence, pages 307–319. Springer, 2022.
  37. Tranphys: Spatiotemporal masked transformer steered remote photoplethysmography estimation. IEEE Transactions on Circuits and Systems for Video Technology, 2023.
  38. Pulsegan: Learning to generate realistic pulse waveforms in remote photoplethysmography. IEEE Journal of Biomedical and Health Informatics, 25(5):1373–1384, 2021.
  39. Visual heart rate estimation with convolutional neural network. In Proceedings of the British Machine Vision Conference, Newcastle, UK, pages 3–6, 2018.
  40. Non-contact video-based pulse rate measurement on a mobile service robot. In The 23rd IEEE International Symposium on Robot and Human Interactive Communication, pages 1056–1062. IEEE, 2014.
  41. Mmpd: Multi-domain mobile video physiology dataset. In 45th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2023.
  42. Siamese-rppg network: Remote photoplethysmography signal estimation from face videos. In Proceedings of the 35th annual ACM symposium on applied computing, pages 2066–2073, 2020.
  43. Maxvit: Multi-axis vision transformer. In European Conference on Computer Vision, pages 459–479. Springer, 2022.
  44. Attention is all you need. Advances in neural information processing systems, 30, 2017.
  45. Remote plethysmographic imaging using ambient light. Optics express, 16(26):21434–21445, 2008.
  46. Linformer: Self-attention with linear complexity. arXiv preprint arXiv:2006.04768, 2020.
  47. A novel algorithm for remote photoplethysmography: Spatial subspace rotation. IEEE Transactions on Biomedical Engineering, 63(9):1974–1984, 2015.
  48. Algorithmic principles of remote ppg. IEEE Transactions on Biomedical Engineering, 64(7):1479–1491, 2016.
  49. Crossformer: A versatile vision transformer hinging on cross-scale attention. In International Conference on Learning Representations, ICLR, 2022.
  50. Vision transformer with deformable attention. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4794–4803, 2022.
  51. Remote photoplethysmograph signal measurement from facial videos using spatio-temporal networks. In 30th British Machine Visison Conference: BMVC 2019. 9th-12th September 2019, Cardiff, UK. The British Machine Vision Conference (BMVC), 2019a.
  52. Remote heart rate measurement from highly compressed facial videos: an end-to-end deep learning solution with video enhancement. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 151–160, 2019b.
  53. Autohr: A strong end-to-end baseline for remote heart rate measurement with neural searching. IEEE Signal Processing Letters, 27:1245–1249, 2020.
  54. Transrppg: Remote photoplethysmography transformer for 3d mask face presentation attack detection. IEEE Signal Processing Letters, 28:1290–1294, 2021a.
  55. Searching multi-rate and multi-modal temporal enhanced networks for gesture recognition. IEEE Transactions on Image Processing, 30:5626–5640, 2021b.
  56. Physformer: Facial video-based physiological measurement with temporal difference transformer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4186–4196, 2022.
  57. Physformer++: Facial video-based physiological measurement with slowfast temporal difference transformer. International Journal of Computer Vision, 131(6):1307–1330, 2023.
  58. Not all tokens are equal: Human-centric visual analysis via token clustering transformer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11101–11111, 2022.
  59. Demodulation based transformer for rppg generation and heart rate estimation. IEEE Signal Processing Letters, 2023.
  60. Video-based physiological measurement using 3d central difference convolution attention network. In 2021 IEEE International Joint Conference on Biometrics (IJCB), pages 1–6. IEEE, 2021.
  61. Biformer: Vision transformer with bi-level routing attention. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10323–10333, 2023.

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.