PhySU-Net: Long Temporal Context Transformer for rPPG with Self-Supervised Pre-training (2402.11913v1)
Abstract: Remote photoplethysmography (rPPG) is a promising technology that consists of contactless measuring of cardiac activity from facial videos. Most recent approaches utilize convolutional networks with limited temporal modeling capability or ignore long temporal context. Supervised rPPG methods are also severely limited by scarce data availability. In this work, we propose PhySU-Net, the first long spatial-temporal map rPPG transformer network and a self-supervised pre-training strategy that exploits unlabeled data to improve our model. Our strategy leverages traditional methods and image masking to provide pseudo-labels for self-supervised pre-training. Our model is tested on two public datasets (OBF and VIPL-HR) and shows superior performance in supervised training. Furthermore, we demonstrate that our self-supervised pre-training strategy further improves our model's performance by leveraging representations learned from unlabeled data.
- Swin-unet: Unet-like pure transformer for medical image segmentation. In European conference on computer vision, pages 205–218. Springer, 2022.
- W. Chen and D. McDuff. Deepphys: Video-based physiological measurement using convolutional attention networks. In Proceedings of the european conference on computer vision (ECCV), pages 349–365, 2018.
- Bvpnet: Video-to-bvp signal prediction for remote heart rate estimation. In 2021 16th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2021), pages 01–08. IEEE, 2021.
- G. De Haan and V. Jeanne. Robust pulse rate from chrominance-based rppg. IEEE Transactions on Biomedical Engineering, 60(10):2878–2886, 2013.
- G. De Haan and A. Van Leest. Improved motion robustness of remote-ppg by using the blood volume pulse signature. Physiological measurement, 35(9):1913, 2014.
- J. Gideon and S. Stent. The way to my heart is through contrastive learning: Remote photoplethysmography from unlabelled video. In Proceedings of the IEEE/CVF international conference on computer vision, pages 3995–4004, 2021.
- Radiant: Better rppg estimation using signal embeddings and transformer. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 4976–4986, 2023.
- Masked autoencoders are scalable vision learners. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16000–16009, 2022.
- Transppg: Two-stream transformer for remote heart rate estimate. arXiv preprint arXiv:2201.10873, 2022.
- Measuring pulse rate with a webcam—a non-contact method for evaluating cardiac activity. In 2011 federated conference on computer science and information systems (FedCSIS), pages 405–410. IEEE, 2011.
- Branch-fusion-net for multi-modal continuous dimensional emotion recognition. IEEE Signal Processing Letters, 29:942–946, 2022.
- The obf database: A large face video database for remote physiological signal measurement and atrial fibrillation detection. In 2018 13th IEEE international conference on automatic face & gesture recognition (FG 2018), pages 242–249. IEEE, 2018.
- Efficientphys: Enabling simple, fast and accurate camera-based cardiac measurement. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 5008–5017, 2023.
- Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF international conference on computer vision, pages 10012–10022, 2021.
- Dual-gan: Joint bvp and noise modeling for remote physiological measurement. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12404–12413, 2021.
- Advancing non-contact vital sign measurement using synthetic avatars. arXiv preprint arXiv:2010.12949, 2020.
- Selection of empirical mode decomposition techniques for extracting breathing rate from ppg. IEEE Signal Processing Letters, 26(4):592–596, 2019.
- Synrhythm: Learning a deep heart rate estimator from general to specific. In 2018 24th International Conference on Pattern Recognition (ICPR), pages 3580–3585. IEEE, 2018.
- Vipl-hr: A multi-modal database for pulse estimation from less-constrained face video. In Computer Vision–ACCV 2018: 14th Asian Conference on Computer Vision, Perth, Australia, December 2–6, 2018, Revised Selected Papers, Part V 14, pages 562–576. Springer, 2019.
- Rhythmnet: End-to-end heart rate estimation from face via spatial-temporal representation. IEEE Transactions on Image Processing, 29:2409–2423, 2019.
- Video-based remote physiological measurement via cross-verified feature disentangling. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part II 16, pages 295–310. Springer, 2020.
- Local group invariance for heart rate estimation from face videos in the wild. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pages 1254–1262, 2018.
- Advancements in noncontact, multiparameter physiological measurements using a webcam. IEEE transactions on biomedical engineering, 58(1):7–11, 2010.
- Visual heart rate estimation with convolutional neural network. In Proceedings of the british machine vision conference, Newcastle, UK, pages 3–6, 2018.
- Z. Sun and X. Li. Contrast-phys: Unsupervised video-based remote physiological measurement via spatiotemporal contrast. In Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XII, pages 492–510. Springer, 2022.
- Remote plethysmographic imaging using ambient light. Optics express, 16(26):21434–21445, 2008.
- Self-supervised representation learning framework for remote physiological measurement using spatiotemporal augmentation loss. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 2431–2439, 2022.
- Transphys: Transformer-based unsupervised contrastive learning for remote heart rate measurement. Biomedical Signal Processing and Control, 86:105058, 2023.
- Algorithmic principles of remote PPG. IEEE Transactions on Biomedical Engineering, 64(7):1479–1491, 2016.
- Autohr: A strong end-to-end baseline for remote heart rate measurement with neural searching. IEEE Signal Processing Letters, 27:1245–1249, 2020.
- Transrppg: Remote photoplethysmography transformer for 3d mask face presentation attack detection. IEEE Signal Processing Letters, 28:1290–1294, 2021.
- Remote photoplethysmograph signal measurement from facial videos using spatio-temporal networks. In British Machine Vision Conference, 2019.
- Remote heart rate measurement from highly compressed facial videos: an end-to-end deep learning solution with video enhancement. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 151–160, 2019.
- Physformer: facial video-based physiological measurement with temporal difference transformer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4186–4196, 2022.