Anatomy-guided domain adaptation for 3D in-bed human pose estimation (2211.12193v2)
Abstract: 3D human pose estimation is a key component of clinical monitoring systems. The clinical applicability of deep pose estimation models, however, is limited by their poor generalization under domain shifts along with their need for sufficient labeled training data. As a remedy, we present a novel domain adaptation method, adapting a model from a labeled source to a shifted unlabeled target domain. Our method comprises two complementary adaptation strategies based on prior knowledge about human anatomy. First, we guide the learning process in the target domain by constraining predictions to the space of anatomically plausible poses. To this end, we embed the prior knowledge into an anatomical loss function that penalizes asymmetric limb lengths, implausible bone lengths, and implausible joint angles. Second, we propose to filter pseudo labels for self-training according to their anatomical plausibility and incorporate the concept into the Mean Teacher paradigm. We unify both strategies in a point cloud-based framework applicable to unsupervised and source-free domain adaptation. Evaluation is performed for in-bed pose estimation under two adaptation scenarios, using the public SLP dataset and a newly created dataset. Our method consistently outperforms various state-of-the-art domain adaptation methods, surpasses the baseline model by 31%/66%, and reduces the domain gap by 65%/82%. Source code is available at https://github.com/multimodallearning/da-3dhpe-anatomy.
- Patient mocap: Human pose estimation under blanket occlusion for hospital monitoring applications, in: International conference on medical image computing and computer-assisted intervention, Springer. pp. 491–499.
- Self-supervised learning for domain adaptation on point clouds, in: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 123–133.
- Leveraging labeling representations in uncertainty-based semi-supervised segmentation, in: International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer. pp. 265–275.
- Towards accurate cross-domain in-bed human pose estimation, in: ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE. pp. 2664–2668.
- Pose-conditioned joint angle limits for 3d human pose reconstruction, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1446–1455.
- Joint supervised and self-supervised learning for 3d real world challenges, in: 2020 25th International Conference on Pattern Recognition (ICPR), IEEE. pp. 6718–6725.
- Constrained domain adaptation for image segmentation. IEEE Transactions on Medical Imaging 40, 1875–1887.
- Source-relaxed domain adaptation for image segmentation, in: International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer. pp. 490–499.
- Test-time adaptation with shape moments for image segmentation, in: International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer. pp. 736–745.
- Parsing human skeletons in an operating room. Machine Vision and Applications 27, 1035–1046.
- Nonlinear programming. Journal of the Operational Research Society 48, 334–334.
- Domain adaptation through anatomical constraints for 3d human pose estimation under the cover, in: International Conference on Medical Imaging with Deep Learning, PMLR. pp. 173–187.
- Adapting the mean teacher for keypoint-based lung registration under geometric domain shifts, in: International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer. pp. 280–290.
- Domain separation networks. Advances in neural information processing systems 29.
- Exploring object relation in mean teacher for cross-domain detection, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11457–11466.
- Cross-domain adaptation for animal pose estimation, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9498–9507.
- Anatomy and geometry constrained one-stage framework for 3d human pose estimation, in: Proceedings of the Asian Conference on Computer Vision.
- Refrec: Pseudo-labels refinement via shape reconstruction for unsupervised 3d domain adaptation, in: 2021 International Conference on 3D Vision (3DV), IEEE. pp. 331–341.
- Patient 3d body pose estimation from pressure imaging. International journal of computer assisted radiology and surgery 14, 517–524.
- Domain-specific batch normalization for unsupervised domain adaptation, in: Proceedings of the IEEE/CVF conference on Computer Vision and Pattern Recognition, pp. 7354–7362.
- Source-free domain adaptive fundus image segmentation with denoised pseudo-labeling, in: International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer. pp. 225–235.
- Patient-specific pose estimation in clinical environments. IEEE journal of translational engineering in health and medicine 6, 1–11.
- Monocular human pose estimation: A survey of deep learning-based methods. Computer Vision and Image Understanding 192, 102897.
- Multi-level unsupervised domain adaption for privacy-protected in-bed pose estimation, in: International Workshop on Advanced Imaging Technology (IWAIT) 2022, SPIE. pp. 431–436.
- Bodies at rest: 3d human pose and shape estimation from a pressure image using synthetic data, in: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 6215–6224.
- Bodypressure-inferring body pose and contact pressure from a depth image. IEEE Transactions on Pattern Analysis and Machine Intelligence .
- Neurokinect: a novel low-cost 3dvideo-eeg system for epileptic seizure motion quantification. PloS one 11, e0145669.
- In-bed pressure-based pose estimation using image space representation learning, in: ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE. pp. 3965–3969.
- Unbiased mean teacher for cross-domain object detection, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4091–4101.
- Unsupervised visual representation learning by context prediction, in: Proceedings of the IEEE international conference on computer vision, pp. 1422–1430.
- Self-supervised global-local structure modeling for point cloud domain adaptation with reliable voted pseudo labels, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6377–6386.
- Self-ensembling for visual domain adaptation, in: 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings.
- Unsupervised domain adaptation by backpropagation, in: International conference on machine learning, PMLR. pp. 1180–1189.
- Hand pointnet: 3d hand pose estimation using point sets, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8417–8426.
- Point-to-point regression pointnet for 3d hand pose estimation, in: Proceedings of the European conference on computer vision (ECCV), pp. 475–491.
- Deep reconstruction-classification networks for unsupervised domain adaptation, in: European conference on computer vision, Springer. pp. 597–613.
- A kernel method for the two-sample-problem. Advances in neural information processing systems 19.
- Fusing information from multiple 2d depth cameras for 3d human pose estimation in the operating room. International journal of computer assisted radiology and surgery 14, 1871–1879.
- Towards viewpoint invariant 3d human pose estimation, in: European conference on computer vision, Springer. pp. 160–177.
- Uncertainty-aware mean teacher for source-free unsupervised domain adaptive 3d object detection. arXiv preprint arXiv:2109.14651 .
- Support point sets for improving contactless interaction in geometric learning for hand pose estimation, in: Bildverarbeitung für die Medizin 2022. Springer, pp. 89–94.
- Cycada: Cycle-consistent adversarial domain adaptation, in: International conference on machine learning, Pmlr. pp. 1989–1998.
- Human3. 6m: Large scale datasets and predictive methods for 3d human sensing in natural environments. IEEE transactions on pattern analysis and machine intelligence 36, 1325–1339.
- Inbed: a highly specialized system for bed-exit-detection and fall prevention on a geriatric ward. Sensors 19, 1017.
- A multi-view rgb-d approach for human pose estimation in operating rooms, in: 2017 IEEE winter conference on applications of computer vision (WACV), IEEE. pp. 363–372.
- Towards contactless patient positioning. IEEE transactions on medical imaging 39, 2701–2710.
- Dual student: Breaking the limits of the teacher in semi-supervised learning, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6728–6736.
- Beyond pixel-wise supervision: semantic segmentation with higher-order shape descriptors .
- Constrained-cnn losses for weakly supervised segmentation. Medical image analysis 54, 88–99.
- A unified framework for domain adaptive pose estimation, in: Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXXIII, Springer. pp. 603–620.
- Unsupervised batchnorm adaptation (ubna): A domain adaptation method for semantic segmentation without using source domain representations, in: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 210–220.
- Generalize then adapt: Source-free domain adaptive semantic segmentation, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 7046–7056.
- Universal source-free domain adaptation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4544–4553.
- Domain impression: A source data free domain adaptation method, in: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 615–625.
- From synthetic to real: Unsupervised domain adaptation for animal pose estimation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1482–1491.
- Dual-teacher: Integrating intra-domain and inter-domain teachers for annotation-efficient cardiac segmentation, in: International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer. pp. 418–427.
- Point-to-pose voting based hand pose estimation using residual permutation equivariant layer, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11927–11936.
- Pointcnn: Convolution on x-transformed points. Advances in neural information processing systems 31, 820–830.
- Bidirectional learning for domain adaptation of semantic segmentation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6936–6945.
- Do we really need to access the source data? source hypothesis transfer for unsupervised domain adaptation, in: International Conference on Machine Learning, PMLR. pp. 6028–6039.
- 3d posturenet: A unified framework for skeleton-based posture recognition. Pattern Recognition Letters 140, 143–149.
- Simultaneously-collected multimodal lying pose dataset: Enabling in-bed human pose monitoring. IEEE Transactions on Pattern Analysis and Machine Intelligence .
- Privacy-preserving in-bed human pose estimation: Highlights from the ieee video and image processing cup 2021 student competition [sp competitions]. IEEE Signal Processing Magazine 39, 121–129.
- Seeing under the cover: A physics guided learning approach for in-bed pose estimation, in: International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer. pp. 236–245.
- Adapted human pose: monocular 3d human pose estimation with zero real 3d pose data. Applied Intelligence , 1–16.
- Adapting off-the-shelf source segmenter for target medical image segmentation, in: International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer. pp. 549–559.
- Relation-shape convolutional neural network for point cloud analysis, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8895–8904.
- Source-free domain adaptation for semantic segmentation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1215–1224.
- Smpl: A skinned multi-person linear model. ACM transactions on graphics (TOG) 34, 1–16.
- Taking a closer look at domain shift: Category-level adversaries for semantics consistent domain adaptation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2507–2516.
- Investigating depth domain adaptation for efficient human pose estimation, in: Proceedings of the European Conference on Computer Vision (ECCV) Workshops, pp. 0–0.
- Or black box and surgical control tower: recording and streaming data and analytics to improve surgical care. Journal of Visceral Surgery 158, S18–S25.
- V2v-posenet: Voxel-to-voxel prediction network for accurate 3d hand and human pose estimation from a single depth map, in: Proceedings of the IEEE conference on computer vision and pattern Recognition, pp. 5079–5088.
- Learning from synthetic animals, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12386–12395.
- Image to image translation for domain adaptation, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4500–4509.
- Stacked hourglass networks for human pose estimation, in: European conference on computer vision, Springer. pp. 483–499.
- A resource-efficient planning for pressure ulcer prevention. IEEE Transactions on Information Technology in Biomedicine 16, 1265–1273.
- Contrastive learning for unpaired image-to-image translation, in: European conference on computer vision, Springer. pp. 319–345.
- Unsupervised domain adaptation for medical imaging segmentation with self-ensembling. NeuroImage 194, 1–11.
- Pointnet: Deep learning on point sets for 3d classification and segmentation, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 652–660.
- Pointnet++: Deep hierarchical feature learning on point sets in a metric space. Advances in neural information processing systems 30.
- Pointdan: A multi-scale 3d domain adaption network for point cloud representation. Advances in Neural Information Processing Systems 32.
- A multi-sensor architecture combining human pose estimation and real-time location systems for workflow monitoring on hybrid operating suites. Future Generation Computer Systems .
- Beyond sharing weights for deep domain adaptation. IEEE transactions on pattern analysis and machine intelligence 41, 801–814.
- Strong-weak distribution alignment for adaptive object detection, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6956–6965.
- Maximum classifier discrepancy for unsupervised domain adaptation, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3723–3732.
- Domain adaptation on point clouds via geometry-aware implicits, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7223–7232.
- Video recording of the operating room—is anonymity possible? Journal of Surgical Research 197, 272–276.
- Human pose estimation and its application to action recognition: A survey. Journal of Visual Communication and Image Representation 76, 103055.
- Human pose estimation on privacy-preserving low-resolution depth images, in: International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer. pp. 583–591.
- Unsupervised domain adaptation for clinician pose estimation and instance segmentation in the operating room. Medical Image Analysis 80, 102525.
- Mvor: A multi-view rgb-d operating room dataset for 2d and 3d human pose estimation, in: Large-Scale Annotation of Biomedical Data and Expert Label Synthesis – MICCAI 2018 Workshops.
- Return of frustratingly easy domain adaptation, in: Proceedings of the AAAI Conference on Artificial Intelligence.
- Deep high-resolution representation learning for human pose estimation, in: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 5693–5703.
- Compositional human pose regression, in: Proceedings of the IEEE International Conference on Computer Vision, pp. 2602–2611.
- Unsupervised domain adaptation through self-supervision. arXiv preprint arXiv:1909.11825 .
- Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. Advances in neural information processing systems 30.
- Learning to adapt structured output space for semantic segmentation, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7472–7481.
- Adversarial discriminative domain adaptation, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7167–7176.
- Deep domain confusion: Maximizing for domain invariance. arXiv preprint arXiv:1412.3474 .
- Tent: Fully test-time adaptation by entropy minimization, in: 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021.
- When human pose estimation meets robustness: Adversarial algorithms and benchmarks, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11855–11864.
- Tripled-uncertainty guided mean teacher model for semi-supervised medical image segmentation, in: International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer. pp. 450–460.
- Deep visual domain adaptation: A survey. Neurocomputing 312, 135–153.
- Continual test-time domain adaptation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7201–7211.
- Dynamic graph cnn for learning on point clouds. Acm Transactions On Graphics (tog) 38, 1–12.
- Double-uncertainty weighted method for semi-supervised learning, in: International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer. pp. 542–551.
- Pointconv: Deep convolutional networks on 3d point clouds, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9621–9630.
- Simple baselines for human pose estimation and tracking, in: Proceedings of the European conference on computer vision (ECCV), pp. 466–481.
- Paconv: Position adaptive convolution with dynamic kernel assembling on point clouds, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3173–3182.
- Robust multi-modal 3d patient body modeling, in: International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer. pp. 86–95.
- St3d: Self-training for unsupervised domain adaptation on 3d object detection, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10368–10378.
- 3d human pose estimation in the wild by adversarial learning, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5255–5264.
- Multimodal in-bed pose and shape estimation under the blankets, in: Proceedings of the 30th ACM International Conference on Multimedia, pp. 2411–2419.
- Uncertainty-aware self-ensembling model for semi-supervised 3d left atrium segmentation, in: International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer. pp. 605–613.
- Generalizable model-agnostic semantic segmentation via target-specific normalization. Pattern Recognition 122, 108292.
- Weakly supervised adversarial learning for 3d human pose estimation from point clouds. IEEE transactions on visualization and computer graphics 26, 1851–1859.
- Cartilage segmentation in high-resolution 3d micro-ct images via uncertainty-guided self-training with very sparse annotation, in: International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer. pp. 802–812.
- Rectifying pseudo label learning via uncertainty estimation for domain adaptive semantic segmentation. International Journal of Computer Vision 129, 1106–1120.
- Uncertainty-aware consistency regularization for cross-domain semantic segmentation. Computer Vision and Image Understanding , 103448.
- Towards 3d human pose estimation in the wild: a weakly-supervised approach, in: Proceedings of the IEEE International Conference on Computer Vision, pp. 398–407.
- Unpaired image-to-image translation using cycle-consistent adversarial networks, in: Proceedings of the IEEE international conference on computer vision, pp. 2223–2232.
- Geometry-aware self-training for unsupervised domain adaptation on object point clouds, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6403–6412.
- Unsupervised domain adaptation for semantic segmentation via class-balanced self-training, in: Proceedings of the European conference on computer vision (ECCV), pp. 289–305.
- Alexander Bigalke (6 papers)
- Lasse Hansen (25 papers)
- Jasper Diesel (2 papers)
- Carlotta Hennigs (1 paper)
- Philipp Rostalski (12 papers)
- Mattias P. Heinrich (34 papers)