Prompt-driven Latent Domain Generalization for Medical Image Classification (2401.03002v1)
Abstract: Deep learning models for medical image analysis easily suffer from distribution shifts caused by dataset artifacts bias, camera variations, differences in the imaging station, etc., leading to unreliable diagnoses in real-world clinical settings. Domain generalization (DG) methods, which aim to train models on multiple domains to perform well on unseen domains, offer a promising direction to solve the problem. However, existing DG methods assume domain labels of each image are available and accurate, which is typically feasible for only a limited number of medical datasets. To address these challenges, we propose a novel DG framework for medical image classification without relying on domain labels, called Prompt-driven Latent Domain Generalization (PLDG). PLDG consists of unsupervised domain discovery and prompt learning. This framework first discovers pseudo domain labels by clustering the bias-associated style features, then leverages collaborative domain prompts to guide a Vision Transformer to learn knowledge from discovered diverse domains. To facilitate cross-domain knowledge learning between different prompts, we introduce a domain prompt generator that enables knowledge sharing between domain prompts and a shared prompt. A domain mixup strategy is additionally employed for more flexible decision margins and mitigates the risk of incorrect domain assignments. Extensive experiments on three medical image classification tasks and one debiasing task demonstrate that our method can achieve comparable or even superior performance than conventional DG algorithms without relying on domain labels. Our code will be publicly available upon the paper is accepted.
- P. Schramowski, W. Stammer, S. Teso, A. Brugger, F. Herbert, X. Shao, H. Luigs, A. Mahlein, and K. Kersting, “Making deep neural networks right for the right scientific reasons by interacting with their explanations,” Nat. Mach. Intell., vol. 2, no. 8, pp. 476–486, 2020. [Online]. Available: https://doi.org/10.1038/s42256-020-0212-3
- A. S. Ross, M. C. Hughes, and F. Doshi-Velez, “Right for the right reasons: Training differentiable models by constraining their explanations,” in Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI 2017, Melbourne, Australia, August 19-25, 2017, C. Sierra, Ed. ijcai.org, 2017, pp. 2662–2670. [Online]. Available: https://doi.org/10.24963/ijcai.2017/371
- A. Bissoto, C. Barata, E. Valle, and S. Avila, “Artifact-based domain generalization of skin lesion models,” in ECCV Workshops, 2022.
- A. Bissoto, M. Fornaciali, E. Valle, and S. Avila, “(de) constructing bias on skin lesion datasets,” 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 2766–2774, 2019.
- S. Yan, Z. Yu, X. Zhang, D. Mahapatra, S. S. Chandra, M. Janda, P. Soyer, and Z. Ge, “Towards trustable skin cancer diagnosis via rewriting model’s decision,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2023, pp. 11 568–11 577.
- M. Atwany and M. Yaqub, “Drgen: Domain generalization in diabetic retinopathy classification,” in Medical Image Computing and Computer Assisted Intervention – MICCAI 2022, L. Wang, Q. Dou, P. T. Fletcher, S. Speidel, and S. Li, Eds. Cham: Springer Nature Switzerland, 2022, pp. 635–644.
- D. M. Nguyen, T. T. Mai, N. T. Than, A. Prange, and D. Sonntag, “Self-supervised domain adaptation for diabetic retinopathy grading using vessel image reconstruction,” in KI 2021: Advances in Artificial Intelligence: 44th German Conference on AI, Virtual Event, September 27–October 1, 2021, Proceedings 44. Springer, 2021, pp. 349–361.
- D. Li, Y. Yang, Y.-Z. Song, and T. M. Hospedales, “Deeper, broader and artier domain generalization,” in Proceedings of the IEEE international conference on computer vision, 2017, pp. 5542–5550.
- P. Teterwak, K. Saito, T. Tsiligkaridis, K. Saenko, and B. A. Plummer, “Erm++: An improved baseline for domain generalization,” arXiv preprint arXiv:2304.01973, 2023.
- S. Seo, Y. Suh, D. Kim, J. Han, and B. Han, “Learning to optimize domain specific normalization for domain generalization,” in European Conference on Computer Vision, 2019.
- K. Zhou, Y. Yang, Y. Qiao, and T. Xiang, “Domain adaptive ensemble learning,” IEEE Transactions on Image Processing, vol. 30, pp. 8008–8018, 2020.
- D. Teney, E. Abbasnejad, S. Lucey, and A. van den Hengel, “Evading the simplicity bias: Training a diverse set of models discovers solutions with superior ood generalization,” 2022.
- P. Nakkiran, G. Kaplun, D. Kalimeris, T. Yang, B. L. Edelman, F. Zhang, and B. Barak, “Sgd on neural networks learns functions of increasing complexity,” 2019.
- K. L. Hermann and A. K. Lampinen, “What shapes feature representations? exploring datasets, architectures, and training,” 2020.
- S. Yan, C. Liu, Z. Yu, L. Ju, D. Mahapatrainst, V. Mar, M. Janda, P. Soyer, and Z. Ge, “Epvt: Environment-aware prompt vision transformer for domain generalization in skin lesion recognition,” 2023.
- Y. Ganin, E. Ustinova, H. Ajakan, P. Germain, H. Larochelle, F. Laviolette, M. Marchand, and V. Lempitsky, “Domain-adversarial training of neural networks,” The journal of machine learning research, vol. 17, no. 1, pp. 2096–2030, 2016.
- B. Sun and K. Saenko, “Deep coral: Correlation alignment for deep domain adaptation,” in Computer Vision–ECCV 2016 Workshops: Amsterdam, The Netherlands, October 8-10 and 15-16, 2016, Proceedings, Part III 14. Springer, 2016, pp. 443–450.
- Z. Zheng, X. Yue, K. Wang, and Y. You, “Prompt vision transformer for domain generalization,” ArXiv, vol. abs/2208.08914, 2022.
- D. Li, Y. Yang, Y.-Z. Song, and T. Hospedales, “Learning to generalize: Meta-learning for domain generalization,” in AAAI Conference on Artificial Intelligence, 2018.
- Y. Li, Y. Yang, W. Zhou, and T. Hospedales, “Feature-critic networks for heterogeneous domain generalization,” in International Conference on Machine Learning. PMLR, 2019, pp. 3915–3924.
- H. Nam, H. Lee, J. Park, W. Yoon, and D. Yoo, “Reducing domain gap by reducing style bias,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 8690–8699.
- M. Xu, J. Zhang, B. Ni, T. Li, C. Wang, Q. Tian, and W. Zhang, “Adversarial domain adaptation with domain mixup,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, 2020, pp. 6502–6509.
- S. Sagawa*, P. W. Koh*, T. B. Hashimoto, and P. Liang, “Distributionally robust neural networks,” in International Conference on Learning Representations, 2020. [Online]. Available: https://openreview.net/forum?id=ryxGuJrFvS
- S. Sivaprasad, A. Goindani, V. Garg, R. Basu, S. Kosgi, and V. Gandhi, “Reappraising domain generalization in neural networks,” arXiv preprint arXiv:2110.07981, 2021.
- A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, and N. Houlsby, “An image is worth 16x16 words: Transformers for image recognition at scale,” ICLR, 2021.
- “Eyepacs dataset,” Kaggle. [Online]. Available: https://www.kaggle.com/datasets/mariaherrerot/eyepacspreprocess
- “Aptos: Aptos 2019 blindness detection,” Kaggle, 2018. [Online]. Available: https://www.kaggle.com/c/aptos2019-blindness-detection
- T. Matsuura and T. Harada, “Domain generalization using a mixture of multiple latent domains,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 07, 2020, pp. 11 749–11 756.
- M. Jia, L. Tang, B.-C. Chen, C. Cardie, S. Belongie, B. Hariharan, and S.-N. Lim, “Visual prompt tuning,” in Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXXIII. Springer, 2022, pp. 709–727.
- Z. Wang, Z. Zhang, C.-Y. Lee, H. Zhang, R. Sun, X. Ren, G. Su, V. Perot, J. Dy, and T. Pfister, “Learning to prompt for continual learning,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 139–149.
- M. Pagliardini, M. Jaggi, F. Fleuret, and S. P. Karimireddy, “Agree to disagree: Diversity through disagreement for better transferability,” in The Eleventh International Conference on Learning Representations, 2023. [Online]. Available: https://openreview.net/forum?id=K7CbYQbyYhY
- S. Seo, J.-Y. Lee, and B. Han, “Unsupervised learning of debiased representations with pseudo-attributes,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 16 742–16 751.
- G. d’Eon, J. d’Eon, J. R. Wright, and K. Leyton-Brown, “The spotlight: A general method for discovering systematic errors in deep learning models,” in 2022 ACM Conference on Fairness, Accountability, and Transparency, 2022, pp. 1962–1981.
- N. Tumanyan, O. Bar-Tal, S. Bagon, and T. Dekel, “Splicing vit features for semantic appearance transfer,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 10 748–10 757.
- Z. Wang, R. Panda, L. Karlinsky, R. Feris, H. Sun, and Y. Kim, “Multitask prompt tuning enables parameter-efficient transfer learning,” in International Conference on Learning Representations, 2023. [Online]. Available: https://openreview.net/forum?id=Nk2pDtuhTq
- R. Karimi Mahabadi, J. Henderson, and S. Ruder, “Compacter: Efficient low-rank hypercomplex adapter layers,” Advances in Neural Information Processing Systems, vol. 34, pp. 1022–1035, 2021.
- A. Aghajanyan, L. Zettlemoyer, and S. Gupta, “Intrinsic dimensionality explains the effectiveness of language model fine-tuning,” in Annual Meeting of the Association for Computational Linguistics, 2020.
- M. Combalia, N. Codella, V. Rotemberg, C. Carrera, S. Dusza, D. Gutman, B. Helba, H. Kittler, N. R. Kurtansky, K. Liopyris et al., “Validation of artificial intelligence prediction models for skin cancer diagnosis using dermoscopy images: the 2019 international skin imaging collaboration grand challenge,” The Lancet Digital Health, vol. 4, no. 5, pp. e330–e339, 2022.
- J. Kawahara, S. Daneshvar, G. Argenziano, and G. Hamarneh, “Seven-point checklist and skin lesion classification using multitask multimodal neural nets,” IEEE journal of biomedical and health informatics, vol. 23, no. 2, pp. 538–546, 2018.
- T. Mendonça, M. Celebi, T. Mendonca, and J. Marques, “Ph2: A public database for the analysis of dermoscopic images,” Dermoscopy image analysis, 2015.
- A. G. Pacheco, G. R. Lima, A. S. Salomao, B. Krohling, I. P. Biral, G. G. de Angelo, F. C. Alves Jr, J. G. Esgario, A. C. Simora, P. B. Castro et al., “Pad-ufes-20: A skin lesion dataset composed of patient data and clinical images collected from smartphones,” Data in brief, vol. 32, p. 106221, 2020.
- I. Gulrajani and D. Lopez-Paz, “In search of lost domain generalization,” in International Conference on Learning Representations, 2021. [Online]. Available: https://openreview.net/forum?id=lQdXeXDoWtI
- “Messidor-2 dr grades.” Kaggle. [Online]. Available: https://www.kaggle.com/datasets/google-brain/messidor2-dr-grades
- P. W. Koh, S. Sagawa, H. Marklund, S. M. Xie, M. Zhang, A. Balsubramani, W. Hu, M. Yasunaga, R. L. Phillips, I. Gao et al., “Wilds: A benchmark of in-the-wild distribution shifts,” in International Conference on Machine Learning. PMLR, 2021, pp. 5637–5664.
- H. Li, S. J. Pan, S. Wang, and A. C. Kot, “Domain generalization with adversarial feature learning,” 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5400–5409, 2018.
- M. Arjovsky, L. Bottou, I. Gulrajani, and D. Lopez-Paz, “Invariant risk minimization,” ArXiv, vol. abs/1907.02893, 2019.
- Y. Ruan, Y. Dubois, and C. J. Maddison, “Optimal representations for covariate shift,” in International Conference on Learning Representations, 2022. [Online]. Available: https://openreview.net/forum?id=Rf58LPCwJj0
- D. Kim, Y. Yoo, S. Park, J. Kim, and J. Lee, “Selfreg: Self-supervised contrastive regularization for domain generalization,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 9619–9628.
- A. Rame, C. Dancette, and M. Cord, “Fishr: Invariant gradient variances for out-of-distribution generalization,” in International Conference on Machine Learning. PMLR, 2022, pp. 18 347–18 377.
- D. Dowson and B. Landau, “The fréchet distance between multivariate normal distributions,” Journal of multivariate analysis, vol. 12, no. 3, pp. 450–455, 1982.
- L. Deecke, T. Hospedales, and H. Bilen, “Visual representation learning over latent domains,” in International Conference on Learning Representations, 2022. [Online]. Available: https://openreview.net/forum?id=kG0AtPi6JI1
- A. Pakzad, K. Abhishek, and G. Hamarneh, “CIRCLe: Color invariant representation learning for unbiased classification of skin lesions,” in Proceedings of the 17th European Conference on Computer Vision (ECCV) - ISIC Skin Image Analysis Workshop, 2022, pp. 203–219.
- Siyuan Yan (16 papers)
- Chi Liu (65 papers)
- Zhen Yu (19 papers)
- Lie Ju (25 papers)
- Dwarikanath Mahapatra (51 papers)
- Brigid Betz-Stablein (5 papers)
- Victoria Mar (7 papers)
- Monika Janda (6 papers)
- Peter Soyer (3 papers)
- Zongyuan Ge (102 papers)