Latent Diffusion Models with Image-Derived Annotations for Enhanced AI-Assisted Cancer Diagnosis in Histopathology (2312.09792v1)
Abstract: AI based image analysis has an immense potential to support diagnostic histopathology, including cancer diagnostics. However, developing supervised AI methods requires large-scale annotated datasets. A potentially powerful solution is to augment training data with synthetic data. Latent diffusion models, which can generate high-quality, diverse synthetic images, are promising. However, the most common implementations rely on detailed textual descriptions, which are not generally available in this domain. This work proposes a method that constructs structured textual prompts from automatically extracted image features. We experiment with the PCam dataset, composed of tissue patches only loosely annotated as healthy or cancerous. We show that including image-derived features in the prompt, as opposed to only healthy and cancerous labels, improves the Fr\'echet Inception Distance (FID) from 178.8 to 90.2. We also show that pathologists find it challenging to detect synthetic images, with a median sensitivity/specificity of 0.55/0.55. Finally, we show that synthetic data effectively trains AI models.
- “Towards Hierarchical Regional Transformer-based Multiple Instance Learning” In Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 3952–3960
- Ralf Huss and Sarah E Coupland “Software-assisted decision support in digital histopathology” In The Journal of Pathology 250.5 Wiley Online Library, 2020, pp. 685–692
- “Detecting genetic alterations in BRAF and NTRK as oncogenic drivers in digital pathology images: Towards model generalization within and across multiple thyroid cohorts” In MICCAI Workshop on Computational Pathology, 2021, pp. 105–116 PMLR
- “Validation of an AI-based solution for breast cancer risk stratification using routine digital histopathology images” In medRxiv Cold Spring Harbor Laboratory Press, 2023, pp. 2023–10
- “Structure-Preserving Color Normalization and Sparse Stain Separation for Histological Images” In IEEE Transactions on Medical Imaging 35.8, 2016, pp. 1962–1971 DOI: 10.1109/TMI.2016.2529665
- “Generative adversarial networks in medical image augmentation: A review” In Computers in Biology and Medicine 144 Elsevier, 2022, pp. 105382
- “Synthetic data in machine learning for medicine and healthcare” In Nature Biomedical Engineering 5.6 Nature Publishing Group UK London, 2021, pp. 493–497
- Fida K Dankar and Mahmoud Ibrahim “Fake it till you make it: Guidelines for effective synthetic data generation” In Applied Sciences 11.5 MDPI, 2021, pp. 2158
- “Generative Adversarial Nets” In Advances in Neural Information Processing Systems 27, 2014 URL: https://proceedings.neurips.cc/paper_files/paper/2014/file/5ca3e9b122f61f8f06494c97b1afccf3-Paper.pdf
- “Generative Adversarial Networks” In Commun. ACM 63.11 Association for Computing Machinery, 2020, pp. 139–144 DOI: 10.1145/3422622
- Connor Shorten and Taghi M Khoshgoftaar “A survey on image data augmentation for deep learning” In Journal of big data 6.1 SpringerOpen, 2019, pp. 1–48
- “Effective Data Augmentation With Diffusion Models” arXiv, 2023 DOI: 10.48550/ARXIV.2302.07944
- “Catastrophic forgetting and mode collapse in GANs” In 2020 International Joint Conference on Neural Networks (IJCNN), 2020, pp. 1–10 IEEE
- Jonathan Ho, Ajay Jain and Pieter Abbeel “Denoising Diffusion Probabilistic Models” In Advances in Neural Information Processing Systems 33, 2020, pp. 6840–6851 URL: https://proceedings.neurips.cc/paper_files/paper/2020/file/4c5bcfec8584af0d967f1ab10179ca4b-Paper.pdf
- “High-Resolution Image Synthesis with Latent Diffusion Models” ISSN: 2575-7075 In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 10674–10685 DOI: 10.1109/CVPR52688.2022.01042
- “A Morphology Focused Diffusion Probabilistic Model for Synthesis of Histopathology Images” In arXiv e-prints, 2022, pp. arXiv–2209
- “DiffInfinite: Large Mask-Image Synthesis via Parallel Random Patch Diffusion in Histopathology” In Thirty-seventh Conference on Neural Information Processing Systems Datasets and Benchmarks Track, 2023 URL: https://openreview.net/forum?id=QXTjde8evS
- “PathLDM: Text conditioned Latent Diffusion Model for Histopathology” In arXiv preprint arXiv:2309.00748, 2023
- “Synthetic Augmentation with Large-Scale Unconditional Pre-training” In Medical Image Computing and Computer Assisted Intervention – MICCAI 2023, 2023, pp. 754–764
- “Rotation Equivariant CNNs for Digital Pathology” In Medical Image Computing and Computer Assisted Intervention – MICCAI 2018, 2018, pp. 210–218
- “Diagnostic Assessment of Deep Learning Algorithms for Detection of Lymph Node Metastases in Women With Breast Cancer” In JAMA 318.22, 2017, pp. 2199–2210 DOI: 10.1001/jama.2017.14585
- Will Cukierski “Histopathologic Cancer Detection” Open-source dataset available at https://kaggle.com/competitions/histopathologic-cancer-detection Kaggle, 2018
- “Emerging Properties in Self-Supervised Vision Transformers” ISSN: 2380-7504 In 2021 IEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 9630–9640 DOI: 10.1109/ICCV48922.2021.00951
- “LAION-5B: An open large-scale dataset for training next generation image-text models” In Advances in Neural Information Processing Systems 35, 2022, pp. 25278–25294 URL: https://proceedings.neurips.cc/paper_files/paper/2022/file/a1859debfb3b59d094f3504d5ebb6c25-Paper-Datasets_and_Benchmarks.pdf
- “Sdxl: improving latent diffusion models for high-resolution image synthesis” In arXiv preprint arXiv:2307.01952, 2023
- “Pick-a-Pic: An Open Dataset of User Preferences for Text-to-Image Generation”, 2023 arXiv:2305.01569 [cs.CV]
- “GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium” In Advances in Neural Information Processing Systems 30, 2017 URL: https://proceedings.neurips.cc/paper_files/paper/2017/file/8a1d694707eb0fefe65871369074926d-Paper.pdf
- “Improved Precision and Recall Metric for Assessing Generative Models” In Advances in Neural Information Processing Systems 32, 2019 URL: https://proceedings.neurips.cc/paper_files/paper/2019/file/0234c510bc6d908b28c70ff313743079-Paper.pdf
- Andrew Brock, Jeff Donahue and Karen Simonyan “Large Scale GAN Training for High Fidelity Natural Image Synthesis” In International Conference on Learning Representations, 2019 URL: https://openreview.net/forum?id=B1xsqj09Fm
- “Diffusion Models Beat GANs on Image Synthesis” In Advances in Neural Information Processing Systems 34, 2021, pp. 8780–8794 URL: https://proceedings.neurips.cc/paper_files/paper/2021/file/49ad23d1ec9fa4bd8d77d02681df5cfa-Paper.pdf
- “Assessing Generative Models via Precision and Recall” In Advances in Neural Information Processing Systems 31, 2018
- “Classification Accuracy Score for Conditional Generative Models” In Advances in Neural Information Processing Systems 32, 2019 URL: https://proceedings.neurips.cc/paper_files/paper/2019/file/fcf55a303b71b84d326fb1d06e332a26-Paper.pdf
- Mary L McHugh “Interrater reliability: the kappa statistic” In Biochemia medica 22.3 Croatian Society of Medical BiochemistryLaboratory Medicine, 2012, pp. 276–282
- “RoentGen: Vision-Language Foundation Model for Chest X-ray Generation” arXiv, 2022 DOI: 10.48550/ARXIV.2211.12737
- “RoentGen: Vision-language foundation model for chest X-ray generation” arXiv:2211.12737 [cs] arXiv, 2022 DOI: 10.48550/arXiv.2211.12737
- “Brain Imaging Generation with Latent Diffusion Models” In Deep Generative Models, 2022, pp. 117–126
- “A Prompt-based Multimodal Tabular Transformer Encoder For Medical Intervention Duration Estimation”, 2023 arXiv:2303.17408 [cs.CL]
- “NapSS: Paragraph-level Medical Text Simplification via Narrative Prompting and Sentence-matching Summarization” In Findings of the Association for Computational Linguistics: EACL 2023, 2023, pp. 1079–1091 DOI: 10.18653/v1/2023.findings-eacl.80
- “Translating radiology reports into plain language using ChatGPT and GPT-4 with prompt learning: results, limitations, and potential” In Visual Computing for Industry, Biomedicine, and Art 6.1, 2023, pp. 9 DOI: 10.1186/s42492-023-00136-5
- “Chataug: Leveraging chatgpt for text data augmentation” In arXiv preprint arXiv:2302.13007, 2023
- Simon Graham, David Epstein and Nasir Rajpoot “Dense Steerable Filter CNNs for Exploiting Rotational Symmetry in Histology Images” In IEEE Transactions on Medical Imaging 39.12, 2020, pp. 4124–4136 DOI: 10.1109/TMI.2020.3013246
- “Contemporary Whole Slide Imaging Devices and Their Applications within the Modern Pathology Department: A Selected Hardware Review” In J Pathol Inform 12, 2021, pp. 50
- D J Meuten, F M Moore and J W George “Mitotic Count and the Field of View Area: Time to Standardize” In Vet Pathol 53.1, 2016, pp. 7–9
- “Robust and data-efficient generalization of self-supervised machine learning for diagnostic imaging” In Nature Biomedical Engineering 7.6, 2023, pp. 756–779 DOI: 10.1038/s41551-023-01049-7
- “Learning Transferable Visual Models From Natural Language Supervision” In Proceedings of the 38th International Conference on Machine Learning 139, Proceedings of Machine Learning Research, 2021, pp. 8748–8763 URL: https://proceedings.mlr.press/v139/radford21a.html
- “Deep ViT Features as Dense Visual Descriptors”, 2022 arXiv:2112.05814 [cs.CV]
- “Transformers: State-of-the-Art Natural Language Processing” In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, 2020, pp. 38–45 DOI: 10.18653/v1/2020.emnlp-demos.6
- M. Halkidi, M. Vazirgiannis and Y. Batistakis “Quality Scheme Assessment in the Clustering Process” In Principles of Data Mining and Knowledge Discovery, 2000, pp. 265–276
- “Huggingface’s transformers: State-of-the-art natural language processing” In arXiv preprint arXiv:1910.03771, 2019
- “Diffusers: State-of-the-art diffusion models” Published at https://github.com/huggingface/diffusers In GitHub repository GitHub, 2022
- “Pseudo Numerical Methods for Diffusion Models on Manifolds” In International Conference on Learning Representations, 2022 URL: https://openreview.net/forum?id=PlKWVd2yBkY
- “Rethinking the Inception architecture for computer vision” ISSN: 1063-6919 In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 2818–2826 DOI: 10.1109/CVPR.2016.308
- “PyTorch Image Quality: Metrics for Image Quality Assessment” arXiv, 2022 DOI: 10.48550/ARXIV.2208.14818
- Sergey Kastryulin, Dzhamil Zakirov and Denis Prokopenko “PyTorch Image Quality: Metrics and Measure for Image Quality Assessment” Open-source software available at https://github.com/photosynthesis-team/piq, 2019 URL: https://github.com/photosynthesis-team/piq
- “Visualizing Data using t-SNE” In Journal of Machine Learning Research 9.86, 2008, pp. 2579–2605 URL: http://jmlr.org/papers/v9/vandermaaten08a.html
- “Dimensionality reduction for visualizing single-cell data using UMAP” In Nature Biotechnology 37.1, 2019, pp. 38–44 DOI: 10.1038/nbt.4314
- “UMAP: Uniform Manifold Approximation and Projection” In Journal of Open Source Software 3.29, 2018, pp. 861
- Diederik P. Kingma and Jimmy Ba “Adam: A Method for Stochastic Optimization”, 2017 arXiv:1412.6980 [cs.LG]
- “Basic and Clinical Biostatistics” Lange Medical Books, 2004
- J.Richard Landis and Gary G. Koch “The Measurement of Observer Agreement for Categorical Data” In Biometrics 33.1 [Wiley, International Biometric Society], 1977, pp. 159–174 URL: http://www.jstor.org/stable/2529310