Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Source Matters: Source Dataset Impact on Model Robustness in Medical Imaging (2403.04484v2)

Published 7 Mar 2024 in cs.CV and cs.LG

Abstract: Transfer learning has become an essential part of medical imaging classification algorithms, often leveraging ImageNet weights. The domain shift from natural to medical images has prompted alternatives such as RadImageNet, often showing comparable classification performance. However, it remains unclear whether the performance gains from transfer learning stem from improved generalization or shortcut learning. To address this, we conceptualize confounders by introducing the Medical Imaging Contextualized Confounder Taxonomy (MICCAT) and investigate a range of confounders across it -- whether synthetic or sampled from the data -- using two public chest X-ray and CT datasets. We show that ImageNet and RadImageNet achieve comparable classification performance, yet ImageNet is much more prone to overfitting to confounders. We recommend that researchers using ImageNet-pretrained models reexamine their model robustness by conducting similar experiments. Our code and experiments are available at https://github.com/DovileDo/source-matters.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Dovile Juodelyte (11 papers)
  2. Yucheng Lu (21 papers)
  3. Amelia Jiménez-Sánchez (14 papers)
  4. Sabrina Bottazzi (1 paper)
  5. Enzo Ferrante (54 papers)
  6. Veronika Cheplygina (52 papers)
Citations (3)

Summary

We haven't generated a summary for this paper yet.