Papers
Topics
Authors
Recent
2000 character limit reached

Enhancing Robustness of Foundation Model Representations under Provenance-related Distribution Shifts (2312.05435v1)

Published 9 Dec 2023 in cs.CL

Abstract: Foundation models are a current focus of attention in both industry and academia. While they have shown their capabilities in a variety of tasks, in-depth research is required to determine their robustness to distribution shift when used as a basis for supervised machine learning. This is especially important in the context of clinical data, with particular limitations related to data accessibility, lack of pretraining materials, and limited availability of high-quality annotations. In this work, we examine the stability of models based on representations from foundation models under distribution shift. We focus on confounding by provenance, a form of distribution shift that emerges in the context of multi-institutional datasets when there are differences in source-specific language use and class distributions. Using a sampling strategy that synthetically induces varying degrees of distribution shift, we evaluate the extent to which representations from foundation models result in predictions that are inherently robust to confounding by provenance. Additionally, we examine the effectiveness of a straightforward confounding adjustment method inspired by Pearl's conception of backdoor adjustment. Results indicate that while foundation models do show some out-of-the-box robustness to confounding-by-provenance related distribution shifts, this can be considerably improved through adjustment. These findings suggest a need for deliberate adjustment of predictive models using representations from foundation models in the context of source-specific distributional differences.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (31)
  1. Highly accurate protein structure prediction with alphafold. Nature, 596(7873):583–589, 2021.
  2. A structural biology community assessment of alphafold2 applications. Nature Structural & Molecular Biology, 29(11):1056–1067, 2022.
  3. Machine learning-based prediction of drug–drug interactions by integrating drug phenotypic, therapeutic, chemical, and genomic properties. Journal of the American Medical Informatics Association, 21(e2):e278–e286, 2014.
  4. Next-generation phenotyping of electronic health records. Journal of the American Medical Informatics Association, 20(1):117–121, 2013.
  5. Deepenroll: patient-trial matching with deep embedding and entailment prediction. In Proceedings of the web conference 2020, pages 1029–1037, 2020.
  6. Attention is all you need. Advances in neural information processing systems, 30, 2017.
  7. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.
  8. Language models are few-shot learners, 2020.
  9. Applications of transformer-based language models in bioinformatics: a survey. Bioinformatics Advances, 3(1):vbad001, 2023.
  10. Alexatm 20b: Few-shot learning using a large-scale multilingual seq2seq model. arXiv preprint arXiv:2208.01448, 2022.
  11. A survey on evaluation of large language models. arXiv preprint arXiv:2307.03109, 2023.
  12. OpenAI. Gpt-4 technical report, 2023.
  13. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971, 2023a.
  14. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288, 2023b.
  15. Bloom: A 176b-parameter open-access multilingual language model. arXiv preprint arXiv:2211.05100, 2022.
  16. Federated learning in medicine: facilitating multi-institutional collaborations without sharing patient data. Scientific reports, 10(1):12598, 2020.
  17. Backdoor adjustment of confounding by provenance for robust text classification of multi-institutional clinical notes. arXiv preprint arXiv:2310.02451, 2023.
  18. Sentence-bert: Sentence embeddings using siamese bert-networks. arXiv preprint arXiv:1908.10084, 2019.
  19. Robust text classification in the presence of confounding bias. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 30, 2016.
  20. Robust text classification under confounding shift. Journal of Artificial Intelligence Research, 63:391–419, 2018.
  21. Measuring robustness to natural distribution shifts in image classification. Advances in Neural Information Processing Systems, 33:18583–18599, 2020.
  22. Judea Pearl. Causality. Cambridge university press, 2009.
  23. Reducing weight undertraining in structured discriminative learning. In Proceedings of the Human Language Technology Conference of the NAACL, Main Conference, pages 89–95, 2006.
  24. Annotating social determinants of health using active learning, and characterizing determinants using neural event extraction. Journal of Biomedical Informatics, 113:103631, 2021.
  25. The 2022 n2c2/uw shared task on extracting social determinants of health. arXiv preprint arXiv:2301.05571, 2023.
  26. Mimic-iii, a freely accessible critical care database. Scientific data, 3(1):1–9, 2016.
  27. Directions in abusive language training data, a systematic review: Garbage in, garbage out. Plos one, 15(12):e0243300, 2020.
  28. Learning from the worst: Dynamically generated datasets to improve online hate detection. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 1667–1682, Online, August 2021. Association for Computational Linguistics. doi: 10.18653/v1/2021.acl-long.132.
  29. Hate speech dataset from a white supremacy forum. In Proceedings of the 2nd Workshop on Abusive Language Online (ALW2), pages 11–20, Brussels, Belgium, October 2018. Association for Computational Linguistics. doi: 10.18653/v1/W18-5102.
  30. Llm. int8 (): 8-bit matrix multiplication for transformers at scale. arXiv preprint arXiv:2208.07339, 2022.
  31. Nuanced metrics for measuring unintended bias with real data for text classification. In Companion proceedings of the 2019 world wide web conference, pages 491–500, 2019.

Summary

We haven't generated a summary for this paper yet.

Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.