FEET: A Framework for Evaluating Embedding Techniques (2411.01322v1)
Abstract: In this study, we introduce FEET, a standardized protocol designed to guide the development and benchmarking of foundation models. While numerous benchmark datasets exist for evaluating these models, we propose a structured evaluation protocol across three distinct scenarios to gain a comprehensive understanding of their practical performance. We define three primary use cases: frozen embeddings, few-shot embeddings, and fully fine-tuned embeddings. Each scenario is detailed and illustrated through two case studies: one in sentiment analysis and another in the medical domain, demonstrating how these evaluations provide a thorough assessment of foundation models' effectiveness in research applications. We recommend this protocol as a standard for future research aimed at advancing representation learning models.
- Publicly available clinical bert embeddings. arXiv preprint arXiv:1904.03323, 2019.
- Scibert: A pretrained language model for scientific text. arXiv preprint arXiv:1903.10676, 2019.
- Omnijet-α𝛼\alphaitalic_α: the first cross-task foundation model for particle physics. Machine Learning: Science and Technology, 2024.
- On the opportunities and risks of foundation models. arXiv preprint arXiv:2108.07258, 2021.
- Tom B Brown. Language models are few-shot learners. arXiv preprint ArXiv:2005.14165, 2020.
- Kenneth Ward Church. Word2vec. Natural Language Engineering, 23(1):155–162, 2017.
- Training verifiers to solve math word problems. arXiv preprint arXiv:2110.14168, 2021.
- Jacob Devlin. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.
- Parameter-efficient fine-tuning of large-scale pre-trained language models. Nature Machine Intelligence, 5(3):220–235, 2023.
- Fine-tuning pretrained language models: Weight initializations, data orders, and early stopping. arXiv preprint arXiv:2002.06305, 2020.
- Self-supervised representation learning: Introduction, advances, and challenges. IEEE Signal Processing Magazine, 39(3):42–62, 2022.
- Measuring massive multitask language understanding. arXiv preprint arXiv:2009.03300, 2020.
- Lora: Low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685, 2021.
- Supervised fine-tuning in turn improves visual foundation models. arXiv preprint arXiv:2401.10222, 2024.
- Mimic-iv. PhysioNet. Available online at: https://physionet. org/content/mimiciv/1.0/(accessed August 23, 2021), pages 49–55, 2020.
- Fasttext. zip: Compressing text classification models. arXiv preprint arXiv:1612.03651, 2016.
- Compacter: Efficient low-rank hypercomplex adapter layers. Advances in Neural Information Processing Systems, 34:1022–1035, 2021.
- Fine-tuning can distort pretrained features and underperform out-of-distribution. arXiv preprint arXiv:2202.10054, 2022.
- Extensively matching for few-shot learning event detection. arXiv preprint arXiv:2006.10093, 2020.
- Astroclip: Cross-modal pre-training for astronomical foundation models. arXiv preprint arXiv:2310.03024, 2023.
- Enhancing antibiotic stewardship using a natural language approach for better feature representation. arXiv preprint arXiv:2405.20419, 2024a.
- Multimodal clinical pseudo-notes for emergency department prediction tasks using multiple embedding model for ehr (meme). arXiv preprint arXiv:2402.00160, 2024b.
- Sample size considerations for fine-tuning large language models for named entity recognition tasks: Methodological study. JMIR AI, 3:e52095, 2024.
- Fine-tuning can cripple your foundation model; preserving features may be the solution. arXiv preprint arXiv:2308.13320, 2023.
- Text serialization and their relationship with the conventional paradigms of tabular machine learning. arXiv preprint arXiv:2406.13846, 2024.
- Learning from few examples: A summary of approaches to few-shot learning. arXiv preprint arXiv:2203.04291, 2022.
- Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pages 1532–1543, 2014.
- Alec Radford. Improving language understanding by generative pre-training. 2018.
- Language models are unsupervised multitask learners. OpenAI blog, 1(8):9, 2019.
- Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748–8763. PMLR, 2021.
- V Sanh. Distilbert, a distilled version of bert: Smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108, 2019.
- Recursive deep models for semantic compositionality over a sentiment treebank. In Proceedings of the 2013 conference on empirical methods in natural language processing, pages 1631–1642, 2013.
- Vl-adapter: Parameter-efficient transfer learning for vision-and-language tasks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 5227–5237, 2022.
- A note on the evaluation of generative models. arXiv preprint arXiv:1511.01844, 2015.
- Transfer learning. In Handbook of research on machine learning applications and trends: algorithms, methods, and techniques, pages 242–264. IGI global, 2010.
- Medbert: a pre-trained language model for biomedical named entity recognition. In 2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), pages 1482–1488. IEEE, 2022.
- Ashish Vaswani. Attention is all you need. arXiv preprint arXiv:1706.03762, 2017.
- Generalizing from a few examples: A survey on few-shot learning. ACM computing surveys (csur), 53(3):1–34, 2020.
- A survey of transfer learning. Journal of Big data, 3:1–40, 2016.
- Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771, 2019.
- The shaky foundations of large language models and foundation models for electronic health records. npj Digital Medicine, 6(1):135, 2023.
Sponsored by Paperpile, the PDF & BibTeX manager trusted by top AI labs.
Get 30 days freePaper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.