Unveiling Key Aspects of Fine-Tuning in Sentence Embeddings: A Representation Rank Analysis (2405.11297v1)
Abstract: The latest advancements in unsupervised learning of sentence embeddings predominantly involve employing contrastive learning-based (CL-based) fine-tuning over pre-trained LLMs. In this study, we analyze the latest sentence embedding methods by adopting representation rank as the primary tool of analysis. We first define Phase 1 and Phase 2 of fine-tuning based on when representation rank peaks. Utilizing these phases, we conduct a thorough analysis and obtain essential findings across key aspects, including alignment and uniformity, linguistic abilities, and correlation between performance and rank. For instance, we find that the dynamics of the key aspects can undergo significant changes as fine-tuning transitions from Phase 1 to Phase 2. Based on these findings, we experiment with a rank reduction (RR) strategy that facilitates rapid and stable fine-tuning of the latest CL-based methods. Through empirical investigations, we showcase the efficacy of RR in enhancing the performance and stability of five state-of-the-art sentence embedding methods.
- Semeval-2015 task 2: Semantic textual similarity, english, spanish and pilot on interpretability. In Proceedings of the 9th international workshop on semantic evaluation (SemEval 2015), pages 252–263.
- Semeval-2014 task 10: Multilingual semantic textual similarity. In Proceedings of the 8th international workshop on semantic evaluation (SemEval 2014), pages 81–91.
- Semeval-2016 task 1: Semantic textual similarity, monolingual and cross-lingual evaluation. In SemEval-2016. 10th International Workshop on Semantic Evaluation; 2016 Jun 16-17; San Diego, CA. Stroudsburg (PA): ACL; 2016. p. 497-511. ACL (Association for Computational Linguistics).
- Semeval-2012 task 6: A pilot on semantic textual similarity. In * SEM 2012: The First Joint Conference on Lexical and Computational Semantics–Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation (SemEval 2012), pages 385–393.
- * sem 2013 shared task: Semantic textual similarity. In Second joint conference on lexical and computational semantics (* SEM), volume 1: proceedings of the Main conference and the shared task: semantic textual similarity, pages 32–43.
- Semeval-2017 task 1: Semantic textual similarity-multilingual and cross-lingual focused evaluation. arXiv preprint arXiv:1708.00055.
- A simple framework for contrastive learning of visual representations. In International conference on machine learning, pages 1597–1607. PMLR.
- Diffcse: Difference-based contrastive learning for sentence embeddings. arXiv preprint arXiv:2204.10298.
- Alexis Conneau and Douwe Kiela. 2018. Senteval: An evaluation toolkit for universal sentence representations. arXiv preprint arXiv:1803.05449.
- What you can cram into a single vector: Probing sentence embeddings for linguistic properties. arXiv preprint arXiv:1805.01070.
- Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
- Simcse: Simple contrastive learning of sentence embeddings. arXiv preprint arXiv:2104.08821.
- Rankme: Assessing the downstream performance of pretrained self-supervised representations by their rank. In International Conference on Machine Learning, pages 10929–10974. PMLR.
- John Hewitt and Christopher D Manning. 2019. A structural probe for finding syntax in word representations. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4129–4138.
- On feature decorrelation in self-supervised learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 9598–9608.
- Promptbert: Improving bert sentence embeddings with prompts. arXiv preprint arXiv:2201.04337.
- Improved universal sentence embeddings with prompt-based contrastive learning and energy-based learning. In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 3021–3035.
- Understanding dimensional collapse in contrastive self-supervised learning. arXiv preprint arXiv:2110.09348.
- Vne: An effective method for improving deep representation by manipulating eigenvalue distribution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 3799–3810.
- Tassilo Klein and Moin Nabi. 2022. micse: Mutual information contrastive learning for low-shot sentence embeddings. arXiv preprint arXiv:2211.04928.
- Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692.
- A sick cure for the evaluation of compositional distributional semantic models. In Lrec, pages 216–223. Reykjavik.
- On the inadequacy of optimizing alignment and uniformity in contrastive learning of sentence representations. In The Eleventh International Conference on Learning Representations.
- Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748–8763. PMLR.
- Olivier Roy and Martin Vetterli. 2007. The effective rank: A measure of effective dimensionality. In 2007 15th European signal processing conference, pages 606–610. IEEE.
- Tongzhou Wang and Phillip Isola. 2020. Understanding contrastive representation learning through alignment and uniformity on the hypersphere. In International Conference on Machine Learning, pages 9929–9939. PMLR.
- Pcl: Peer-contrastive learning with diverse augmentations for unsupervised sentence embeddings. arXiv preprint arXiv:2201.12093.
- Infocse: Information-aggregated contrastive learning of sentence embeddings. arXiv preprint arXiv:2210.06432.
- Esimcse: Enhanced sample building method for contrastive learning of unsupervised sentence embedding. arXiv preprint arXiv:2109.04380.
- Decoupled contrastive learning. In Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXVI, pages 668–684. Springer.
- Contrastive learning with prompt-derived virtual semantic prototypes for unsupervised sentence embedding. arXiv preprint arXiv:2211.03348.
- Unsupervised sentence representation via contrastive learning with mixing negatives. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 11730–11738.
- Towards a unified theoretical understanding of non-contrastive learning via rank differential mechanism. arXiv preprint arXiv:2303.02387.
Sponsored by Paperpile, the PDF & BibTeX manager trusted by top AI labs.
Get 30 days freePaper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.