Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
51 tokens/sec
GPT-4o
60 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Scaling Sentence Embeddings with Large Language Models (2307.16645v1)

Published 31 Jul 2023 in cs.CL

Abstract: LLMs have recently garnered significant interest. With in-context learning, LLMs achieve impressive results in various natural language tasks. However, the application of LLMs to sentence embeddings remains an area of ongoing research. In this work, we propose an in-context learning-based method aimed at improving sentence embeddings performance. Our approach involves adapting the previous prompt-based representation method for autoregressive models, constructing a demonstration set that enables LLMs to perform in-context learning, and scaling up the LLMs to different model sizes. Through extensive experiments, in-context learning enables LLMs to generate high-quality sentence embeddings without any fine-tuning. It helps LLMs achieve performance comparable to current contrastive learning methods. By scaling model size, we find scaling to more than tens of billion parameters harms the performance on semantic textual similarity (STS) tasks. However, the largest model outperforms other counterparts and achieves the new state-of-the-art result on transfer tasks. We also fine-tune LLMs with current contrastive learning approach, and the 2.7B OPT model, incorporating our prompt-based method, surpasses the performance of 4.8B ST5, achieving the new state-of-the-art results on STS tasks. Our code is available at https://github.com/kongds/scaling_sentemb.

Scaling Sentence Embeddings with LLMs: A Review

The paper, "Scaling Sentence Embeddings with LLMs," addresses the challenges and opportunities associated with leveraging LLMs for generating high-quality sentence embeddings. The paper focuses on enhancing the representation capabilities of LLMs through innovative methods, while scrutinizing their performances across various parameters and tasks. The authors propose a novel approach that integrates in-context learning and prompt engineering to refine sentence embeddings, thus equipping these models with enriched semantic understanding without extensive fine-tuning.

Key Contributions and Methodology

The primary contribution of the paper is the introduction of a new sentence embedding method termed Prompt-based Explicit One word Limitation (PromptEOL). This method extends existing prompt-based techniques to autoregressive LLMs like OPT and LLaMA. PromptEOL involves adapting LLMs to generate meaningful sentence representations, while explicitly restricting the model's output to one word to ensure semantic coherence. This restriction is achieved by modifying the prompt structure to explicitly constrain the response, which improves sentence representation without the need for fine-tuning.

Additionally, the paper exploits the inherent capabilities of in-context learning in LLMs to further improve sentence embeddings. The authors construct a demonstration set using sentences and their corresponding semantic words generated by ChatGPT and sourced from the Oxford dictionary. By presenting these examples to the LLMs, the research showcases improved performance on semantic textual similarity (STS) tasks, surpassing existing unsupervised methods such as SimCSE and PromptBERT.

The research also explores the scaling of LLMs from millions to tens of billions of parameters, observing that while scaling can enhance transfer task performance, it does not linearly translate to gains in STS tasks beyond a certain threshold. There is a nuanced analysis revealing that increasing model size beyond billions of parameters may not yield further improvements in semantic understanding, underlining the importance of methodological rather than sheer computational advancements.

Results

The paper's results highlight the effectiveness of PromptEOL in achieving state-of-the-art performance in sentence embeddings. Experiments demonstrate that LLMs, when combined with this prompt-based method and efficient fine-tuning techniques, such as QLoRA, achieve competitive or superior results compared to traditional models like BERT and T5-based Sentence Transformers.

The empirical evaluations—conducted on standard benchmarks such as SentEval for STS tasks and transfer tasks—illustrate that large models, given the appropriate methodological improvements like prompt engineering and in-context learning, can capture semantics more effectively without extensive computational demands. This marks a significant stride in the scalability and applicability of LLMs for diverse NLP tasks beyond their primary autoregressive capabilities.

Implications and Future Directions

The implications of this research are twofold: practically, it showcases a feasible method to harness large-scale models without necessitating full fine-tuning, thereby reducing computational overhead, and theoretically, it opens avenues for exploring the boundaries of prompt engineering and in-context learning in enhancing model interpretability and performance.

Future developments could explore integrating more nuanced context-learning strategies and expanding the demonstration set for in-context learning to cover other linguistic phenomena. Additionally, exploring broader datasets and real-world applications—such as domain-specific language tasks or multi-lingual embeddings—could further cement the utility of LLMs in generating high-quality sentence embeddings across various contexts.

In conclusion, this paper provides a substantial contribution to the domain of sentence embeddings, offering a methodologically sound approach that leverages both the scale and inherent capabilities of LLMs. The insights drawn from their experiments set the stage for future explorations into efficient and effective use of LLMs in natural language processing.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Ting Jiang (28 papers)
  2. Shaohan Huang (79 papers)
  3. Zhongzhi Luan (21 papers)
  4. Deqing Wang (36 papers)
  5. Fuzhen Zhuang (97 papers)
Citations (30)
Github Logo Streamline Icon: https://streamlinehq.com