Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
60 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Meta-Task Prompting Elicits Embeddings from Large Language Models (2402.18458v2)

Published 28 Feb 2024 in cs.CL
Meta-Task Prompting Elicits Embeddings from Large Language Models

Abstract: We introduce a new unsupervised text embedding method, Meta-Task Prompting with Explicit One-Word Limitation (MetaEOL), for generating high-quality sentence embeddings from LLMs without the need for model fine-tuning. Leveraging meta-task prompting, MetaEOL guides LLMs to produce embeddings through a series of carefully designed prompts that address multiple representational aspects. Our comprehensive experiments demonstrate that embeddings averaged from various meta-tasks are versatile embeddings that yield competitive performance on Semantic Textual Similarity (STS) benchmarks and excel in downstream tasks, surpassing contrastive-trained models. Our findings suggest a new scaling law, offering a versatile and resource-efficient approach for embedding generation across diverse scenarios.

Meta-Task Prompting with Explicit One-Word Limitation (MetaEOL) Enhances Embedding Extraction from LLMs

Introduction to MetaEOL

The paper presents Meta-Task Prompting with Explicit One-Word Limitation (MetaEOL), an innovative unsupervised approach for generating sentence embeddings from LLMs like GPT-3 and LLaMA. Unlike traditional methods requiring model fine-tuning or specific task engineering, MetaEOL utilizes a series of meta-task prompts to guide LLMs in producing nuanced embeddings. This approach leverages the inherent strength of LLMs in understanding language context without additional training, thus aligning with the zero-resource setting philosophy.

Comprehensive Experiments

The empirical evaluation underscores MetaEOL’s efficiency, positioning it as a competitive alternative to contrastive-trained models for Semantic Textual Similarity (STS) tasks and superior in downstream tasks. Key findings from the experimentation include:

  • Meta-Task Averaging: Simply aggregating embeddings from various meta-tasks, without extra training, results in embeddings that rival those from contrastive-trained models across STS benchmarks.
  • Integration of Meta-Tasks: Incrementally adding meta-tasks consistently enhances performance across STS tasks, underscoring the value of incorporating diverse perspectives.
  • Layer Selection Strategy: Instead of solely relying on the final layer for embedding extraction, employing a proportional layer selection strategy based on models' size yields further improvement.

Methodology

MetaEOL stands out by employing meta-task prompting, where each prompt is tailored to a specific usage scenario or task context. This process generates multiple embeddings for each sentence, reflecting varied representational facets. An exemplary application of MetaEOL involves generating different templates via ChatGPT-4 to capture distinct semantic aspects like Text Classification (TC), Sentiment Analysis (SA), Paraphrase Identification (PI), and Information Extraction (IE).

Analysis and Findings

A thorough analysis reveals several insights:

  1. Ablation Study: Demonstrated the complementary nature of meta-tasks, with embeddings derived from diverse tasks showing improved performance.
  2. Task Influence: A direct relationship between the number of meta-tasks used and the overall performance on STS tasks, advocating for the multi-faceted nature of sentence representation.
  3. Prompt Influence: Investigated the impact of employing multiple prompts for the same meta-task, illustrating that more prompts lead to nuanced embeddings and better STS performance.
  4. Output Layer Influence: Identified optimal layers for embedding extraction, challenging the conventional wisdom of using only the final layer.

Implications and Future Directions

The novel approach of MetaEOL offers both theoretical and practical implications:

  • Theoretical: It challenges existing embedding generation paradigms by demonstrating the effectiveness of unsupervised, prompt-based approaches.
  • Practical: MetaEOL presents a viable, cost-effective alternative for embedding generation in resource-constrained settings, mitigating the need for extensive computational resources associated with model training.

MetaEOL's success paves the way for future exploration into multilingual contexts and broader application scenarios, such as document retrieval. As LLMs continue to evolve, the scalability and adaptability of methods like MetaEOL will undoubtedly play a pivotal role in advancing the state-of-the-art in NLP.

Limitations and Further Work

The paper acknowledges the computational overhead of MetaEOL, given the necessity to process multiple prompts per sentence. Additionally, current evaluations are limited to English and specific types of NLP tasks. Future research could extend MetaEOL's methodology to multilingual settings and explore its efficacy in more expansive task benchmarks, potentially enhancing its utility and applicability in the rapidly evolving landscape of generative AI and NLP.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (48)
  1. Semeval-2015 task 2: Semantic textual similarity, english, spanish and pilot on interpretability. In Proceedings of the 9th international workshop on semantic evaluation (SemEval 2015), pages 252–263.
  2. Semeval-2014 task 10: Multilingual semantic textual similarity. In Proceedings of the 8th international workshop on semantic evaluation (SemEval 2014), pages 81–91.
  3. Semeval-2016 task 1: Semantic textual similarity, monolingual and cross-lingual evaluation. In SemEval-2016. 10th International Workshop on Semantic Evaluation; 2016 Jun 16-17; San Diego, CA. Stroudsburg (PA): ACL; 2016. p. 497-511. ACL (Association for Computational Linguistics).
  4. SemEval-2012 task 6: A pilot on semantic textual similarity. In *SEM 2012: The First Joint Conference on Lexical and Computational Semantics – Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation (SemEval 2012), pages 385–393, Montréal, Canada. Association for Computational Linguistics.
  5. Semeval-2012 task 6: A pilot on semantic textual similarity. In * SEM 2012: The First Joint Conference on Lexical and Computational Semantics–Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation (SemEval 2012), pages 385–393.
  6. Sem 2013 shared task: Semantic textual similarity. In Second joint conference on lexical and computational semantics (* SEM), volume 1: proceedings of the Main conference and the shared task: semantic textual similarity, pages 32–43.
  7. Task-aware retrieval with instructions. In Findings of the Association for Computational Linguistics: ACL 2023, pages 3650–3675, Toronto, Canada. Association for Computational Linguistics.
  8. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901.
  9. Semeval-2017 task 1: Semantic textual similarity-multilingual and cross-lingual focused evaluation. arXiv preprint arXiv:1708.00055.
  10. SemEval-2017 task 1: Semantic textual similarity multilingual and crosslingual focused evaluation. In Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), pages 1–14, Vancouver, Canada. Association for Computational Linguistics.
  11. Improving contrastive learning of sentence embeddings from AI feedback. In Findings of the Association for Computational Linguistics: ACL 2023, pages 11122–11138, Toronto, Canada. Association for Computational Linguistics.
  12. DiffCSE: Difference-based contrastive learning for sentence embeddings. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 4207–4218, Seattle, United States. Association for Computational Linguistics.
  13. Scaling instruction-finetuned language models. arXiv preprint arXiv:2210.11416.
  14. Alexis Conneau and Douwe Kiela. 2018. SentEval: An evaluation toolkit for universal sentence representations. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Miyazaki, Japan. European Language Resources Association (ELRA).
  15. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, Minneapolis, Minnesota. Association for Computational Linguistics.
  16. William B Dolan and Chris Brockett. 2005. Automatically constructing a corpus of sentential paraphrases. In Proceedings of the Third International Workshop on Paraphrasing (IWP2005).
  17. A survey on in-context learning. arXiv preprint arXiv:2301.00234.
  18. Simcse: Simple contrastive learning of sentence embeddings. arXiv preprint arXiv:2104.08821.
  19. HyperPrompt: Prompt-based task-conditioning of transformers. In Proceedings of the 39th International Conference on Machine Learning, volume 162 of Proceedings of Machine Learning Research, pages 8678–8690. PMLR.
  20. Minqing Hu and Bing Liu. 2004. Mining and summarizing customer reviews. In ACM SIGKDD international conference on Knowledge discovery and data mining.
  21. Unsupervised dense information retrieval with contrastive learning. arXiv preprint arXiv:2112.09118.
  22. Mistral 7b. arXiv preprint arXiv:2310.06825.
  23. Scaling sentence embeddings with large language models. arXiv preprint arXiv:2307.16645.
  24. Promptbert: Improving bert sentence embeddings with prompts. arXiv preprint arXiv:2201.04337.
  25. PromptBERT: Improving BERT sentence embeddings with prompts. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 8826–8837, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
  26. SentiLARE: Sentiment-aware language representation learning with linguistic knowledge. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 6975–6988, Online. Association for Computational Linguistics.
  27. Meaning representations from trajectories in autoregressive models. arXiv preprint arXiv:2310.18348.
  28. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692.
  29. A sick cure for the evaluation of compositional distributional semantic models. In Lrec, pages 216–223. Reykjavik.
  30. Cross-task generalization via natural language crowdsourcing instructions. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 3470–3487, Dublin, Ireland. Association for Computational Linguistics.
  31. Learning to match using local and distributed representations of text for web search. In Proceedings of the 26th international conference on world wide web, pages 1291–1299.
  32. Sentence-t5: Scalable sentence encoders from pre-trained text-to-text models. In Findings of the Association for Computational Linguistics: ACL 2022, pages 1864–1874, Dublin, Ireland. Association for Computational Linguistics.
  33. A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. In Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04), pages 271–278, Barcelona, Spain.
  34. Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. In Proceedings of the ACL.
  35. Nils Reimers and Iryna Gurevych. 2019. Sentence-BERT: Sentence embeddings using Siamese BERT-networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3982–3992, Hong Kong, China. Association for Computational Linguistics.
  36. Multitask prompted training enables zero-shot task generalization. arXiv preprint arXiv:2110.08207.
  37. Recursive deep models for semantic compositionality over a sentiment treebank. In emnlp, pages 1631–1642.
  38. One embedder, any task: Instruction-finetuned text embeddings. In Findings of the Association for Computational Linguistics: ACL 2023, pages 1102–1121, Toronto, Canada. Association for Computational Linguistics.
  39. Michael Tomasello. 2009. The usage-based theory of language acquisition. In The Cambridge handbook of child language, pages 69–87. Cambridge Univ. Press.
  40. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971.
  41. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288.
  42. Ellen M Voorhees and Dawn M Tice. 2000. Building a question answering test collection. In Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval, pages 200–207.
  43. Super-NaturalInstructions: Generalization via declarative instructions on 1600+ NLP tasks. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 5085–5109, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
  44. Annotating expressions of opinions and emotions in language. Language resources and evaluation, 39(2-3):165–210.
  45. PCL: Peer-contrastive learning with diverse augmentations for unsupervised sentence embeddings. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 12052–12066, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
  46. A survey on knowledge distillation of large language models. arXiv preprint arXiv:2402.13116.
  47. Contrastive learning of sentence embeddings from scratch. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 3916–3932, Singapore. Association for Computational Linguistics.
  48. Opt: Open pre-trained transformer language models. arXiv preprint arXiv:2205.01068.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Yibin Lei (9 papers)
  2. Di Wu (477 papers)
  3. Tianyi Zhou (172 papers)
  4. Tao Shen (87 papers)
  5. Yu Cao (129 papers)
  6. Chongyang Tao (61 papers)
  7. Andrew Yates (59 papers)
Citations (8)
X Twitter Logo Streamline Icon: https://streamlinehq.com
Youtube Logo Streamline Icon: https://streamlinehq.com