Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Using Context-to-Vector with Graph Retrofitting to Improve Word Embeddings (2210.16848v2)

Published 30 Oct 2022 in cs.CL and cs.AI

Abstract: Although contextualized embeddings generated from large-scale pre-trained models perform well in many tasks, traditional static embeddings (e.g., Skip-gram, Word2Vec) still play an important role in low-resource and lightweight settings due to their low computational cost, ease of deployment, and stability. In this paper, we aim to improve word embeddings by 1) incorporating more contextual information from existing pre-trained models into the Skip-gram framework, which we call Context-to-Vec; 2) proposing a post-processing retrofitting method for static embeddings independent of training by employing priori synonym knowledge and weighted vector distribution. Through extrinsic and intrinsic tasks, our methods are well proven to outperform the baselines by a large margin.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Jiangbin Zheng (25 papers)
  2. Yile Wang (24 papers)
  3. Ge Wang (214 papers)
  4. Jun Xia (76 papers)
  5. Yufei Huang (81 papers)
  6. Guojiang Zhao (12 papers)
  7. Yue Zhang (620 papers)
  8. Stan Z. Li (222 papers)
Citations (23)

Summary

We haven't generated a summary for this paper yet.