Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Effect of Post-processing on Contextualized Word Representations (2104.07456v2)

Published 15 Apr 2021 in cs.CL and cs.AI

Abstract: Post-processing of static embedding has beenshown to improve their performance on both lexical and sequence-level tasks. However, post-processing for contextualized embeddings is an under-studied problem. In this work, we question the usefulness of post-processing for contextualized embeddings obtained from different layers of pre-trained LLMs. More specifically, we standardize individual neuron activations using z-score, min-max normalization, and by removing top principle components using the all-but-the-top method. Additionally, we apply unit length normalization to word representations. On a diverse set of pre-trained models, we show that post-processing unwraps vital information present in the representations for both lexical tasks (such as word similarity and analogy)and sequence classification tasks. Our findings raise interesting points in relation to theresearch studies that use contextualized representations, and suggest z-score normalization as an essential step to consider when using them in an application.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Hassan Sajjad (64 papers)
  2. Firoj Alam (75 papers)
  3. Fahim Dalvi (45 papers)
  4. Nadir Durrani (48 papers)
Citations (9)

Summary

We haven't generated a summary for this paper yet.