Effect of Post-processing on Contextualized Word Representations (2104.07456v2)

Published 15 Apr 2021 in cs.CL and cs.AI

Abstract: Post-processing of static embedding has beenshown to improve their performance on both lexical and sequence-level tasks. However, post-processing for contextualized embeddings is an under-studied problem. In this work, we question the usefulness of post-processing for contextualized embeddings obtained from different layers of pre-trained LLMs. More specifically, we standardize individual neuron activations using z-score, min-max normalization, and by removing top principle components using the all-but-the-top method. Additionally, we apply unit length normalization to word representations. On a diverse set of pre-trained models, we show that post-processing unwraps vital information present in the representations for both lexical tasks (such as word similarity and analogy)and sequence classification tasks. Our findings raise interesting points in relation to theresearch studies that use contextualized representations, and suggest z-score normalization as an essential step to consider when using them in an application.

Authors (4)

Hassan Sajjad (64 papers)
Firoj Alam (75 papers)
Fahim Dalvi (45 papers)
Nadir Durrani (48 papers)

Citations (9)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Effect of Post-processing on Contextualized Word Representations (2104.07456v2)

Summary

Related Papers