Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Idioms, Probing and Dangerous Things: Towards Structural Probing for Idiomaticity in Vector Space (2304.14333v1)

Published 27 Apr 2023 in cs.CL, cs.AI, and cs.LG

Abstract: The goal of this paper is to learn more about how idiomatic information is structurally encoded in embeddings, using a structural probing method. We repurpose an existing English verbal multi-word expression (MWE) dataset to suit the probing framework and perform a comparative probing study of static (GloVe) and contextual (BERT) embeddings. Our experiments indicate that both encode some idiomatic information to varying degrees, but yield conflicting evidence as to whether idiomaticity is encoded in the vector norm, leaving this an open question. We also identify some limitations of the used dataset and highlight important directions for future work in improving its suitability for a probing analysis.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Vasudevan Nedumpozhimana (5 papers)
  2. John D. Kelleher (37 papers)
  3. Filip Klubička (7 papers)
Citations (4)

Summary

We haven't generated a summary for this paper yet.