Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Are Representations Built from the Ground Up? An Empirical Examination of Local Composition in Language Models (2210.03575v2)

Published 7 Oct 2022 in cs.CL

Abstract: Compositionality, the phenomenon where the meaning of a phrase can be derived from its constituent parts, is a haLLMark of human language. At the same time, many phrases are non-compositional, carrying a meaning beyond that of each part in isolation. Representing both of these types of phrases is critical for language understanding, but it is an open question whether modern LLMs (LMs) learn to do so; in this work we examine this question. We first formulate a problem of predicting the LM-internal representations of longer phrases given those of their constituents. We find that the representation of a parent phrase can be predicted with some accuracy given an affine transformation of its children. While we would expect the predictive accuracy to correlate with human judgments of semantic compositionality, we find this is largely not the case, indicating that LMs may not accurately distinguish between compositional and non-compositional phrases. We perform a variety of analyses, shedding light on when different varieties of LMs do and do not generate compositional representations, and discuss implications for future modeling work.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Emmy Liu (17 papers)
  2. Graham Neubig (342 papers)
Citations (8)

Summary

We haven't generated a summary for this paper yet.