Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Making Transformers Solve Compositional Tasks (2108.04378v2)

Published 9 Aug 2021 in cs.AI and cs.CL

Abstract: Several studies have reported the inability of Transformer models to generalize compositionally, a key type of generalization in many NLP tasks such as semantic parsing. In this paper we explore the design space of Transformer models showing that the inductive biases given to the model by several design decisions significantly impact compositional generalization. Through this exploration, we identified Transformer configurations that generalize compositionally significantly better than previously reported in the literature in a diverse set of compositional tasks, and that achieve state-of-the-art results in a semantic parsing compositional generalization benchmark (COGS), and a string edit operation composition benchmark (PCFG).

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Joshua Ainslie (32 papers)
  2. Vaclav Cvicek (5 papers)
  3. Zachary Fisher (13 papers)
  4. Santiago Ontañón (28 papers)
Citations (65)

Summary

We haven't generated a summary for this paper yet.