Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Revisiting the Compositional Generalization Abilities of Neural Sequence Models (2203.07402v1)

Published 14 Mar 2022 in cs.CL

Abstract: Compositional generalization is a fundamental trait in humans, allowing us to effortlessly combine known phrases to form novel sentences. Recent works have claimed that standard seq-to-seq models severely lack the ability to compositionally generalize. In this paper, we focus on one-shot primitive generalization as introduced by the popular SCAN benchmark. We demonstrate that modifying the training distribution in simple and intuitive ways enables standard seq-to-seq models to achieve near-perfect generalization performance, thereby showing that their compositional generalization abilities were previously underestimated. We perform detailed empirical analysis of this phenomenon. Our results indicate that the generalization performance of models is highly sensitive to the characteristics of the training data which should be carefully considered while designing such benchmarks in future.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Arkil Patel (14 papers)
  2. Satwik Bhattamishra (13 papers)
  3. Phil Blunsom (87 papers)
  4. Navin Goyal (42 papers)
Citations (30)

Summary

We haven't generated a summary for this paper yet.