Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Compositional Generalization in Semantic Parsing: Pre-training vs. Specialized Architectures (2007.08970v3)

Published 17 Jul 2020 in cs.CL and cs.LG

Abstract: While mainstream machine learning methods are known to have limited ability to compositionally generalize, new architectures and techniques continue to be proposed to address this limitation. We investigate state-of-the-art techniques and architectures in order to assess their effectiveness in improving compositional generalization in semantic parsing tasks based on the SCAN and CFQ datasets. We show that masked LLM (MLM) pre-training rivals SCAN-inspired architectures on primitive holdout splits. On a more complex compositional task, we show that pre-training leads to significant improvements in performance vs. comparable non-pre-trained models, whereas architectures proposed to encourage compositional generalization on SCAN or in the area of algorithm learning fail to lead to significant improvements. We establish a new state of the art on the CFQ compositional generalization benchmark using MLM pre-training together with an intermediate representation.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Daniel Furrer (2 papers)
  2. Marc van Zee (6 papers)
  3. Nathan Scales (8 papers)
  4. Nathanael Schärli (8 papers)
Citations (109)

Summary

We haven't generated a summary for this paper yet.