Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Explain-then-Process: Using Grammar Prompting to Enhance Grammatical Acceptability Judgments (2506.02302v1)

Published 2 Jun 2025 in cs.CL and cs.AI

Abstract: LLMs can explain grammatical rules, yet they often fail to apply those rules when judging sentence acceptability. We present "grammar prompting", an explain-then-process paradigm: a large LLM first produces a concise explanation of the relevant syntactic phenomenon, then that explanation is fed back as additional context to the target model -- either an LLM or a smaller LLM (SLM) -- before deciding which sentence of a minimal pair is grammatical. On the English BLiMP, Chinese SLING, and Russian RuBLiMP benchmarks, this simple prompt design yields substantial improvements over strong baselines across many syntactic phenomena. Feeding an LLM's metalinguistic explanation back to the target model bridges the gap between knowing a rule and using it. On SLMs, grammar prompting alone trims the average LLM-SLM accuracy gap by about 20%, and when paired with chain-of-thought, by 56% (13.0 pp -> 5.8 pp), all at negligible cost. The lightweight, language-agnostic cue lets low-cost SLMs approach frontier-LLM performance in multilingual settings.

Explain-then-Process: Using Grammar Prompting to Enhance Grammatical Acceptability Judgments

This paper introduces an innovative approach termed "grammar prompting" to enhance the grammatical acceptability judgments of LLMs. LLMs, despite their proficiency in generating coherent text, often exhibit limitations when explicitly reasoning about grammatical structures. The proposed grammar prompting method aims to bridge the gap between LLMs' implicit grammatical knowledge and their ability to apply grammatical rules through a strategic explain-then-process paradigm.

Methodology and Results

The core of this paradigm involves two main steps: firstly, eliciting a concise explanation of a syntactic phenomenon from an LLM; secondly, feeding this explanation back to the same or another model to guide its grammatical judgments. Experiments are conducted across several benchmarks—BLiMP (English), SLING (Chinese), and RuBLiMP (Russian)—to test the efficacy of grammar prompting.

On smaller LLMs (SLMs), grammar prompting alone reduces the accuracy gap with LLMs by 20%, while a combined approach with chain-of-thought (CoT) reasoning achieves a 56% reduction (from a gap of 13.0 percentage points to 5.8 pp), all at negligible computational cost. The results indicate that grammar prompting significantly improves models' ability to make structured grammatical judgments, especially in categories governed by categorical linguistic constraints (e.g., negative polarity items).

Implications

The findings have important implications for improving the linguistic reasoning capabilities of models, particularly in multilingual contexts where access to extensive linguistic resources is limited. By leveraging language-agnostic cues, low-cost SLMs can approach the performance of frontier LLMs, promoting more equitable access to advanced AI capabilities. Furthermore, this approach provides a scalable solution for enhancing LLMs' proficiency in under-resourced languages.

Challenges and Future Directions

While grammar prompting shows substantial gains, challenges persist in phenomena requiring finer constituent recognition or language-specific lexical knowledge. Models can occasionally misapply rules due to errors in constituent identification or reasoning chains. Future research may focus on improving models' parsing abilities and exploring the integration of explicit grammatical rules with in-context learning examples to further augment parsing accuracy.

Conclusion

This paper presents grammar prompting as a promising approach to augmenting LLMs' linguistic reasoning from implicit knowledge to explicit rule application. This framework not only enhances grammatical competency but also acts as a low-cost equalizer between model sizes, making sophisticated language processing capabilities more accessible in diverse and previously underserved linguistic domains.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Russell Scheinberg (4 papers)
  2. Ameeta Agrawal (23 papers)
  3. Amber Shore (3 papers)
  4. So Young Lee (4 papers)
Youtube Logo Streamline Icon: https://streamlinehq.com