Explain-then-Process: Using Grammar Prompting to Enhance Grammatical Acceptability Judgments
This paper introduces an innovative approach termed "grammar prompting" to enhance the grammatical acceptability judgments of LLMs. LLMs, despite their proficiency in generating coherent text, often exhibit limitations when explicitly reasoning about grammatical structures. The proposed grammar prompting method aims to bridge the gap between LLMs' implicit grammatical knowledge and their ability to apply grammatical rules through a strategic explain-then-process paradigm.
Methodology and Results
The core of this paradigm involves two main steps: firstly, eliciting a concise explanation of a syntactic phenomenon from an LLM; secondly, feeding this explanation back to the same or another model to guide its grammatical judgments. Experiments are conducted across several benchmarks—BLiMP (English), SLING (Chinese), and RuBLiMP (Russian)—to test the efficacy of grammar prompting.
On smaller LLMs (SLMs), grammar prompting alone reduces the accuracy gap with LLMs by 20%, while a combined approach with chain-of-thought (CoT) reasoning achieves a 56% reduction (from a gap of 13.0 percentage points to 5.8 pp), all at negligible computational cost. The results indicate that grammar prompting significantly improves models' ability to make structured grammatical judgments, especially in categories governed by categorical linguistic constraints (e.g., negative polarity items).
Implications
The findings have important implications for improving the linguistic reasoning capabilities of models, particularly in multilingual contexts where access to extensive linguistic resources is limited. By leveraging language-agnostic cues, low-cost SLMs can approach the performance of frontier LLMs, promoting more equitable access to advanced AI capabilities. Furthermore, this approach provides a scalable solution for enhancing LLMs' proficiency in under-resourced languages.
Challenges and Future Directions
While grammar prompting shows substantial gains, challenges persist in phenomena requiring finer constituent recognition or language-specific lexical knowledge. Models can occasionally misapply rules due to errors in constituent identification or reasoning chains. Future research may focus on improving models' parsing abilities and exploring the integration of explicit grammatical rules with in-context learning examples to further augment parsing accuracy.
Conclusion
This paper presents grammar prompting as a promising approach to augmenting LLMs' linguistic reasoning from implicit knowledge to explicit rule application. This framework not only enhances grammatical competency but also acts as a low-cost equalizer between model sizes, making sophisticated language processing capabilities more accessible in diverse and previously underserved linguistic domains.