Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Cutting Through the Noise: Boosting LLM Performance on Math Word Problems (2406.15444v3)

Published 30 May 2024 in cs.CL

Abstract: LLMs excel at various tasks, including solving math word problems (MWPs), but struggle with real-world problems containing irrelevant information. To address this, we propose a prompting framework that generates adversarial variants of MWPs by adding irrelevant variables. We introduce a dataset, PROBLEMATHIC, containing both adversarial and non-adversarial MWPs. Our experiments reveal that LLMs are susceptible to distraction by numerical noise, resulting in an average relative performance drop of ~26% on adversarial MWPs. To mitigate this, we fine-tune LLMs (Llama-2, Mistral) on the adversarial samples from our dataset. Fine-tuning on adversarial training instances improves performance on adversarial MWPs by ~8%, indicating increased robustness to noise and improved ability to identify relevant data for reasoning. Finally, to assess the generalizability of our prompting framework, we introduce GSM-8K-Adv, an adversarial variant of the GSM-8K benchmark. LLMs continue to struggle when faced with adversarial information, reducing performance by up to 6%.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Ujjwala Anantheswaran (6 papers)
  2. Himanshu Gupta (54 papers)
  3. Kevin Scaria (7 papers)
  4. Shreyas Verma (7 papers)
  5. Chitta Baral (152 papers)
  6. Swaroop Mishra (60 papers)
Citations (3)
X Twitter Logo Streamline Icon: https://streamlinehq.com