Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Learning to Reduce: Towards Improving Performance of Large Language Models on Structured Data (2407.02750v1)

Published 3 Jul 2024 in cs.CL

Abstract: LLMs have been achieving competent performance on a wide range of downstream tasks, yet existing work shows that inference on structured data is challenging for LLMs. This is because LLMs need to either understand long structured data or select the most relevant evidence before inference, and both approaches are not trivial. This paper proposes a framework, Learning to Reduce, that fine-tunes a LLM with On-Policy Learning to generate a reduced version of an input structured data. When compared to state-of-the-art LLMs like GPT-4, Learning to Reduce not only achieves outstanding performance in reducing the input, but shows generalizability on different datasets. We further show that the model fine-tuned with our framework helps LLMs better perform on table QA tasks especially when the context is longer.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Younghun Lee (6 papers)
  2. Sungchul Kim (65 papers)
  3. Ryan A. Rossi (124 papers)
  4. Tong Yu (119 papers)
  5. Xiang Chen (343 papers)
Citations (1)
Youtube Logo Streamline Icon: https://streamlinehq.com