Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

TABLET: Learning From Instructions For Tabular Data (2304.13188v1)

Published 25 Apr 2023 in cs.LG and cs.CL

Abstract: Acquiring high-quality data is often a significant challenge in training ML models for tabular prediction, particularly in privacy-sensitive and costly domains like medicine and finance. Providing natural language instructions to LLMs offers an alternative solution. However, it is unclear how effectively instructions leverage the knowledge in LLMs for solving tabular prediction problems. To address this gap, we introduce TABLET, a benchmark of 20 diverse tabular datasets annotated with instructions that vary in their phrasing, granularity, and technicality. Additionally, TABLET includes the instructions' logic and structured modifications to the instructions. We find in-context instructions increase zero-shot F1 performance for Flan-T5 11b by 44% on average and 13% for ChatGPT on TABLET. Also, we explore the limitations of using LLMs for tabular prediction in our benchmark by evaluating instruction faithfulness. We find LLMs often ignore instructions and fail to predict specific instances correctly, even with examples. Our analysis on TABLET shows that, while instructions help LLM performance, learning from instructions for tabular data requires new capabilities.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Dylan Slack (17 papers)
  2. Sameer Singh (96 papers)
Citations (12)