Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
127 tokens/sec
GPT-4o
11 tokens/sec
Gemini 2.5 Pro Pro
53 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
10 tokens/sec
DeepSeek R1 via Azure Pro
33 tokens/sec
2000 character limit reached

Hybrid Querying Over Relational Databases and Large Language Models (2408.00884v2)

Published 1 Aug 2024 in cs.DB and cs.CL

Abstract: Database queries traditionally operate under the closed-world assumption, providing no answers to questions that require information beyond the data stored in the database. Hybrid querying using SQL offers an alternative by integrating relational databases with LLMs to answer beyond-database questions. In this paper, we present the first cross-domain benchmark, SWAN, containing 120 beyond-database questions over four real-world databases. To leverage state-of-the-art LLMs in addressing these complex questions in SWAN, we present two solutions: one based on schema expansion and the other based on user defined functions. We also discuss optimization opportunities and potential future directions. Our evaluation demonstrates that using GPT-4 Turbo with few-shot prompts, one can achieves up to 40.0\% in execution accuracy and 48.2\% in data factuality. These results highlights both the potential and challenges for hybrid querying. We believe that our work will inspire further research in creating more efficient and accurate data systems that seamlessly integrate relational databases and LLMs to address beyond-database questions.

Summary

We haven't generated a summary for this paper yet.