DBTagger: Multi-Task Learning for Keyword Mapping in NLIDBs Using Bi-Directional Recurrent Neural Networks (2101.04226v1)

Published 11 Jan 2021 in cs.DB and cs.CL

Abstract: Translating Natural Language Queries (NLQs) to Structured Query Language (SQL) in interfaces deployed in relational databases is a challenging task, which has been widely studied in database community recently. Conventional rule based systems utilize series of solutions as a pipeline to deal with each step of this task, namely stop word filtering, tokenization, stemming/lemmatization, parsing, tagging, and translation. Recent works have mostly focused on the translation step overlooking the earlier steps by using ad-hoc solutions. In the pipeline, one of the most critical and challenging problems is keyword mapping; constructing a mapping between tokens in the query and relational database elements (tables, attributes, values, etc.). We define the keyword mapping problem as a sequence tagging problem, and propose a novel deep learning based supervised approach that utilizes POS tags of NLQs. Our proposed approach, called \textit{DBTagger} (DataBase Tagger), is an end-to-end and schema independent solution, which makes it practical for various relational databases. We evaluate our approach on eight different datasets, and report new state-of-the-art accuracy results, $92.4\%$ on the average. Our results also indicate that DBTagger is faster than its counterparts up to $10000$ times and scalable for bigger databases.

PDF Abstract

Summarize Bookmark Chat (Pro)

Authors (3)

Özgür Ulusoy (6 papers)
Arif Usta (3 papers)
Akifhan Karakayali (2 papers)

Citations (8)

View on Semantic Scholar

DBTagger: Multi-Task Learning for Keyword Mapping in NLIDBs Using Bi-Directional Recurrent Neural Networks (2101.04226v1)

Related Papers