Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Reducing the impact of out of vocabulary words in the translation of natural language questions into SPARQL queries (2111.03000v1)

Published 4 Nov 2021 in cs.CL, cs.IR, and cs.LG

Abstract: Accessing the large volumes of information available in public knowledge bases might be complicated for those users unfamiliar with the SPARQL query language. Automatic translation of questions posed in natural language in SPARQL has the potential of overcoming this problem. Existing systems based on neural-machine translation are very effective but easily fail in recognizing words that are Out Of the Vocabulary (OOV) of the training set. This is a serious issue while querying large ontologies. In this paper, we combine Named Entity Linking, Named Entity Recognition, and Neural Machine Translation to perform automatic translation of natural language questions into SPARQL queries. We demonstrate empirically that our approach is more effective and resilient to OOV words than existing approaches by running the experiments on Monument, QALD-9, and LC-QuAD v1, which are well-known datasets for Question Answering over DBpedia.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Manuel A. Borroto Santana (1 paper)
  2. Francesco Ricca (36 papers)
  3. Bernardo Cuteri (4 papers)
Citations (3)

Summary

We haven't generated a summary for this paper yet.