Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Content-Based Table Retrieval for Web Queries (1706.02427v1)

Published 8 Jun 2017 in cs.CL

Abstract: Understanding the connections between unstructured text and semi-structured table is an important yet neglected problem in natural language processing. In this work, we focus on content-based table retrieval. Given a query, the task is to find the most relevant table from a collection of tables. Further progress towards improving this area requires powerful models of semantic matching and richer training and evaluation resources. To remedy this, we present a ranking based approach, and implement both carefully designed features and neural network architectures to measure the relevance between a query and the content of a table. Furthermore, we release an open-domain dataset that includes 21,113 web queries for 273,816 tables. We conduct comprehensive experiments on both real world and synthetic datasets. Results verify the effectiveness of our approach and present the challenges for this task.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Zhao Yan (16 papers)
  2. Duyu Tang (65 papers)
  3. Nan Duan (172 papers)
  4. Junwei Bao (34 papers)
  5. Yuanhua Lv (6 papers)
  6. Ming Zhou (182 papers)
  7. Zhoujun Li (122 papers)
Citations (20)