Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Open Question Answering over Tables and Text (2010.10439v2)

Published 20 Oct 2020 in cs.CL and cs.AI

Abstract: In open question answering (QA), the answer to a question is produced by retrieving and then analyzing documents that might contain answers to the question. Most open QA systems have considered only retrieving information from unstructured text. Here we consider for the first time open QA over both tabular and textual data and present a new large-scale dataset Open Table-and-Text Question Answering (OTT-QA) to evaluate performance on this task. Most questions in OTT-QA require multi-hop inference across tabular data and unstructured text, and the evidence required to answer a question can be distributed in different ways over these two types of input, making evidence retrieval challenging -- our baseline model using an iterative retriever and BERT-based reader achieves an exact match score less than 10%. We then propose two novel techniques to address the challenge of retrieving and aggregating evidence for OTT-QA. The first technique is to use "early fusion" to group multiple highly relevant tabular and textual units into a fused block, which provides more context for the retriever to search for. The second technique is to use a cross-block reader to model the cross-dependency between multiple retrieved evidence with global-local sparse attention. Combining these two techniques improves the score significantly, to above 27%.

Overview of Open Question Answering over Tables and Text

The paper "Open Question Answering over Tables and Text" introduces a novel approach to an open-domain question answering (QA) system that simultaneously leverages both structured tabular data and unstructured textual data. The paper also introduces the Open Table-and-Text Question Answering (OTT-QA) dataset, specifically designed to evaluate the performance of QA systems on this combined input format.

Key Contributions

OTQ-QA tackles the existing limitation in open QA systems, which typically focus on textual data retrieval and reading, by integrating tables as another rich source of data. This approach recognizes that tables can store aggregated numeric facts and collections that are less frequently found in unstructured text, thereby presenting opportunities for richer information extraction.

The key contributions of the paper include:

  1. OTT-QA Dataset: A significant addition to the QA domain, the OTT-QA dataset consists of 45,000 human-annotated questions requiring multi-hop reasoning across both tables and text. This dataset pushes the boundaries of current QA models by requiring them to effectively retrieve and synthesize dispersed evidence from mixed data formats.
  2. Fusion Retriever and Cross-Block Reader: The authors propose two new techniques to handle the challenges of evidence retrieval and aggregation:
    • Fusion Retriever: This technique involves early fusion of multiple relevant tabular and textual units into a single fused block, enhancing the context available for retrieval.
    • Cross-Block Reader: This reader uses global-local sparse attention, allowing it to model dependencies across multiple retrieved evidence blocks. This strategy effectively reduces the burden of processing long sequences and allows for cross-referencing between different blocks of evidence.
  3. Evaluation and Results: By combining the fusion retriever and cross-block reader, the proposed system demonstrates a significant improvement in performance with an exact match score above 27%, compared to less than 10% with baseline models using iterative retrievers and BERT-based readers.

Implications and Future Research Directions

The paper's approach has significant implications for the development of more flexible and effective QA systems. The techniques introduced could extend beyond text-only datasets, potentially influencing domains that utilize multi-modal data integration, such as information retrieval in business intelligence or scientific research, where data may be heterogeneous.

The integration of tables and text is a crucial step towards more realistic QA tasks that emulate human-like reasoning, where information is often gathered and synthesized from diverse sources.

Future developments in this area might involve further enhancing retrieval techniques through advanced natural language understanding and the integration of additional data types, such as images or audio. Improvements in the cross-block reader approach to handle even more extensive sequences efficiently could also be pursued, potentially leveraging models such as sparse attention transformers or other forms of efficient long-sequence transformers.

In conclusion, this work marks a pivotal point in the enrichment of computational models designed to undertake human-like questioning and answering tasks, setting the stage for further advancements in multi-modal information retrieval and analysis.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Wenhu Chen (134 papers)
  2. Ming-Wei Chang (44 papers)
  3. Eva Schlinger (2 papers)
  4. William Wang (38 papers)
  5. William W. Cohen (79 papers)
Citations (173)