FinTextQA: A Dataset for Long-form Financial Question Answering (2405.09980v1)

Published 16 May 2024 in cs.CL and cs.AI

Abstract: Accurate evaluation of financial question answering (QA) systems necessitates a comprehensive dataset encompassing diverse question types and contexts. However, current financial QA datasets lack scope diversity and question complexity. This work introduces FinTextQA, a novel dataset for long-form question answering (LFQA) in finance. FinTextQA comprises 1,262 high-quality, source-attributed QA pairs extracted and selected from finance textbooks and government agency websites.Moreover, we developed a Retrieval-Augmented Generation (RAG)-based LFQA system, comprising an embedder, retriever, reranker, and generator. A multi-faceted evaluation approach, including human ranking, automatic metrics, and GPT-4 scoring, was employed to benchmark the performance of different LFQA system configurations under heightened noisy conditions. The results indicate that: (1) Among all compared generators, Baichuan2-7B competes closely with GPT-3.5-turbo in accuracy score; (2) The most effective system configuration on our dataset involved setting the embedder, retriever, reranker, and generator as Ada2, Automated Merged Retrieval, Bge-Reranker-Base, and Baichuan2-7B, respectively; (3) models are less susceptible to noise after the length of contexts reaching a specific threshold.

PDF HTML Abstract

Summarize PDF Markdown Bookmark Chat (Pro)

References (35)

Authors (8)

Jian Chen (257 papers)
Peilin Zhou (34 papers)
Yining Hua (23 papers)
Yingxin Loh (1 paper)
Kehui Chen (6 papers)
Ziyuan Li (32 papers)
Bing Zhu (53 papers)
Junwei Liang (47 papers)

Citations (2)

View on Semantic Scholar

Tweets

https://twitter.com/knishimae0531/status/1795609751331680379

https://twitter.com/dhruvtrehan9/status/1900644307864088588

FinTextQA: A Dataset for Long-form Financial Question Answering (2405.09980v1)

Related Papers

Tweets