Case-Based Reasoning Approach for Solving Financial Question Answering (2405.13044v1)
Abstract: Measuring a machine's understanding of human language often involves assessing its reasoning skills, i.e. logical process of deriving answers to questions. While recent LLMs have shown remarkable proficiency in text based tasks, their efficacy in complex reasoning problems involving heterogeneous information such as text, tables, and numbers remain uncertain. Addressing this gap, FinQA introduced a numerical reasoning dataset for financial documents and simultaneously proposed a program generation approach . Our investigation reveals that half of the errors (48%) stem from incorrect operations being generated. To address this issue, we propose a novel approach to tackle numerical reasoning problems using case based reasoning (CBR), an artificial intelligence paradigm that provides problem solving guidance by offering similar cases (i.e. similar questions and corresponding logical programs). Our model retrieves relevant cases to address a given question, and then generates an answer based on the retrieved cases and contextual information. Through experiments on the FinQA dataset, we demonstrate competitive performance of our approach and additionally show that by expanding case repository, we can help solving complex multi step programs which FinQA showed weakness of.
- Agnar Aamodt and Enric Plaza. 1994. Case-based reasoning: Foundational issues, methodological variations, and system approaches. AI communications, 7(1):39–59.
- Mathqa: Towards interpretable math word problem solving with operation-based formalisms. arXiv preprint arXiv:1905.13319.
- Natural language interfaces to databases–an introduction. Natural language engineering, 1(1):29–81.
- Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473.
- Neural symbolic reader: Scalable integration of distributed and symbolic representations for reading comprehension. In International Conference on Learning Representations.
- Finqa: A dataset of numerical reasoning over financial data. arXiv preprint arXiv:2109.00122.
- Electra: Pre-training text encoders as discriminators rather than generators. arXiv preprint arXiv:2003.10555.
- Case-based reasoning for natural language queries over knowledge bases. arXiv preprint arXiv:2104.08762.
- Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
- Wizard of wikipedia: Knowledge-powered conversational agents. arXiv preprint arXiv:1811.01241.
- Table pre-training: A survey on model architectures, pre-training objectives, and downstream tasks. arXiv preprint arXiv:2201.09745.
- Tapas: Weakly supervised table parsing via pre-training. arXiv preprint arXiv:2004.02349.
- Poly-encoders: Transformer architectures and pre-training strategies for fast and accurate multi-sentence scoring. arXiv preprint arXiv:1905.01969.
- Dense passage retrieval for open-domain question answering. arXiv preprint arXiv:2004.04906.
- Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942.
- Vladimir I Levenshtein et al. 1966. Binary codes capable of correcting deletions, insertions, and reversals. In Soviet physics doklady, volume 10, pages 707–710. Soviet Union.
- Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692.
- Training millions of personalized dialogue agents. arXiv preprint arXiv:1809.01984.
- Panupong Pasupat and Percy Liang. 2015. Compositional semantic parsing on semi-structured tables. arXiv preprint arXiv:1508.00305.
- Squad: 100,000+ questions for machine comprehension of text. arXiv preprint arXiv:1606.05250.
- Okapi at trec-3. Nist Special Publication Sp, 109:109.
- Qa dataset explosion: A taxonomy of nlp resources for question answering and reading comprehension. ACM Computing Surveys, 55(10):1–45.
- Karen Sparck Jones. 1972. A statistical interpretation of term specificity and its application in retrieval. Journal of documentation, 28(1):11–21.
- Sequence to sequence learning with neural networks. Advances in neural information processing systems, 27.
- Jesse Vig and Kalai Ramea. 2019. Comparison of transfer-learning approaches for response selection in multi-turn conversations. In Workshop on DSTC7.
- Transfertransfo: A transfer learning approach for neural network based conversational agents. arXiv preprint arXiv:1901.08149.
- Tabert: Pretraining for joint understanding of textual and tabular data. arXiv preprint arXiv:2005.08314.
- Multihiertt: Numerical reasoning over multi hierarchical tabular and textual data. arXiv preprint arXiv:2206.01347.
- Tat-qa: A question answering benchmark on a hybrid of tabular and textual content in finance. arXiv preprint arXiv:2105.07624.
- Yikyung Kim (1 paper)
- Jay-Yoon Lee (16 papers)