Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
194 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Towards reducing hallucination in extracting information from financial reports using Large Language Models (2310.10760v1)

Published 16 Oct 2023 in cs.CL, q-fin.PM, q-fin.ST, and stat.AP

Abstract: For a financial analyst, the question and answer (Q&A) segment of the company financial report is a crucial piece of information for various analysis and investment decisions. However, extracting valuable insights from the Q&A section has posed considerable challenges as the conventional methods such as detailed reading and note-taking lack scalability and are susceptible to human errors, and Optical Character Recognition (OCR) and similar techniques encounter difficulties in accurately processing unstructured transcript text, often missing subtle linguistic nuances that drive investor decisions. Here, we demonstrate the utilization of LLMs to efficiently and rapidly extract information from earnings report transcripts while ensuring high accuracy transforming the extraction process as well as reducing hallucination by combining retrieval-augmented generation technique as well as metadata. We evaluate the outcomes of various LLMs with and without using our proposed approach based on various objective metrics for evaluating Q&A systems, and empirically demonstrate superiority of our method.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (25)
  1. Palm 2 technical report. arXiv preprint arXiv:2305.10403 (2023).
  2. A survey of longest common subsequence algorithms. In Proceedings Seventh International Symposium on String Processing and Information Retrieval. SPIRE 2000. IEEE, 39–48.
  3. Pythia: A suite for analyzing large language models across training and scaling. In International Conference on Machine Learning. PMLR, 2397–2430.
  4. Language models are few-shot learners. Advances in neural information processing systems 33 (2020), 1877–1901.
  5. Scaling instruction-finetuned language models. arXiv preprint arXiv:2210.11416 (2022).
  6. Graham A Cutting and Anne-Françoise Cutting-Decelle. 2021. Intelligent Document Processing–Methods and Tools in the real world. arXiv preprint arXiv:2112.14070 (2021).
  7. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).
  8. Jade Goldstein and Jaime G Carbonell. 1998. Summarization:(1) using MMR for diversity-based reranking and (2) evaluating summaries. In TIPSTER TEXT PROGRAM PHASE III: Proceedings of a Workshop held at Baltimore, Maryland, October 13-15, 1998. 181–195.
  9. Hien Thi Ha and Ales Horák. 2022. Information extraction from scanned invoice images using text analysis and layout features. Signal Processing: Image Communication 102 (2022), 116601.
  10. Matthew A Jaro. 1989. Advances in record-linkage methodology as applied to matching the 1985 census of Tampa, Florida. J. Amer. Statist. Assoc. 84, 406 (1989), 414–420.
  11. Retrieval-augmented generation for knowledge-intensive nlp tasks. Advances in Neural Information Processing Systems 33 (2020), 9459–9474.
  12. RETA-LLM: A Retrieval-Augmented Large Language Model Toolkit. arXiv preprint arXiv:2306.05212 (2023).
  13. Graph convolution for multimodal information extraction from visually rich documents. arXiv preprint arXiv:1903.11279 (2019).
  14. Crosslingual generalization through multitask finetuning. arXiv preprint arXiv:2211.01786 (2022).
  15. Paradigm Shift in Sustainability Disclosure Analysis: Empowering Stakeholders with CHATREPORT, a Language Model-Based Tool. arXiv preprint arXiv:2306.15518 (2023).
  16. Shreeshiv Patel and Dvijesh Bhatt. 2020. Abstractive information extraction from scanned invoices (AIESI) using end-to-end sequential approach. arXiv preprint arXiv:2009.05728 (2020).
  17. Graphie: A graph-based framework for information extraction. arXiv preprint arXiv:1810.13083 (2018).
  18. Mahmudul Sheikh and Sumali Conlon. 2012. A rule-based system to extract financial information. Journal of Computer Information Systems 52, 4 (2012), 10–19.
  19. Docile benchmark for document information localization and extraction. arXiv preprint arXiv:2302.05658 (2023).
  20. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023).
  21. William E Winkler. 1990. String comparator metrics and enhanced decision rules in the Fellegi-Sunter model of record linkage. (1990).
  22. BARTScore: Evaluating Generated Text as Text Generation. In Advances in Neural Information Processing Systems, M. Ranzato, A. Beygelzimer, Y. Dauphin, P.S. Liang, and J. Wortman Vaughan (Eds.), Vol. 34. Curran Associates, Inc., 27263–27277. https://proceedings.neurips.cc/paper/2021/file/e4d2b6e6fdeca3e60e0f1a62fee3d9dd-Paper.pdf
  23. Leveraging LLMs for KPIs Retrieval from Hybrid Long-Document: A Comprehensive Framework and Dataset. arXiv preprint arXiv:2305.16344 (2023).
  24. BERTScore: Evaluating Text Generation with BERT. In International Conference on Learning Representations. https://openreview.net/forum?id=SkeHuCVFDr
  25. ToolQA: A Dataset for LLM Question Answering with External Tools. arXiv preprint arXiv:2306.13304 (2023).
Citations (7)

Summary

We haven't generated a summary for this paper yet.