Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
144 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

ESGReveal: An LLM-based approach for extracting structured data from ESG reports (2312.17264v1)

Published 25 Dec 2023 in cs.CL and cs.IR

Abstract: ESGReveal is an innovative method proposed for efficiently extracting and analyzing Environmental, Social, and Governance (ESG) data from corporate reports, catering to the critical need for reliable ESG information retrieval. This approach utilizes LLMs (LLM) enhanced with Retrieval Augmented Generation (RAG) techniques. The ESGReveal system includes an ESG metadata module for targeted queries, a preprocessing module for assembling databases, and an LLM agent for data extraction. Its efficacy was appraised using ESG reports from 166 companies across various sectors listed on the Hong Kong Stock Exchange in 2022, ensuring comprehensive industry and market capitalization representation. Utilizing ESGReveal unearthed significant insights into ESG reporting with GPT-4, demonstrating an accuracy of 76.9% in data extraction and 83.7% in disclosure analysis, which is an improvement over baseline models. This highlights the framework's capacity to refine ESG data analysis precision. Moreover, it revealed a demand for reinforced ESG disclosures, with environmental and social data disclosures standing at 69.5% and 57.2%, respectively, suggesting a pursuit for more corporate transparency. While current iterations of ESGReveal do not process pictorial information, a functionality intended for future enhancement, the study calls for continued research to further develop and compare the analytical capabilities of various LLMs. In summary, ESGReveal is a stride forward in ESG data processing, offering stakeholders a sophisticated tool to better evaluate and advance corporate sustainability efforts. Its evolution is promising in promoting transparency in corporate reporting and aligning with broader sustainable development aims.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (40)
  1. Sustainable investing: The black box of environmental, social, and governance (esg) ratings. The Journal of Wealth Management .
  2. Qwen technical report. arXiv preprint arXiv:2309.16609.
  3. Glitter or gold? deriving structured insights from sustainability reports via large language models. arXiv preprint arXiv:2310.05628.
  4. Climate governance effects on carbon disclosure and performance. The British Accounting Review 52, 100880.
  5. Practical ai cases for solving esg challenges. Sustainability 15, 12731.
  6. Causes and consequences of voluntary assurance of csr reports. Accounting, Auditing & Accountability Journal 32, 2451--2474.
  7. "glitter or gold? deriving structured insights from sustainability reports via large language models". Journal of Sustainable Reporting 10, 100--120.
  8. Glm: General language model pretraining with autoregressive blank infilling, in: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 320--335.
  9. Automatic esg assessment of companies by mining and evaluating media coverage data: Nlp approach and tool.
  10. Hong Kong Exchanges and Clearing Limited, 2023a. Corporate Governance Code. URL: https://en-rules.hkex.com.hk/sites/default/files/net_file_store/HKEX4476_3828_VER23680.pdf.
  11. Hong Kong Exchanges and Clearing Limited, 2023b. ESG Reporting Guide. URL: https://en-rules.hkex.com.hk/sites/default/files/net_file_store/HKEX4476_3841_VER18584.pdf.
  12. Layoutlmv3: Pre-training for document ai with unified text and image masking, in: Proceedings of the 30th ACM International Conference on Multimedia.
  13. Billion-scale similarity search with gpus. IEEE Transactions on Big Data 7, 535--547.
  14. Bloated Disclosures: Can ChatGPT Help Investors Process Information? Technical Report 23-07. Chicago Booth Research Paper.
  15. Langchain: Building applications with llms through composability. https://python.langchain.com/.
  16. Proposing an integrated approach to analyzing esg data via machine learning and deep learning algorithms. Sustainability 14, 8745.
  17. Retrieval-augmented generation for knowledge-intensive nlp tasks, in: Advances in Neural Information Processing Systems, pp. 9459--9474.
  18. Public perceptions of environmental, social, and governance (esg) based on social media data: Evidence from china. Journal of Cleaner Production 387, 135840.
  19. Analyzing sustainability reports using natural language processing. arXiv preprint arXiv:2011.08073 .
  20. A conceptual framework for subdomain specific pre-training of large language models for green claim detection. European Journal of Sustainable Development 12, 319.
  21. Chatreport: Democratizing sustainability disclosure analysis through llm-based tools. arXiv arXiv:2307.15770.
  22. OpenAI, 2023a. Gpt-4 technical report. Technical Report.
  23. OpenAI, 2023b. Instructgpt: Ai for generating instructions. https://openai.com/research/instructgpt/.
  24. Nlp for responsible finance: Fine-tuning transformer-based models for esg, pp. 3532--3536.
  25. Evaluating environmental, social, and governance (esg) from a systemic perspective: An analysis supported by natural language processing. SSRN Electronic Journal .
  26. Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research 21, 5485--5551.
  27. Mapping esg trends by distant supervision of neural language models.
  28. Greenai – an nlp approach to esg financing.
  29. Esg controversies, esg disclosure and analyst forecast accuracy. International Review of Financial Analysis .
  30. Pubtables-1m: Towards comprehensive table extraction from unstructured documents, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4634--4642.
  31. S&P Dow Jones Indices, 2023. ESG Score Methodology. URL: http://www.spdji.com/esg-score-methodology.
  32. Environmental, social, and governance (esg) disclosure: A literature review. The British Accounting Review 55, 101149.
  33. Esg data collection with adaptive ai, in: Proceedings of the [Conference Name], pp. 468--475.
  34. Milvus: A purpose-built vector data management system, in: Proceedings of the 2021 International Conference on Management of Data, pp. 2614--2627.
  35. M3e: Moka massive mixed embedding model.
  36. Lore: Logical location regression network for table structure recognition, in: Proceedings of the AAAI Conference on Artificial Intelligence, AAAI Press. doi:10.1609/aaai.v37i3.25402.
  37. Lore: Logical location regression network for table structure recognition, in: Proceedings of the AAAI Conference on Artificial Intelligence.
  38. Fingpt: Open-source financials large language models. arXiv arXiv:2306.06031.
  39. Renovation in environmental, social and governance (esg) research: the application of machine learning. Asian Review of Accounting .
  40. Hlatr: Enhance multi-stage text retrieval with hybrid list aware transformer reranking. arXiv preprint arXiv:2205.10569.
Citations (3)

Summary

We haven't generated a summary for this paper yet.