INACIA: Integrating Large Language Models in Brazilian Audit Courts: Opportunities and Challenges (2401.05273v3)
Abstract: This paper introduces INACIA (Instru\c{c}~ao Assistida com Intelig^encia Artificial), a groundbreaking system designed to integrate LLMs into the operational framework of Brazilian Federal Court of Accounts (TCU). The system automates various stages of case analysis, including basic information extraction, admissibility examination, Periculum in mora and Fumus boni iuris analyses, and recommendations generation. Through a series of experiments, we demonstrate INACIA's potential in extracting relevant information from case documents, evaluating its legal plausibility, and formulating propositions for judicial decision-making. Utilizing a validation dataset alongside LLMs, our evaluation methodology presents a novel approach to assessing system performance, correlating highly with human judgment. These results underscore INACIA's potential in complex legal task handling while also acknowledging the current limitations. This study discusses possible improvements and the broader implications of applying AI in legal contexts, suggesting that INACIA represents a significant step towards integrating AI in legal systems globally, albeit with cautious optimism grounded in the empirical findings.
- Daron Acemoglu and David Autor. 2011. Chapter 12 - Skills, Tasks and Technologies: Implications for Employment and Earnings. Handbook of Labor Economics, Vol. 4. Elsevier, 1043–1171. https://doi.org/10.1016/S0169-7218(11)02410-5
- Apache. [n. d.]. Apache Tika. https://tika.apache.org/. Accessed: 2023-12-11.
- Introspective Tips: Large Language Model for In-Context Decision Making. arXiv:2305.11598 [cs.AI]
- SummEval: Re-evaluating Summarization Evaluation. Transactions of the Association for Computational Linguistics 9 (04 2021), 391–409. https://doi.org/10.1162/tacl_a_00373 arXiv:https://direct.mit.edu/tacl/article-pdf/doi/10.1162/tacl_a_00373/1923949/tacl_a_00373.pdf
- Mathieu Fenniak. [n. d.]. PyPDF2. https://pypdf2.readthedocs.io/en/3.0.0/. Accessed: 2023-12-11.
- Claudio Ferraz and Frederico Finan. 2008. Exposing Corrupt Politicians: The Effects of Brazil’s Publicly Released Audits on Electoral Outcomes*. The Quarterly Journal of Economics 123, 2 (05 2008), 703–745. https://doi.org/10.1162/qjec.2008.123.2.703 arXiv:https://academic.oup.com/qje/article-pdf/123/2/703/5441143/123-2-703.pdf
- GPTScore: Evaluate as You Desire. arXiv:2302.04166 [cs.CL]
- Google. [n. d.]. Google Cloud Vision API. https://cloud.google.com/vision/. Accessed: 2023-12-11.
- Katikapalli Subramanyam Kalyan. 2023. A survey of GPT-3 family large language models including ChatGPT and GPT-4. Natural Language Processing Journal (2023), 100048. https://doi.org/10.1016/j.nlp.2023.100048
- Large Language Models are Zero-Shot Reasoners. https://doi.org/10.48550/ARXIV.2205.11916
- Retrieval-augmented Generation for Knowledge-intensive NLP Tasks. Advances in Neural Information Processing Systems 33 (2020), 9459–9474.
- G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment. arXiv:2303.16634 [cs.CL]
- Microsoft. [n. d.]. Microsoft Azure Form Recognizer. https://azure.microsoft.com/en-us/services/cognitive-services/form-recognizer/. Accessed: 2023-12-11.
- Document Ranking with a Pretrained Sequence-to-Sequence Model. In Findings of the Association for Computational Linguistics: EMNLP 2020. Association for Computational Linguistics, Online, 708–718. https://doi.org/10.18653/v1/2020.findings-emnlp.63
- Visconde: Multi-document QA with GPT-3 and Neural Reranking. In Advances in Information Retrieval, Jaap Kamps, Lorraine Goeuriot, Fabio Crestani, Maria Maistro, Hideo Joho, Brian Davis, Cathal Gurrin, Udo Kruschwitz, and Annalina Caputo (Eds.). Springer Nature Switzerland, Cham, 534–543. https://doi.org/10.1007/978-3-031-28238-6_44
- Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res. 21, 140 (2020), 1–67.
- Okapi at TREC-3. In TREC.
- Murray Shanahan and Catherine Clarke. 2023. Evaluating Large Language Model Creativity from a Literary Perspective. arXiv:2312.03746 [cs.CL]
- Ray Smith. 2007. An overview of the Tesseract OCR engine. In Ninth international conference on document analysis and recognition (ICDAR 2007), Vol. 2. IEEE, 629–633.
- Bruno W Speck. 2011. Auditing Institutions. Corruption and Democracy in Brazil: the Struggle for Accountability (2011), 127–161.
- Zhongxiang Sun. 2023. A Short Survey of Viewing Large Language Models in Legal Aspect. arXiv:2303.09136 [cs.CL]
- ReAct: Synergizing Reasoning and Acting in Language Models. arXiv:2210.03629 [cs.CL]
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.