Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
72 tokens/sec
GPT-4o
61 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

DISC-LawLLM: Fine-tuning Large Language Models for Intelligent Legal Services (2309.11325v2)

Published 20 Sep 2023 in cs.CL
DISC-LawLLM: Fine-tuning Large Language Models for Intelligent Legal Services

Abstract: We propose DISC-LawLLM, an intelligent legal system utilizing LLMs to provide a wide range of legal services. We adopt legal syllogism prompting strategies to construct supervised fine-tuning datasets in the Chinese Judicial domain and fine-tune LLMs with legal reasoning capability. We augment LLMs with a retrieval module to enhance models' ability to access and utilize external legal knowledge. A comprehensive legal benchmark, DISC-Law-Eval, is presented to evaluate intelligent legal systems from both objective and subjective dimensions. Quantitative and qualitative results on DISC-Law-Eval demonstrate the effectiveness of our system in serving various users across diverse legal scenarios. The detailed resources are available at https://github.com/FudanDISC/DISC-LawLLM.

DISC-LawLLM: Advancing Legal Services with LLMs

The paper introduces DISC-LawLLM, a system specifically designed to leverage LLMs for a wide range of intelligent legal services. Structured on a legal syllogism prompting strategy, this model fine-tunes LLMs to improve legal reasoning within the Chinese Judicial context. By incorporating a retrieval module, DISC-LawLLM enhances its ability to utilize external legal knowledge, addressing the dynamic nature of legal databases.

Methodology and Dataset Construction

The authors construct a supervised fine-tuning dataset, DISC-Law-SFT, which consists of distinct subsets, focusing on legal reasoning and domain-specific knowledge integration. This dataset is derived from multiple sources, including public NLP legal task datasets, legal raw text, and open-source instruction datasets. The paper employs GPT-3.5-turbo for enhancing output consistency with legal syllogism, creating instruction samples for tasks like legal information extraction, judgment prediction, and text summarization.

Training and Model Architecture

The training of DISC-LawLLM is accomplished via two primary steps: Supervised Fine-Tuning (SFT) and Retrieval Augmentation. The architecture is based on the Baichuan-13B-Base model with 13.2 billion parameters, which is further fine-tuned using DISC-Law-SFT. Retrieval Augmentation is implemented by integrating an external retrieval framework that dynamically accesses an evolving legal knowledge base, ensuring accurate and current legal references.

Evaluation Framework

The authors propose a comprehensive evaluation framework, DISC-Law-Eval, which provides both objective and subjective assessments. Objective evaluation examines legal knowledge and reasoning via multi-choice questions from various legal exams. Subjective evaluation involves qualitative analysis using a question-answering paradigm, scored by GPT-3.5, assessing accuracy, completeness, and clarity.

Results and Implications

The results demonstrate that DISC-LawLLM significantly surpasses existing general and legal LLMs in objective evaluations, even outperforming GPT-3.5-turbo in multiple legal domains. It indicates superior jurisprudential reasoning, particularly for complex legal tasks. In subjective evaluations, DISC-LawLLM shows improvements in average scoring across key dimensions, highlighting its applicability in real-world scenarios.

Practical and Theoretical Contributions

From a practical perspective, DISC-LawLLM offers substantial advantages over traditional legal systems, simplifying tasks for legal professionals, enhancing legal consultation accessibility, and serving educational purposes for law students. Theoretically, the paper contributes to the field of LegalAI by demonstrating how fine-tuning with legal syllogism and retrieval mechanisms can enhance LLM capabilities in specialized domains.

Future Directions

This paper opens avenues for extending DISC-LawLLM to other legal systems and languages, with the potential to integrate even broader repositories of legal knowledge. Future developments could explore multi-modal inputs and deeper integration with court databases to further enrich the system's applicability and reliability in diverse legal contexts.

Overall, DISC-LawLLM represents a significant step forward in utilizing LLMs for legal applications, setting a robust foundation for future advancements in AI-driven legal services.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (40)
  1. Baichuan-inc. 2023. Baichuan-13b. https://github.com/baichuan-inc/Baichuan-13B.
  2. Lexnlp: Natural language processing and information extraction for legal and regulatory texts. Research Handbook on Big Data Law.
  3. CAIL. 2020. Cail2020. https://github.com/china-ai-law-challenge/CAIL2020.
  4. CAIL. 2022. Cail2022. https://github.com/china-ai-law-challenge/CAIL2022.
  5. Joint entity and relation extraction for legal documents with legal feature enhancement. In Proceedings of the 28th International Conference on Computational Linguistics, pages 1561–1571, Barcelona, Spain (Online). International Committee on Computational Linguistics.
  6. Chatlaw: Open-source legal large language model with integrated external knowledge bases.
  7. Efficient and effective text encoding for chinese llama and alpaca. arXiv preprint arXiv:2304.08177.
  8. Glm: General language model pretraining with autoregressive blank infilling. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 320–335.
  9. Cjrc: A reliable human-annotated benchmark dataset for chinese judicial reading comprehension. In Chinese Computational Linguistics, pages 439–451, Cham. Springer International Publishing.
  10. Anne von der Lieth Gardner. 1987. An artificial intelligence approach to legal reasoning. MIT press.
  11. Lawyer llama technical report. ArXiv, abs/2305.15062.
  12. IDEA-CCNL. 2021. Fengshenbang-lm. https://github.com/IDEA-CCNL/Fengshenbang-LM.
  13. Incorporating argument-level interactions for persuasion comments evaluation using co-attention model. In Proceedings of the 27th International Conference on Computational Linguistics, pages 3703–3714.
  14. Discrete argument representation learning for interactive argument pair identification. arXiv preprint arXiv:1911.01621.
  15. Cong Jiang and Xiaolei Yang. 2023. Legal syllogism prompting: Teaching large language models for legal judgment prediction. arXiv preprint arXiv:2307.08321.
  16. Answering legal questions by learning neural attentive text representation. In Proceedings of the 28th International Conference on Computational Linguistics, pages 988–998.
  17. Haitao Li. 2023. Lexilaw. https://github.com/CSHaitao/LexiLaw.
  18. Lawgpt. https://github.com/LiuHC0428/LAW_GPT.
  19. Lecard: a legal case retrieval dataset for chinese law system. In Proceedings of the 44th international ACM SIGIR conference on research and development in information retrieval, pages 2342–2348.
  20. Meta. 2023. Llama. https://github.com/facebookresearch/llama.
  21. Crosslingual generalization through multitask finetuning.
  22. OpenAI. 2022. Chatgpt: Optimizing language models for dialogue.
  23. OpenAI. 2023. Gpt-4 technical report.
  24. Instruction tuning with gpt-4. arXiv preprint arXiv:2304.03277.
  25. Richard A Posner. 1990. The problems of jurisprudence. Harvard University Press.
  26. Deepspeed: System optimizations enable training deep learning models with over 100 billion parameters. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 3505–3506.
  27. Pengxiao Song. 2023. Lawgpt. https://github.com/pengxiao-song/LaWGPT.
  28. Yun Song and Zhongyu Wei. 2021. Inferring association between alcohol addiction and defendant’s emotion based on sound at court. Frontiers in Psychology, 12:669780.
  29. Self-instruct: Aligning language model with self generated instructions. arXiv preprint arXiv:2212.10560.
  30. CAIL2018: A large-scale legal dataset for judgment prediction. CoRR, abs/1807.02478.
  31. Jianxin Yang. 2023. Firefly. https://github.com/yangjianxin1/Firefly.
  32. Legal judgment prediction via multi-perspective bi-feedback network. arXiv preprint arXiv:1905.03969.
  33. LEVEN: A large-scale Chinese legal event detection dataset. In Findings of the Association for Computational Linguistics: ACL 2022, pages 183–201, Dublin, Ireland. Association for Computational Linguistics.
  34. Interpretable charge predictions for criminal cases: Learning to generate court views from fact descriptions. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 1854–1864, New Orleans, Louisiana. Association for Computational Linguistics.
  35. ymcui. 2023. Chinese-llama-alpaca-2. https://github.com/ymcui/Chinese-LLaMA-Alpaca-2.
  36. Overview of smp-cail2020-argmine: The interactive argument-pair extraction in judgement document challenge. Data Intelligence, 3(2):287–307.
  37. Chinese open instruction generalist: A preliminary release.
  38. Judging llm-as-a-judge with mt-bench and chatbot arena.
  39. How does nlp benefit legal system: A summary of legal artificial intelligence. arXiv preprint arXiv:2004.12158.
  40. Jec-qa: A legal-domain question answering dataset. In Proceedings of AAAI.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (11)
  1. Shengbin Yue (5 papers)
  2. Wei Chen (1288 papers)
  3. Siyuan Wang (73 papers)
  4. Bingxuan Li (19 papers)
  5. Chenchen Shen (1 paper)
  6. Shujun Liu (9 papers)
  7. Yuxuan Zhou (79 papers)
  8. Yao Xiao (77 papers)
  9. Song Yun (1 paper)
  10. Xuanjing Huang (287 papers)
  11. Zhongyu Wei (98 papers)
Citations (57)