Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Chatlaw: A Multi-Agent Collaborative Legal Assistant with Knowledge Graph Enhanced Mixture-of-Experts Large Language Model (2306.16092v2)

Published 28 Jun 2023 in cs.CL
Chatlaw: A Multi-Agent Collaborative Legal Assistant with Knowledge Graph Enhanced Mixture-of-Experts Large Language Model

Abstract: AI legal assistants based on LLMs can provide accessible legal consulting services, but the hallucination problem poses potential legal risks. This paper presents Chatlaw, an innovative legal assistant utilizing a Mixture-of-Experts (MoE) model and a multi-agent system to enhance the reliability and accuracy of AI-driven legal services. By integrating knowledge graphs with artificial screening, we construct a high-quality legal dataset to train the MoE model. This model utilizes different experts to address various legal issues, optimizing the accuracy of legal responses. Additionally, Standardized Operating Procedures (SOP), modeled after real law firm workflows, significantly reduce errors and hallucinations in legal services. Our MoE model outperforms GPT-4 in the Lawbench and Unified Qualification Exam for Legal Professionals by 7.73% in accuracy and 11 points, respectively, and also surpasses other models in multiple dimensions during real-case consultations, demonstrating our robust capability for legal consultation.

ChatLaw: Open-Source Legal LLM with Integrated External Knowledge Bases

The paper "ChatLaw: Open-Source Legal LLM with Integrated External Knowledge Bases" addresses a significant gap in the development of targeted large-scale LLMs within the Chinese legal domain. Unlike earlier endeavors such as BloombergGPT and FinGPT, the paper focuses on creating a dedicated and open-source legal LLM named ChatLaw to bolster digital transformation in legal fields.

Core Contributions

The authors identify several key contributions with the development of ChatLaw:

  1. Mitigation of Hallucination: The paper presents a strategy to decrease hallucination phenomena by enhancing the model's training and incorporating modules during inference. This structure incorporates "consult," "reference," "self-suggestion," and "response" modules that integrate domain-specific knowledge and accurate information from external sources.
  2. Legal Feature Word Extraction Model: A model is trained to extract legal feature words efficiently, facilitating effective analysis of legal contexts within user input.
  3. Legal Text Similarity Calculation Model: By employing a BERT-based approach, the authors create a model to measure textual similarity, enabling efficient retrieval of similar legal documents for further analysis.
  4. Chinese Legal Exam Testing Dataset: A unique dataset is curated specifically for evaluating model performance in legal multiple-choice questions, supplemented with an ELO arena scoring mechanism.

Dataset and Methodology

The dataset construction is meticulous, involving a multi-step process to ensure comprehensiveness. It includes real-world legal data such as news articles, social media content, legal regulations, judicial interpretations, and legal consultation scenarios. After data curation, rigorous cleaning processes filter incoherent content, and the ChatGPT API is employed for data augmentation.

Utilizing Ziya-LLaMA-13B, the authors fine-tuned the ChatLaw model with Low-Rank Adaptation (LoRA), further reducing hallucinations with a self-suggestion role. Specific pre-trained models also addressed keyword extraction for accurate legal text retrieval, leveraging a novel algorithm for improved accuracy.

Results and Analysis

Experimental evaluations utilize a compilation of national judicial exam questions. However, due to low accuracy rates across models, traditional accuracy assessments do not suffice. Thus, the authors adopt an ELO-based scoring mechanism to provide a meaningful comparison.

Significant insights include:

  • Incorporation of legal domain data improves model performance on multiple-choice questions.
  • Task-specific training enhances the performance, as observed with ChatLaw outperforming GPT-4.
  • Larger models generally show enhanced capabilities in handling complex legal logic and reasoning tasks.

Implications and Future Work

The implications of this work establish a solid foundation for future research in legally structured LLMing. By proposing an innovative integration of vector knowledge bases with LLMs, the authors pave the way for reducing hallucinations and improving problem-solving capabilities in specific domains.

However, the work also acknowledges limitations, particularly concerning general tasks and logical reasoning due to the base model's scale. Future research directions may involve refining generalization capabilities and minimizing social risks associated with model deployment.

In conclusion, the paper provides a robust framework for developing domain-specific LLMs in the legal sector, with potential applications extending beyond the immediate legal environment, inviting further exploration into enhanced performance and broader application areas.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (14)
  1. Falcon-40B: an open large language model with state-of-the-art performance. 2023.
  2. Vicuna: An open-source chatbot impressing gpt-4 with 90%* chatgpt quality, March 2023.
  3. LoRA: Low-rank adaptation of large language models. In International Conference on Learning Representations, 2022.
  4. Billion-scale similarity search with GPUs. IEEE Transactions on Big Data, 7(3):535–547, 2019.
  5. OpenAI. Gpt-4 technical report, 2023.
  6. Hugginggpt: Solving ai tasks with chatgpt and its friends in hugging face, 2023.
  7. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971, 2023.
  8. Huatuo: Tuning llama model with chinese medical knowledge, 2023.
  9. Bloomberggpt: A large language model for finance, 2023.
  10. Fingpt: Open-source financial large language models, 2023.
  11. Zero-shot learners for natural language understanding via a unified multiple choice perspective, 2022.
  12. GLM-130b: An open bilingual pre-trained model. In The Eleventh International Conference on Learning Representations (ICLR), 2023.
  13. Judging llm-as-a-judge with mt-bench and chatbot arena, 2023.
  14. Chatmed: A chinese medical large language model. https://github.com/michael-wzhu/ChatMed, 2023.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Jiaxi Cui (13 papers)
  2. Zongjian Li (10 papers)
  3. Yang Yan (22 papers)
  4. Bohua Chen (5 papers)
  5. Li Yuan (141 papers)
  6. Munan Ning (19 papers)
  7. Hao Li (803 papers)
  8. Bin Ling (3 papers)
  9. Yonghong Tian (184 papers)
Citations (96)
Github Logo Streamline Icon: https://streamlinehq.com
Youtube Logo Streamline Icon: https://streamlinehq.com