Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

AgentsCourt: Building Judicial Decision-Making Agents with Court Debate Simulation and Legal Knowledge Augmentation (2403.02959v3)

Published 5 Mar 2024 in cs.CL and cs.AI

Abstract: With the development of deep learning, natural language processing technology has effectively improved the efficiency of various aspects of the traditional judicial industry. However, most current efforts focus on tasks within individual judicial stages, making it difficult to handle complex tasks that span multiple stages. As the autonomous agents powered by LLMs are becoming increasingly smart and able to make complex decisions in real-world settings, offering new insights for judicial intelligence. In this paper, (1) we propose a novel multi-agent framework, AgentsCourt, for judicial decision-making. Our framework follows the classic court trial process, consisting of court debate simulation, legal resources retrieval and decision-making refinement to simulate the decision-making of judge. (2) we introduce SimuCourt, a judicial benchmark that encompasses 420 Chinese judgment documents, spanning the three most common types of judicial cases. Furthermore, to support this task, we construct a large-scale legal knowledge base, Legal-KB, with multi-resource legal knowledge. (3) Extensive experiments show that our framework outperforms the existing advanced methods in various aspects, especially in generating legal articles, where our model achieves significant improvements of 8.6% and 9.1% F1 score in the first and second instance settings, respectively.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (37)
  1. Gpt-4 technical report. arXiv preprint arXiv:2303.08774.
  2. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901.
  3. Neural legal judgment prediction in english. arXiv preprint arXiv:1906.02059.
  4. Agentverse: Facilitating multi-agent collaboration and exploring emergent behaviors in agents. arXiv preprint arXiv:2308.10848.
  5. Junyan Cheng and Peter Chin. 2024. Sociodojo: Building lifelong analytical agents with real-world text and time series. In The Twelfth International Conference on Learning Representations.
  6. Textworld: A learning environment for text-based games. In Computer Games: 7th Workshop, CGW 2018, Held in Conjunction with the 27th International Conference on Artificial Intelligence, IJCAI 2018, Stockholm, Sweden, July 13, 2018, Revised Selected Papers 7, pages 41–75. Springer.
  7. Multiple–true–false questions reveal the limits of the multiple–choice format for detecting students with incomplete understandings. BioScience, 68(6):455–463.
  8. Chatlaw: Open-source legal large language model with integrated external knowledge bases. arXiv preprint arXiv:2306.16092.
  9. Mind2web: Towards a generalist agent for the web. arXiv preprint arXiv:2306.06070.
  10. Improving factuality and reasoning in language models through multiagent debate. arXiv preprint arXiv:2305.14325.
  11. Lawbench: Benchmarking legal knowledge of large language models. arXiv preprint arXiv:2309.16289.
  12. Lego: A multi-agent collaborative framework with role-playing and iterative feedback for causality explanation generation. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 9142–9163.
  13. Metagpt: Meta programming for multi-agent collaborative framework. arXiv preprint arXiv:2308.00352.
  14. Lawyer llama technical report. arXiv preprint arXiv:2305.15062.
  15. How question types reveal student thinking: An experimental comparison of multiple-true-false and free-response formats. CBE—Life Sciences Education, 16(2):ar26.
  16. A multi-task benchmark for korean legal language understanding and judgement prediction. Advances in Neural Information Processing Systems, 35:32537–32551.
  17. A free format legal question answering system. In Proceedings of the Natural Legal Language Processing Workshop 2021, pages 107–113.
  18. Camel: Communicative agents for" mind" exploration of large scale language model society. arXiv preprint arXiv:2303.17760.
  19. Pyserini: A Python toolkit for reproducible information retrieval research with sparse and dense representations. In Proceedings of the 44th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2021), pages 2356–2362.
  20. Agentbench: Evaluating llms as agents. arXiv preprint arXiv:2308.03688.
  21. Jorge Martinez-Gil. 2023. A survey on legal question–answering systems. Computer Science Review, 48:100552.
  22. Ha-Thanh Nguyen. 2023. A brief report on lawgpt 1.0: A virtual legal assistant based on gpt-3. arXiv preprint arXiv:2302.05729.
  23. Generative agents: Interactive simulacra of human behavior. arXiv preprint arXiv:2304.03442.
  24. Communicative agents for software development. arXiv preprint arXiv:2307.07924.
  25. Toran Bruce Richards. 2023. Autogpt - the next evolution of data driven chat ai. https://auto-gpt.ai/.
  26. Alfworld: Aligning text and embodied environments for interactive learning. arXiv preprint arXiv:2010.03768.
  27. Androidenv: A reinforcement learning platform for android. arXiv preprint arXiv:2105.13231.
  28. Plan-and-solve prompting: Improving zero-shot chain-of-thought reasoning by large language models. arXiv preprint arXiv:2305.04091.
  29. Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems, 35:24824–24837.
  30. Autogen: Enabling next-gen llm applications via multi-agent conversation framework. arXiv preprint arXiv:2308.08155.
  31. C-pack: Packaged resources to advance general chinese embedding.
  32. Leven: A large-scale chinese legal event detection dataset. arXiv preprint arXiv:2203.08556.
  33. Webshop: Towards scalable real-world web interaction with grounded language agents. Advances in Neural Information Processing Systems, 35:20744–20757.
  34. ReAct: Synergizing reasoning and acting in language models. In International Conference on Learning Representations (ICLR).
  35. Building cooperative embodied agents modularly with large language models. arXiv preprint arXiv:2307.02485.
  36. Jec-qa: a legal-domain question answering dataset. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pages 9701–9708.
  37. Webarena: A realistic web environment for building autonomous agents. arXiv preprint arXiv:2307.13854.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (10)
  1. Zhitao He (9 papers)
  2. Pengfei Cao (39 papers)
  3. Chenhao Wang (31 papers)
  4. Zhuoran Jin (23 papers)
  5. Yubo Chen (58 papers)
  6. Jiexin Xu (5 papers)
  7. Huaijun Li (4 papers)
  8. Xiaojian Jiang (5 papers)
  9. Kang Liu (207 papers)
  10. Jun Zhao (469 papers)
Citations (2)
X Twitter Logo Streamline Icon: https://streamlinehq.com