Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Explicitly Integrating Judgment Prediction with Legal Document Retrieval: A Law-Guided Generative Approach (2312.09591v2)

Published 15 Dec 2023 in cs.IR

Abstract: Legal document retrieval and judgment prediction are crucial tasks in intelligent legal systems. In practice, determining whether two documents share the same judgments is essential for establishing their relevance in legal retrieval. However, existing legal retrieval studies either ignore the vital role of judgment prediction or rely on implicit training objectives, expecting a proper alignment of legal documents in vector space based on their judgments. Neither approach provides explicit evidence of judgment consistency for relevance modeling, leading to inaccuracies and a lack of transparency in retrieval. To address this issue, we propose a law-guided method, namely GEAR, within the generative retrieval framework. GEAR explicitly integrates judgment prediction with legal document retrieval in a sequence-to-sequence manner. Experiments on two Chinese legal case retrieval datasets show the superiority of GEAR over state-of-the-art methods while maintaining competitive judgment prediction performance. Moreover, we validate its robustness across languages and domains on a French statutory article retrieval dataset.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (44)
  1. Guided Open Vocabulary Image Captioning with Constrained Beam Search. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 936–945.
  2. Do transformer attention heads provide transparency in abstractive summarization? arXiv preprint arXiv:1907.00570 (2019).
  3. A multitask, multilingual, multimodal evaluation of chatgpt on reasoning, hallucination, and interactivity. arXiv preprint arXiv:2302.04023 (2023).
  4. Longformer: The long-document transformer. arXiv preprint arXiv:2004.05150 (2020).
  5. A history of AI and Law in 50 papers: 25 years of the international conference on AI and Law. Artificial Intelligence and Law 20 (2012), 215–319.
  6. Hier-SPCNet: A Legal Statute Hierarchy-Based Heterogeneous Network for Computing Legal Case Document Similarity. Association for Computing Machinery, New York, NY, USA, 1657–1660.
  7. Autoregressive Entity Retrieval. In International Conference on Learning Representations. https://openreview.net/forum?id=5k8F6UU39V
  8. LEGAL-BERT: The muppets straight out of law school. arXiv preprint arXiv:2010.02559 (2020).
  9. User response models to improve a reinforce recommender system. In Proceedings of the 14th ACM International Conference on Web Search and Data Mining. 121–129.
  10. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).
  11. Improved Lexically Constrained Decoding for Translation and Monolingual Rewriting. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, 839–850. https://doi.org/10.18653/v1/N19-1090
  12. Transparency and accountability in AI decision support: Explaining and visualizing convolutional neural networks for text information. Decision Support Systems 134 (2020), 113302. https://doi.org/10.1016/j.dss.2020.113302
  13. Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
  14. Thesaurus-based Retrieval of Case Law. Frontiers in Artificial Intelligence and Applications 152 (2006), 61.
  15. SAILER: Structure-aware Pre-trained Language Model for Legal Case Retrieval. arXiv preprint arXiv:2304.11370 (2023).
  16. Exploration and Regularization of the Latent Action Space in Recommendation. In Proceedings of the ACM Web Conference 2023. 833–844.
  17. Gpteval: Nlg evaluation using gpt-4 with better human alignment. arXiv preprint arXiv:2303.16634 (2023).
  18. ML-LJP: Multi-Law Aware Legal Judgment Prediction. In Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval (Taipei, Taiwan) (SIGIR ’23). Association for Computing Machinery, New York, NY, USA, 1023–1034. https://doi.org/10.1145/3539618.3591731
  19. LeCaRD: a legal case retrieval dataset for Chinese law system. In Proceedings of the 44th international ACM SIGIR conference on research and development in information retrieval. 2342–2348.
  20. Finding Relevant Indian Judgments Using Dispersion of Citation Network. In Proceedings of the 24th International Conference on World Wide Web. 1085–1088.
  21. Marie-Francine Moens. 2001. Innovative techniques for legal text retrieval. Artificial Intelligence and Law 9 (2001), 29–57.
  22. FACTS-IR: fairness, accountability, confidentiality, transparency, and safety in information retrieval. In ACM SIGIR Forum, Vol. 53. ACM New York, NY, USA, 20–43.
  23. Matt Post and David Vilar. 2018. Fast Lexically Constrained Decoding with Dynamic Beam Allocation for Neural Machine Translation. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). 1314–1324.
  24. The probabilistic relevance framework: BM25 and beyond. Foundations and Trends® in Information Retrieval 3, 4 (2009), 333–389.
  25. Improving legal information retrieval using an ontological framework. Artificial Intelligence and Law 17 (2009), 101–124.
  26. BERT-PLI: Modeling Paragraph-Level Interactions for Legal Case Retrieval.. In IJCAI. 3501–3507.
  27. Understanding Relevance Judgments in Legal Case Retrieval. ACM Transactions on Information Systems 41, 3 (2023), 1–32.
  28. Law Article-Enhanced Legal Case Matching: a Model-Agnostic Causal Learning Approach. arXiv preprint arXiv:2210.11012 (2022).
  29. Transformer Memory as a Differentiable Search Index. In Advances in Neural Information Processing Systems, S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh (Eds.), Vol. 35. Curran Associates, Inc., 21831–21843. https://proceedings.neurips.cc/paper_files/paper/2022/file/892840a6123b5ec99ebaab8be1530fba-Paper-Conference.pdf
  30. Zero-Shot cross-lingual summarization via large language models.
  31. A neural corpus indexer for document retrieval. Advances in Neural Information Processing Systems 35 (2022), 25600–25614.
  32. Ronald J Williams. 1992. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine learning 8 (1992), 229–256.
  33. Lawformer: A pre-trained language model for chinese legal long documents. AI Open 2 (2021), 79–84.
  34. Self-supervised reinforcement learning for recommender systems. In Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval. 931–940.
  35. Approximate nearest neighbor negative contrastive learning for dense text retrieval. arXiv preprint arXiv:2007.00808 (2020).
  36. Explainable legal case matching via inverse optimal transport-based rationale extraction. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval. 657–668.
  37. Knowledge representation for the intelligent legal case retrieval. In Knowledge-Based Intelligent Information and Engineering Systems: 9th International Conference, KES 2005, Melbourne, Australia, September 14-16, 2005, Proceedings, Part I 9. Springer, 339–345.
  38. ChengXiang Zhai et al. 2008. Statistical language models for information retrieval a critical review. Foundations and Trends® in Information Retrieval 2, 3 (2008), 137–213.
  39. Han Zhang and Zhicheng Dou. 2023. Case Retrieval for Legal Judgment Prediction in Legal Artificial Intelligence. In China National Conference on Chinese Computational Linguistics. Springer, 434–448.
  40. Contrastive Learning for Legal Judgment Prediction. ACM Transactions on Information Systems 41, 4 (2023), 1–25.
  41. Benchmarking large language models for news summarization. arXiv preprint arXiv:2301.13848 (2023).
  42. Xinyan Zhao and VG Vinod Vydiswaran. 2021. Lirex: Augmenting language inference with relevant explanations. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 14532–14539.
  43. Ultron: An Ultimate Retriever on Corpus with a Model-based Indexer. arXiv:2208.09257 [cs.IR]
  44. Bridging the gap between indexing and retrieval for differentiable search index with query generation. arXiv preprint arXiv:2206.10128 (2022).
Citations (5)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com