Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

CaseGNN++: Graph Contrastive Learning for Legal Case Retrieval with Graph Augmentation (2405.11791v1)

Published 20 May 2024 in cs.IR

Abstract: Legal case retrieval (LCR) is a specialised information retrieval task that aims to find relevant cases to a given query case. LCR holds pivotal significance in facilitating legal practitioners in finding precedents. Most of existing LCR methods are based on traditional lexical models and LLMs, which have gained promising performance in retrieval. However, the domain-specific structural information inherent in legal documents is yet to be exploited to further improve the performance. Our previous work CaseGNN successfully harnesses text-attributed graphs and graph neural networks to address the problem of legal structural information neglect. Nonetheless, there remain two aspects for further investigation: (1) The underutilization of rich edge information within text-attributed case graphs limits CaseGNN to generate informative case representation. (2) The inadequacy of labelled data in legal datasets hinders the training of CaseGNN model. In this paper, CaseGNN++, which is extended from CaseGNN, is proposed to simultaneously leverage the edge information and additional label data to discover the latent potential of LCR models. Specifically, an edge feature-based graph attention layer (EUGAT) is proposed to comprehensively update node and edge features during graph modelling, resulting in a full utilisation of structural information of legal cases. Moreover, a novel graph contrastive learning objective with graph augmentation is developed in CaseGNN++ to provide additional training signals, thereby enhancing the legal comprehension capabilities of CaseGNN++ model. Extensive experiments on two benchmark datasets from COLIEE 2022 and COLIEE 2023 demonstrate that CaseGNN++ not only significantly improves CaseGNN but also achieves supreme performance compared to state-of-the-art LCR methods. Code has been released on https://github.com/yanran-tang/CaseGNN.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (76)
  1. Improving BERT-based Query-by-Document Retrieval with Multi-task Optimization. In ECIR.
  2. DoSSIER@COLIEE 2021: Leveraging dense retrieval and summarization-based re-ranking for case law retrieval. CoRR abs/2108.03937 (2021).
  3. Leveraging Linguistic Structure For Open Domain Information Extraction. In ACL.
  4. Injecting the BM25 Score as Text Improves BERT-Based Re-rankers. In ECIR.
  5. LeiBi@COLIEE 2022: Aggregating Tuned Lexical Models with a Cluster-driven BERT-based Model for Case Law Retrieval. CoRR abs/2205.13351 (2022).
  6. Arian Askari and Suzan Verberne. 2021. Combining Lexical and Neural Retrieval with Longformer-based Summarization for Effective Case Law Retrieval. In DESIRES (CEUR).
  7. Retrieval for Extremely Long Queries and Documents with RPRS: a Highly Efficient and Effective Transformer-based Re-Ranker. CoRR abs/2303.01200 (2023).
  8. Longformer: The Long-Document Transformer. CoRR abs/2004.05150 (2020).
  9. LEGAL-BERT: The Muppets straight out of Law School. CoRR abs/2010.02559 (2020).
  10. Ilias Chalkidis and Dimitrios Kampas. 2019. Deep learning in law: early adaptation and legal word embeddings trained on large corpora. Artif. Intell. Law 27, 2 (2019), 171–198.
  11. A Simple Framework for Contrastive Learning of Visual Representations. In ICML. 1597–1607.
  12. Zhuyun Dai and Jamie Callan. 2019. Context-Aware Sentence/Passage Term Importance Estimation For First Stage Retrieval. CoRR abs/1910.10687 (2019).
  13. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL-HLT.
  14. Graph Random Neural Networks for Semi-Supervised Learning on Graphs. In NeurIPS.
  15. SimCSE: Simple Contrastive Learning of Sentence Embeddings. In EMNLP, Marie-Francine Moens, Xuanjing Huang, Lucia Specia, and Scott Wen-tau Yih (Eds.). 6894–6910.
  16. Competition on Legal Information Extraction/Entailment (COLIEE).
  17. Michael Gutmann and Aapo Hyvärinen. 2010. Noise-contrastive estimation: A new estimation principle for unnormalized statistical models. In AISTATS. 297–304.
  18. Inductive Representation Learning on Large Graphs. In NeurIPS.
  19. Momentum Contrast for Unsupervised Visual Representation Learning. In CVPR. 9726–9735.
  20. Explanations as Features: LLM-Based Features for Text-Attributed Graphs. CoRR abs/2305.19523 (2023).
  21. Karen Spärck Jones. 2004. A statistical interpretation of term specificity and its application in retrieval. J. Documentation 60, 5 (2004), 493–502.
  22. Omar Khattab and Matei Zaharia. 2020. ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT. In SIGIR.
  23. Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In ICLR.
  24. Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In ICLR.
  25. SAILER: Structure-aware Pre-trained Language Model for Legal Case Retrieval. CoRR abs/2304.11370 (2023).
  26. Yuexin Li and Bryan Hooi. 2023. Prompt-Based Zero- and Few-Shot Node Classification: A Multimodal Approach. CoRR abs/2307.11572 (2023).
  27. Investigating Conversational Agent Action in Legal Case Retrieval. In ECIR.
  28. Query Generation and Buffer Mechanism: Towards a better conversational agent for legal case retrieval. Inf. Process. Manag. (2022).
  29. RoBERTa: A Robustly Optimized BERT Pretraining Approach. CoRR abs/1907.11692 (2019).
  30. CaT: Balanced Continual Graph Learning with Graph Condensation. CoRR abs/2309.09455 (2023).
  31. PUMA: Efficient Continual Graph Learning with Graph Condensation. CoRR abs/2312.14439.
  32. Incorporating Retrieval Information into the Truncation of Ranking Lists for Better Legal Search. In SIGIR.
  33. LeCaRD: A Legal Case Retrieval Dataset for Chinese Law System. In SIGIR.
  34. Incorporating Structural Information into Legal Case Retrieval. ACM Trans. Inf. Syst. (2023).
  35. COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining. In NeurIPS. 23102–23114.
  36. Distributed Representations of Words and Phrases and their Compositionality. In NeurIPS.
  37. Andriy Mnih and Yee Whye Teh. 2012. A fast and simple algorithm for training neural probabilistic language models. In ICML.
  38. SCENE: Reasoning About Traffic Scenes Using Heterogeneous Graph Neural Networks. IEEE Robotics Autom. Lett. (2023).
  39. Document Ranking with a Pretrained Sequence-to-Sequence Model. In EMNLP.
  40. Document Expansion by Query Prediction. CoRR abs/1904.08375 (2019).
  41. Jay M. Ponte and W. Bruce Croft. 2017. A Language Modeling Approach to Information Retrieval. SIGIR (2017).
  42. Understanding the Behaviors of BERT in Ranking. CoRR abs/1904.07531 (2019).
  43. Exploiting Positional Information for Session-Based Recommendation. ACM Trans. Inf. Syst. 40, 2 (2022), 35:1–35:24.
  44. Exploiting Cross-session Information for Session-based Recommendation with Graph Neural Networks. ACM Trans. Inf. Syst. (2020).
  45. Rethinking the Item Order in Session-based Recommendation with Graph Neural Networks. In CIKM.
  46. GAG: Global Attributed Graph Neural Network for Streaming Session-based Recommendation. In SIGIR.
  47. Semantic-Based Classification of Relevant Case Law. In JURISIN.
  48. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. J. Mach. Learn. Res. (2020).
  49. Nils Reimers and Iryna Gurevych. 2019. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. In EMNLP-IJCNLP.
  50. Stephen E. Robertson and Steve Walker. 1994. Some Simple Effective Approximations to the 2-Poisson Model for Probabilistic Weighted Retrieval. In SIGIR.
  51. DropEdge: Towards Deep Graph Convolutional Networks on Node Classification. In ICLR.
  52. BERT-PLI: Modeling Paragraph-Level Interactions for Legal Case Retrieval. In IJCAI.
  53. Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. (2014).
  54. Law Article-Enhanced Legal Case Matching: a Model-Agnostic Causal Learning Approach. CoRR abs/2210.11012 (2022).
  55. Prompt-based Effective Input Reformulation for Legal Case Retrieval. CoRR abs/2309.02962 (2023).
  56. CaseGNN: Graph Neural Networks for Legal Case Retrieval with Text-Attributed Graphs. In ECIR 2024. Springer, 80–95.
  57. CaseLink: Inductive Graph Learning for Legal Case Retrieval. CoRR abs/2403.17780 (2024).
  58. Large-Scale Representation Learning on Graphs via Bootstrapping. In ICLR.
  59. Building Legal Case Retrieval Systems with Lexical Matching and Summarization using A Pre-Trained Phrase Scoring Model. In ICAIL.
  60. Attention is All you Need. In NeurIPS. 5998–6008.
  61. Graph Attention Networks. In ICLR.
  62. NOWJ at COLIEE 2023 - Multi-Task and Ensemble Approaches in Legal Information Processing. CoRR abs/2306.04903 (2023).
  63. GraphCrop: Subgraph Cropping for Graph Classification. CoRR abs/2009.10564 (2020).
  64. Zhaowei Wang. 2022. Legal Element-oriented Modeling with Multi-view Contrastive Learning for Legal Case Retrieval. In IJCNN.
  65. Zhihao Wen and Yuan Fang. 2023. Augmenting Low-Resource Text Classification with Graph-Grounded Pre-training and Prompting. In SIGIR.
  66. CLEAR: Contrastive Learning for Sentence Representation. CoRR abs/2012.15466 (2020).
  67. Lawformer: A pre-trained language model for Chinese legal long documents. AI Open 2 (2021), 79–84.
  68. Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval. In ICLR.
  69. LegalGNN: Legal Information Enhanced Graph Neural Network for Recommendation. ACM Trans. Inf. Syst. (2022).
  70. LEVEN: A Large-Scale Chinese Legal Event Detection Dataset. In ACL.
  71. Graph Contrastive Learning with Augmentations. In NeurIPS.
  72. Explainable Legal Case Matching via Inverse Optimal Transport-based Rationale Extraction. In SIGIR.
  73. Contrastive Learning for Legal Judgment Prediction. ACM Trans. Inf. Syst. 41, 4 (2023), 25.
  74. CFGL-LCR: A Counterfactual Graph Learning Framework for Legal Case Retrieval. In SIGKDD. 3332–3341.
  75. Iteratively Questioning and Answering for Interpretable Legal Judgment Prediction. In AAAI.
  76. PSSL: Self-supervised Learning for Personalized Search with Contrastive Sampling. In CIKM. 2749–2758.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Yanran Tang (6 papers)
  2. Ruihong Qiu (26 papers)
  3. Yilun Liu (28 papers)
  4. Xue Li (124 papers)
  5. Zi Huang (126 papers)

Summary

We haven't generated a summary for this paper yet.

Github Logo Streamline Icon: https://streamlinehq.com

GitHub

X Twitter Logo Streamline Icon: https://streamlinehq.com