Exploiting Contrastive Learning and Numerical Evidence for Confusing Legal Judgment Prediction (2211.08238v3)
Abstract: Given the fact description text of a legal case, legal judgment prediction (LJP) aims to predict the case's charge, law article and penalty term. A core problem of LJP is how to distinguish confusing legal cases, where only subtle text differences exist. Previous studies fail to distinguish different classification errors with a standard cross-entropy classification loss, and ignore the numbers in the fact description for predicting the term of penalty. To tackle these issues, in this work, first, we propose a moco-based supervised contrastive learning to learn distinguishable representations, and explore the best strategy to construct positive example pairs to benefit all three subtasks of LJP simultaneously. Second, in order to exploit the numbers in legal cases for predicting the penalty terms of certain cases, we further enhance the representation of the fact description with extracted crime amounts which are encoded by a pre-trained numeracy model. Extensive experiments on public benchmarks show that the proposed method achieves new state-of-the-art results, especially on confusing legal cases. Ablation studies also demonstrate the effectiveness of each component.
- Predicting judicial decisions of the european court of human rights: A natural language processing perspective. PeerJ Computer Science, 2:e93.
- Learning representations by maximizing mutual information across views. Advances in neural information processing systems, 32.
- Neural legal judgment prediction in english. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 4317–4323.
- A simple framework for contrastive learning of visual representations. In International conference on machine learning, pages 1597–1607. PMLR.
- Learning a similarity metric discriminatively, with application to face verification. In 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005), 20-26 June 2005, San Diego, CA, USA, pages 539–546. IEEE Computer Society.
- A survey on legal judgment prediction: Datasets, metrics, models and challenges. arXiv preprint arXiv:2204.04859.
- Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186.
- Legal judgment prediction: A survey of the state of the art. In Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pages 5461–5469. International Joint Conferences on Artificial Intelligence Organization. Survey Track.
- Legal judgment prediction via event extraction with constraints. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 648–664.
- Supervised contrastive learning for pre-trained language model fine-tuning. In International Conference on Learning Representations.
- Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9729–9738.
- Learning deep representations by mutual information estimation and maximization. arXiv preprint arXiv:1808.06670.
- Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation, 9(8):1735–1780.
- Few-shot charge prediction with discriminative legal attributes. In Proceedings of the 27th International Conference on Computational Linguistics, pages 487–498.
- Supervised contrastive learning. Advances in Neural Information Processing Systems, 33:18661–18673.
- Ml-ljp: Multi-law aware legal judgment prediction. In Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 1023–1034.
- Improving legal judgment prediction through reinforced criminal element extraction. Information Processing & Management, 59(1):102780.
- Thong Nguyen and Anh Tuan Luu. 2021. Contrastive learning for neural topic model. Advances in neural information processing systems, 34:11974–11986.
- Adaptive contrastive learning on multimodal transformer for review helpfulness prediction. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 10085–10096.
- Methods for numeracy-preserving word embeddings. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, EMNLP 2020, Online, November 16-20, 2020, pages 4742–4753. Association for Computational Linguistics.
- Varsha Suresh and Desmond Ong. 2021. Not all negatives are equal: Label-aware contrastive loss for fine-grained text classification. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 4381–4394.
- Mitigating data sparsity for short text topic modeling by topic-semantic contrastive learning. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 2748–2760.
- Unsupervised feature learning via non-parametric instance discrimination. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3733–3742.
- CAIL2018: A large-scale legal dataset for judgment prediction. CoRR, abs/1807.02478.
- Cail2018: A large-scale legal dataset for judgment prediction. arXiv preprint arXiv:1807.02478.
- Distinguish confusing law articles for legal judgment prediction. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 3086–3095.
- Legal judgment prediction via multi-perspective bi-feedback network. In Proceedings of the 28th International Joint Conference on Artificial Intelligence, pages 4085–4091.
- Hierarchical attention networks for document classification. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1480–1489, San Diego, California. Association for Computational Linguistics.
- Neurjudge: a circumstance-aware neural framework for legal judgment prediction. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 973–982.
- Contrastive learning for legal judgment prediction. ACM Transactions on Information Systems, 41(4):1–25.
- Use all the labels: A hierarchical multi-label contrastive learning framework. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16660–16669.
- Legal judgment prediction via topological learning. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 3540–3549.
- How does nlp benefit legal system: A summary of legal artificial intelligence. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 5218–5230.
- Open chinese language pre-trained model zoo. Technical report.
- Leilei Gan (21 papers)
- Baokui Li (2 papers)
- Kun Kuang (114 papers)
- Yating Zhang (21 papers)
- Lei Wang (975 papers)
- Luu Anh Tuan (55 papers)
- Yi Yang (855 papers)
- Fei Wu (317 papers)