Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

CAPTAIN at COLIEE 2023: Efficient Methods for Legal Information Retrieval and Entailment Tasks (2401.03551v1)

Published 7 Jan 2024 in cs.CL and cs.IR

Abstract: The Competition on Legal Information Extraction/Entailment (COLIEE) is held annually to encourage advancements in the automatic processing of legal texts. Processing legal documents is challenging due to the intricate structure and meaning of legal language. In this paper, we outline our strategies for tackling Task 2, Task 3, and Task 4 in the COLIEE 2023 competition. Our approach involved utilizing appropriate state-of-the-art deep learning methods, designing methods based on domain characteristics observation, and applying meticulous engineering practices and methodologies to the competition. As a result, our performance in these tasks has been outstanding, with first places in Task 2 and Task 3, and promising results in Task 4. Our source code is available at https://github.com/Nguyen2015/CAPTAIN-COLIEE2023/tree/coliee2023.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (39)
  1. Using deep learning approaches for tackling legal’s challenges (COLIEE 2022). In Sixteenth International Workshop on Juris-informatics (JURISIN).
  2. ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators. CoRR abs/2003.10555 (2020). arXiv:2003.10555 https://arxiv.org/abs/2003.10555
  3. Corinna Cortes and Vladimir Vapnik. 1995. Support-vector networks. Machine learning 20 (1995), 273–297.
  4. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. CoRR abs/1810.04805 (2018). arXiv:1810.04805 http://arxiv.org/abs/1810.04805
  5. Predicate’s argument resolver and entity abstraction for legal question answering: Kis teams at coliee 2021 shared task. In Proceedings of the COLIEE Workshop in ICAIL.
  6. Legal Textual Entailment Using Ensemble of Rule-Based and BERT-Based Method with Data Augmentation by Related Article Generation. In New Frontiers in Artificial Intelligence: JSAI-isAI 2022 Workshop, JURISIN 2022, and JSAI 2022 International Session, Kyoto, Japan, June 12–17, 2022, Revised Selected Papers. Springer, 138–153.
  7. Paragraph Similarity Scoring and Fine-Tuned BERT for Legal Information Retrieval and Entailment. COLIEE 2020 (2020).
  8. Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation 9, 8 (1997), 1735–1780.
  9. Bm25 and transformer-based legal information extraction and entailment. In Proceedings of the COLIEE Workshop in ICAIL.
  10. Statute law information retrieval and entailment. In Proceedings of the Seventeenth International Conference on Artificial Intelligence and Law. 283–289.
  11. COLIEE 2022 Summary: Methods for Legal Document Retrieval and Entailment. In New Frontiers in Artificial Intelligence: JSAI-isAI 2022 Workshop, JURISIN 2022, and JSAI 2022 International Session, Kyoto, Japan, June 12–17, 2022, Revised Selected Papers. Springer, 51–67.
  12. Siat@ coliee-2021: Combining statistics recall and semantic ranking for legal case retrieval and entailment. In Proceedings of the COLIEE Workshop in ICAIL.
  13. Rethinking attention: An attempting on revaluing attention weight with disjunctive union of longest uncommon subsequence for legal queries answering. I. Proceedings of the Sixteenth International Workshop on Juris-informatics (JURISIN 2022) (2022).
  14. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019).
  15. Retrieving legal cases from a large-scale candidate corpus. Proceedings of the Eighth International Competition on Legal Information Extraction/Entailment, COLIEE2021 (2021).
  16. A Legal Information Retrieval System for Statute Law. In Recent Challenges in Intelligent Information and Database Systems: 14th Asian Conference, ACIIDS 2022, Ho Chi Minh City, Vietnam, November 28-30, 2022, Proceedings. Springer, 370–382.
  17. Jnlp team: Deep learning approaches for legal processing tasks in coliee 2021. arXiv preprint arXiv:2106.13405 (2021).
  18. ParaLaw Nets - Cross-lingual Sentence-level Pretraining for Legal Text Processing.
  19. ParaLaw Nets–Cross-lingual Sentence-level Pretraining for Legal Text Processing. Proceedings of the COLIEE Workshop in ICAIL (2021) (2021).
  20. JNLP Team: Deep Learning for Legal Processing in COLIEE 2020. COLIEE 2020 (2020).
  21. MS MARCO: A human generated machine reading comprehension dataset. choice 2640 (2016), 660.
  22. Recurrent neural network-based models for recognizing requisite and effectuation parts in legal texts. Artificial Intelligence and Law 26 (2018), 169–199.
  23. Document Ranking with a Pretrained Sequence-to-Sequence Model. In Findings of the Association for Computational Linguistics: EMNLP 2020. Association for Computational Linguistics, Online, 708–718. https://doi.org/10.18653/v1/2020.findings-emnlp.63
  24. Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). 1532–1543.
  25. Semantic-based classification of relevant case law. In New Frontiers in Artificial Intelligence: JSAI-isAI 2022 Workshop, JURISIN 2022, and JSAI 2022 International Session, Kyoto, Japan, June 12–17, 2022, Revised Selected Papers. Springer, 84–95.
  26. Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research 21, 1 (2020), 5485–5551.
  27. Nils Reimers and Iryna Gurevych. 2019. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics. https://arxiv.org/abs/1908.10084
  28. Billions of Parameters Are Worth More Than In-domain Training Data: A case study in the Legal Case Entailment Task. Proceedings of the Sixteenth International Workshop on Juris-informatics (JURISIN 2022) (2022).
  29. Billions of parameters are worth more than in-domain training data: A case study in the legal case entailment task. Sixteenth International Workshop on Juris-informatics (JURISIN) (2022).
  30. To tune or not to tune? zero-shot models for legal case entailment. In Proceedings of the Eighteenth International Conference on Artificial Intelligence and Law. 295–300.
  31. A pentapus grapples with legal reasoning. In Proceedings of the COLIEE Workshop in ICAIL.
  32. BERT-Based Ensemble Model for Statute Law Retrieval and Legal Information Entailment. In New Frontiers in Artificial Intelligence: JSAI-isAI 2020 Workshops, JURISIN, LENLS 2020 Workshops, Virtual Event, November 15–17, 2020, Revised Selected Papers 12. Springer, 226–239.
  33. Thuir@ coliee-2020: Leveraging semantic understanding and exact matching for legal case retrieval and entailment. COLIEE 2020 (2020).
  34. Indri: A language model-based search engine for complex queries. In Proceedings of the international conference on intelligent analysis, Vol. 2. Washington, DC., 2–6.
  35. Repurposing entailment for multi-hop question answering tasks. Proc. of NAACL (2019) (2019).
  36. Using textbook knowledge for statute retrieval and entailment classification. In New Frontiers in Artificial Intelligence: JSAI-isAI 2022 Workshop, JURISIN 2022, and JSAI 2022 International Session, Kyoto, Japan, June 12–17, 2022, Revised Selected Papers. Springer, 125–137.
  37. Legal norm retrieval with variations of the bert model combined with tf-idf vectorization. In Proceedings of the eighteenth international conference on artificial intelligence and law. 285–294.
  38. Bert-based ensemble methods for information retrieval and legal textual entailment in coliee statute law task. Proceedings of the Eigtht International Competition on Legal Information Extraction/Entailment (COLIEE 2021) (2021).
  39. HUKB at the COLIEE 2022 statute law task. In New Frontiers in Artificial Intelligence: JSAI-isAI 2022 Workshop, JURISIN 2022, and JSAI 2022 International Session, Kyoto, Japan, June 12–17, 2022, Revised Selected Papers. Springer, 109–124.
Citations (7)

Summary

We haven't generated a summary for this paper yet.

Github Logo Streamline Icon: https://streamlinehq.com