Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

CiteCaseLAW: Citation Worthiness Detection in Caselaw for Legal Assistive Writing (2305.03508v1)

Published 3 May 2023 in cs.CL and cs.LG

Abstract: In legal document writing, one of the key elements is properly citing the case laws and other sources to substantiate claims and arguments. Understanding the legal domain and identifying appropriate citation context or cite-worthy sentences are challenging tasks that demand expensive manual annotation. The presence of jargon, language semantics, and high domain specificity makes legal language complex, making any associated legal task hard for automation. The current work focuses on the problem of citation-worthiness identification. It is designed as the initial step in today's citation recommendation systems to lighten the burden of extracting an adequate set of citation contexts. To accomplish this, we introduce a labeled dataset of 178M sentences for citation-worthiness detection in the legal domain from the Caselaw Access Project (CAP). The performance of various deep learning models was examined on this novel dataset. The domain-specific pre-trained model tends to outperform other models, with an 88% F1-score for the citation-worthiness detection task.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Mann Khatri (2 papers)
  2. Pritish Wadhwa (1 paper)
  3. Gitansh Satija (1 paper)
  4. Reshma Sheik (1 paper)
  5. Yaman Kumar (23 papers)
  6. Rajiv Ratn Shah (108 papers)
  7. Ponnurangam Kumaraguru (129 papers)
Citations (1)