Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
162 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

NESTLE: a No-Code Tool for Statistical Analysis of Legal Corpus (2309.04146v2)

Published 8 Sep 2023 in cs.CL and cs.AI

Abstract: The statistical analysis of large scale legal corpus can provide valuable legal insights. For such analysis one needs to (1) select a subset of the corpus using document retrieval tools, (2) structure text using information extraction (IE) systems, and (3) visualize the data for the statistical analysis. Each process demands either specialized tools or programming skills whereas no comprehensive unified "no-code" tools have been available. Here we provide NESTLE, a no-code tool for large-scale statistical analysis of legal corpus. Powered by a LLM and the internal custom end-to-end IE system, NESTLE can extract any type of information that has not been predefined in the IE system opening up the possibility of unlimited customizable statistical analysis of the corpus without writing a single line of code. We validate our system on 15 Korean precedent IE tasks and 3 legal text classification tasks from LexGLUE. The comprehensive experiments reveal NESTLE can achieve GPT-4 comparable performance by training the internal IE module with 4 human-labeled, and 192 LLM-labeled examples.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (45)
  1. Palm 2 technical report.
  2. Anthropic. 2023. Introducing claude. https://www.anthropic.com/index/introducing-claude.
  3. Legal NERC with ontologies, Wikipedia and curriculum learning. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, pages 254–259, Valencia, Spain. Association for Computational Linguistics.
  4. Ilias Chalkidis. 2023. Chatgpt may pass the bar exam soon, but has a long way to go for the lexglue benchmark. SSRN.
  5. LexGLUE: A benchmark dataset for legal language understanding in English. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 4310–4330, Dublin, Ireland. Association for Computational Linguistics.
  6. Joint entity and relation extraction for legal documents with legal feature enhancement. In Proceedings of the 28th International Conference on Computational Linguistics, pages 1561–1571, Barcelona, Spain (Online). International Committee on Computational Linguistics.
  7. Is GPT-3 a good data annotator? In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 11173–11195, Toronto, Canada. Association for Computational Linguistics.
  8. Mining legal arguments in court decisions. arXiv preprint arXiv:2208.06178.
  9. Annollm: Making large language models to be better crowdsourced annotators.
  10. Cuad: An expert-annotated nlp dataset for legal contract review. NeurIPS.
  11. Learning from limited labels for long legal dialogue. In Proceedings of the Natural Legal Language Processing Workshop 2021, pages 190–204, Punta Cana, Dominican Republic. Association for Computational Linguistics.
  12. LoRA: Low-rank adaptation of large language models. In International Conference on Learning Representations.
  13. Data-efficient end-to-end information extraction for statistical legal analysis. In Proceedings of the Natural Legal Language Processing Workshop 2022, pages 143–152, Abu Dhabi, United Arab Emirates (Hybrid). Association for Computational Linguistics.
  14. A multi-task benchmark for korean legal language understanding and judgement prediction. In Thirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track.
  15. Api-bank: A benchmark for tool-augmented llms.
  16. Explanations from large language models make small reasoners better.
  17. Taskmatrix.ai: Completing tasks by connecting foundation models with millions of apis.
  18. CLAUDETTE: an automated detector of potentially unfair clauses in online terms of service. CoRR, abs/1805.01217.
  19. Agentbench: Evaluating llms as agents.
  20. Large language model is not a good few-shot information extractor, but a good reranker for hard samples!
  21. Peft: State-of-the-art parameter-efficient fine-tuning methods. https://github.com/huggingface/peft.
  22. Eric Martinez. 2023. Re-evaluating gpt-4’s bar exam performance.
  23. Information extraction from legal documents: A study in the context of common law court judgements. In Proceedings of the The 18th Annual Workshop of the Australasian Language Technology Association, pages 98–103, Virtual Workshop. Australasian Language Technology Association.
  24. OpenAI. 2023. Gpt-4 technical report.
  25. Gorilla: Large language model connected with massive apis. arXiv preprint arXiv:2305.15334.
  26. Legal terminology extraction with the termolator. In Proceedings of the Natural Legal Language Processing Workshop 2021, pages 155–162, Punta Cana, Dominican Republic. Association for Computational Linguistics.
  27. Sequence-to-sequence models for extracting information from registration and legal documents. In Document Analysis Systems, pages 83–95, Cham. Springer International Publishing.
  28. Tool learning with foundation models.
  29. Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of Machine Learning Research, 21(140):1–67.
  30. Zero-offload: Democratizing billion-scale model training.
  31. A comprehensive evaluation of large language models on legal judgment prediction. arXiv preprint arXiv:2310.11761.
  32. Restgpt: Connecting large language models with real-world applications via restful apis.
  33. Toolalpaca: Generalized tool learning for language models with 3000 simulated cases.
  34. Stanford alpaca: An instruction-following llama model. https://github.com/tatsu-lab/stanford_alpaca.
  35. Llama: Open and efficient foundation language models.
  36. Llama 2: Open foundation and fine-tuned chat models.
  37. LEDGAR: A large-scale multi-label corpus for text classification of legal provisions in contracts. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 1235–1241, Marseille, France. European Language Resources Association.
  38. Prompt2model: Generating deployable models from natural language instructions.
  39. Voyager: An open-ended embodied agent with large language models.
  40. mT5: A massively multilingual pre-trained text-to-text transformer. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 483–498, Online. Association for Computational Linguistics.
  41. Leven: A large-scale chinese legal event detection dataset. arXiv preprint arXiv:2203.08556.
  42. Data-copilot: Bridging billions of data and humans with autonomous workflow.
  43. Judging llm-as-a-judge with mt-bench and chatbot arena.
  44. Agieval: A human-centric benchmark for evaluating foundation models.
  45. Toolqa: A dataset for llm question answering with external tools.

Summary

We haven't generated a summary for this paper yet.