Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Reformulating Domain Adaptation of Large Language Models as Adapt-Retrieve-Revise: A Case Study on Chinese Legal Domain (2310.03328v3)

Published 5 Oct 2023 in cs.CL

Abstract: While LLMs like GPT-4 have recently demonstrated astonishing zero-shot capabilities in general domain tasks, they often generate content with hallucinations in specific domains such as Chinese law, hindering their application in these areas. This is typically due to the absence of training data that encompasses such a specific domain, preventing GPT-4 from acquiring in-domain knowledge. A pressing challenge is that it's not plausible to continue training LLMs of such scale on in-domain data. This paper introduces a simple and effective domain adaptation framework for GPT-4 by reformulating generation as an \textbf{adapt-retrieve-revise} process. The initial step is to \textbf{adapt} an affordable 7B LLM to the target domain by continuing learning on in-domain data. When solving a task, we leverage the adapted LLM to generate a draft answer given a task query. Then, the draft answer will be used to \textbf{retrieve} supporting evidence candidates from an external in-domain knowledge base. Finally, the draft answer and retrieved evidence are concatenated into a whole prompt to let GPT-4 assess the evidence and \textbf{revise} the draft answer to generate the final answer. Our proposal combines the advantages of the efficiency of adapting a smaller 7B model with the evidence-assessing capability of GPT-4 and effectively prevents GPT-4 from generating hallucinatory content. In the zero-shot setting of four Chinese legal tasks, our method improves accuracy by 33.3\% compared to the direct generation by GPT-4. When compared to two stronger retrieval-based baselines, our method outperforms them by 15.4\% and 23.9\%. Our code will be released

Definition Search Book Streamline Icon: https://streamlinehq.com
References (34)
  1. Alibaba-inc. Qwen-7b. https://github.com/QwenLM/Qwen-VL, 2023.
  2. Baichuan-inc. Baichuan-7b. https://github.com/baichuan-inc/Baichuan-7B, 2023.
  3. Improving language models by retrieving from trillions of tokens. In Kamalika Chaudhuri, Stefanie Jegelka, Le Song, Csaba Szepesvari, Gang Niu, and Sivan Sabato (eds.), Proceedings of the 39th International Conference on Machine Learning, volume 162 of Proceedings of Machine Learning Research, pp.  2206–2240. PMLR, 17–23 Jul 2022. URL https://proceedings.mlr.press/v162/borgeaud22a.html.
  4. Language models are few-shot learners. In H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin (eds.), Advances in Neural Information Processing Systems, volume 33, pp.  1877–1901. Curran Associates, Inc., 2020a. URL https://proceedings.neurips.cc/paper_files/paper/2020/file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf.
  5. Language models are few-shot learners. In Hugo Larochelle, Marc’Aurelio Ranzato, Raia Hadsell, Maria-Florina Balcan, and Hsuan-Tien Lin (eds.), Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual, 2020b. URL https://proceedings.neurips.cc/paper/2020/hash/1457c0d6bfcb4967418bfb8ac142f64a-Abstract.html.
  6. EQUALS: A real-world dataset for legal question answering via reading chinese laws. In Matthias Grabmair, Francisco Andrade, and Paulo Novais (eds.), Proceedings of the Nineteenth International Conference on Artificial Intelligence and Law, ICAIL 2023, Braga, Portugal, June 19-23, 2023, pp.  71–80. ACM, 2023. doi: 10.1145/3594536.3595159. URL https://doi.org/10.1145/3594536.3595159.
  7. Palm: Scaling language modeling with pathways, 2022.
  8. Chatlaw: Open-source legal large language model with integrated external knowledge bases, 2023a.
  9. Efficient and effective text encoding for chinese llama and alpaca, 2023b.
  10. BERT: pre-training of deep bidirectional transformers for language understanding. In Jill Burstein, Christy Doran, and Thamar Solorio (eds.), Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers), pp.  4171–4186. Association for Computational Linguistics, 2019. doi: 10.18653/v1/n19-1423. URL https://doi.org/10.18653/v1/n19-1423.
  11. GLM: General language model pretraining with autoregressive blank infilling. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp.  320–335, Dublin, Ireland, May 2022. Association for Computational Linguistics. doi: 10.18653/v1/2022.acl-long.26. URL https://aclanthology.org/2022.acl-long.26.
  12. Tao Hai. Lexilaw. https://github.com/CSHaitao/LexiLaw, 2023.
  13. Training compute-optimal large language models, 2022.
  14. Lora: Low-rank adaptation of large language models. In The Tenth International Conference on Learning Representations, ICLR 2022, Virtual Event, April 25-29, 2022. OpenReview.net, 2022. URL https://openreview.net/forum?id=nZeVKeeFYf9.
  15. Lawyer llama technical report, 2023.
  16. IDEA-CCNL. Fengshenbang-lm. https://github.com/pengxiao-song/LaWGPT, 2023.
  17. Atlas: Few-shot learning with retrieval augmented language models, 2022.
  18. BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In Dan Jurafsky, Joyce Chai, Natalie Schluter, and Joel R. Tetreault (eds.), Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5-10, 2020, pp.  7871–7880. Association for Computational Linguistics, 2020a. doi: 10.18653/v1/2020.acl-main.703. URL https://doi.org/10.18653/v1/2020.acl-main.703.
  19. Retrieval-augmented generation for knowledge-intensive NLP tasks. In Hugo Larochelle, Marc’Aurelio Ranzato, Raia Hadsell, Maria-Florina Balcan, and Hsuan-Tien Lin (eds.), Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual, 2020b. URL https://proceedings.neurips.cc/paper/2020/hash/6b493230205f780e1bc26945df7481e5-Abstract.html.
  20. P-tuning v2: Prompt tuning can be comparable to fine-tuning universally across scales and tasks. CoRR, abs/2110.07602, 2021. URL https://arxiv.org/abs/2110.07602.
  21. Roberta: A robustly optimized bert pretraining approach, 2019.
  22. lyogavin. anima. https://github.com/lyogavin/Anima, 2023.
  23. Lecard: A legal case retrieval dataset for chinese law system. In Fernando Diaz, Chirag Shah, Torsten Suel, Pablo Castells, Rosie Jones, and Tetsuya Sakai (eds.), SIGIR ’21: The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, Virtual Event, Canada, July 11-15, 2021, pp.  2342–2348. ACM, 2021. doi: 10.1145/3404835.3463250. URL https://doi.org/10.1145/3404835.3463250.
  24. OpenAI. Gpt-4 technical report, 2023.
  25. Instruction tuning with gpt-4, 2023.
  26. Scaling language models: Methods, analysis & insights from training gopher, 2022.
  27. Pengxiao Song. Lawgpt. https://github.com/IDEA-CCNL/Fengshenbang-LM, 2021.
  28. Lamda: Language models for dialog applications, 2022.
  29. Llama: Open and efficient foundation language models. CoRR, abs/2302.13971, 2023. doi: 10.48550/arXiv.2302.13971. URL https://doi.org/10.48550/arXiv.2302.13971.
  30. Text embeddings by weakly-supervised contrastive pre-training. CoRR, abs/2212.03533, 2022. doi: 10.48550/arXiv.2212.03533. URL https://doi.org/10.48550/arXiv.2212.03533.
  31. CAIL2018: A large-scale legal dataset for judgment prediction. CoRR, abs/1807.02478, 2018. URL http://arxiv.org/abs/1807.02478.
  32. Generate rather than retrieve: Large language models are strong context generators. In The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023. OpenReview.net, 2023. URL https://openreview.net/pdf?id=fB0hRu9GZUS.
  33. Disc-lawllm: Fine-tuning large language models for intelligent legal services, 2023.
  34. JEC-QA: A legal-domain question answering dataset. In The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, The Thirty-Second Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020, New York, NY, USA, February 7-12, 2020, pp.  9701–9708. AAAI Press, 2020. doi: 10.1609/aaai.v34i05.6519. URL https://doi.org/10.1609/aaai.v34i05.6519.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Yating Zhang (21 papers)
  2. Yexiang Wang (2 papers)
  3. Fei Cheng (46 papers)
  4. Sadao Kurohashi (55 papers)
  5. Zhen Wan (42 papers)
Citations (8)