Comparing the Performance of NLP Toolkits and Evaluation measures in Legal Tech (2103.11792v1)

Published 12 Mar 2021 in cs.CL and cs.LG

Abstract: Recent developments in Natural Language Processing have led to the introduction of state-of-the-art Neural LLMs, enabled with unsupervised transferable learning, using different pretraining objectives. While these models achieve excellent results on the downstream NLP tasks, various domain adaptation techniques can improve their performance on domain-specific tasks. We compare and analyze the pretrained Neural LLMs, XLNet (autoregressive), and BERT (autoencoder) on the Legal Tasks. Results show that XLNet Model performs better on our Sequence Classification task of Legal Opinions Classification, whereas BERT produces better results on the NER task. We use domain-specific pretraining and additional legal vocabulary to adapt BERT Model further to the Legal Domain. We prepared multiple variants of the BERT Model, using both methods and their combination. Comparing our variants of the BERT Model, specializing in the Legal Domain, we conclude that both additional pretraining and vocabulary techniques enhance the BERT model's performance on the Legal Opinions Classification task. Additional legal vocabulary improves BERT's performance on the NER task. Combining the pretraining and vocabulary techniques further improves the final results. Our Legal-Vocab-BERT Model gives the best results on the Legal Opinions Task, outperforming the larger pretrained general LLMs, i.e., BERT-Base and XLNet-Base.

PDF Abstract

Summarize PDF Markdown Bookmark Chat (Pro)

Authors (1)

Muhammad Zohaib Khan (1 paper)

Citations (3)

View on Semantic Scholar

Comparing the Performance of NLP Toolkits and Evaluation measures in Legal Tech (2103.11792v1)

Related Papers