Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Analyzing Vietnamese Legal Questions Using Deep Neural Networks with Biaffine Classifiers (2304.14447v1)

Published 27 Apr 2023 in cs.CL

Abstract: In this paper, we propose using deep neural networks to extract important information from Vietnamese legal questions, a fundamental task towards building a question answering system in the legal domain. Given a legal question in natural language, the goal is to extract all the segments that contain the needed information to answer the question. We introduce a deep model that solves the task in three stages. First, our model leverages recent advanced autoencoding LLMs to produce contextual word embeddings, which are then combined with character-level and POS-tag information to form word representations. Next, bidirectional long short-term memory networks are employed to capture the relations among words and generate sentence-level representations. At the third stage, borrowing ideas from graph-based dependency parsing methods which provide a global view on the input sentence, we use biaffine classifiers to estimate the probability of each pair of start-end words to be an important segment. Experimental results on a public Vietnamese legal dataset show that our model outperforms the previous work by a large margin, achieving 94.79% in the F1 score. The results also prove the effectiveness of using contextual features extracted from pre-trained LLMs combined with other types of features such as character-level and POS-tag features when training on a limited dataset.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Nguyen Anh Tu (6 papers)
  2. Hoang Thi Thu Uyen (3 papers)
  3. Tu Minh Phuong (8 papers)
  4. Ngo Xuan Bach (4 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.