LaVy: Vietnamese Multimodal Large Language Model (2404.07922v6)

Published 11 Apr 2024 in cs.CL, cs.CV, and cs.LG

Abstract: LLMs and Multimodal LLMs (MLLMs) have taken the world by storm with impressive abilities in complex reasoning and linguistic comprehension. Meanwhile there are plethora of works related to Vietnamese LLMs, the lack of high-quality resources in multimodality limits the progress of Vietnamese MLLMs. In this paper, we pioneer in address this by introducing LaVy, a state-of-the-art Vietnamese MLLM, and we also introduce LaVy-Bench benchmark designated for evaluating MLLMs' understanding on Vietnamese visual language tasks. Our project is public at https://github.com/baochi0212/LaVy

PDF HTML Abstract

Summarize Bookmark Chat (Pro)

References (20)

Authors (2)

Chi Tran (6 papers)
Huong Le Thanh (1 paper)

Citations (3)

View on Semantic Scholar

Tweets

https://twitter.com/realmofresearch/status/1779868431917216039

LaVy: Vietnamese Multimodal Large Language Model (2404.07922v6)

Related Papers

Tweets