Papers
Topics
Authors
Recent
2000 character limit reached

YAD: Leveraging T5 for Improved Automatic Diacritization of Yorùbá Text (2412.20218v1)

Published 28 Dec 2024 in cs.CL

Abstract: In this work, we present Yor`ub\'a automatic diacritization (YAD) benchmark dataset for evaluating Yor`ub\'a diacritization systems. In addition, we pre-train text-to-text transformer, T5 model for Yor`ub\'a and showed that this model outperform several multilingually trained T5 models. Lastly, we showed that more data and larger models are better at diacritization for Yor`ub\'a

Summary

We haven't generated a summary for this paper yet.

Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.