Introducing DictaLM -- A Large Generative Language Model for Modern Hebrew (2309.14568v1)
Abstract: We present DictaLM, a large-scale LLM tailored for Modern Hebrew. Boasting 7B parameters, this model is predominantly trained on Hebrew-centric data. As a commitment to promoting research and development in the Hebrew language, we release both the foundation model and the instruct-tuned model under a Creative Commons license. Concurrently, we introduce DictaLM-Rab, another foundation model geared towards Rabbinic/Historical Hebrew. These foundation models serve as ideal starting points for fine-tuning various Hebrew-specific tasks, such as instruction, Q&A, sentiment analysis, and more. This release represents a preliminary step, offering an initial Hebrew LLM model for the Hebrew NLP community to experiment with.
- Representations and architectures in neural sentiment analysis for morphologically rich languages: A case study from Modern Hebrew. In Proceedings of the 27th International Conference on Computational Linguistics, pages 2242–2252, Santa Fe, New Mexico, USA. Association for Computational Linguistics.
- Dan Bareket and Reut Tsarfaty. 2021. Neural Modeling for Named Entities and Morphology (NEMO2). Transactions of the Association for Computational Linguistics, 9:909–928.
- Heq: a large and diverse hebrew reading comprehension benchmark.
- Dan Hendrycks and Kevin Gimpel. 2023. Gaussian error linear units (gelus).
- Omri Keren and Omer Levy. 2021. Parashoot: A hebrew question answering dataset. In Proceedings of the 3rd Workshop on Machine Reading for Question Answering, pages 106–112.
- Neural machine translation of rare words with subword units. CoRR, abs/1508.07909.
- Vitaly Shalumov and Harel Haskey. 2023. Hero: Roberta and longformer hebrew language models. arXiv:2304.11077.
- Noam Shazeer. 2020. Glu variants improve transformer.
- Roformer: Enhanced transformer with rotary position embedding.
- Attention is all you need. CoRR, abs/1706.03762.
- Improving low compute language modeling with in-domain embedding initialisation.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.