Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 98 tok/s
Gemini 2.5 Pro 58 tok/s Pro
GPT-5 Medium 25 tok/s Pro
GPT-5 High 23 tok/s Pro
GPT-4o 112 tok/s Pro
Kimi K2 165 tok/s Pro
GPT OSS 120B 460 tok/s Pro
Claude Sonnet 4 29 tok/s Pro
2000 character limit reached

Introducing DictaLM -- A Large Generative Language Model for Modern Hebrew (2309.14568v1)

Published 25 Sep 2023 in cs.CL

Abstract: We present DictaLM, a large-scale LLM tailored for Modern Hebrew. Boasting 7B parameters, this model is predominantly trained on Hebrew-centric data. As a commitment to promoting research and development in the Hebrew language, we release both the foundation model and the instruct-tuned model under a Creative Commons license. Concurrently, we introduce DictaLM-Rab, another foundation model geared towards Rabbinic/Historical Hebrew. These foundation models serve as ideal starting points for fine-tuning various Hebrew-specific tasks, such as instruction, Q&A, sentiment analysis, and more. This release represents a preliminary step, offering an initial Hebrew LLM model for the Hebrew NLP community to experiment with.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (11)
  1. Representations and architectures in neural sentiment analysis for morphologically rich languages: A case study from Modern Hebrew. In Proceedings of the 27th International Conference on Computational Linguistics, pages 2242–2252, Santa Fe, New Mexico, USA. Association for Computational Linguistics.
  2. Dan Bareket and Reut Tsarfaty. 2021. Neural Modeling for Named Entities and Morphology (NEMO2). Transactions of the Association for Computational Linguistics, 9:909–928.
  3. Heq: a large and diverse hebrew reading comprehension benchmark.
  4. Dan Hendrycks and Kevin Gimpel. 2023. Gaussian error linear units (gelus).
  5. Omri Keren and Omer Levy. 2021. Parashoot: A hebrew question answering dataset. In Proceedings of the 3rd Workshop on Machine Reading for Question Answering, pages 106–112.
  6. Neural machine translation of rare words with subword units. CoRR, abs/1508.07909.
  7. Vitaly Shalumov and Harel Haskey. 2023. Hero: Roberta and longformer hebrew language models. arXiv:2304.11077.
  8. Noam Shazeer. 2020. Glu variants improve transformer.
  9. Roformer: Enhanced transformer with rotary position embedding.
  10. Attention is all you need. CoRR, abs/1706.03762.
  11. Improving low compute language modeling with in-domain embedding initialisation.

Summary

We haven't generated a summary for this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.