Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
51 tokens/sec
GPT-4o
60 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

MindLLM: Pre-training Lightweight Large Language Model from Scratch, Evaluations and Domain Applications (2310.15777v2)

Published 24 Oct 2023 in cs.CL and cs.AI

Abstract: LLMs have demonstrated remarkable performance across various natural language tasks, marking significant strides towards general artificial intelligence. While general artificial intelligence is leveraged by developing increasingly large-scale models, there could be another branch to develop lightweight custom models that better serve certain domains, taking into account the high cost of training and deploying LLMs and the scarcity of resources. In this paper, we present MindLLM, a novel series of bilingual lightweight LLMs, trained from scratch, alleviating such burdens by offering models with 1.3 billion and 3 billion parameters. A thorough account of experiences accrued during large model development is given, covering every step of the process, including data construction, model architecture, evaluation, and applications. Such insights are hopefully valuable for fellow academics and developers. MindLLM consistently matches or surpasses the performance of other open-source larger models on some public benchmarks. We also introduce an innovative instruction tuning framework tailored for smaller models to enhance their capabilities efficiently. Moreover, we explore the application of MindLLM in specific vertical domains such as law and finance, underscoring the agility and adaptability of our lightweight models.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Yizhe Yang (12 papers)
  2. Huashan Sun (7 papers)
  3. Jiawei Li (115 papers)
  4. Runheng Liu (2 papers)
  5. Yinghao Li (27 papers)
  6. Yuhang Liu (57 papers)
  7. Heyan Huang (107 papers)
  8. Yang Gao (761 papers)
Citations (6)
Youtube Logo Streamline Icon: https://streamlinehq.com