HuixiangDou: Overcoming Group Chat Scenarios with LLM-based Technical Assistance (2401.08772v2)
Abstract: In this work, we present HuixiangDou, a technical assistant powered by LLMs (LLM). This system is designed to assist algorithm developers by providing insightful responses to questions related to open-source algorithm projects, such as computer vision and deep learning projects from OpenMMLab. We further explore the integration of this assistant into the group chats of instant messaging (IM) tools such as WeChat and Lark. Through several iterative improvements and trials, we have developed a sophisticated technical chat assistant capable of effectively answering users' technical questions without causing message flooding. This paper's contributions include: 1) Designing an algorithm pipeline specifically for group chat scenarios; 2) Verifying the reliable performance of text2vec in task rejection; 3) Identifying three critical requirements for LLMs in technical-assistant-like products, namely scoring ability, In-Context Learning (ICL), and Long Context. We have made the source code, android app and web service available at Github (https://github.com/internlm/huixiangdou), OpenXLab (https://openxlab.org.cn/apps/detail/tpoisonooo/huixiangdou-web) and YouTube (https://youtu.be/ylXrT-Tei-Y) to aid in future research and application. HuixiangDou is applicable to any group chat within IM tools.
- Mmdetection: Open mmlab detection toolbox and benchmark. arXiv preprint arXiv:1906.07155, 2019.
- LangChain Contributors. Langchain: Building applications with llms through composability, 2023a. URL https://github.com/langchain-ai/langchain.
- LMDeploy Contributors. Lmdeploy is a toolkit for compressing, deploying, and serving llms, 2023b. URL https://github.com/internlm/lmdeploy.
- OpenCompass Contributors. Opencompass: A universal evaluation platform for foundation models, 2023c. URL https://github.com/open-compass/opencompass.
- Triton Contributors. Development repository for the triton language and compiler, 2019. URL https://github.com/openai/triton.
- XTuner Contributors. Xtuner: A toolkit for efficiently fine-tuning llm, 2023d. URL https://github.com/internlm/xtuner.
- Rephrase and respond: Let large language models ask better questions for themselves, 2023.
- emozilla and bloc97. Dynamically scaled rope further increases performance of long context llama with zero fine-tuning, June 2023. URL https://www.reddit.com/r/LocalLLaMA/comments/14mrgpr/dynamically_scaled_rope_further_increases/.
- Megablocks: Efficient sparse training with mixture-of-experts, 2022.
- The Stem Cell Hypothesis: Dilemma behind Multi-Task Learning with Transformer Encoders. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 5555–5577, Online and Punta Cana, Dominican Republic, November 2021. Association for Computational Linguistics. URL https://aclanthology.org/2021.emnlp-main.451.
- A Brief Study of Open Source Graph Databases, 2013.
- OpenAI. The official python library for the openai api, 2022. URL https://github.com/openai/openai-python/blob/e389823ba013a24b4c32ce38fa0bd87e6bccae94/chatml.md.
- Yarn: Efficient context window extension of large language models, 2023.
- Jianlin Su. The upgrade path of transformer: 12, infinite extrapolation with rerope?, Aug 2023. URL https://spaces.ac.cn/archives/9708.
- wenda Contributors. wenda: A large language model (llm) invocation platform, 2023. URL https://github.com/wenda-LLM/wenda.
- React: Synergizing reasoning and acting in language models, 2023.