Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
43 tokens/sec
GPT-4o
12 tokens/sec
Gemini 2.5 Pro Pro
37 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
4 tokens/sec
DeepSeek R1 via Azure Pro
33 tokens/sec
2000 character limit reached

HuixiangDou: Overcoming Group Chat Scenarios with LLM-based Technical Assistance (2401.08772v2)

Published 16 Jan 2024 in cs.CL

Abstract: In this work, we present HuixiangDou, a technical assistant powered by LLMs (LLM). This system is designed to assist algorithm developers by providing insightful responses to questions related to open-source algorithm projects, such as computer vision and deep learning projects from OpenMMLab. We further explore the integration of this assistant into the group chats of instant messaging (IM) tools such as WeChat and Lark. Through several iterative improvements and trials, we have developed a sophisticated technical chat assistant capable of effectively answering users' technical questions without causing message flooding. This paper's contributions include: 1) Designing an algorithm pipeline specifically for group chat scenarios; 2) Verifying the reliable performance of text2vec in task rejection; 3) Identifying three critical requirements for LLMs in technical-assistant-like products, namely scoring ability, In-Context Learning (ICL), and Long Context. We have made the source code, android app and web service available at Github (https://github.com/internlm/huixiangdou), OpenXLab (https://openxlab.org.cn/apps/detail/tpoisonooo/huixiangdou-web) and YouTube (https://youtu.be/ylXrT-Tei-Y) to aid in future research and application. HuixiangDou is applicable to any group chat within IM tools.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (16)
  1. Mmdetection: Open mmlab detection toolbox and benchmark. arXiv preprint arXiv:1906.07155, 2019.
  2. LangChain Contributors. Langchain: Building applications with llms through composability, 2023a. URL https://github.com/langchain-ai/langchain.
  3. LMDeploy Contributors. Lmdeploy is a toolkit for compressing, deploying, and serving llms, 2023b. URL https://github.com/internlm/lmdeploy.
  4. OpenCompass Contributors. Opencompass: A universal evaluation platform for foundation models, 2023c. URL https://github.com/open-compass/opencompass.
  5. Triton Contributors. Development repository for the triton language and compiler, 2019. URL https://github.com/openai/triton.
  6. XTuner Contributors. Xtuner: A toolkit for efficiently fine-tuning llm, 2023d. URL https://github.com/internlm/xtuner.
  7. Rephrase and respond: Let large language models ask better questions for themselves, 2023.
  8. emozilla and bloc97. Dynamically scaled rope further increases performance of long context llama with zero fine-tuning, June 2023. URL https://www.reddit.com/r/LocalLLaMA/comments/14mrgpr/dynamically_scaled_rope_further_increases/.
  9. Megablocks: Efficient sparse training with mixture-of-experts, 2022.
  10. The Stem Cell Hypothesis: Dilemma behind Multi-Task Learning with Transformer Encoders. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 5555–5577, Online and Punta Cana, Dominican Republic, November 2021. Association for Computational Linguistics. URL https://aclanthology.org/2021.emnlp-main.451.
  11. A Brief Study of Open Source Graph Databases, 2013.
  12. OpenAI. The official python library for the openai api, 2022. URL https://github.com/openai/openai-python/blob/e389823ba013a24b4c32ce38fa0bd87e6bccae94/chatml.md.
  13. Yarn: Efficient context window extension of large language models, 2023.
  14. Jianlin Su. The upgrade path of transformer: 12, infinite extrapolation with rerope?, Aug 2023. URL https://spaces.ac.cn/archives/9708.
  15. wenda Contributors. wenda: A large language model (llm) invocation platform, 2023. URL https://github.com/wenda-LLM/wenda.
  16. React: Synergizing reasoning and acting in language models, 2023.

Summary

  • The paper presents HuixiangDou, an LLM-based system that uses a two-stage pipeline to filter and accurately respond to technical queries in group chat environments.
  • It integrates In-Context Learning, text2vec ranking, and specialized scoring algorithms to ensure precise answers and reduce irrelevant responses.
  • Experimental results demonstrate the system’s effective balance between response precision and maintaining a positive user experience in technical group chats.

Introduction to HuixiangDou

Developers often create user groups on instant messaging platforms to facilitate discussions about open-source projects. As these groups expand, organizers look to minimize time spent answering repetitive questions while still providing necessary assistance. Automated responses and pinned messages are traditional solutions, but they fall short as they cannot adequately address queries specific to users' local development environments, leading to an overwhelming number of irrelevant messages. That's where the technical assistant system HuixiangDou comes in, designed to empower users with solutions for technical queries within such group chats, particularly for computer vision and deep learning projects.

The Evolution of HuixiangDou

HuixiangDou's development process involved a progression from a fundamental model to a sophisticated assistant proficient in handling the unique demands of group chats. The initial approach, symbolically called "Dagger," involved fine-tuning LLMs to respond to queries directly. However, this method encountered problems such as message flooding and inaccuracies in responses. To combat these issues, a two-part improvement was made.

The "Spear" advancement introduced a two-staged Reject Pipeline for dismissing off-topic conversations and a Response Pipeline employing specialized LLM prompt techniques for domain-specific inquiries. The final "Rake" version improved upon this foundation, refining the system's ability to handle extensive context and incorporating a more complex Response Pipeline. This final system maintains precision in non-response scenarios and boosts the potential for providing accurate assistance.

Technicalities and Results

HuixiangDou's sophisticated system comprises several components to accurately assess and respond to user questions without disrupting chat group dynamics. It uses a combination of algorithms including text2vec for relevance assessment and ranking, In-Context Learning to enhance understanding, and specialized scoring algorithms to ascertain the importance of user questions. Importantly, it avoids generating unreliable or illegal content while preserving the chat's user experience.

Experimentation with HuixiangDou emphasized fine-tuning on domain-specific QA pairs and employing a range of models to test the refusal to answer accuracy. Scalable methods were also implemented to account for the maximum token length LLMs can handle, ensuring that the system can process detailed inquiries effectively. Ultimately, HuixiangDou demonstrated a promising ability to correctly reject off-topic interactions and provide precise technical assistance in test scenarios.

Conclusion and Further Directions

In sum, HuixiangDou is a novel LLM-based system that effectively operates within group chat environments, addressing the basket of challenges associated with automated technical assistance. While showcasing the importance of domain-specific knowledge, the work also highlights an LLM's need for scoring capacity, In-Context Learning, and handling extensive context. The system's performance underscored the balance between precision and user experience, ensuring valuable communication within technical discussion groups.

However, there's more progress to be made. Future enhancements may explore ways for LLMs to understand entire repositories to provide even more accurate responses, along with addressing multimodal inquiries that include images or non-textual data. The source code and software for HuixiangDou are available publicly for further research and implementation, with the potential for this technology to be applied across various group chat contexts.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.