Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 153 tok/s
Gemini 2.5 Pro 50 tok/s Pro
GPT-5 Medium 20 tok/s Pro
GPT-5 High 28 tok/s Pro
GPT-4o 79 tok/s Pro
Kimi K2 198 tok/s Pro
GPT OSS 120B 428 tok/s Pro
Claude Sonnet 4.5 38 tok/s Pro
2000 character limit reached

HuixiangDou2 Framework Overview

Updated 18 August 2025
  • HuixiangDou2 is an advanced technical assistant that uses LLMs to filter and process noisy, multi-user group chats for technical queries.
  • It employs a multi-stage algorithm pipeline combining text normalization, dual-stage rejection with text2vec and LLM scoring, achieving 0.99 precision and 0.92 recall.
  • Integrated with IM platforms like WeChat and Lark, its open-source implementation supports both research and practical real-world deployments.

The HuixiangDou2 Framework is an advanced technical assistant system leveraging LLMs for robust operation within high-noise, multi-participant group chat environments. Designed for integration with instant messaging (IM) platforms such as WeChat and Lark, it enables algorithm developers to query open-source projects in computer vision and deep learning, while effectively filtering non-technical or off-topic chatter. The architecture combines a multi-stage algorithm pipeline with state-of-the-art retrieval, scoring, and context management techniques, and is validated through rigorous precision and recall studies. Open-source implementations support adoption and future extension in both research and real-world applications.

1. Algorithm Pipeline Specialization

The pipeline of HuixiangDou2 is precisely engineered for the unique constraints of group chat scenarios. It decomposes into three primary stages:

  • Preprocess User Input: User messages are normalized by concatenating group and user identifiers, overcoming the role limitations in default LLM chat templates. The system aggregates temporally proximate messages, utilizes OCR for image content, and disregards unsupported media such as emojis, videos, and voice notes. Short or evidently off-topic utterances are filtered preemptively.
  • Rejection Pipeline: To reduce response noise and avoid LLM hallucinations, a two-stage rejection module is employed. First, a text2vec model determines the semantic proximity of the message to predefined technical domains, filtering overtly irrelevant content. Subsequently, an LLM-based scoring system assesses the context and tone, further refining message selection. This dual-stage gating (as depicted in the source's Figure 1) ensures only genuine technical queries propagate forward.
  • Response Pipeline: Accepted queries progress to staged augmentation and answering. Keywords are extracted with LLM-native NLP routines, and supporting documents are retrieved using a fusion of LangChain, BCEmbedding, targeted web search, and a bespoke knowledge graph for repository-specific queries. Scoring and partial ordering prioritize background content. A hybrid LLM orchestration dynamically allocates queries, exploiting differential model strengths (e.g., InternLM2 for scoring and kimi chat for extended context) to generate vetted responses. Safety checks are performed before message dispatch (as detailed in Figure 2 of the source).

2. Performance Validation

Central to the framework is quantitative evaluation of its rejection mechanisms. Multiple text2vec models were investigated using manually annotated message corpora, establishing the precision and recall of group chat topic discrimination.

Model Precision Recall
text2vec-large-chinese 0.99 0.92

Table: Task rejection performance on technical group chats.

This high precision (0.99) and robust recall (0.92) confirm that the rejection module effectively discards non-technical or off-topic inputs while minimizing false negatives. Additional validation showed that alternative context splitting strategies (e.g., langchain.MarkdownHeaderTextSplitter vs. CharacterTextSplitter) did not substantively alter accuracy, underscoring robustness to implementation details in upstream document processing.

3. Technical Requirements for LLMs

Three critical competencies for LLM deployment in group chat assistance are identified:

  • Scoring Ability: LLMs are routinely prompted to assign integer scores (0–10) for tasks such as query relevance, interrogativity, and risk of hallucination. These scores drive fast, interpretable routing logic and validation gates within the pipeline. The system also leverages LLM scoring for security filtering.
  • In-Context Learning (ICL): The assistant is explicitly designed to maximize ICL, feeding contextually ranked and filtered snippets into the LLM prompt window. Data sources include domain-specific documentation, GitHub issue caches, and prior conversation turns. The architecture respects the need for maximizing effective context to counteract hallucinations and maintain technical specificity.
  • Long Context Handling: To accommodate lengthy group conversations and background documents that may reach or exceed 16k tokens, the framework adopts advanced strategies including ReRoPE (Relative Positional Encoding extension for transformers) and dynamic NTK (Neural Tangent Kernel) scaling. As empirically validated in the source, this allows for context windows up to 40k tokens on an A100 80G GPU, retaining query precision and avoiding information loss.

4. System Integration with Instant Messaging Platforms

HuixiangDou2 is tailored for seamless operation within leading instant messaging platforms, specifically addressing the complexities of high-noise group chats:

  • Message Aggregation: The framework aggregates messages across group participants using unique groupid and userid concatenations, offering granularity beyond typical LLM role templates.
  • Noise Suppression: The dual-stage rejection pipeline is critical for ensuring that only technical questions elicit system responses, thereby avoiding group message flooding and off-topic answer generation.
  • Platform-Specific Interface Adaptation: APIs and custom plugins are used to interface with external IM systems. These integrations ensure that only filtered, contextually enriched responses are delivered, and that conversation flow disruption is minimized.
  • Information Security and Identity: The concatenation of user and group identifiers supplements conversational context and mitigates misattribution, which is otherwise common in multiple-participant LLM chats.

5. Open-Source Implementation and Accessibility

The entire system stack is distributed under open-source terms, with public resources comprising:

  • Source Code: Full codebase available at https://github.com/internlm/huixiangdou
  • Android Application: Distributed through the Github repository for direct mobile experimentation.
  • Web Service: Online access provided by OpenXLab
  • Demonstration Video: Platform overview hosted on YouTube

Researchers are able to clone the repositories and follow provided documentation for both experimental and production integration within their own chat environments.

6. Prospective Research and Extension

Several future trajectories are identified:

  • Deeper Source Code Understanding: Increasing the depth of LLM comprehension for domain codebases via further pretraining or targeted fine-tuning on project-specific data.
  • Contextual Representation Expansion: Investigating advanced paging and selective retrieval methods to scale effective context past the 40k token window.
  • Multimodal Input Handling: Enhancing the framework to parse images (e.g., log screenshots) by integrating multimodal models capable of OCR and semantic image analysis, addressing current model limitations to textual inputs.
  • Chat Format and UX: The ChatML format, presently employed, is noted to result in degraded context tracking in multi-turn technical discussions. Future work aims to improve chat serialization formats to more effectively capture conversational topology, speaker turns, and technical nuance.

7. Synthesis and Impact

HuixiangDou2 demonstrates that careful engineering of input preprocessing, hierarchical rejection and scoring schemes, and modular LLM orchestration can deliver precise, context-aware technical assistance in noisy, real-world group chat scenarios. The framework advances effective group chat AI by grounding each response in rigorously filtered and ranked technical content, maximizing reliability, security, and user relevance. Open-source availability and modular architecture facilitate both academic paper and practical adoption, with significant implications for the deployment of LLM assistants in collaborative technical environments.

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to HuixiangDou2 Framework.