A "Wenlu" Brain System for Multimodal Cognition and Embodied Decision-Making: A Secure New Architecture for Deep Integration of Foundation Models and Domain Knowledge (2506.00570v1)
Abstract: With the rapid penetration of artificial intelligence across industries and scenarios, a key challenge in building the next-generation intelligent core lies in effectively integrating the language understanding capabilities of foundation models with domain-specific knowledge bases in complex real-world applications. This paper proposes a multimodal cognition and embodied decision-making brain system, Wenlu", designed to enable secure fusion of private knowledge and public models, unified processing of multimodal data such as images and speech, and closed-loop decision-making from cognition to automatic generation of hardware-level code. The system introduces a brain-inspired memory tagging and replay mechanism, seamlessly integrating user-private data, industry-specific knowledge, and general-purpose LLMs. It provides precise and efficient multimodal services for enterprise decision support, medical analysis, autonomous driving, robotic control, and more. Compared with existing solutions,
Wenlu" demonstrates significant advantages in multimodal processing, privacy security, end-to-end hardware control code generation, self-learning, and sustainable updates, thus laying a solid foundation for constructing the next-generation intelligent core.