- The paper presents SPECI, a framework integrating dynamic skill prompts into hierarchical continual imitation learning for advanced robot manipulation.
- It introduces multimodal perception, a dynamic skill codebook, and an attention-driven selection mechanism for context-aware skill reuse.
- Experimental results showcase superior forward and backward knowledge transfer, with improved AUC performance compared to existing methods.
 
 
      SPECI: Skill Prompts-based Hierarchical Continual Imitation Learning for Robot Manipulation
Introduction
The paper introduces a novel framework, Skill Prompts-based Hierarchical Continual Imitation Learning (SPECI), aimed at addressing the challenges of robot manipulation in dynamic environments. Traditional imitation learning (IL), while effective for fixed tasks, struggles with lifelong adaptation, which is crucial for real-world applications. Continual imitation learning (CIL), on the other hand, offers incremental task adaptation but often neglects the intrinsic skills necessary for robot manipulation or relies on rigid skills, limiting cross-task knowledge transfer.
SPECI Framework
SPECI is designed as a hierarchical CIL policy architecture that integrates skill acquisition and reuse, enhancing task-level knowledge transfer. This framework consists of three modules:
- Multimodal Perception and Fusion Module: Utilizes modality-specific encoders to process heterogeneous sensory data, enabling comprehensive environment representation.
- Skill Inference Module: Employs dynamic skill extraction and selection via an expandable skill codebook, facilitating implicit skill acquisition and efficient reuse.
- Action Execution Module: Generates precise actions through mode approximation and task-sharing parameters, enhancing bidirectional knowledge transfer.
  Figure 1: Framework of the proposed SPECI for robot continual imitation learning, illustrating its hierarchical modules for perception, skill inference, and action execution. 
Skill Codebook and Attention Mechanism
SPECI innovatively employs a dynamic skill codebook that autonomously expands as new tasks are learned. This allows for implicit skill acquisition without manual definition, promoting efficient knowledge transfer at both skill and task levels. Additionally, the use of an attention-driven skill selection mechanism enables context-aware skill utilization, driving superior cross-task skill reuse.
Experimental Results
Extensive experiments on diverse manipulation task suites demonstrate SPECI’s superiority over state-of-the-art CIL methods, particularly in bidirectional knowledge transfer. Forward Transfer (FWT) metrics indicate SPECI's rapid adaptation to new tasks, while Negative Backward Transfer (NBT) metrics show robust retention of previously learned tasks. The Area Under the Success Rate Curve (AUC) further highlights SPECI’s overarching performance advantage.

 
Figure 2: Comparison of different policy architectures under ER learning paradigm, showcasing SPECI’s superior forward and backward knowledge transfer.
Mode Approximation
To further bolster task-level knowledge sharing, SPECI incorporates mode approximation within its architecture. This involves enriching the policy with task-specific and task-sharing parameters, enhancing the model's ability to balance stability and adaptability across varying tasks. Mode approximation particularly strengthens performance in complex, long-horizon tasks where procedural and declarative knowledge integration is crucial.

 
Figure 3: Visualization of the FWT and AUC metric gaps between upper bounds and different policy architectures, evaluated for lifelong learning tasks.
Conclusion
SPECI represents a significant advancement in hierarchical CIL, offering a robust framework for robot manipulation across evolving environments. Its novel skill acquisition mechanisms and attention-driven selection processes directly address limitations in prior methods, providing an effective solution for lifelong robot learning. Future research may explore further integration of rehearsal-free CL paradigms and enhanced hierarchical planning to extend the capabilities of SPECI.
The potential for SPECI to adapt to diverse, unforeseen tasks and enhance robotic autonomy in dynamic settings marks a promising direction for future developments in AI and robotics.