SEAL: Searching Expandable Architectures for Incremental Learning (2505.10457v1)

Published 15 May 2025 in cs.LG, cs.AI, and cs.CV

Abstract: Incremental learning is a machine learning paradigm where a model learns from a sequential stream of tasks. This setting poses a key challenge: balancing plasticity (learning new tasks) and stability (preserving past knowledge). Neural Architecture Search (NAS), a branch of AutoML, automates the design of the architecture of Deep Neural Networks and has shown success in static settings. However, existing NAS-based approaches to incremental learning often rely on expanding the model at every task, making them impractical in resource-constrained environments. In this work, we introduce SEAL, a NAS-based framework tailored for data-incremental learning, a scenario where disjoint data samples arrive sequentially and are not stored for future access. SEAL adapts the model structure dynamically by expanding it only when necessary, based on a capacity estimation metric. Stability is preserved through cross-distillation training after each expansion step. The NAS component jointly searches for both the architecture and the optimal expansion policy. Experiments across multiple benchmarks demonstrate that SEAL effectively reduces forgetting and enhances accuracy while maintaining a lower model size compared to prior methods. These results highlight the promise of combining NAS and selective expansion for efficient, adaptive learning in incremental scenarios.

Summary

SEAL: Searching Expandable Architectures for Incremental Learning

Incremental learning (IL), also referred to as continual learning, represents a crucial area within machine learning where models are tasked with learning from a sequential stream of data reflecting ongoing tasks. This paradigm presents two fundamental challenges, namely, catastrophic forgetting and plasticity loss. Catastrophic forgetting is characterized by the inability of models to retain acquired knowledge upon exposure to new tasks, while plasticity loss denotes the diminished capacity of models to integrate novel learnings effectively throughout time.

In addressing these challenges, many approaches have been formulated to balance plasticity and stability, such as penalizing changes in critical parameters and leveraging knowledge distillation techniques. Neural architecture search (NAS), a prominent area within automated machine learning (AutoML), stands out in constructing adaptive neural network architectures. However, traditional NAS-based methods often resort to model expansions at every task instance, which proves inefficient within resource-constrained environments such as IoT and TinyML platforms.

The study introduces SEAL, a NAS-based framework specifically designed for data-incremental scenarios characterized by sequential arrival of disjoint data samples that are not retained for future processing. SEAL dynamically modifies the model structure by selectively expanding it based on a capacity estimation metric, ensuring the preservation of stability through cross-distillation training after each expansion step. Uniquely, SEAL's NAS component jointly optimizes the architecture alongside the expansion policy. This dual approach is pivotal in effectively mitigating forgetting while ensuring high accuracy, yet maintaining lower model sizes compared to legacy methods.

Empirical evidence from several benchmarks reveals SEAL's efficacy in substantial reduction of forgetting and accuracy augmentation. These findings underscore the potential of integrating NAS with selective model expansion, heralding efficient adaptive learning within incremental learning scenarios.

The implications of this research are multifaceted, both in practical deployment and theoretical development. SEAL not only provides a framework that is highly suitable for environments with strict limitations on resources, but also advances the theoretical understanding of model expansibility and its impact on learning efficiency. Practically, SEAL is poised to enhance the deployment of machine learning models in areas where computational resources and memory storage are severely limited, fostering advancements in IoT and other ubiquitous computing domains. Theoretically, the work opens avenues for further exploration into hybrid frameworks that leverage NAS for optimal policy strategy alongside architecture design, potentially extending into other areas of incremental learning settings such as class-incremental and domain-incremental tasks.

Looking ahead, one of the promising directions lies in enhancing the role of NAS within the expansion policy. An intriguing prospect is the incorporation of a capacity-related control directly within the NAS process, allowing the search to dynamically determine when and how model expansion should occur. Such integration could prove pivotal in settings where the simplicity of data streams may no longer apply, necessitating robust adaptation mechanisms within the NAS framework.

Overall, SEAL stands as a testament to the promising intersection between NAS and IL, paving the way for efficient learning paradigms in increasingly dynamic data environments while addressing fundamental challenges encapsulated in stability and plasticity trade-off.