Compositional Learning & Subsystem Identification

Updated 10 April 2026

Compositional learning and subsystem identification are methods for decomposing complex systems into interacting, reusable modules to achieve scalability and data efficiency.
They leverage structured composition laws, physics-informed architectures, and uni-modular experiments to uniquely recover subsystem behavior from limited observations.
This approach enhances interpretability, generalization, and transferability across applications such as reinforcement learning, automata, and synthetic biology.

Compositional learning and subsystem identification refer to the paradigm in which complex systems—across domains such as machine learning, dynamical system modeling, reinforcement learning, causal inference, automata learning, and synthetic biology—are optimally modeled, predicted, or controlled by decomposing them into interacting, reusable modules (subsystems) whose individual behavior can be learned, identified, or inferred, then composed to predict or analyze the behavior of previously unseen composite systems. These concepts are central to scalability, data efficiency, modular knowledge transfer, interpretability, and verifiable generalization, with substantial theoretical, algorithmic, and empirical literature spanning multiple fields.

1. Formal Foundations of Compositional Learning

Compositionality entails that the global behavior of a system can be expressed as a well-defined function (often parameterized by an explicit "composition law") of the behaviors of its components/subsystems. In learning, the objective is to parameterize and identify subsystem functions $f_j$ and a global composition map $G$ such that for system-level input $u$ , the output $y$ satisfies: $y = G(f_1(u_1), f_2(u_2),\ldots, f_n(u_n); \theta),$ where $G$ is structurally known (e.g., feedback interconnection, algebraic composition) and only a subset of the variables (e.g., module inputs, outputs, or internal states) is directly observable. Modular identifiability is then the property that each subsystem's behavior can be uniquely recovered (up to functional equivalence) from system-level observations under suitable experiments, and the learned models provide valid predictions under novel modular compositions (Wang et al., 23 Sep 2025).

Subsystem identification in compositional settings is governed by conditions on observability, excitation, and the mathematical structure of $G$ . Notably, modular identifiability (for static systems) can be achieved by "uni-modular" experiments, where only one module's input varies at a time, dramatically reducing required datasets from exponential (in system size) to linear in the number of modules (Wang et al., 23 Sep 2025). Similar principles extend to dynamical systems via port-Hamiltonian neural networks (PHNNs) (Neary et al., 2022, Otterdijk et al., 2024), automata via context-sensitive learning (Fujinami et al., 6 Aug 2025, Henry et al., 23 Apr 2025), RL systems via specification decomposition (Neary et al., 2021, Neary et al., 2023), and sequential interventions via low-rank multiplicative factorization (Yu et al., 2024).

2. Architectures and Learning Algorithms

Port-Hamiltonian Neural Networks and Dynamical System Composition

PHNNs provide a physics-informed neural ODE formulation for dynamical systems, leveraging the port-Hamiltonian structure: $\dot x = [J(x) - R(x)] \nabla_x H(x) + G(x)u, \quad y = G(x)^\top \nabla_x H(x)$ where energy conservation, dissipativity, and interconnections are enforced at the architectural level. Subsystem PHNNs are trained on isolated subsystem data; composition for prediction or control proceeds by block-diagonal aggregation of $H$ , $R$ , $G$ 0 and structured addition to $G$ 1 to encode known or learned interconnection topology. If the interconnection matrix is unknown, a small number of transitions from the composite system suffice to solve for the interconnection parameters via least squares (Neary et al., 2022). PHNN compositions maintain passivity properties and theoretical error bounds in terms of subsystem identification and interconnection estimation errors (Neary et al., 2022). Extensions handle subsystem identification from system-level I/O only, accommodating noise via output-error loss over simulated traces (e.g., SUBNET/OE framework) (Otterdijk et al., 2024).

Modular and Lifelong Learning

Lifelong compositional learning frameworks posit a two-stage cycle:

Assimilation: Given new data, determine a composition (routing weights, gating, graph structure) of previously learned modules that best explains the new task's data.
Accommodation: If existing modules cannot fit the new data threshold, adapt (fine-tune with regularization to avoid catastrophic forgetting) or add new modules, followed by (potentially) reusing or recombining them in future tasks (Mendez et al., 2020, Mendez, 2022).

Module selection during assimilation leverages sparsity-inducing penalties or utility scores, while accommodation typically employs regularization schemes such as Elastic Weight Consolidation or experience replay to preserve prior knowledge. Subsystem identification is often realized by examining usage patterns of modules across tasks (e.g., clustering gating vectors), and adjusting per-module learning rates based on recent demand or drift measurements (Mendez, 2022). The compositional lifelong setting enables accelerated transfer, subsystem reusability, and targeted adaptation to non-stationary environments.

Reinforcement Learning via Parametric MDP Decomposition

Compositional RL frameworks model global tasks as abstract parametric Markov Decision Processes (pMDPs) with "actions" corresponding to invoking lower-level RL subsystems. Each subsystem $G$ 2 is defined by an entry set, exit set, time horizon, and is trained to satisfy a specific success probability bound. The global reachability objective decomposes into bilinear constraints on the subsystem specifications; the overall composition is planned by solving for the minimal feasible set of per-subsystem guarantees that enable the global specification (Neary et al., 2021, Neary et al., 2023). Training proceeds iteratively: estimate current subsystem performance, update pMDP parameters and focus further training on bottleneck subsystems. Empirically, compositional RL achieves orders-of-magnitude sample efficiency improvements and verifiable satisfaction of global specifications.

Compositional Automata and System Identification

Componentwise automata learning algorithms such as CCwL* (Fujinami et al., 6 Aug 2025) and CoalA (Henry et al., 23 Apr 2025) directly tackle compositional system identification from modularized or partially-modularized black-box systems. Given access to component-level queries (or by automated inference of subsystem alphabets when only global queries are available), these learners maintain observation tables per component, prune unreachable states (component redundancies), and minimize learning and query costs by exploiting context-sensitive reachability. CoalA further automates subsystem discovery by iteratively refining the decomposition of the global alphabet via the detection of synchronization-induced discrepancies in system behavior.

3. Theoretical Guarantees and Identifiability

Theoretical results underpinning compositional learning frameworks address both uniqueness of subsystem recovery and generalization guarantees for composed models:

Subsystem Identifiability: For static or equilibrium systems, modular identifiability theorems demonstrate that subsystem input-output functions are uniquely determined (up to coordinate transformations) by "uni-modular" experiments, provided the composition map is sufficiently non-degenerate and function spaces are appropriately regular (Wang et al., 23 Sep 2025). For dynamical systems, similar error bounds are provided in terms of prediction loss on held-out composite-system trajectories, with modular neural-ODE architectures (Neary et al., 2022).
Generalization and Sample Complexity: Compositional architectures significantly reduce needed data. For $G$ 3 modules with $G$ 4 input values each, structure-agnostic approaches require $G$ 5 samples, while compositional approaches need only $G$ 6 samples under modular identifiability (Wang et al., 23 Sep 2025). In meta-learning with hypernetworks, identification up to linear transformation is provable with $G$ 7 tasks, where $G$ 8 is the number of ground-truth modules, and only linear (not exponential) scaling in sample complexity is required (Schug et al., 2023).
Stable Modular Learning: The use of second-order loss approximations and regularization aligned with the Hessian/Fisher information ensures that incremental compositional updates are non-interfering and supports theoretical bounds on forgetting and specialization (Porrello et al., 2024).
Causal and Sequential Intervention Models: Identification theorems for sequential compositional intervention models provide point-identifiability under assumptions involving unit-specific burn-in, full-rankness, no repeated interventions, and temporal isolation between sequential actions, and allow concrete workflows for decomposable effect estimation and counterfactual extrapolation (Yu et al., 2024).

4. Empirical Demonstrations and Case Studies

Compositional learning and subsystem identification are empirically validated on diverse testbeds:

Dynamical System Modeling: PHNNs trained on isolated spring-mass-damper subsystems achieve $G$ 9 one-step MSE on the composite system, with accurate long-term rollouts in systems of up to $u$ 0 interacting components, without additional data (Neary et al., 2022). In multi-physics transfer, subsystems learned within one interconnection are recomposed in new settings (e.g., mechanical plus thermodynamic) and maintain predictive accuracy (Otterdijk et al., 2024).
Lifelong and Incremental Learning: Modular learners demonstrate accelerated convergence and reduced catastrophic forgetting in split-task image classification (CIFAR-100, ImageNet-R, CUB-200) and multitask RL settings, with modular gating allowing subsystem discovery and reuse (Mendez, 2022, Porrello et al., 2024). Task-unlearning and compositional specialization are supported via additive and subtractive composition of LoRA/PEFT task vectors (Porrello et al., 2024).
Reinforcement Learning: In gridworld and physics domains, compositional RL algorithms achieve high-reachability specifications with $u$ 1 fewer steps than monolithic end-to-end RL approaches (Neary et al., 2021, Neary et al., 2023).
Automata/System Integration: CCwL* and CoalA yield $u$ 2– $u$ 3 reductions in membership queries in concurrent systems, aggressive pruning of redundant component states, and automatic discovery of subsystem boundaries via alphabet decomposition, with scalable performance on networks with realistic concurrency (Fujinami et al., 6 Aug 2025, Henry et al., 23 Apr 2025).
Synthetic Biology: Modular learning frameworks recover biocircuit module transfer curves from small uni-modular datasets, whereas monolithic networks trained on the same data cannot generalize to unseen combinatorial inputs (Wang et al., 23 Sep 2025).

5. Practical Considerations and Limitations

Data and Structural Assumptions

All compositional approaches depend fundamentally on appropriate structural assumptions: known or discoverable composition laws, sufficiently informative experiments (e.g., unit-specific burn-in, full-rank basis for sequential interventions (Yu et al., 2024)), and excitation of all modules or subsystems during training (Schug et al., 2023, Otterdijk et al., 2024). Inadequate excitation or ill-posed composition maps can preclude identifiability.

Interconnection Identification and Scalability

For dynamical and network systems, the identification of unknown interconnections (e.g., off-diagonal structure matrices) may require specific excitation patterns or additional parametric modeling (Neary et al., 2022, Otterdijk et al., 2024). For large-scale and highly concurrent systems, reachability analysis and context-sensitive learning algorithms are needed to avoid intractable enumeration or redundant state learning (Fujinami et al., 6 Aug 2025, Henry et al., 23 Apr 2025).

Stability–Plasticity Tradeoff

Separation of assimilation and accommodation stages in modular lifelong learning explicitly manages the balance between knowledge retention and flexibility, but requires careful regularization and expansion criteria to avoid both catastrophic forgetting and unbounded module growth (Mendez, 2022, Mendez et al., 2020). In nonstationary environments, per-module adaptation rates provide targeted tracking of subsystem drift.

Open Questions

Outstanding research directions include scaling subsystem identification to deeply nonlinear/latent-state interconnections (Otterdijk et al., 2024), automated discovery of structural composition graphs (Wang et al., 23 Sep 2025), rich causal and sequential effect modeling beyond current low-rank formalisms (Yu et al., 2024), and extending compositional learning to architectures beyond regular automata (e.g., register/timed automata) (Henry et al., 23 Apr 2025).

6. Applications and Impact

Compositional learning and subsystem identification have broad practical and scientific significance:

Engineering and Physical Systems: Data-efficient modeling and control of robots, power electronic networks, multi-physics processes, and synthetic biological circuits, with modular transferability and compositional prediction (Neary et al., 2022, Otterdijk et al., 2024, Wang et al., 23 Sep 2025).
Lifelong and Continual Learning: Domain-agnostic frameworks for building and maintaining reusable subsystem libraries across sequentially arriving tasks, mitigating catastrophic forgetting and enabling fast transfer (Mendez, 2022, Porrello et al., 2024).
Reinforcement Learning: Verifiable modular RL agents for safety-critical applications, automated subtask decomposition, and scalable policy synthesis in complex MDPs and POMDPs (Neary et al., 2021, Neary et al., 2023).
Automata and Formal Methods: Efficient model learning and component extraction in concurrent and legacy systems, program synthesis, and verification (Fujinami et al., 6 Aug 2025, Henry et al., 23 Apr 2025).
Causal Inference: Subsystem identification in personalized interventions and generalization to unseen combinations in causal effect estimation and policy prediction (Yu et al., 2024).

In all these domains, compositional frameworks facilitate interpretability, transfer, data efficiency, and principled subsystem reuse, establishing a unified methodology for modeling, control, and learning in modular complex systems.