OrthoReg: Orthogonal Regularization for Hybrid Symbolic-Neural Dynamical Systems

Published 17 Jun 2026 in cs.LG, cs.AI, and eess.SY | (2606.19145v1)

Abstract: Dynamical systems are fundamental to modeling the natural world, yet modeling them involves a persistent trade-off: manually prescribed mechanistic models are interpretable by design but often overly simplistic and misspecified; in contrast, flexible data-driven neural methods lack physical insight. Hybrid modeling aims for the best of both worlds by combining a prescribed or symbolic, physics-based component with a flexible neural network. A critical challenge, however, is that the neural component may relearn mechanistic parts, yielding redundant and uninterpretable models, especially when the symbolic structure itself is discovered from data. Existing methods based on standard $L^2$ regularization rely on a projection argument that breaks when the symbolic component is learned through sparse discovery, allowing the neural augmentation to overlap with symbolic structure. We introduce \textbf{OrthoReg} (Orthogonal Regularization), which directly penalizes overlap between the symbolic and neural components, preventing symbolic structure from being absorbed by the neural residual. This yields a complementary decomposition: the symbolic part captures what the library can express, and the neural part captures what remains. On benchmark dynamical systems with partial library mismatch, OrthoReg improves symbolic recovery and out-of-distribution behavior.

Abstract PDF Upgrade to Chat

Authors (2)

Summary

The paper introduces OrthoReg, which enforces empirical orthogonality between symbolic and neural components to achieve a clear, non-redundant decomposition in hybrid dynamical systems.
It demonstrates superior symbolic recovery and a two orders-of-magnitude reduction in out-of-distribution error compared to traditional L2 regularization.
Empirical evaluations on systems like the damped pendulum and Duffing oscillator validate OrthoReg's enhanced interpretability and robust transferability under partial library mismatch.

Orthogonal Regularization for Hybrid Symbolic-Neural Dynamical Systems

Motivation and Problem Setup

Hybrid modeling seeks a principled synthesis of manually prescribed (symbolic, mechanistic) models and flexible neural architectures in dynamical systems, targeting the interpretability–expressiveness trade-off central to scientific ML. Existing “hybrid” approaches typically represent a dynamical vector field as $f = f_{\mathrm{phy} + f_{\mathrm{aug}$, with $f_{\mathrm{phy}$ from a symbolic library (e.g., polynomial, trigonometric) and $f_{\mathrm{aug}$ as a learned neural residual. While norm-based regularization ( $L^2$ , as in APHYNITY [yin2021augmenting]) provably enforces orthogonality when the symbolic component is fixed, this guarantee collapses under joint symbolic discovery via sparse regression—commonly seen in SINDy-style pipelines [brunton2016discovering]. The neural residual then can redundantly re-express symbolic directions, eroding interpretability and transferring poorly out-of-distribution.

This work introduces OrthoReg, an empirical inner-product regularization that directly penalizes overlap between symbolic and neural components, restoring a clean decomposition: symbolic captures within-library structure, neural captures orthogonal residue. The theoretical regime addressed is the case of partial library mismatch—where the true dynamics are only partly captured by the symbolic library.

Figure 1: Symbolic--neural decompositions under library mismatch; OrthoReg enforces separation, pushing the neural residual outside the symbolic span under partial library coverage.

Method: OrthoReg Objective and Theoretical Guarantees

The OrthoReg objective augments the sparse symbolic–neural joint training loss with an empirical orthogonality penalty: $\mathcal{L}_{\mathrm{fit} + \mu \|w\|_1 + \lambda \sum_{j=1}^M (\langle \hat{f}_{\mathrm{aug}, \phi_j \rangle_{\mathcal{D})^2$ where the empirical inner product is computed over observed states. $\mu$ controls symbolic sparsity; $\lambda$ controls orthogonality.

OrthoReg guarantees that at global optima (for sufficiently large $\lambda$ ), empirical neural–symbolic overlap converges to zero, yielding a direct-sum error decomposition under the empirical inner product: $\|f-(\hat f_{\mathrm{phy}+\hat f_{\mathrm{aug})\|_{\mathcal{D}^2 = \|P_{\mathcal{F}_{\mathrm{phy}^{\mathcal{D}(f)-\hat f_{\mathrm{phy}\|_{\mathcal{D}^2 + \|f-P_{\mathcal{F}_{\mathrm{phy}^{\mathcal{D}(f)-\hat f_{\mathrm{aug}\|_{\mathcal{D}^2$ This formalizes the desired decomposition: symbolic fits what the library can express; neural captures what cannot.

By contrast, $L^2$ regularization only shrinks the neural residual norm, allowing significant symbolic–neural overlap when $f_{\mathrm{phy}$0 is learned via sparsity, as established in the paper's theoretical analysis.

Empirical Results: Partial Library Misspecification and OOD Generalization

Experiments span four systems (modified damped pendulum, Lotka–Volterra, time-modulated SIR, Duffing oscillator), probing OOD generalization, symbolic recovery, and robustness under partial and severe library mismatch.

OrthoReg yields:

Superior symbolic recovery: Highest F1 scores and sparser symbolic supports, especially in partial mismatch regimes.
Major OOD error reduction: On the damped pendulum, OrthoReg achieves two orders-of-magnitude lower OOD derivative MSE than $f_{\mathrm{phy}$1-regularized hybrids, even as in-distribution fit slightly decreases.
Robust regime transfer: Duffing oscillator cross-basin evaluations demonstrate OrthoReg's improved global generalization, as symbolic structure transfers to new basins and neural residuals capture out-of-library effects.
Figure 2: Ablations. (a) OrthoReg is most effective under partial library mismatch. (b) Irregular sampling degrades all methods, but OrthoReg retains competitive OOD behavior. (c) Intermediate $f_{\mathrm{phy}$2 optimizes symbolic separation versus OOD error; right panel visualizes residual–symbolic separation.

Ablations show:

OrthoReg gains are most pronounced under intermediate library mismatch, diminishing in extreme cases where residuals dominate or libraries are nearly sufficient.
Irregular sampling degrades all methods, but OrthoReg maintains a competitive symbolic–neural split.
Noise robustness degrades comparably across methods; OrthoReg's benefit is primarily separation, not denoising.
Figure 3: (a) Damped pendulum trajectories in $f_{\mathrm{phy}$3. (b) Duffing oscillator in $f_{\mathrm{phy}$4; OrthoReg captures global basin structure, pure symbolic and $f_{\mathrm{phy}$5 hybrids distort dynamics.

Practical and Theoretical Implications

OrthoReg is most effective when a partially expressive symbolic library exists, and structured effects lie outside its span—i.e., in realistic regimes where scientific knowledge is incomplete. The method enforces complementary decomposition, thereby preserving interpretability and transferability of symbolic components.

There are practical implications for scientific machine learning pipelines:

Interpretability retention: Symbolic components are not erased by neural augmentation.
Stable OOD transfer: Symbolic structure learned within OrthoReg generalizes, even as neural residuals handle unmodeled phenomena.
Robust hybrid modeling: Additive inner-product penalties succeed where compositional architectures, staged residual learning, and standard $f_{\mathrm{phy}$6 regularization fail.

Theoretically, OrthoReg’s error decomposition under empirical inner product formalizes the symbolic–neural split; extending this framework to population-level guarantees and broader symbolic regression pipelines (e.g., genetic programming, transformer-based symbolic methods) is a promising direction.

Speculation on Future Developments

Hybrid approaches, strengthened by orthogonality constraints, are poised to become central in scientific AI for settings with incomplete physical models. OrthoReg provides a pathway for scalable, robust symbolic–neural integration, suggesting future extensions:

Population-level orthogonality bounds via uniform convergence/Rademacher complexity analysis;
Automated symbolic library refinement alongside OrthoReg, potentially coupling transformer-based discovery and orthogonal residuals;
Canonical SINDy-style selection–refit pipelines integrated with orthogonal residual learning to separate symbolic and neural contributions beyond additive decompositions.

Conclusion

OrthoReg empirically and theoretically addresses the fragility of symbolic–neural separation under partial library misspecification. Unlike standard regularization, OrthoReg enforces empirical orthogonality, yielding interpretable symbolic cores and robust residual learning. This mechanism facilitates practical hybrid modeling—preserving transferability and non-redundancy as physical knowledge is algorithmically extracted and augmented. Extensions to broader symbolic regression paradigms and population-level guarantees are natural future directions.

Figure 4: Monte Carlo sampling ablation; OrthoReg's symbolic recovery and OOD stability concentrate as sample size increases, visualized by F1 scores and residual–symbolic cosine diagnostics.