Self-Supervised Learning for Inverse Problems

Updated 5 October 2025

Self-supervised learning for inverse problems is a framework that trains reconstruction networks using observed measurements and physical models without paired ground-truth data.
It integrates forward operators and physics-consistency losses to ensure realistic reconstructions even in ill-posed, non-linear, or noisy scenarios.
Techniques like measurement splitting and equivariance constraints enable robust performance in applications such as CT, MRI, deblurring, and audio declipping.

Self-supervised learning for inverse problems refers to a class of methods where a reconstruction algorithm—often a neural network—is trained solely with access to observed measurements and mathematical models of the forward process, without paired ground-truth data. This paradigm leverages known physical operators, architectural priors, intrinsic data invariances, or measurement splitting to provide supervision. Inverse problems in imaging (e.g., CT, MRI, super-resolution, deblurring), physics (e.g., photoacoustics), robotics, and signal processing are increasingly solved with self-supervised techniques, yielding competitive or even state-of-the-art performance even when high-quality ground-truth is unattainable or the forward operator is non-linear, non-unitary, or ill-posed.

1. Self-Supervised Loss Formulations for Inverse Problems

Self-supervised approaches construct learning objectives that do not require reference signals but instead enforce consistency with the measured data or known transformations. A central class of objectives involves measurement consistency loss: for forward model $A$ , measurement $y$ , and reconstruction network $f_\theta$ , the loss is typically

$L_\mathrm{MC}(\theta) = \|A f_\theta(y) - y\|^2,$

which penalizes discrepancy between forward-modeled reconstructions and the observed measurements. In the context of noise, this may be combined with sophisticated weighting or divergence corrections, as in SURE-based estimators (Scanvic et al., 2023).

For highly ill-posed or non-linear problems, traditional consistency is insufficient. Recent advances introduce auxiliary losses exploiting inherent data symmetries—translation, scale, or amplitude invariance—to enforce additional constraints (see Section 3). The combination of measurement-based and invariance-based objectives is exemplified by scale-equivariant (Scanvic et al., 2023), translation-equivariant (Sechaud et al., 30 Sep 2025), and amplitude-equivariant (Sechaud et al., 3 Sep 2024) loss functions: $L_\mathrm{eq}(\theta) = \mathbb{E}_{g}\|g f_\theta(y) - f_\theta(\mathcal{T}_g f_\theta(y))\|^2,$ where $\mathcal{T}_g$ is a group action (scaling, translation, etc.).

An alternative is the splitting loss, which simulates supervision by partitioning the incomplete measurement and using part as input, part as target, to drive the network toward the MMSE estimator (see Section 4).

2. Integration of Physical Models and Forward Operators

The forward operator, describing the physics of the data acquisition process, is critical in self-supervision. In many works, the operator is explicitly embedded in the training loss or network architecture, enforcing that reconstructions are physically plausible. In the general framework (Zhang et al., 2021), the loss is

$L(\theta) = \|y - H(f_\theta(y))\|^2,$

where $H$ is the measurement matrix (e.g., based on time-of-flight for ultrasound, or the Radon transform for CT (Bubba et al., 28 Feb 2025)), and $f_\theta$ outputs the reconstructed object.

Physics-consistency loss is central to GedankenNet (Huang et al., 2022), where a differentiable wave propagation operator is used to enforce Maxwell-consistent hologram reconstructions. In nonlinear inverse problems, e.g., phase retrieval ( $y = |A x|^2$ ) (Sechaud et al., 30 Sep 2025) or photoacoustic tomography with unknown sound speed (Hwang et al., 2023), the measurement operator itself is often parameterized or learned jointly within the self-supervised framework.

In highly ill-posed or ill-conditioned cases (e.g., sparse-angle CT), theoretical results guarantee that, under certain rank and independence assumptions, self-supervised training with appropriately weighted loss functions yields gradient updates equivalent to those from a supervised loss incorporating the forward operator (Bubba et al., 28 Feb 2025, Gan et al., 2022).

3. Exploiting Data Symmetries: Equivariance-Based Self-Supervision

Self-supervised methods circumvent the lack of ground-truth by exploiting intrinsic invariances or equivariances in the signal class. Several classes of equivariance-based strategies have been detailed:

Translation Equivariance: For phase retrieval, natural images are approximately translation-invariant. By enforcing that $f_\theta(h(T_g x)) \approx T_g x$ —where $T_g$ is a translation and $h$ is the measurement process—the network is supervised to reconstruct realistic images even from nonlinear measurements (Sechaud et al., 30 Sep 2025).
Scale Equivariance: For super-resolution and deblurring with high-frequency nullspaces, enforcing consistency across image scales (with appropriate gradient stopping) forces the network to hallucinate missing details in the nullspace of the degradation operator (Scanvic et al., 2023).
Amplitude Equivariance: For nonlinear audio declipping, self-supervision is enforced by amplitude scaling: $f_\theta(\eta(g x)) = g f_\theta(\eta(x))$ for all $g$ , with $\eta$ the clipping operator (Sechaud et al., 3 Sep 2024).

Usually, these strategies are effective only if the group action does not commute with the forward operator, ensuring the equivariance loss provides complementary information to the measurement consistency loss (Sechaud et al., 30 Sep 2025).

The design of network architectures that are equivariant to these groups, or the incorporation of virtual transformations of the forward operator, is central to recent advances in settings with incomplete data (Sechaud et al., 1 Oct 2025).

4. Splitting, Masking, and Measurement Partitioning

The splitting loss (or measurement splitting, Editor's term) is a methodology wherein available measurements are partitioned into disjoint subsets; one subset is used as input, and the other as pseudo-target. In image inpainting, MRI, and compressed sensing, this technique enables self-supervision from single incomplete acquisition models (Sechaud et al., 1 Oct 2025): $\mathcal{L}_\mathrm{split}(y, A, f) = \mathbb{E}_{y_1, A_1| y, A} \|A_{T_g} f(y_1, A_1) - y\|^2.$ Theoretical analysis guarantees that, under group-invariance of the data distribution and appropriate rank conditions, this loss converges, in expectation, to the minimizer of the supervised MSE loss, i.e., the MMSE estimator.

When noise is present, variants such as recorrupted-to-recorrupted (R2R) adjust the splitting loss to maintain unbiasedness in the MMSE estimate. The approach has demonstrated near-supervised reconstruction quality in high-rank-deficiency regimes, particularly when paired with architectures enforcing appropriate equivariance (Sechaud et al., 1 Oct 2025).

Masking and recombination have also been used in audio (Sechaud et al., 3 Sep 2024) and image (Kobayashi et al., 2020) domains to separate supervised and unsupervised regions, ensuring that self-supervision is concentrated on ill-posed parts of the problem.

5. Handling Non-Linearity, Correlated Noise, and Unknown Factors

Classical self-supervised strategies are tailored for linear operators and uncorrelated noise, but extensions now address nonlinear and correlated contexts:

Non-Linear Operators: For audio declipping (Sechaud et al., 3 Sep 2024) and phase retrieval (Sechaud et al., 30 Sep 2025), equivariance-based self-supervised losses exploit problem-specific invariances to propagate supervisory signals into the nullspaces of complicated nonlinear measurement functions. For instance, in phase retrieval, translation-invariance is enforced using cosine similarity metrics that are invariant to global phase.
Correlated Noise: In computed tomography or microscopy with spatially structured noise, Noisier2Inverse (Gruber et al., 25 Mar 2025) demonstrates that direct generation of noisier measurements (using known correlation structure) and targeting $2y - z$, rather than $y$ itself, in the measurement loss, yields robust learning without the instability of extrapolation.
Unknown Physics/Model Uncertainty: When the forward operator contains unknown functions (e.g., unknown speed of sound in photoacoustics), a mapping network is introduced to jointly learn the unknown physical parameterization alongside the reconstruction, supervised only by the data-matching loss (Hwang et al., 2023).

Combined, these techniques support robust signal recovery in scenarios previously inaccessible to ground-truth-based or purely linear self-supervised schemes.

6. Theoretical Guarantees and Convergence Analysis

Several recent contributions rigorously establish when self-supervised learning achieves the same minimizer as the corresponding supervised problem:

In deep equilibrium models (DEQ), under suitable rank and independence assumptions, the gradient update from the self-supervised loss coincides with that from the supervised mean-squared error loss, for both unitary and non-unitary forward operators (Gan et al., 2022, Bubba et al., 28 Feb 2025).
For splitting-based losses, Theorem 1 in (Sechaud et al., 1 Oct 2025) formalizes that equivariant reconstruction networks trained on splitting losses achieve the MMSE estimator in expectation, under group invariance of the measurement distribution.
In inertial-accelerated deep inverse priors (Buskulic et al., 3 Jun 2025), explicit exponential convergence rates and recovery bounds are established in both continuous- and discrete-time, relating the decay rate to the minimal singular values of the operator and network Jacobian: $\mathcal{L}(y(t)) \leq \xi\, \mathcal{L}(y(0))\, \exp\left(-\frac{\sigma_{\min}(J_g(\theta_0)) \sigma_{\min}(A)}{2}\, t\right).$
Measurement consistency-based self-supervised learning in medical imaging is also shown to deliver improvements over total variation regularization (2–3 dB PSNR) and approximate supervised performance in certain settings (Senouf et al., 2019).

These guarantees guide both algorithm design and stopping criteria, revealing the effect of architecture, overparameterization, hyperparameters, and regularization choices on the learning efficacy.

7. Practical Implementations and Application Domains

Self-supervised techniques have found applications across imaging, signal recovery, robotics, and environmental sciences:

In imaging: plane-wave ultrasound, photoacoustic image reconstruction, CT, MRI, and image deblurring/super-resolution benefit from physics-guided self-supervision (Zhang et al., 2021, Bubba et al., 28 Feb 2025, Scanvic et al., 2023).
In audio: scale-amplitude equivariant self-supervision enables declipping of music signals with competitive performance to supervised methods when clean data is unavailable (Sechaud et al., 3 Sep 2024).
In robotics: embodied self-supervised learning coordinates sampling, feedback, and model retraining in robot arm inverse kinematics even with non-convex solution spaces (Weiming et al., 2023).
In hydrology: knowledge-guided contrastive self-supervision infers static basin characteristics from noisy, time-varying driver and response data (Ghosh et al., 2021).
In medical imaging: self-supervised approaches are now achieving robust performance even under severe undersampling or rank deficiency, substantially closing the gap to supervised training (Senouf et al., 2019, Sechaud et al., 1 Oct 2025).
In complex physics: self-supervised inverse rendering models, such as SS-SfP, leverage joint physics-image fusion and reflectance estimation to achieve 3D shape recovery from polarization without ground-truth surface normals (Tiwari et al., 12 Jul 2024).

Self-supervised strategies thus form a growing methodological toolbox for tackling inverse problems where data is incomplete, noisy, or governed by unknown or nonlinear forward operators. Continued progress in theoretical convergence, architectural equivariance, and loss construction is expected to further expand the reach of these methods across scientific and engineering domains.