Zero-Shot Physics-Informed Fine-Tuning
- Zero-shot physics-informed fine-tuning is a method that adapts pre-trained models to new physical systems solely by enforcing physics constraints without paired training data.
- It leverages physics-driven loss functions such as PDE residuals and boundary conditions, while employing parameter-efficient strategies like LoRA for rapid adaptation.
- Empirical results highlight enhanced sample efficiency and significantly reduced training times compared to traditional supervised methods.
Zero-shot physics-informed fine-tuning refers to a suite of techniques for adapting a neural model to a new task or physical system without any supervised data from the new domain, relying solely on physical laws or governing equations. This methodology enforces known physics (e.g., PDEs, boundary conditions, conservation laws) as the only source of supervision during adaptation. Key applications involve scientific machine learning, operator learning, inverse problems, and generative modeling, where labeled data may be difficult or impossible to acquire for the downstream task.
1. Foundations and Key Concepts
Zero-shot physics-informed fine-tuning unifies developments across inverse imaging, operator learning, generative modeling, and symbolic dynamics. In all cases, a model pre-trained on generic data or auxiliary tasks is adapted to a new physical task by enforcing the governing equations, often formulated as PINN-style objective terms, residual minimization, or reward signals.
Distinguishing features:
- Zero-shot: No paired input–output data for the new task is ever required for adaptation. The only "labels" are physics constraints.
- Physics-informed loss: Adaptation is driven by minimizing violations of the known physical model (e.g., enforcing PDE residuals, boundary/initial conditions, or projecting iterates onto physical constraints).
- Fine-tuning regime: Adaptation typically updates either all parameters, a small subset (parameter-efficient fine-tuning), or auxiliary coefficients/bases, initialized from a pre-trained model which need not have seen the target physics.
Prototypical frameworks include plug-and-play phase retrieval via physics projection (Kumar, 2021), neural operator fine-tuning via PINN residuals (Zhang et al., 2024, Wu, 2024), reward-fine-tuning of generative diffusion models (Yuan et al., 24 Sep 2025), and structure-imposed zero-shot generalization in graph networks (Cranmer et al., 2019).
2. Methodological Variants
2.1 Physics-Informed Fine-Tuning in Operator Learning
Operator learning frameworks such as DeepONet utilize a two-step protocol (Zhang et al., 2024, Wu, 2024):
- Distributed Pre-training: Train a neural operator across multiple source tasks (e.g., different PDEs), aggregating parameters (trunk/branch nets) for a strong prior.
- Zero-Shot Physics-Informed Fine-Tuning: Adapt the operator to a new PDE only by enforcing a composite PINN loss,
where the terms are the physics residual, initial and boundary mismatches, computed at collocation points and with no supervised data (Zhang et al., 2024). Fine-tuning can be full (all weights) or parameter-efficient (e.g., LoRA adapters or only re-weighting output coefficients) (Wu, 2024).
2.2 Sparse Reward Maximization for Generative Models
Zero-shot physics-informed reward fine-tuning (PIRF) frames enforcement of physical constraints in generative diffusion models as maximizing a terminal (sparse) reward,
where with the PDE residual (Yuan et al., 24 Sep 2025). Updates propagate gradients from the reward back through the entire diffusion trajectory. Weight-based regularization and layer-wise truncated backpropagation are incorporated for stable, efficient performance.
2.3 Plug-and-Play Priors and Physics Loop for Inverse Problems
In inverse imaging (e.g., phase retrieval), a neural network pretrained for generic denoising is looped with a physics-projection step: at each iteration, (a) impose physical constraints (e.g., measured intensity, diffractive forward model), (b) apply the deep prior, (c) iterate until convergence (Kumar, 2021). No ground-truth phase examples are ever used; only physical measurement and forward operator are available.
2.4 Architecture-Induced Physics Priors for Symbolic Dynamics
Graph Networks can encode physics priors via architecture, not explicit loss functions: message-passing with dimension and aggregation constraints ensures that internal representations reflect force-like quantities, enabling zero-shot generalization to varying system size or force law and post-hoc symbolic recovery (Cranmer et al., 2019).
3. Physics-Informed Loss Formulations
Zero-shot adaptation universally relies on loss objectives that encode the governing physics without ground-truth solutions. Typical forms include:
- PDE residual loss: At collocation points,
- Boundary and initial condition loss: Enforcing known BC/IC at sample points.
- Physics-projection (inverse problems): Projecting iterates onto the set of variables satisfying physical measurement constraints.
- Sparse reward (generative): The reward is minus the squared norm of the residual under the physical law (Yuan et al., 24 Sep 2025).
The weights assigned to individual terms, selection of collocation grids, and regularization design (LoRA rank, output coefficient parametrization, or direct weight decay) are tuned based on the problem class (Zhang et al., 2024, Wu, 2024, Yuan et al., 24 Sep 2025).
4. Parameter-Efficient Adaptation Strategies
A salient direction is parameter-efficient fine-tuning, aiming to update only a minimal subset of weights or adapt low-rank structures for rapid, stable adaptation:
- LoRA (Low-Rank Adaptation): Inserts low-rank adapters in linear layers; during fine-tuning, only the adapters’ weights are updated. Enables substantial reduction in trainable parameters with little performance loss (Zhang et al., 2024, Wu, 2024).
- Branch output reweighting: Freeze all main network parameters and optimize only output coefficients on the physics loss (e.g., branch net outputs in DeepONet replaced with new trainable ) (Wu, 2024).
- Trunk-net expansions/ensembles: Augment the set of basis functions by pooling pre-trained bases from multiple sources or introducing scale-variant copies (Wu, 2024).
- Layer-wise truncated backpropagation: In diffusion models, limit physics-informed updates to only highest-resolution decoder blocks where PDE residuals are most sensitive (Yuan et al., 24 Sep 2025).
5. Empirical Results and Comparative Performance
Across domains, zero-shot physics-informed fine-tuning produces accurate adapted models with drastically improved sample efficiency compared to training from scratch or standard supervised fine-tuning.
Operator Learning (Table: (Zhang et al., 2024))
| Method | Burgers (Rel. Error) | Porous Media | Diff.-Reaction | Random Init |
|---|---|---|---|---|
| PI-LoRA + D2NO init | 3.11% | 5.49% | 3.24% | 21.14% |
| PI-Full + D2NO init | 4.99% | — | — | — |
| PI-Full from random init | 21.14% | 12.57% | 4.66% | — |
PINN Acceleration (Wu, 2024)
- FTO-PINN achieves PINN-level accuracy on Burgers’, advection, and interface tasks in 1–2 seconds, versus hundreds to thousands of seconds for from-scratch training.
- Number of optimized parameters drops from to .
Generative Models (Yuan et al., 24 Sep 2025)
- PIRF achieves lower PDE-residual MSE on Darcy and Helmholtz benchmarks vs. CoCoGen and DiffusionPDE, while halving fine-tuning time and maintaining strong visual fidelity during sampling.
- For 80 steps, PIRF fine-tuning requires ~180k trajectories and completes in ~15 hours on two A100 GPUs.
Inverse Imaging (Kumar, 2021)
- Phy-ZSN and PhyTV-ZSN reconstruct quantitatively superior phase images versus classical TV and error reduction, reaching final quality ~8.5× faster on real datasets.
6. Practical Guidelines and Limitations
Best practices:
- Multi-source pre-training yields models that can adapt to a wider configuration of downstream tasks (Zhang et al., 2024).
- LoRA or α-parameter adaptation is preferred for memory and speed on large models (Wu, 2024).
- Careful loss weight selection according to problem scale and physics term importance.
- Collocation/sample point selection according to PDE order and domain size.
- Monitor residuals rather than supervised validation, as no ground-truth outputs are available (Yuan et al., 24 Sep 2025).
- Architectural priors (as in graph networks) can facilitate zero-shot generalization even in the absence of fine-tuning (Cranmer et al., 2019).
Remaining limitations:
- Success depends on pre-trained model proximity to the target task (distribution shift remains a barrier).
- Some nonlinearities or strong domain mismatch may require limited supervised adaptation or richer augmentation (Zhang et al., 2024, Wu, 2024).
7. Extensions and Theoretical Significance
Zero-shot physics-informed fine-tuning provides a scaffolding for data-efficient scientific modeling where ground-truth solutions are unobtainable or cost-prohibitive. Theoretically, it bridges operator learning, PDE-constrained optimization, probabilistic generative modeling, and causal inference by leveraging the expressivity of neural architectures with generic priors and enforcing physical feasibility post hoc.
This paradigm is extensible to arbitrary forward models (tomography, superresolution, generative processes, dynamical systems) as long as a physically meaningful loss or reward can be formulated. A plausible implication is for general-purpose scientific foundation models, where physics-informed fine-tuning delivers domain- and task-specific specialization without curated labeled datasets (Zhang et al., 2024, Yuan et al., 24 Sep 2025, Wu, 2024).