NeuroSymBO: Adaptive Prompt Tuning for PDEs
- NeuroSymBO is a discrete Bayesian Optimization framework that adaptively tunes prompts for LLM-driven symbolic discovery of partial differential equations.
- It formalizes instruction selection as a Markov Decision Process, dynamically choosing from a curated bank of 100 reasoning strategies to overcome instruction brittleness.
- Empirical results demonstrate significant improvements in recovery rate, accuracy, and parsimony over static prompt baselines across various PDE benchmarks.
NeuroSymBO is a discrete Bayesian Optimization (BO) framework for adaptive prompt (instruction) tuning in LLM-accelerated discovery of partial differential equations (PDEs). It addresses the instability and suboptimality inherent in static prompt-based symbolic regression, leveraging a sequential decision process that dynamically selects from a bank of reasoning strategies at each generation step. NeuroSymBO formalizes prompt engineering as a Markov Decision Process (MDP), enabling algorithmic adaptation to the evolving state of the symbolic search and leading to substantial gains in recovery rate, accuracy, and parsimony relative to fixed-prompt baselines (Qu et al., 31 Dec 2025).
1. Instruction Brittleness in LLM-Driven Equation Discovery
LLM-based equation discovery is acutely sensitive to small changes in prompt phrasing, a phenomenon termed instruction brittleness. For two semantically similar instruction strings and , small textual changes (e.g., “simplest” vs. “most accurate”) can produce qualitatively different symbolic outputs under an LLM proposer , even under a shared history . In the context of PDE discovery, this brittleness causes optimization plateaus and suboptimality. Prompts that emphasize parsimony may eliminate essential terms, while accuracy-oriented prompts can induce spurious or overcomplex interactions. Static prompt selection is thus ill-suited for the iterative, multi-step search processes typical of scientific symbolic regression.
2. Markov Decision Process Formulation
NeuroSymBO reframes instruction selection as an episodic MDP with discrete actions corresponding to prompt strategies. The state at iteration includes the top- candidate equations with their fitness scores and the best-so-far fitness . Actions are indices () referring to strategies in the bank . The numerical reward is defined as the highest composite fitness score among candidates generated at . The overarching objective is to learn a policy that maximizes expected cumulative reward over rounds, aligning synthesis trajectory with the evolving needs of the search.
3. Discrete Library of Reasoning Strategies
The strategy bank contains diverse instruction variants, generated by a meta-LLM (e.g., GPT-4o). Each strategy encodes a distinct approach to symbolic generation, spanning axes of exploration versus exploitation and structural versus coefficient refinement. Representative strategies include:
- (Exploration): “Ignore previous bests. Propose a completely new functional form.”
- (Mutation): “Keep the core structure but replace nonlinear interactions.”
- (Parsimony): “The best equation is too complex. Remove redundant terms.”
- (Refinement): “Focus on adjusting existing terms’ functional shapes.”
This curated bank enables targeted perturbations of symbolic trajectories, facilitating both escape from local minima and convergence to parsimonious correct solutions.
4. Bayesian Optimization over Instruction Space
NeuroSymBO treats the mapping from instruction strategy index to maximum achieved fitness as an expensive black-box function, optimizing it with BO. A Gaussian Process (GP) surrogate models prior/posterior beliefs about performance. Acquisition functions guide selection:
- Expected Improvement (EI):
with analytic form using the posterior mean and variance .
where balances exploration–exploitation.
Numerical feedback for each candidate is computed via the composite fitness
where is normalized root mean square error after fitting via STRidge, and complexity denotes number of active terms. enforces system-level parsimony.
5. Workflow and Algorithmic Pseudocode
The operational pipeline proceeds as follows:
- Initialization: GP surrogate, empty history , observation set , and .
- For iterations:
- If , sample randomly; else, fit GP to and select .
- Form prompt as the task specification, top-N history elements, and selected instruction .
- Generate candidate equations with the LLM, fit coefficients via STRidge, and compute for each.
- Update , .
- If , update , .
- Return the best-recovered equation .
6. Empirical Performance and Benchmarks
NeuroSymBO was evaluated on a suite of one-dimensional PDE benchmarks: Burgers, Fisher-KPP, Chafee-Infante, PDE_divide, and synthetic Allen-Cahn, with data sampled on $200$––$256$ spatiotemporal grids. The baseline comprised a fixed-prompt LLM (Llama-3.2-3B-Instruct) instructed to “Find the simplest equation.” BO used BoTorch’s exact GP and EI acquisition over iterations and $5$ seeds.
Key metrics were:
- Recovery rate: percentage of runs exactly retrieving the ground-truth symbolic form.
- Train/test on held-out data.
- Parsimony: average number of active terms.
| PDE | Method | Train | Test | Recovery Rate | Avg. # Terms |
|---|---|---|---|---|---|
| Allen-Cahn | Fixed | 0.7968 | 0.7824 | 40% | 7.2 |
| Ours (NeuroSymBO) | 0.9107 | 0.8914 | 100% | 4.1 | |
| Burgers | Fixed | 0.8102 | 0.8242 | 60% | 5.0 |
| Ours | 0.8699 | 0.8791 | 100% | 3.8 | |
| Chafee | Fixed | 0.9894 | 0.9886 | 80% | 4.4 |
| Ours | 0.9951 | 0.9947 | 100% | 3.0 | |
| Divide | Fixed | 0.9927 | 0.9922 | 20% | 6.5 |
| Ours | 0.9942 | 0.9941 | 100% | 2.9 | |
| Fisher | Fixed | 0.9952 | 0.9953 | 100% | 3.0 |
| Ours | 0.9999 | 0.9999 | 100% | 2.0 |
Optimization trajectories demonstrate that dynamic instruction selection (NeuroSymBO) exhibits stepwise improvements at moments of strategy adaptation, while performance under static prompting saturates prematurely.
7. Scalability, Limitations, and Extension Pathways
GP surrogate fitting incurs computational cost, but with the associated overhead is substantially less than that of LLM inference. Sparse GP methods can further accelerate surrogate modeling for larger or extended instruction banks. The framework is contingent on the inherent expressiveness of the LLM: symbolic recovery is unattainable if the LLM's inductive biases omit certain operators, regardless of instruction tuning. NeuroSymBO’s application thus far is restricted to 1D PDEs; extension to higher dimensions or chaotic systems would necessitate both richer in-context history and prompt modifications.
Possible future directions include:
- On-the-fly meta-LLM-driven expansion of the strategy bank.
- Integration of alternative acquisition criteria (UCB, Thompson Sampling).
- Hierarchical BO over both strategy and structural hyperparameters (e.g., in ).
- Coupling with active learning for data or regime selection.
- Exploiting the MDP framework for RL-based instruction policy learning.
NeuroSymBO establishes instruction-tuned BO as an effective paradigm for robust, parsimonious scientific symbolic discovery, substantiated by enhanced performance metrics over static prompting (Qu et al., 31 Dec 2025).