NeuroSymBO: Adaptive Prompt Tuning for PDEs

Updated 7 January 2026

NeuroSymBO is a discrete Bayesian Optimization framework that adaptively tunes prompts for LLM-driven symbolic discovery of partial differential equations.
It formalizes instruction selection as a Markov Decision Process, dynamically choosing from a curated bank of 100 reasoning strategies to overcome instruction brittleness.
Empirical results demonstrate significant improvements in recovery rate, accuracy, and parsimony over static prompt baselines across various PDE benchmarks.

NeuroSymBO is a discrete Bayesian Optimization (BO) framework for adaptive prompt (instruction) tuning in LLM-accelerated discovery of partial differential equations (PDEs). It addresses the instability and suboptimality inherent in static prompt-based symbolic regression, leveraging a sequential decision process that dynamically selects from a bank of reasoning strategies at each generation step. NeuroSymBO formalizes prompt engineering as a Markov Decision Process (MDP), enabling algorithmic adaptation to the evolving state of the symbolic search and leading to substantial gains in recovery rate, accuracy, and parsimony relative to fixed-prompt baselines (Qu et al., 31 Dec 2025).

1. Instruction Brittleness in LLM-Driven Equation Discovery

LLM-based equation discovery is acutely sensitive to small changes in prompt phrasing, a phenomenon termed instruction brittleness. For two semantically similar instruction strings $I$ and $I'$ , small textual changes (e.g., “simplest” vs. “most accurate”) can produce qualitatively different symbolic outputs under an LLM proposer $P(\cdot)$ , even under a shared history $\mathcal{H}$ . In the context of PDE discovery, this brittleness causes optimization plateaus and suboptimality. Prompts that emphasize parsimony may eliminate essential terms, while accuracy-oriented prompts can induce spurious or overcomplex interactions. Static prompt selection is thus ill-suited for the iterative, multi-step search processes typical of scientific symbolic regression.

2. Markov Decision Process Formulation

NeuroSymBO reframes instruction selection as an episodic MDP with discrete actions corresponding to prompt strategies. The state $s_t$ at iteration $t$ includes the top- $N$ candidate equations with their fitness scores $\mathcal{H}_t = \{(\hat{u}_i, S_i)\}_{i=1}^N$ and the best-so-far fitness $y^*$ . Actions $a_t$ are indices $k_t$ ( $1 \le k_t \le K$ ) referring to strategies in the bank $\mathfrak{B}$ . The numerical reward $r_t$ is defined as the highest composite fitness score $S_t = \max_{j} S(\hat{u}_j)$ among candidates generated at $t$ . The overarching objective is to learn a policy $\pi(a|s)$ that maximizes expected cumulative reward over $T$ rounds, aligning synthesis trajectory with the evolving needs of the search.

3. Discrete Library of Reasoning Strategies

The strategy bank $\mathfrak{B} = \{s_1, \ldots, s_K\}$ contains $K = 100$ diverse instruction variants, generated by a meta-LLM (e.g., GPT-4o). Each strategy encodes a distinct approach to symbolic generation, spanning axes of exploration versus exploitation and structural versus coefficient refinement. Representative strategies include:

$s_1$ (Exploration): “Ignore previous bests. Propose a completely new functional form.”
$s_2$ (Mutation): “Keep the core structure but replace nonlinear interactions.”
$s_3$ (Parsimony): “The best equation is too complex. Remove redundant terms.”
$s_4$ (Refinement): “Focus on adjusting existing terms’ functional shapes.”

This curated bank enables targeted perturbations of symbolic trajectories, facilitating both escape from local minima and convergence to parsimonious correct solutions.

4. Bayesian Optimization over Instruction Space

NeuroSymBO treats the mapping from instruction strategy index $k$ to maximum achieved fitness $f(k)$ as an expensive black-box function, optimizing it with BO. A Gaussian Process (GP) surrogate $f(k) \sim \mathcal{GP}(m(k), \kappa(k, k'))$ models prior/posterior beliefs about performance. Acquisition functions guide selection:

Expected Improvement (EI):

$EI(k) = \mathbb{E}\big[\max(0, f(k) - y^* - \xi)\big],$

with analytic form using the posterior mean $\mu_n(k)$ and variance $\sigma_n^2(k)$ .

Upper Confidence Bound (UCB):

$UCB(k) = \mu_n(k) + \beta \sigma_n(k),$

where $\beta > 0$ balances exploration–exploitation.

Numerical feedback for each candidate $\hat{u}$ is computed via the composite fitness

$S(\hat{u}) = \frac{1 - \lambda \cdot \text{complexity}(\hat{u})}{1 + \mathrm{NRMSE}(\hat{u})},$

where $\mathrm{NRMSE}$ is normalized root mean square error after fitting via STRidge, and complexity denotes number of active terms. $\lambda = 0.01$ enforces system-level parsimony.

5. Workflow and Algorithmic Pseudocode

The operational pipeline proceeds as follows:

Initialization: GP surrogate, empty history $\mathcal{H}$ , observation set $\mathcal{O}$ , and $y^* \leftarrow -\infty$ .
For $t = 1, \ldots, T$ iterations:
- If $t \leq K_\text{init}$ , sample $k_t$ randomly; else, fit GP to $\mathcal{O}$ and select $k_t = \arg\max_k EI(k)$ .
- Form prompt $P_t$ as the task specification, top-N history elements, and selected instruction $I^{(k_t)}_\text{strategy}$ .
- Generate $M$ candidate equations with the LLM, fit coefficients via STRidge, and compute $S_j$ for each.
- Update $\mathcal{H} \leftarrow \mathcal{H} \cup \{(\hat{u}_t, S_t)\}$ , $\mathcal{O} \leftarrow \mathcal{O} \cup \{(k_t, S_t)\}$ .
- If $S_t > y^*$ , update $y^* \leftarrow S_t$ , $\hat{u}^* \leftarrow \hat{u}_t$ .
Return the best-recovered equation $\hat{u}^*$ .

6. Empirical Performance and Benchmarks

NeuroSymBO was evaluated on a suite of one-dimensional PDE benchmarks: Burgers, Fisher-KPP, Chafee-Infante, PDE_divide, and synthetic Allen-Cahn, with data sampled on $200$– $300\times100$ –$256$ spatiotemporal grids. The baseline comprised a fixed-prompt LLM (Llama-3.2-3B-Instruct) instructed to “Find the simplest equation.” BO used BoTorch’s exact GP and EI acquisition over $T = 300$ iterations and $5$ seeds.

Key metrics were:

Recovery rate: percentage of runs exactly retrieving the ground-truth symbolic form.
Train/test $R^2$ on held-out data.
Parsimony: average number of active terms.

PDE	Method	Train $R^2$	Test $R^2$	Recovery Rate	Avg. # Terms
Allen-Cahn	Fixed	0.7968	0.7824	40%	7.2
	Ours (NeuroSymBO)	0.9107	0.8914	100%	4.1
Burgers	Fixed	0.8102	0.8242	60%	5.0
	Ours	0.8699	0.8791	100%	3.8
Chafee	Fixed	0.9894	0.9886	80%	4.4
	Ours	0.9951	0.9947	100%	3.0
Divide	Fixed	0.9927	0.9922	20%	6.5
	Ours	0.9942	0.9941	100%	2.9
Fisher	Fixed	0.9952	0.9953	100%	3.0
	Ours	0.9999	0.9999	100%	2.0

Optimization trajectories demonstrate that dynamic instruction selection (NeuroSymBO) exhibits stepwise improvements at moments of strategy adaptation, while performance under static prompting saturates prematurely.

7. Scalability, Limitations, and Extension Pathways

GP surrogate fitting incurs $O(t^3)$ computational cost, but with $T \leq 300$ the associated overhead is substantially less than that of LLM inference. Sparse GP methods can further accelerate surrogate modeling for larger $T$ or extended instruction banks. The framework is contingent on the inherent expressiveness of the LLM: symbolic recovery is unattainable if the LLM's inductive biases omit certain operators, regardless of instruction tuning. NeuroSymBO’s application thus far is restricted to 1D PDEs; extension to higher dimensions or chaotic systems would necessitate both richer in-context history and prompt modifications.

Possible future directions include:

On-the-fly meta-LLM-driven expansion of the strategy bank.
Integration of alternative acquisition criteria (UCB, Thompson Sampling).
Hierarchical BO over both strategy and structural hyperparameters (e.g., $\lambda$ in $S(\cdot)$ ).
Coupling with active learning for data or regime selection.
Exploiting the MDP framework for RL-based instruction policy learning.

NeuroSymBO establishes instruction-tuned BO as an effective paradigm for robust, parsimonious scientific symbolic discovery, substantiated by enhanced performance metrics over static prompting (Qu et al., 31 Dec 2025).

Markdown Report Issue Upgrade to Chat

References (1)

Dynamic Bayesian Optimization Framework for Instruction Tuning in Partial Differential Equation Discovery (2025)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to NeuroSymBO.

NeuroSymBO: Adaptive Prompt Tuning for PDEs

1. Instruction Brittleness in LLM-Driven Equation Discovery

2. Markov Decision Process Formulation

3. Discrete Library of Reasoning Strategies

4. Bayesian Optimization over Instruction Space

5. Workflow and Algorithmic Pseudocode

6. Empirical Performance and Benchmarks

7. Scalability, Limitations, and Extension Pathways

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

NeuroSymBO: Adaptive Prompt Tuning for PDEs

1. Instruction Brittleness in LLM-Driven Equation Discovery

2. Markov Decision Process Formulation

3. Discrete Library of Reasoning Strategies

4. Bayesian Optimization over Instruction Space

5. Workflow and Algorithmic Pseudocode

6. Empirical Performance and Benchmarks

7. Scalability, Limitations, and Extension Pathways

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research