Papers
Topics
Authors
Recent
Search
2000 character limit reached

NeuroSymBO: Adaptive Prompt Tuning for PDEs

Updated 7 January 2026
  • NeuroSymBO is a discrete Bayesian Optimization framework that adaptively tunes prompts for LLM-driven symbolic discovery of partial differential equations.
  • It formalizes instruction selection as a Markov Decision Process, dynamically choosing from a curated bank of 100 reasoning strategies to overcome instruction brittleness.
  • Empirical results demonstrate significant improvements in recovery rate, accuracy, and parsimony over static prompt baselines across various PDE benchmarks.

NeuroSymBO is a discrete Bayesian Optimization (BO) framework for adaptive prompt (instruction) tuning in LLM-accelerated discovery of partial differential equations (PDEs). It addresses the instability and suboptimality inherent in static prompt-based symbolic regression, leveraging a sequential decision process that dynamically selects from a bank of reasoning strategies at each generation step. NeuroSymBO formalizes prompt engineering as a Markov Decision Process (MDP), enabling algorithmic adaptation to the evolving state of the symbolic search and leading to substantial gains in recovery rate, accuracy, and parsimony relative to fixed-prompt baselines (Qu et al., 31 Dec 2025).

1. Instruction Brittleness in LLM-Driven Equation Discovery

LLM-based equation discovery is acutely sensitive to small changes in prompt phrasing, a phenomenon termed instruction brittleness. For two semantically similar instruction strings II and II', small textual changes (e.g., “simplest” vs. “most accurate”) can produce qualitatively different symbolic outputs under an LLM proposer P()P(\cdot), even under a shared history H\mathcal{H}. In the context of PDE discovery, this brittleness causes optimization plateaus and suboptimality. Prompts that emphasize parsimony may eliminate essential terms, while accuracy-oriented prompts can induce spurious or overcomplex interactions. Static prompt selection is thus ill-suited for the iterative, multi-step search processes typical of scientific symbolic regression.

2. Markov Decision Process Formulation

NeuroSymBO reframes instruction selection as an episodic MDP with discrete actions corresponding to prompt strategies. The state sts_t at iteration tt includes the top-NN candidate equations with their fitness scores Ht={(u^i,Si)}i=1N\mathcal{H}_t = \{(\hat{u}_i, S_i)\}_{i=1}^N and the best-so-far fitness yy^*. Actions ata_t are indices ktk_t (1ktK1 \le k_t \le K) referring to strategies in the bank B\mathfrak{B}. The numerical reward rtr_t is defined as the highest composite fitness score St=maxjS(u^j)S_t = \max_{j} S(\hat{u}_j) among candidates generated at tt. The overarching objective is to learn a policy π(as)\pi(a|s) that maximizes expected cumulative reward over TT rounds, aligning synthesis trajectory with the evolving needs of the search.

3. Discrete Library of Reasoning Strategies

The strategy bank B={s1,,sK}\mathfrak{B} = \{s_1, \ldots, s_K\} contains K=100K = 100 diverse instruction variants, generated by a meta-LLM (e.g., GPT-4o). Each strategy encodes a distinct approach to symbolic generation, spanning axes of exploration versus exploitation and structural versus coefficient refinement. Representative strategies include:

  • s1s_1 (Exploration): “Ignore previous bests. Propose a completely new functional form.”
  • s2s_2 (Mutation): “Keep the core structure but replace nonlinear interactions.”
  • s3s_3 (Parsimony): “The best equation is too complex. Remove redundant terms.”
  • s4s_4 (Refinement): “Focus on adjusting existing terms’ functional shapes.”

This curated bank enables targeted perturbations of symbolic trajectories, facilitating both escape from local minima and convergence to parsimonious correct solutions.

4. Bayesian Optimization over Instruction Space

NeuroSymBO treats the mapping from instruction strategy index kk to maximum achieved fitness f(k)f(k) as an expensive black-box function, optimizing it with BO. A Gaussian Process (GP) surrogate f(k)GP(m(k),κ(k,k))f(k) \sim \mathcal{GP}(m(k), \kappa(k, k')) models prior/posterior beliefs about performance. Acquisition functions guide selection:

  • Expected Improvement (EI):

EI(k)=E[max(0,f(k)yξ)],EI(k) = \mathbb{E}\big[\max(0, f(k) - y^* - \xi)\big],

with analytic form using the posterior mean μn(k)\mu_n(k) and variance σn2(k)\sigma_n^2(k).

UCB(k)=μn(k)+βσn(k),UCB(k) = \mu_n(k) + \beta \sigma_n(k),

where β>0\beta > 0 balances exploration–exploitation.

Numerical feedback for each candidate u^\hat{u} is computed via the composite fitness

S(u^)=1λcomplexity(u^)1+NRMSE(u^),S(\hat{u}) = \frac{1 - \lambda \cdot \text{complexity}(\hat{u})}{1 + \mathrm{NRMSE}(\hat{u})},

where NRMSE\mathrm{NRMSE} is normalized root mean square error after fitting via STRidge, and complexity denotes number of active terms. λ=0.01\lambda = 0.01 enforces system-level parsimony.

5. Workflow and Algorithmic Pseudocode

The operational pipeline proceeds as follows:

  1. Initialization: GP surrogate, empty history H\mathcal{H}, observation set O\mathcal{O}, and yy^* \leftarrow -\infty.
  2. For t=1,,Tt = 1, \ldots, T iterations:
    • If tKinitt \leq K_\text{init}, sample ktk_t randomly; else, fit GP to O\mathcal{O} and select kt=argmaxkEI(k)k_t = \arg\max_k EI(k).
    • Form prompt PtP_t as the task specification, top-N history elements, and selected instruction Istrategy(kt)I^{(k_t)}_\text{strategy}.
    • Generate MM candidate equations with the LLM, fit coefficients via STRidge, and compute SjS_j for each.
    • Update HH{(u^t,St)}\mathcal{H} \leftarrow \mathcal{H} \cup \{(\hat{u}_t, S_t)\}, OO{(kt,St)}\mathcal{O} \leftarrow \mathcal{O} \cup \{(k_t, S_t)\}.
    • If St>yS_t > y^*, update ySty^* \leftarrow S_t, u^u^t\hat{u}^* \leftarrow \hat{u}_t.
  3. Return the best-recovered equation u^\hat{u}^*.

6. Empirical Performance and Benchmarks

NeuroSymBO was evaluated on a suite of one-dimensional PDE benchmarks: Burgers, Fisher-KPP, Chafee-Infante, PDE_divide, and synthetic Allen-Cahn, with data sampled on $200$–300×100300\times100–$256$ spatiotemporal grids. The baseline comprised a fixed-prompt LLM (Llama-3.2-3B-Instruct) instructed to “Find the simplest equation.” BO used BoTorch’s exact GP and EI acquisition over T=300T = 300 iterations and $5$ seeds.

Key metrics were:

  • Recovery rate: percentage of runs exactly retrieving the ground-truth symbolic form.
  • Train/test R2R^2 on held-out data.
  • Parsimony: average number of active terms.
PDE Method Train R2R^2 Test R2R^2 Recovery Rate Avg. # Terms
Allen-Cahn Fixed 0.7968 0.7824 40% 7.2
Ours (NeuroSymBO) 0.9107 0.8914 100% 4.1
Burgers Fixed 0.8102 0.8242 60% 5.0
Ours 0.8699 0.8791 100% 3.8
Chafee Fixed 0.9894 0.9886 80% 4.4
Ours 0.9951 0.9947 100% 3.0
Divide Fixed 0.9927 0.9922 20% 6.5
Ours 0.9942 0.9941 100% 2.9
Fisher Fixed 0.9952 0.9953 100% 3.0
Ours 0.9999 0.9999 100% 2.0

Optimization trajectories demonstrate that dynamic instruction selection (NeuroSymBO) exhibits stepwise improvements at moments of strategy adaptation, while performance under static prompting saturates prematurely.

7. Scalability, Limitations, and Extension Pathways

GP surrogate fitting incurs O(t3)O(t^3) computational cost, but with T300T \leq 300 the associated overhead is substantially less than that of LLM inference. Sparse GP methods can further accelerate surrogate modeling for larger TT or extended instruction banks. The framework is contingent on the inherent expressiveness of the LLM: symbolic recovery is unattainable if the LLM's inductive biases omit certain operators, regardless of instruction tuning. NeuroSymBO’s application thus far is restricted to 1D PDEs; extension to higher dimensions or chaotic systems would necessitate both richer in-context history and prompt modifications.

Possible future directions include:

  • On-the-fly meta-LLM-driven expansion of the strategy bank.
  • Integration of alternative acquisition criteria (UCB, Thompson Sampling).
  • Hierarchical BO over both strategy and structural hyperparameters (e.g., λ\lambda in S()S(\cdot)).
  • Coupling with active learning for data or regime selection.
  • Exploiting the MDP framework for RL-based instruction policy learning.

NeuroSymBO establishes instruction-tuned BO as an effective paradigm for robust, parsimonious scientific symbolic discovery, substantiated by enhanced performance metrics over static prompting (Qu et al., 31 Dec 2025).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to NeuroSymBO.