Synthetic Batch-Level Pebble Measurements
- Synthetic batch-level pebble measurements are noise-augmented, low-dimensional proxies that aggregate discharged pebble data, providing a concise snapshot of reactor fuel behavior.
- They employ statistical binning, Normal distribution models, and calibrated Gaussian noise to emulate gamma spectroscopy observables in pebble bed reactors.
- This approach enables efficient LSTM-based predictive modeling of critical reactor parameters such as reactivity, flux evolution, and burnup distribution under operational constraints.
Synthetic batch-level pebble measurements are low-dimensional, noise-augmented constructs engineered to proxy the aggregate physical observables available from on-line gamma spectroscopy in pebble bed reactors (PBRs). Designed for use as features in machine learning models, particularly long short-term memory (LSTM) architectures, these measurements summarize the discharged pebble composition and activity over specified time windows. Their primary purpose is to provide physically-interpretable, computationally tractable input for predictive modeling of reactivity and flux evolution, notably under operational scenarios where direct, fully resolved measurement or simulation of each pebble is infeasible.
1. Formal Definitions and Physical Assumptions
The construction of synthetic batch-level pebble measurements relies on strict physical and statistical assumptions to reduce the dimensionality of the problem while maintaining sensitivity to the phenomena of interest.
1.1 Discharged-Pebble Burnup Distribution
At each depletion timestep, typically corresponding to ≈ 6.525 days (around 25,000 pebbles), discharged pebbles are binned into burnup intervals (nominally 12), uniform in percentage of fissions per initial metal atom (%FIMA). The true burnup of pebbles within bin is modeled as a Normal distribution:
where is the volume-averaged central burnup of bin , and , calibrated against high-fidelity (HxF) full-pebble depletion simulations. The fraction removed above a discard threshold is then:
1.2 Volume-Averaged Nuclide Concentrations
For each burnup bin at discharge, average nuclide concentrations are computed as pebble count-weighted means:
These bulk materials are then used as proxies for groupwise Serpent transport and depletion calculations, reducing the requirement for explicit tracking of millions of individual pebbles.
1.3 Defined Batch-Level Features
Within every timestep, five classes of dependent features are extracted:
- Burnup-bin counts : Number of discharged pebbles in nine fixed %FIMA intervals (0–22.5% FIMA).
- Average total discharge burnup : Mean burnup across all discharged pebbles.
- Number of discarded pebbles : Computed by summing the expected number of pebbles in each bin with .
- Average last-pass burnup by radial zone : For , mean of pebbles' last radial pass burnup.
- Measurement noise model: Zero-mean Gaussian noise is added to each synthetic feature, with standard deviations determined by a prescribed Mean Absolute Percentage Error (MAPE).
| Feature type | MAPE (%) |
|---|---|
| Burnup-bin counts | 5 |
| Average discharge burnup | 2.5 |
| Number discarded pebbles | 5 |
| Avg. last-pass BU by radial zone | 10 |
The noise model is intended to emulate practical errors in gamma counting and inventory pathway assignment.
2. Generation Practices and Calibration Procedures
The procedure for producing synthetic batch-level measurements is predicated on a zone-based simulation of the reactor core using PEARLSim, combined with binning and noise augmentation strategies.
2.1 Zone-Based Core Simulation
The 280 MW gFHR core is decomposed into 4 radial and 10 axial zones, yielding up to 480 unique material regions across 12 burnup groups. Serpent 2.2.0 calculates, at each step, , power/flux meshes, and burnup group inventories. Pebble movement is modeled as an axial shift per interval, with top-zone pebbles rebinned upon discharge using the described statistical model for burnup.
2.2 Calibration Against High-Fidelity Simulation
The -factor is empirically determined by adjusting the width of the assumed Normal in burnup space such that the aggregate discard histogram at threshold MWd/kgHM matches the reference curve from HxF simulations. This empirical anchoring is necessary to compensate for model error induced by the reduction to a zoned, batch-level formulation.
2.3 Observable Feature Extraction
At each batch (25,000 pebbles per window):
- Populate for 9 burnup bins.
- Compute , , and using batch means and bin statistics.
- Add Gaussian noise per the feature-specific MAPE, scaled in accord with the measurement error model.
3. Encoding for LSTM Model Input
The constructed synthetic measurements are concatenated with other known or controlled parameters to form the input feature vectors for recurrent neural network training.
3.1 Input Vector Specification
Each timestep yields a 21-dimensional feature vector:
- User-controlled: fraction graphite, total core power, control rod depth, circulation rate (), burnup threshold.
- Synthetic batch features: , , –, –, and average power/pebble.
3.2 Time Series Assembly and Training
Input sequences are windowed to 8 timesteps, yielding an input tensor of shape: batch_size × 8 × 21. Six LSTM models, trained in parallel, target specific outputs: excess reactivity (), five principal components (PCs) each for power and flux profiles, and the next-timestep versions of the dependent input features themselves for operational forecasting.
4. Feature Importance and Predictive Sensitivity
Feature relevance for predictive model accuracy is established using a permutation-importance methodology.
4.1 Permutation Testing Protocol
For each trained model, baseline mean absolute error (MAE) is computed; features are independently permuted and MAE recomputed. The difference MAE provides a quantitative importance metric for each feature.
4.2 Reactivity Forecasting Insights
For the primary reactivity model, the three features with highest impact on MAE are: fuel-insertion fraction, , and . High-burnup bins (7–9) also demonstrate elevated importance, in alignment with delayed-propagation reactivity phenomena in PBRs. Last-pass radial burnup attributes register low importance, plausibly due to collinearity with other bulk burnup metrics.
4.3 Dependency Patterns in Forecasting
Model predictions for next-timestep average discharge burnup are governed predominantly by the immediate prior value (self-predictability). Likewise, next-step discard count predictions hinge on the populations in high-burnup bins and the set discard threshold . Last-pass radial burnup is most sensitive to total core power and average power per pebble, reflecting flux distribution effects over short windows.
5. Limitations, Sources of Uncertainty, and Recommendations
Although the synthetic batch-level approach delivers tractable and interpretable proxies for physical observations, several important limitations and sources of uncertainty are recognized.
5.1 Impact of Coarse Zoning and Binning
Restricting the decomposition to 4 radial and 12 burnup groups obscures pebble-level variance; higher fidelity (finer zoning or pebble-wise depletion) would capture more detailed behavior at ≥10–100× computational cost.
5.2 Assumption of Normal Distribution
The discard-burnup distribution is approximated as Normal, whereas empirical distributions may deviate, even after calibration via HxF data, introducing residual bias.
5.3 Noise Model Simplifications
The imposed Gaussian noise, scaled by fixed MAPE, offers at best an approximation of practical sensor error, omitting potential count-rate non-Poisson characteristics and pathway-specific biases.
5.4 Methodological Recommendations
For accuracy:
- Increase tallying efforts (histories, cycles) in Serpent calculations to suppress mesh-induced statistical variability, especially for higher-order PCs.
- Average over multiple random shuffles per timestep to mitigate shuffle variance in .
- Increase spatial and burnup binning granularity or calibrate bin widths using hybrid HxF→PEARLSim pipelines.
- Investigate alternative recurrent units (e.g., GRU) or physics-driven preprocessing (e.g., reduced-order ).
For deployment:
- Periodically retrain the synthetic model with gold-standard reference measurements (HxF, gamma counting) to correct drift.
- Utilize online permutation-importance metrics to monitor feature reliability and detect measurement drift (e.g., burnup-bin sensor bias).
6. Context, Significance, and Applications
Synthetic batch-level pebble measurements, as implemented in Kolaja et al. (2025) (Kolaja et al., 7 Nov 2025), serve as a working compromise between detailed, computationally prohibitive modeling and the need for physically rooted observables for data-driven prediction. Several key implications emerge:
- The reduction to batch-level, noise-augmented proxies permits efficient LSTM training and operation without requiring full-core pebble tracking.
- The retained dependency of model accuracy on physically interpretable quantities (e.g., high-burnup bin population, mean discharged burnup) confirms that the approach respects the dominant physical drivers of reactivity evolution.
- The framework’s explicit allowance for user-controlled and measured (uncertain) input features aligns with practical online monitoring and operation in advanced reactor systems.
This suggests that the batch-level synthetic measurement methodology provides a viable and realistic path for the incorporation of machine-learning-based forecasting into PBR operational support, so long as limitations regarding aggregation, noise representation, and bin calibration are systematically addressed.