Local Temperature Scaling (LTS)

Updated 28 November 2025

Local Temperature Scaling (LTS) is a spatially adaptive method that recalibrates temperature-like parameters to improve model accuracy in fields including semantic segmentation, turbulence, and phase transitions.
It employs localized networks (e.g., compact CNNs or hierarchical filters) to predict per-pixel temperature scales, reducing calibration errors such as ECE, SCE, and ACE while preserving segmentation performance.
Empirical results across datasets (COCO, CamVid, LPBA40) and domains showcase LTS's superiority over global scaling methods, providing improved uncertainty calibration and robustness to spatial heterogeneity.

Local Temperature Scaling (LTS) encompasses a set of methodologies across disciplines that involve the spatially-dependent rescaling or adjustment of temperature or temperature-like parameters at local (pixel, voxel, spatial coordinate, or lattice site) levels. LTS is rigorously formulated for three domains: probability calibration in neural semantic segmentation (Ding et al., 2020), wall-modeling in high-speed turbulence (Chen et al., 2021), and crossover scaling at phase transitions under smooth temperature gradients (Bonati et al., 2014). In each, LTS yields demonstrably improved theoretical or empirical performance over global or homogeneous alternatives, leveraging locally-adaptive temperature fields to match spatial non-uniformity in data, physics, or model predictions. The sections below articulate the formalisms, methodologies, and empirical outcomes defining LTS in these areas.

1. Probability Calibration for Semantic Segmentation

Local Temperature Scaling (LTS) in neural network probability calibration generalizes standard Temperature Scaling (TS) by introducing spatial adaptiveness. For multi-class semantic segmentation, TS rescales logits $z$ by a global scalar $T>0$ , producing a calibrated softmax: $\sigma_{\mathrm{SM}}(z; T)^{(l)} = \frac{\exp(z^{(l)}/T)}{\sum_{j=1}^L \exp(z^{(j)}/T)}$ TS fits $T^*$ by minimizing negative log-likelihood (NLL) on a hold-out set. Image-based TS (IBTS) extends this to one temperature $T_i$ per image, predicted via a small CNN. LTS assigns a distinct temperature $T_i(x)$ to each pixel (or voxel) $x$ of image $i$ : $\sigma_{\mathrm{SM}}(z_i(x); T_i(x))^{(l)} = \frac{\exp(z_i(x)^{(l)}/T_i(x))}{\sum_j \exp(z_i(x)^{(j)}/T_i(x))}$ $T_i(x)$ is predicted by a compact CNN $H_\alpha$ that takes $(z_i, I_i, x)$ as input. All calibration methods operate strictly as post-processing steps, preserving argmax segmentation (and, therefore, Intersection over Union (IoU) and Dice coefficients), while potentially improving probability calibration (Ding et al., 2020).

2. Algorithms and Network Architecture for LTS (Segmentation Calibration)

The calibration network for LTS is a small “soft-tree” CNN, structured hierarchically with binary nodes. Each leaf node contains a convolutional filter (on logits or image channels); internal nodes contain gating filters controlling the weighted mixing of subtree outputs. The forward computation at position $x$ extracts local patches of logits and image channels, passing them through these hierarchical filters, culminating in $T(x) = \mathrm{ReLU}(H_\mathrm{root}) + \epsilon$ , with $\epsilon$ ensuring positivity. For IBTS, a global average-pooling at the root produces a single scalar $T_i$ . The entire network is optimized for cross-entropy with respect to ground-truth per-pixel labels using Adam for 100 epochs on the validation set (Ding et al., 2020).

3. Calibration Metrics and Quantitative Performance

Calibration is quantified using metrics that compare predicted confidences to actual accuracy:

Expected Calibration Error (ECE): the mean absolute difference between accuracy and confidence over bins of predicted probability.
Maximum Calibration Error (MCE): the largest bin-wise accuracy-confidence discrepancy.
Static Calibration Error (SCE), and Adaptive Calibration Error (ACE): generalizations accounting for class breakdown and adaptive binning.

Segmentation accuracy (IoU/Dice) is evaluated separately. For downstream tasks (e.g., multi-atlas segmentation), metrics like Volume Dice (VD), Average Surface Distance (ASD), Surface Dice (SD), and $95\%$ Maximum Distance (95MD) are also computed.

Empirically, LTS uniformly achieves the lowest ECE, SCE, and ACE across “All”, “Boundary”, and “Local” regions relative to TS and IBTS. On COCO, CamVid, and LPBA40, pixelwise ECE (All) is reduced from $12.44\%$ (uncalibrated) to $10.04\%$ with LTS for COCO, $7.79\% \to 3.40\%$ on CamVid, and $5.58\% \to 0.90\%$ on LPBA40. Boundary ECE and local ECE are also markedly diminished. Downstream, LTS improves label fusion metrics such as VD and ASD, with gains remaining significant under statistical tests (Wilcoxon, Benjamini–Hochberg, FDR=$0.05$) (Ding et al., 2020).

Dataset	Uncal ECE (All)	TS	IBTS	LTS (All)
COCO	12.44%	12.53%	11.92%	10.04%
CamVid	7.79%	3.45%	3.63%	3.40%
LPBA40	5.58%	1.43%	1.47%	0.90%

4. Theoretical Properties and Methodological Implications

LTS preserves the segmentation argmax due to the monotonicity of softmax w.r.t. positive scaling, thereby leaving accuracy metrics unchanged while directly targeting NLL minimization. By optimizing for spatially local miscalibration, LTS adapts to systematic variations (e.g., higher miscalibration at object boundaries or in textured regions) unattainable by global-scale approaches. The method is theoretically justified as it equilibriates NLL and entropy, increasing entropy if overconfident, decreasing it if underconfident.

Limitations include reliance on a hold-out validation set for training the LTS network (a practical constraint in data-scarce domains such as medical imaging), computational overhead at test time (forward pass for the LTS CNN), and the persistence of high MCE (due to annotation noise and rare boundary pixels). Extensions proposed include bin-wise or class-conditional temperature fields, regularization on $T(x)$ (e.g., total variation), and joint calibration-segmentation training (Ding et al., 2020).

5. LTS in Wall-Modeled Large Eddy Simulation (LES) of High-Speed Flows

In wall-modeled LES, LTS refers to the spatially local, property-dependent scaling of wall-normal coordinates and temperature, yielding “semi-local scaling.” The semi-local wall distance is

$y^* = y\,Re_{\tau}\sqrt{\frac{\rho(y)}{\rho_w}\frac{\mu_w}{\mu(y)}}$

and the effective wall temperature variable is

$\theta_T \approx \begin{cases} y^* &\text{(viscous sublayer)} \ \frac{1}{\kappa_T}\ln y^* + B' &\text{(log layer)} \end{cases}$

The eddy thermal diffusivity (eddy-conductivity) closure is derived as

$\frac{\alpha_t}{\mu}= \kappa_T y^* D_T(y^*), \qquad D_T(y^*) = [1 - \exp(-y^*/A_T)]^2, \quad A_T \approx 20$

At high Mach numbers, aerodynamic heating becomes non-negligible, breaking the semi-local data collapse. Incorporating an aerodynamic heating correction restores scaling integrity: $y^*_{AH} = y^* \sqrt{1 - \frac{\epsilon + \epsilon_t}{q_w}}$ Surprisingly, the eddy-conductivity formula remains unchanged after this correction, explaining its robustness in high-Mach regimes (Chen et al., 2021).

The method is implemented in wall-model solvers via local evaluation of $y^*$ (or $y^*_{AH}$ ), damping functions, and closure of eddy coefficients, iteratively updating viscous and thermal fluxes at all spatial wall points.

6. Universal Scaling at First-Order Phase Transitions under Spatial Temperature Gradients

In statistical physics, LTS describes the universal scaling of observables in systems with smooth, spatially-varying temperature profiles $T(x)$ , especially near critical surfaces $T(x) = T_c$ . The canonical example is the 2D $q$ -state Potts model with Hamiltonian

$H = -\frac12\sum_{x,y} J_x \sum_{\mu=\pm\hat x, \pm\hat y} \delta(s_{x,y},\,s_{x+\mu_x, y+\mu_y}), \quad J_x = 1/T_x$

For a smooth temperature gradient parameterized by $l_t$ , the observables near $x=0$ obey the scaling form

$e(x) \approx f_e(\zeta), \qquad m(x) \approx f_m(\zeta), \qquad \zeta = \frac{x}{l_t^\theta}$

The exponent $\theta = \frac{p}{d+p}$ for a power-law gradient $(T_x - T_c) \sim x|x|^{p-1}/l_t^p$ in $d$ dimensions. E.g., in $d=2$ , $p=1$ (linear gradient), $\theta=1/3$ ; for $p=2$ , $\theta=1/2$ . Correlation functions, e.g., $P_e(0,x) \approx l_t^\theta G_e(0,\zeta)$ , also scale accordingly.

Numerical results confirm the predicted data collapse for $q=10,20$ (Potts) and, with prefactors, for $q=2$ (Ising). The LTS formalism thus unifies the description of crossover regions at first-order and continuous transitions under smooth inhomogeneity, defining a new universal length $\xi \sim l_t^\theta$ and scaling functions $f_e,\,f_m$ that interpolate between bulk plateaux (Bonati et al., 2014).

7. Comparative Analysis and Applicability

Across the surveyed domains, LTS methods share a core design principle: the spatially local tuning of a “temperature” parameter to rectify non-uniformity, either in model confidence (segmentation), near-wall turbulence (LES), or equilibrium observables (statistical mechanics). In each context:

LTS is post-hoc and model-agnostic (segmentation), computationally explicit (LES), or emergent from scaling theory (phase transitions).
Local adaptation provides robustness to spatial heterogeneity that cannot be captured by global or image-level approaches.
The theoretical underpinnings justify universal, often data-driven, calibration laws.

LTS methods do require sufficient data or model access: a validation set for neural calibration, measurement or computation of local gradients/fluxes in LES, or the ability to simulate spatially inhomogeneous Hamiltonians in statistical physics. Potential extensions include regularized or hierarchical temperature fields, joint end-to-end learning, or richer local architectures for temperature field prediction (Ding et al., 2020).

LTS thus constitutes a key technical strategy for spatial calibration or scaling, relevant in modern deep learning, turbulence modeling, and critical phenomena. Its empirical and theoretical properties have been systematically established for these domains (Ding et al., 2020, Chen et al., 2021, Bonati et al., 2014).