Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
GPT-5.1
GPT-5.1 133 tok/s
Gemini 3.0 Pro 55 tok/s Pro
Gemini 2.5 Flash 164 tok/s Pro
Kimi K2 202 tok/s Pro
Claude Sonnet 4.5 39 tok/s Pro
2000 character limit reached

UFC-MIL: Uncertainty-Focused Calibrated MIL

Updated 16 November 2025
  • The paper introduces UFC-MIL, a multi-resolution MIL framework that integrates patch-level uncertainty and entropy-based masking to boost diagnostic reliability.
  • It employs a Topological Neighbor Attention Module and Soft-Resolution Label Smoothing (SRLS) to achieve superior accuracy and calibrated confidence on multiple histopathology datasets.
  • Key methodologies include multi-resolution feature extraction with frozen CNN embeddings, entropy-guided patch selection, and cross-resolution fusion that mirrors expert clinical reasoning.

Uncertainty-Focused Calibrated Multiple Instance Learning (UFC-MIL) is a multi-resolution diagnostic framework designed for histopathological whole-slide image (WSI) analysis, addressing both classification accuracy and calibration of model predictions. UFC-MIL explicitly models patch-level uncertainty to make bag-level predictions that more closely mirror clinical expert reasoning, enabling reliable diagnostic support suitable for deployment in settings requiring high reliability.

1. Model Architecture

UFC-MIL processes digital whole-slide images by extracting non-overlapping patches at RR distinct resolutions, commonly at $2.0$, $1.0$, and $0.5$ microns-per-pixel (MPP). Each patch xi,n(r)x^{(r)}_{i,n} is embedded via a frozen, pre-trained CNN (e.g., ResNet) into zi,n(r)Rdz^{(r)}_{i,n}\in\mathbb{R}^d, yielding Zi(r)=[zi,1(r),,zi,nr(r)]Z^{(r)}_i = [z^{(r)}_{i,1},\ldots,z^{(r)}_{i,n_r}] per resolution.

A learnable class token clsi(r)Rdcls^{(r)}_i \in \mathbb{R}^d is prepended to the stack and processed through a Nyström-approximate self-attention module; this yields Z~i(r)R(nr+1)×d\tilde{Z}^{(r)}_i \in \mathbb{R}^{(n_r+1)\times d}, modeling intra-resolution contextual dependencies. To further inject spatial information, the Topological Neighbor Attention Module (TNAM) aggregates local context among patch neighbors defined by a 4- or 8-connectivity adjacency graph Ai(r)A^{(r)}_i, generating updated features Ti(r)T^{(r)}_i via:

si,n(r)=exp(w[tanh(Atz~i,n(r))σ(Asz~i,n(r))])kNi,n(r)exp(w[tanh(Atz~i,k(r))σ(Asz~i,k(r))])s^{(r)}_{i,n} = \frac{\exp(w^\top [\tanh(A_t\tilde{z}^{(r)}_{i,n}) \odot \sigma(A_s\tilde{z}^{(r)}_{i,n})])}{\sum_{k\in\mathcal{N}^{(r)}_{i,n}} \exp(w^\top [\tanh(A_t\tilde{z}^{(r)}_{i,k}) \odot \sigma(A_s\tilde{z}^{(r)}_{i,k})])}

ti,n(r)=kNi,n(r)si,k(r)z~i,k(r)t^{(r)}_{i,n} = \sum_{k\in\mathcal{N}^{(r)}_{i,n}} s^{(r)}_{i,k}\cdot\tilde{z}^{(r)}_{i,k}

These are residual summed into Z~i(r)\tilde{Z}^{(r)}_i, preserving the class token.

Unique to UFC-MIL, patch-wise entropy is computed and high-entropy patches are masked via a Gumbel-softmax differentiable binary mask mi(r){0,1}nrm^{(r)}_i\in\{0,1\}^{n_r}, guiding cross-resolution feature fusion. Fine resolution features are focused on uncertain regions through:

(1Repeat(mi(r),nr+1/nr))Z~i(r+1)+Repeat(mi(r)Zi(r),nr+1/nr)(1-\text{Repeat}(m^{(r)}_i, n_{r+1}/n_r)) \odot \tilde{Z}^{(r+1)}_i + \text{Repeat}(m^{(r)}_i\odot Z^{(r)}_i, n_{r+1}/n_r)

Classification prediction at each resolution is performed by the class token p^i(r)RC\hat{p}^{(r)}_i\in\mathbb{R}^C, with every patch also projected via an MLP (“identical dimension reduction network”) to p^i,n(r)RC\hat{p}^{(r)}_{i,n}\in\mathbb{R}^C.

2. Mathematical Formulation

Patch-wise uncertainty quantification leverages Shannon entropy on softmax predictions:

Hi(r)[n]=c=0,1p^i,n(r)[c]log2p^i,n(r)[c]H^{(r)}_i[n] = -\sum_{c=0,1} \hat{p}^{(r)}_{i,n}[c] \cdot \log_2 \hat{p}^{(r)}_{i,n}[c]

The uncertainty-focused patch-wise loss $L_{PW}^{(r)}_i$ is defined:

$L_{PW}^{(r)}_i = (1-Y_i)\frac{1}{n_r}\sum_{n=1}^{n_r} \text{ReLU}(\hat{p}^{(r)}_{i,n}[1]-\delta) + Y_i\cdot\text{ReLU}(-\max_n \hat{p}^{(r)}_{i,n}[1]+(1-\delta))$

where Yi{0,1}Y_i\in\{0,1\} is the slide-level label and δ<0.5\delta<0.5 is a margin. This formulation preserves the MIL assumption: negatives should not have highly positive patches, while positives should yield at least one high-confidence positive.

Classification loss at each resolution is standard cross-entropy:

$L_{CE}^{(r)}_i = -\sum_{c=0,1} \mathbb{I}[c=Y_i] \log\hat{p}^{(r)}_i[c]$

Total training loss accumulates resolution- and sample-wise terms, with patch-wise loss weighting λ\lambda:

$L_{total} = \sum_{i\in \text{dataset}} \sum_{r=1}^R \left[L_{CE}^{(r)}_i + \lambda L_{uncertainty}^{(r)}_i \right]$

3. Calibration Methodology (SRLS)

UFC-MIL employs Soft-Resolution Label Smoothing (SRLS) for calibration, leveraging patch-level uncertainty statistics inferred from the primary training run without extra inference iterations.

At a selected epoch, patch entropies Hi(r)[n]H^{(r)}_i[n] are aggregated over the training set:

μ(r)=avgi  meannHi(r)[n],σ(r)=avgi  stdnHi(r)[n]\mu^{(r)} = \text{avg}_i \; \text{mean}_n H^{(r)}_i[n],\quad \sigma^{(r)} = \text{avg}_i \; \text{std}_n H^{(r)}_i[n]

For each sample and resolution, min-max scaling standardizes these as μ~i(r),σ~i(r)[0,1]\tilde \mu^{(r)}_i, \tilde \sigma^{(r)}_i \in [0,1]. A smoothing factor is computed:

ϵi(r)=12(μ~i(r)+σ~i(r))α\epsilon^{(r)}_i = \frac{1}{2}\left(\tilde \mu^{(r)}_i + \tilde \sigma^{(r)}_i \right)\cdot\alpha

where α\alpha is a global temperature (empirically α=0.1\alpha=0.1). The hard label YiY_i is replaced by a soft target:

Y~i(r)=(1ϵi(r))Yi+ϵi(r)C\tilde Y^{(r)}_i = (1-\epsilon^{(r)}_i)Y_i + \frac{\epsilon^{(r)}_i}{C}

Over the final KK epochs, the model is fine-tuned using these targets, minimizing:

Lcalib=irCE(p^i(r),Y~i(r))L_{calib} = \sum_i \sum_r \text{CE}(\hat{p}^{(r)}_i, \tilde Y^{(r)}_i)

No extra inference loops are required, making calibration efficient and exploiting resolution outputs.

4. Training Regimen and Hyperparameterization

Optimization is performed with Adam (initial learning rate 1×1041\times10^{-4}, β1=0.9\beta_1=0.9, β2=0.999\beta_2=0.999) using cosine decay to zero over TT total epochs (typically $20$–$30$ for convergence, with SRLS applied in the last $5$–$10$). Due to resource constraints, batch size is usually $1$ WSI. Default hyperparameters derived by validation include δ=0.49\delta=0.49, α=0.1\alpha=0.1, and equal loss weighting λ=1.0\lambda=1.0; ablation indicates no further gain by tuning λ\lambda.

5. Evaluation Metrics and Results

Evaluation employs:

ECE=m=1MBmNAcc(Bm)Conf(Bm)\text{ECE} = \sum_{m=1}^M \frac{|B_m|}{N}|\text{Acc}(B_m) - \text{Conf}(B_m)|

Where BmB_m partitions samples by confidence.

Performance was assessed on three public datasets: CAMELYON16 (n=400n=400 WSIs), DHMC (n=143n=143), and BCNB (n=1,058n=1,058). UFC-MIL with SRLS yields competitive or superior accuracy and notably improved calibration:

Dataset Model Accuracy (Acc) ECE
CAMELYON16 UFC-MIL 0.917±0.0380.917\pm0.038 0.086±0.0370.086\pm0.037
CAMELYON16 UFC-MIL★ 0.941±0.0110.941\pm0.011 0.056±0.0160.056\pm0.016
CAMELYON16 Best SOTA $0.909$ $0.086$
DHMC UFC-MIL★ 0.812±0.0210.812\pm0.021 0.189±0.0210.189\pm0.021
DHMC Best SOTA $0.758$ $0.206$
BCNB UFC-MIL★ 0.820±0.0280.820\pm0.028 0.077±0.0330.077\pm0.033
BCNB Best SOTA $0.800$ $0.108$

AUC on CAMELYON16 is approximately $0.964$, with UFC-MIL★ (SRLS calibrated) reducing ECE by $30$–40%40\% relative to the strongest prior baseline.

6. Context and Application Significance

UFC-MIL advances multi-resolution MIL by integrating uncertainty quantification at patch level, yielding both high diagnostic fidelity and trustworthy confidence estimates. Conventional multi-resolution MILs (e.g., DS-MIL) focus exclusively on classification accuracy, whereas UFC-MIL addresses the nuanced requirement of calibration critical for clinical decision support.

By leveraging attention-driven neighbor aggregation and entropy-masked resolution zooming, UFC-MIL more closely emulates expert pathologists' workflow: regions of diagnostic ambiguity are examined at higher resolution. The patch-wise loss preserves MIL assumptions, allows ambiguous regions ("grey‐zone"), and explicitly encodes uncertainty, mitigating overconfidence in negative cases and enhancing interpretability.

The SRLS calibration approach obviates inference overhead and exploits multi-resolution predictions, providing a route for seamless calibration tuning in MIL systems. Its practical benefit is especially pronounced for deployment in environments with strict reliability constraints.

A plausible implication is the broader adoption of UFC-MIL-like architectures as calibration-aware MIL becomes a clinical requirement, with potential relevance in non-pathology domains requiring fine-grained uncertainty modeling.

7. Limitations and Directions for Further Research

While UFC-MIL demonstrates robust performance and calibration on multiple histopathology datasets, batch size is limited by GPU memory constraints and the approach requires patch extraction at multiple resolutions, increasing preprocessing burden. The architecture's reliance on a frozen feature extractor may influence adaptability across datasets with divergent statistics; fine-tuning or self-supervised pre-training represent natural extensions. Further, rigorous analysis of the calibration method's behavior under dataset shift remains an open question, particularly as SRLS is tied to entropy statistics at a specific checkpoint epoch.

Investigation into the integration of non-image modalities and expansion of uncertainty quantification schemes may broaden UFC-MIL’s applicability. Extending TNAM to model more sophisticated spatial relations or incorporating domain-specific priors could yield enhanced context modeling. Comparative paper with Bayesian and deep ensemble calibration methods would clarify UFC-MIL’s theoretical and practical position in the landscape of calibrated MIL approaches.

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Uncertainty-Focused Calibrated MIL (UFC-MIL).