MDF: Multi-Defocus Fusion Methods
- The paper introduces MDF as a technique that fuses multiple images using weighted selection maps to create a sharp, all-in-focus output.
- MDF methods leverage boundary-aware cascades, α-matte modeling, and GAN-based approaches to precisely manage defocus spread and ambiguous boundaries.
- The approach employs realistic synthetic datasets and rigorous quantitative metrics to significantly enhance edge fidelity and reduce artifacts.
Multi-Defocus Fusion (MDF) methods refer to a diverse class of algorithms and architectures designed to combine multiple images (or image stacks) acquired at different focal settings into a single all-in-focus image or enhanced representation. MDF tackles the physical and algorithmic challenges associated with depth-of-field (DOF) limitations, defocus spread, and boundary ambiguity, and finds applications ranging from microscopy to electron microscopy and general photographic imaging. Contemporary MDF solutions span deep learning, optimization, decision-map, variational, and diffusion-based strategies, with approaches rigorously modeling boundary phenomena and dataset realism.
1. Formal Problem Definition and Mathematical Principles
Let denote a set of perfectly registered source images, each acquired with a distinct focal setting. The objective is to generate a fused image such that it is locally identical to the sharpest available source at each position:
where is a selection or soft attention map, typically constrained by and . In two-image fusion, this often reduces to a binary decision map or a continuous focus score ; fusion then is
Defocus measurement, sharpness criteria, or learned focus-confidence are core to MDF. Recent frameworks explicitly address focus/defocus boundaries (FDBs) and model the defocus spread effect, where blur extends beyond object edges due to lens PSF and occlusions (Ma et al., 2019, Ma et al., 2019, Wang et al., 2020).
2. Algorithmic Architectures and Boundary-Awareness
The modern MDF landscape includes architectures purpose-built for the focus/defocus boundary problem:
- Boundary-Aware Cascades: Two-stage models deploy an initial fusion network to estimate global focus and a dedicated boundary-refinement network for ambiguous, mixed-focus pixels. For example, ResNet-56 is trained as both an “Initial Fusion Net” (global) and “Boundary Net” (localized only to boundary patches), with pixel classification into near- and far-FDB regions via running window averages of focus scores (Ma et al., 2019).
- α-Matte Based Boundary Modeling: These methods synthesize data and networks under physically plausible layered image formation models, where defocus spreads are modeled by blurred alpha mattes (e.g., Gaussian kernels convolved with binary transmission layers). Networks first produce a soft “guidance map” for boundaries, with residual refinement strictly on the boundary band by a specialized sub-net; output fusion is weighted according to these maps (Ma et al., 2019).
- Decision Map with Deep Feature Calibration: MDF methods such as GACN introduce a cascade that simultaneously predicts soft decision maps using deep spatial-frequency activations and produces fused images, eschewing empirical postprocessing. Decision-map calibration analytically refines ambiguous boundary regions using guided filtering or boundary masking, ensuring pixel-accurate assignment at boundaries (Ma et al., 2020).
- GAN and Diffusion Approaches: Generative models (MFIF-GAN, ReDiffuse) directly address defocus spread by training discriminators on α-matte synthesized pairs, adversarial and gradient-aware losses, and, in ReDiffuse, incorporating rotation-group equivariant U-Nets to maintain geometric structure in fusion against symmetric or repetitive patterns (Wang et al., 2020, Li et al., 22 Mar 2026).
3. Dataset Generation and Simulation Fidelity
Robust MDF depends crucially on the availability of realistic, diverse, labeled datasets.
- Synthetic Data with Realistic Boundaries: High-fidelity dataset generation involves combining matting cutouts with complex backgrounds, applying depth-dependent blurs only to occluded regions, and enforcing layered occlusion/blur logic. For instance, producing a pair by
0
1
closely mimics optical DOF behavior around boundaries (Ma et al., 2019, Ma et al., 2019).
- Domain-Specific Simulation: In HAADF-STEM, MDF simulates multiple defocus images spanning the support thickness under multislice physics, considering probe convergence and collection angles to maximize elemental separability for atomic identification (Li et al., 7 May 2025).
- Large-Scale Blender/Cycles Datasets: For high-resolution fusion, datasets like MattingMFIF utilize Blender-rendered 4K scenes with optically plausible DOF, realistic object placement, and associated all-in-focus ground truth (Piano et al., 22 Oct 2025).
Empirical results demonstrate that dataset realism significantly lowers boundary classification error and enhances fusion fidelity, especially in the vicinity of challenging FDB regions.
4. Quantitative Metrics and Benchmarking
MDF evaluation universally relies on metrics that discriminate both global and boundary-region fusion quality. Established criteria include:
| Metric | Purpose | Higher-is-Better? |
|---|---|---|
| Qₙₘᵢ, Q_MI | Mutual information with input(s) | Yes |
| Q_G | Gradient-based sharpness consistency | Yes |
| Q_Y, Q_y | Structural similarity or SSIM variants | Yes |
| Q_CB | Human visual system–based assessment | Yes |
| MS-SSIM, EN | Multi-scale or entropy-based | Yes |
| MOS | Mean Opinion Score (human rating) | Yes |
| Edge/Frequency MI | Feature mutual information (edges/DCT) | Yes |
Boundary-aware methods consistently demonstrate improvements over classical transforms (NSCT, SR, DSIFT), often yielding best-in-class metric scores both globally and specifically at the FDB (Ma et al., 2019, Ma et al., 2019, Ma et al., 2020).
5. Extensions: Multi-Image, Unsupervised, Physical and Domain-Specific Fusion
- Multi-Image Fusion and Decision Volume Calibration: MDF extensions to more than two inputs select among 2 sources at each pixel using an analytically constructed “decision volume” built from pairwise decision maps and calibration formulas (e.g., 3 as a function of focus probabilities), yielding efficient and scalable multi-stack fusion (Ma et al., 2020, Piano et al., 22 Oct 2025).
- Unsupervised and Optimization-Based Strategies: Techniques such as MFNet bypass curated ground truth by maximizing local SSIM in a sliding window, selectively matching fused output to the most in-focus patch from multiple sources. Gradient-based optimization frameworks (as in MFF-SSIM) directly maximize a patchwise fusion quality index, robustly reducing halos in strong defocus spread regimes (Yan et al., 2018, Xu et al., 2020).
- Physical/Electron Microscopy Applications: In atomic-resolution STEM, MDF recovers Z-contrast by per-pixel maximum-intensity fusion over a defocus series and combines with LoG-based detection and Gaussian-mixture modeling to classify elements, achieving <5% classification error compared to ~50% for single-defocus frames (Li et al., 7 May 2025).
- Depth-Map Guided MDF: By incorporating explicit depth sensing, MDF can segment the scene into DOF-compliant regions, assigning each block to its optimal focal plane based on the closest match to camera focus/distance, enabling artifact-free, order-of-magnitude faster real-time all-in-focus imaging (Liu et al., 2018).
6. Limitations, Scalability, and Research Outlook
- Boundary Fragility and Spread: Despite advances, MDF remains sensitive to accurate boundary delineation and defocus spread modeling; performance may degrade in the presence of severe FDBs, thick PSFs, or imperfect registration.
- Registration and Occlusion: The efficacy of latent-space or deep fusion approaches (e.g., VAEEDOF) assumes near-perfect alignment; methods may need further adaptation for dynamic or misaligned input bursts (Piano et al., 22 Oct 2025).
- Efficiency Considerations: Cascade and decision volume approaches yield significant runtime reductions (up to 30–50% improvement), with high-resolution diffusion models leveraging weight-sharing for further acceleration (Ma et al., 2020, Li et al., 22 Mar 2026).
- Potential Extensions: Future advances include the integration of rotation-equivariant group convolutions (for symmetry preservation), non-local attention to reinforce defocus-prone structures, light-field augmentation for better occlusion handling, and joint optimization of synthetic-real domain transfer.
7. Representative Results and State-of-the-Art Standing
Multiple MDF methods exhibit consistent outperformance of prior baselines across comprehensive quantitative benchmarks, particularly in boundary fidelity, artifact suppression, and visual quality:
| Method | Main Feature | Perfect Boundary Score | Notable Advantages | Ref |
|---|---|---|---|---|
| Boundary Net MDF | 2-channel ResNet + FDB refine | Yes | Top in Qₙₘᵢ, Q_G, Q_Y, Q_CB | (Ma et al., 2019) |
| α-Matte MMF-Net | Cascaded boundary fusion | Yes | Superior edges, no halos | (Ma et al., 2019) |
| GACN Cascade | End-to-End Decision Map | Yes | 30–50% speedup, DM calibration | (Ma et al., 2020) |
| MFIF-GAN | α-matte GAN, DSE-matching | Yes | Pixel-accurate mask, SOTA metrics | (Wang et al., 2020) |
| VAEEDOF | Latent-space multi-image fusion | Yes (on synthetic) | Seamless 4K fusion, generative fill | (Piano et al., 22 Oct 2025) |
| Physical MDF (STEM) | Pixelwise max-intensity | N/A | <5% atom class error | (Li et al., 7 May 2025) |
| ReDiffuse | Rot.-equiv. diffusion (U-Net) | Yes (rotation) | Consistency for symmetric detail | (Li et al., 22 Mar 2026) |
In summary, MDF encapsulates a spectrum of mathematically grounded, boundary-sensitive, and increasingly domain-adaptive techniques that set the current state of the art for multi-focus and multi-defocus fusion problems across disciplines.