Ordered Subset Expectation Maximization

Updated 22 December 2025

OSEM is an accelerated iterative reconstruction framework that partitions measurement data into ordered subsets to significantly speed up convergence in tomography.
It integrates partial EM updates with advanced regularization techniques like total variation and PSF modeling to enhance image quality in modalities such as PET, SPECT, and CT.
Using a modest subset count (typically S≤5) balances convergence speed and noise, while recent variance-reduction extensions address limit-cycle behaviors.

Ordered Subset Expectation Maximization (OSEM) is an accelerated statistical iterative reconstruction framework widely adopted in emission and transmission tomography, as well as in multi-modal inverse problems. It reformulates the standard expectation-maximization (EM) update by partitioning measurement data into ordered subsets, thereby achieving considerable convergence acceleration while introducing complex behaviors regarding noise, convergence accuracy, and limit cycles. OSEM has also been adapted for regularized, penalized and hybrid deep learning settings. Its utility has been extensively documented in the context of PET, SPECT, CT, MVCT, and even radio interferometric calibration.

1. Derivation and Core Principles

OSEM is constructed as an ordered, block-sequential generalization of the classical maximum-likelihood EM (MLEM) algorithm for Poisson data. For a system matrix $A$ , measured projection data $y = \{y_i\}$ , and image $f = \{f_j\}$ , the Poisson log-likelihood is:

$L(f) = \sum_i \left[ y_i \ln\left((Af)_i\right) - (Af)_i \right] + \text{const}$

where $(Af)_i = \sum_j A_{ij} f_j$ . The standard EM/MLEM update is:

$f_j^{(k+1)} = f_j^{(k)} \frac{\sum_i A_{ij} \frac{y_i}{\sum_l A_{il} f_l^{(k)}}}{\sum_i A_{ij}}$

OSEM partitions the indices $i=1,\dots,M$ into $S$ disjoint subsets $S_1,\dots,S_S$ , commonly by angular projection or detector geometry. At every outer iteration $k$ and subset step $s$ , OSEM applies a "partial" EM update restricted to data in $S_s$ :

$f_j^{(k,s)} = f_j^{(k,s-1)} \frac{\sum_{i \in S_s} A_{ij} \frac{y_i}{\sum_l A_{il} f_l^{(k,s-1)}}}{\sum_{i \in S_s} A_{ij}}$

After iterating over all subsets, $f^{(k+1,0)} \gets f^{(k,S)}$ . Subset cycling delivers an $S$ -fold acceleration in early convergence, as each sub-iteration involves only $1/S$ of the data (Xin et al., 2014, Kacperski et al., 2018, Ozaki et al., 2019).

2. Acceleration, Noise, and Limit-Cycle Behavior

While OSEM achieves rapid initial convergence to high-likelihood solutions, it departs from the strictly monotonic likelihood ascent and global fixed-point guarantees of MLEM. As the number of subsets $S$ increases, OSEM exhibits:

Accelerated convergence: Each outer iteration is $S$ times faster in wall-clock terms for comparable hardware (Kacperski et al., 2018, Ozaki et al., 2019).
Noise amplification: With $S > 5$ , reconstructed image noise (e.g., coefficient of variation, background ROI COV) increases substantially; at $S=10$ , COV rises by $\sim$ 50\% and at $S=20$ by $\sim$ 100\%—a noise penalty equivalent to halving the effective scan time (Kacperski et al., 2018).
Limit cycles: The iterates do not converge to a single point but enter a cycle of $S$ distinct points, each linked to a subset, i.e., $f^{k} \rightarrow \bar f^{t}$ at large $k$ for $t=1,\ldots, S$ (Kereta et al., 2021).

The practical implication is that $S \leq 5$ represents a "safe" regime: COV rise is limited ( $\lesssim 20\%$ ), balancing reconstruction speed and noise for clinical protocols (Kacperski et al., 2018).

3. Implementation in Advanced System Models and Regularization

OSEM is compatible with complex system matrices and advanced penalization:

System matrix factorization and PSF incorporation: In PET, $A_{ij}$ can be decomposed as $A_{ij}^{\text{geom}} \times A_{ij}^{\text{PSF}}$ , where $A_{ij}^{\text{PSF}}$ explicitly models spatially-variant blurring using MC-simulated single-photon incidence responses and coincidence blurring factors. OSEM is then applied with on-the-fly multiplication of geometry and precomputed blur factors, leading to significant improvements in radial resolution, contrast recovery, and noise uniformity (Xin et al., 2014).
Attenuation and scatter compensation: Ordered subset EM has been adapted for list-mode data in SPECT, where the summation is over detected photons and per-event attributes, and the system model incorporates attenuation, energy-dependent scatter, and path sensitivity (Rahman et al., 2020).
Total-variation (TV) and other penalties: OSEM can be combined with TV regularization (OSEM-CP), using primal-dual solvers such as Chambolle-Pock per subset, yielding better noise/artifact suppression and supporting unrolled hybrid architectures (e.g., OSEM-CPNN) (An et al., 2024, Ozaki et al., 2019).
Block-sequential regularized EM (BSREM): OSEM's block-sequential nature allows MAP penalties such as the relative difference prior (RDP) to be applied per subset. Subiteration-dependent diagonal preconditioners and Nesterov-type momentum scheduling further accelerate convergence (SDP-BSREM) (Guo et al., 2021).

4. Advanced Algorithmic Extensions and Variance Reduction

Recent work frames OSEM as a preconditioned stochastic gradient ascent, which motivates variance-reduction techniques to address noise and limit cycles:

Stochastic variance reduced EM (SVREM): Maintains an auxiliary sufficient statistic, applying SVRG-style corrections during subset updates. This scheme retains OSEM's computational cost but converges to within a few percent of the true maximum likelihood solution without entering limit cycles, unlike standard OSEM/BSREM/SEM (Kereta et al., 2021).
SAGA/SVRG for EM: Storing subset gradients or computing periodic anchors, these methods provide variance control and can utilize constant or over-relaxed step sizes, ensuring rapid, stable convergence on penalized objectives (Kereta et al., 2021).

5. Quantitative Performance and Computational Considerations

OSEM's performance is extensively documented across modalities:

Metric	OSEM	Enhanced OSEM (PSF, TV, etc.)
Radial FWHM (mm)	2.46–6.96	2.42–4.41 (PET, PSF-OSEM)
COV at S=1/S=5	0.030 / 0.035	varies by penalty, regularization
Iteration speed	$\sim S\times$	+20–50% per iter (PSF-OSEM)
GPU acceleration	up to 130× CPU	up to 130× (MVCT)

Significant findings include:

PSF-OSEM improves spatial resolution and reduces bias (radial FWHM and shift up to 45% and 6 mm lower, respectively) without post-filtering (Xin et al., 2014).
OSEM with TV regularization and GPU achieves 512×512 reconstructions in seconds (Ozaki et al., 2019).
Higher S accelerates, but noise rises rapidly; OSEM with S=5 nearly saturates speedups with moderate noise penalty (Kacperski et al., 2018).
Variance-reduced OSEM eliminates limit cycles and attains lower final errors than any fixed-subset OSEM variant (Kereta et al., 2021).

6. Applications Beyond Tomography

OSEM principles extend beyond emission/transmission imaging:

Radio interferometric calibration: OS-LS and OS-SAGE utilize OSEM approaches to accelerate ML parameter estimation in radio array calibration, handling blocks of time/frequency visibilities while sustaining accuracy levels close to full-data SAGE/LS (Kazemi et al., 2013).
Hybrid and Deep Learning Integrations: OSEM-style ordered subsets have been exploited in learned unrolling of regularized reconstruction networks, notably OSEM-CPNN for LDCT, which embeds view-by-view Chambolle-Pock updates as CNN blocks (An et al., 2024).

7. Practical Algorithmic Guidelines and Limitations

Researchers have converged on several practical recommendations:

Employ modest subset counts ( $S\leq 5$ ) to balance acceleration and noise (Kacperski et al., 2018).
Combine OSEM with regularization (e.g., TV, RDP) or system matrix enhancements (PSF-blur, scatter/attenuation modeling) for superior noise/artifact control (Xin et al., 2014, An et al., 2024, Guo et al., 2021).
For large-scale settings, exploit parallelization (multi-core/GPU) and subset cycling to maximize throughput; subset size can be adapted per application (Ozaki et al., 2019, Rahman et al., 2020).
Monitor for limit cycles and, if necessary, switch to variance-reduced methods to ensure convergence to the global optimum (Kereta et al., 2021).
For applications with highly non-uniform data SNR (e.g., radio calibration), subset partitioning may require more sophisticated balancing (Kazemi et al., 2013).

OSEM thus serves as a flexible, high-performance foundation for statistical iterative reconstruction and related inverse problems. Its core structure is broadly adaptable, but careful tuning of subset count, regularization, and algorithmic extensions is essential for stability and optimal image quality.