Radon Cumulative Distribution Transform (R-CDT)
- R-CDT is a nonlinear, invertible image representation that couples the Radon transform with the cumulative distribution transform to achieve linear separability.
- It effectively linearizes geometric deformations such as translations, scalings, and rotations, offering robust feature extraction even with limited or noisy data.
- Normalized and generalized R-CDT variants extend invariance to affine and non-Euclidean transformations, enhancing its applicability across diverse imaging tasks.
The Radon Cumulative Distribution Transform (R-CDT) is a nonlinear, invertible image representation formed by coupling the classical Radon transform with the one-dimensional cumulative distribution transform (CDT) along each projection angle. This transform provides a low-level, closed-form feature extraction method that enables linear separability of image classes generated via mass-preserving deformations, translations, scalings, and certain broader transformation models. R-CDT is closely related to the sliced Wasserstein metric and has seen accelerated theoretical and algorithmic development extending to normalizations for affine invariance, signed-data transforms, and generalizations to non-Euclidean domains. Its main application lies in image and signal classification, especially in regimes of small or corrupted data, where it yields significant advantages in separability, label-efficiency, and computational complexity over conventional methods (Kolouri et al., 2015, Shifat-E-Rabbi et al., 2020, Beckmann et al., 25 Nov 2024, Beckmann et al., 10 Jun 2025, Beckmann et al., 8 Dec 2025).
1. Mathematical Formulation and Core Principles
The R-CDT for an image defined on proceeds in two main steps:
- Radon Transform: The image is projected along lines parameterized by angle and distance ,
producing a family of 1D signals ("sinograms").
- Cumulative Distribution Transform (CDT) Along Each Angle: For a chosen positive reference projection , let and denote their CDFs. The monotone map satisfying
is computed for each . The R-CDT representation is then
The inverse is explicit: invert the 1D CDT for each projection, then apply analytic or discrete filtered backprojection for the Radon inverse (Kolouri et al., 2015, Long et al., 2023).
Invertibility is guaranteed provided all 1D projections are strictly positive, ensuring monotonicity in , and the Radon transform is invertible under standard decay/support conditions. The R-CDT is inherently nonlinear, as arises from a nonlinear CDF-matching condition per .
2. Geometric and Theoretical Properties
2.1 Linearization of Image Transforms
R-CDT linearizes classes of transport and geometric operations:
- Translations: If , then
- Isotropic Scalings: implies
- Rotations: For rotations by angle and circularly symmetric , (Kolouri et al., 2015, Shifat-E-Rabbi et al., 2020).
2.2 Connection to Sliced Wasserstein Distance
The R-CDT space is isometric to the sliced Wasserstein-2 metric:
where acts as a weighting determined by the reference (Shifat-E-Rabbi et al., 2020, Beckmann et al., 10 Jun 2025).
2.3 Linear Separability and Convexification
A key theorem asserts that when image classes are constructed via mass-preserving deformations (satisfying closure, convexity, and non-intersection conditions), their R-CDT representations are linearly separable in , independent of the reference image (Kolouri et al., 2015). Translation and scaling variabilities become additive (thus linearly separable) in R-CDT coordinates.
3. Normalized and Generalized R-CDT Variants
3.1 Max-Normalized and Mean-Normalized R-CDT
Plain R-CDT is not invariant to arbitrary affine transformations. To address this, a two-step normalization is implemented:
- Zero-mean and unit-variance normalization:
with and the mean and standard deviation over for each angle.
- Angular Aggregation: By taking the pointwise maximum (max-normalization, mNRCDT) or angular mean (aNRCDT), affine invariance is achieved:
These produce feature sets with provable invariance to affine transforms and robustness to certain non-affine deformations, with stability controlled by Wasserstein metrics (Beckmann et al., 25 Nov 2024, Beckmann et al., 10 Jun 2025, Beckmann et al., 8 Dec 2025).
3.2 Generalized Aggregation for Group Invariance
Beyond max or mean pooling, any aggregator that is permutation-invariant and boundedness-preserving yields an -normalized R-CDT with similar invariance and linear separability properties. This approach generalizes the R-CDT to multi-dimensional (e.g., ) and non-Euclidean spaces (e.g., with generalized Radon transforms) (Beckmann et al., 8 Dec 2025). The effect is that all class members under group transformations collapse to single feature representations, ensuring perfect linear separability if the respective template features are distinct.
3.3 Signed R-CDT
The standard R-CDT presumes nonnegative, mass-normalized (i.e., probability measure) data. The signed R-CDT (RSCDT) generalizes to arbitrary images via Jordan decomposition , applying the CDT independently to and and tracking their respective norms. The feature representation is then a tuple over of positive/negative CDT-transforms and norms, and in the aggregate yields an isometric embedding for the so-called signed-sliced Wasserstein metric (Gong et al., 2023).
4. Computational Algorithms and Complexity
A typical algorithmic pipeline for R-CDT-based analysis involves:
- Normalizing input images to unit mass.
- Computing discrete sinograms (Radon projections) over angles using nearest-neighbor or B-spline interpolation.
- For each :
- Estimating empirical CDFs for both image and reference projections;
- Solving for the unique map via monotone interpolation or sorting;
- Constructing R-CDT slices as above.
- (Optional) Performing two-step normalization for affine invariance.
- For inversion, reconstructing projections using inverse 1D CDT, followed by standard filtered backprojection.
The computational complexity per image is typically for images and projection angles, matching that of the Ridgelet transform. Discretization artifacts arise at sharp discontinuities and can be mitigated by smoothing, zero-padding, or post-processing (Kolouri et al., 2015, Long et al., 2023).
5. Applications in Machine Learning, Classification, and Model Reduction
5.1 Image Classification
R-CDT features, flattened and optionally PCA-pruned, serve as input for linear or kernel-based classifiers (e.g., SVMs). Performance gains over pixel or conventional feature representations are substantial, particularly in regimes with limited data, strong geometric variability, or affine distortions. For small-sample or small-data settings, max-normalized or mean-normalized R-CDT variants achieve near-perfect or perfect separability for classes generated by affine transformations (Shifat-E-Rabbi et al., 2020, Beckmann et al., 25 Nov 2024, Beckmann et al., 10 Jun 2025, Beckmann et al., 8 Dec 2025).
Empirical results span multiple domains:
- Facial expression and illumination-invariant face recognition: Nearest-subspace in local R-CDT domains achieves competitive or state-of-the-art accuracy under substantial lighting or pose changes (Zhuang et al., 2022).
- Medical and biological imaging: Significantly improved classification of pathological versus benign cell morphologies, outperforming standard deep nets on small annotated datasets (Shifat-E-Rabbi et al., 2020, Kolouri et al., 2015).
- Watermark recognition and filigranology: R-CDT-based recognition maintains performance under nontrivial affine measurement distortions (Beckmann et al., 25 Nov 2024, Beckmann et al., 10 Jun 2025).
- Handwritten character classification: With extremely limited training data, normalized R-CDT features enable accurate nearest-template or nearest-neighbor classification of characters and digits under broad affine and local deformations (Beckmann et al., 25 Nov 2024, Beckmann et al., 8 Dec 2025).
5.2 Reduced-Order Modeling for Advection-Dominated Systems
In high-dimensional dynamical systems governed by advection and transport, the physical-domain solution manifold is characteristically non-linear and poorly suited for linear model reduction (e.g., POD). R-CDT linearizes the effect of transport, enabling rapid decay of singular values in R-CDT space. POD applied in R-CDT space yields low-rank subspaces accurately capturing traveling features with drastically fewer modes, and supports accurate interpolation tasks that fail in the physical variable domain (Long et al., 2023).
5.3 Subspace Modeling and Robustness
Nearest-subspace and subspace modeling in R-CDT coordinates exploit the isometric embedding properties of the transform. Classes generated by smooth or mass-preserving transformations of templates produce low-dimensional or even one-dimensional subspaces in R-CDT space, yielding robust classification and favorable out-of-distribution generalization properties, with strong empirical performance for few-shot learning (Shifat-E-Rabbi et al., 2020, Beckmann et al., 8 Dec 2025).
6. Limitations, Extensions, and Current Challenges
While R-CDT offers a suite of theoretical and computational guarantees, several limitations and frontiers persist:
- Discretization: Nonlinear artifacts and errors are introduced around sharp fronts and edges, which can be partially mitigated by smoothing, padding, or median filtering but are not completely eliminated in current implementations (Long et al., 2023).
- Reference Dependence: Numerical stability and feature scaling can be sensitive to the choice of reference measure or image, despite theoretical separability being reference-invariant (Kolouri et al., 2015).
- Applicability Domain: R-CDT is best suited to problems where critical image deformations are transport-like and classes are generated by deformations of prototypes; natural imagery with highly non-template-structured variability yields diminished gains relative to deep convolutional approaches (Shifat-E-Rabbi et al., 2020).
- Extension to Non-Euclidean, Signed, and High-Dimensional Data: Generalized and normalized R-CDT methods extend the original theory to signed-data, multi-dimensional settings, and non-Euclidean manifolds, with ongoing work regarding the computational tractability and representation power in these contexts (Gong et al., 2023, Beckmann et al., 8 Dec 2025).
- Scalability and Integration: Current algorithms operate efficiently for moderate grid and angle sizes but further acceleration, discrete versions, and scalable implementations suitable for large-scale vision and physical simulation are active research areas (Kolouri et al., 2015, Beckmann et al., 8 Dec 2025).
7. Summary Table: Core R-CDT Variants and Properties
| Variant | Domain | Invariance | Key Reference |
|---|---|---|---|
| Plain R-CDT | (maps to sinogram) | Translation, scaling, some rotation | (Kolouri et al., 2015, Shifat-E-Rabbi et al., 2020) |
| mNRCDT / aNRCDT | (aggregated over angles) | Full affine (mNRCDT), partial affine (aNRCDT) | (Beckmann et al., 10 Jun 2025, Beckmann et al., 25 Nov 2024, Beckmann et al., 8 Dec 2025) |
| Signed R-CDT | Signed images | Deformations, sign-preserving | (Gong et al., 2023) |
| Generalized R-CDT | , , etc. | Affine/group invariance under defined | (Beckmann et al., 8 Dec 2025) |
| Local R-CDT | Patchwise gradients on images | Local affine (illumination) | (Zhuang et al., 2022) |
Key: mNRCDT (max), aNRCDT (mean aggregation); h-normalized allows custom aggregation; denotes the normed function space used for classification measures.
R-CDT and its normalized and generalized variants offer a mathematically grounded, invertible, and computationally tractable framework for feature extraction and classification, robustly addressing geometric and structural image variability beyond the capacity of linear transforms and standard finite-dimensional embeddings (Kolouri et al., 2015, Beckmann et al., 10 Jun 2025, Beckmann et al., 8 Dec 2025).