Papers
Topics
Authors
Recent
2000 character limit reached

Radiomics Feature Extractor

Updated 26 December 2025
  • Radiomics Feature Extractor is a computational system that maps segmented medical images to high-dimensional quantitative feature vectors, summarizing morphology, intensity, and texture.
  • It integrates hand-crafted, deep learning, and hybrid pipelines to standardize feature extraction and boost reproducibility across diverse imaging cohorts.
  • Advanced methods such as wavelet filtering, GPU acceleration, and tensor radiomics enhance predictive power and address challenges like dimensionality and preprocessing sensitivity.

A radiomics feature extractor is a computational system that maps medical images, typically with segmented regions-of-interest (ROI), to high-dimensional quantitative feature vectors that encode morphological, histogram-based, and spatial texture biomarkers. The extractor operationalizes radiomics as an algorithmic bridge between complex medical imaging data and machine learning–ready tabular representations, with the goal of supporting clinical prediction, diagnosis, prognosis, retrieval, or biomarker discovery in large, heterogeneous imaging cohorts (Afshar et al., 2018). Feature extractors may be realized as modular hand-crafted pipelines, deep learning–driven models, or hybrid systems, and are foundational in standardized, reproducible imaging research and precision medicine.

1. Hand-Crafted Feature Extraction: Categories and Mathematical Formalism

Hand-crafted feature extraction pipelines operate on intensity-resolved medical images and well-defined ROIs (2D or 3D masks). Standardized systems such as PyRadiomics, SERA, and PySERA compute a battery of features, with formulae and implementation parameters harmonized by the Image Biomarker Standardization Initiative (IBSI) (Salmanpour et al., 20 Nov 2025, Primakov et al., 2022).

Principal feature groups:

  • First-order statistics: Mean, variance, skewness, kurtosis, energy, entropy, uniformity, min/max, percentiles, RMS, interquartile ranges. These summarize the intensity histogram inside the ROI, e.g., for intensity vector {xi}\{x_i\}, μ=1Nxi\mu = \frac{1}{N}\sum x_i, Entropy=pklog2pk\mathrm{Entropy} = -\sum p_k \log_2 p_k where pkp_k is the normalized histogram (Salmanpour et al., 20 Nov 2025).
  • Shape and morphology: Volume, surface area, compactness, sphericity (Ψ=π1/3(6V)2/3/A\Psi = \pi^{1/3}(6V)^{2/3}/A), maximum 3D/planar diameters, convexity, elongation (Salmanpour et al., 20 Nov 2025, Na et al., 11 Jul 2025).
  • Second-order texture features:
    • GLCM (Gray Level Co-occurrence Matrix): P(i,j)P(i,j) is the frequency of intensities ii, jj at a defined distance and direction, leading to contrast, energy (ASM), homogeneity, correlation, entropy (Vallières et al., 2017, Feng et al., 15 Oct 2025, Primakov et al., 2022).
    • GLRLM (Gray Level Run-Length Matrix): R(i,)R(i,\ell) counts the number of runs of length \ell at gray-level ii, yielding short/long run emphasis, non-uniformity measures.
    • GLSZM (Gray Level Size Zone Matrix): Z(i,s)Z(i,s) is the count of connected zones of size ss at level ii, leading to small/large zone emphasis, zone non-uniformity (Salmanpour et al., 20 Nov 2025).
    • NGTDM (Neighborhood Gray Tone Difference Matrix) and GLDM (Gray Level Dependence Matrix): compute local contrast, coarseness, and dependence statistics.
  • Higher-order and filtered features:
    • Wavelet subbands (e.g. Haar, Daubechies) and Laplacian-of-Gaussian (LoG) filter responses are convolved with the input; first- and second-order features are recomputed for each sub-band (Depeursinge et al., 2020, Primakov et al., 2022).
    • Moment invariants: Based on Hu moments and central moments (e.g., ϕ1=η20+η02\phi_1 = \eta_{20} + \eta_{02}), providing affine/rotation-invariant shape descriptors (Salmanpour et al., 20 Nov 2025).

Parameter sweeps (voxel size, discretization, interpolation) are critical; tensorized multi-flavour approaches stack feature sets across multiple parameterizations to boost robustness and predictive power (Rahmim et al., 2022).

2. Preprocessing and Standardization

Successful feature extraction requires a reproducible preprocessing pipeline that harmonizes data across sites, scanners, and protocols (Kozák, 23 Sep 2024, Primakov et al., 2022). Critical stages include:

  • Resampling to isotropic voxels (typically 1 mm³ by trilinear or B-spline interpolation).
  • Bias-field correction (e.g., N4).
  • Intensity normalization: Z-score (Iμ)/σ(I-\mu)/\sigma or min-max scaling within the ROI or brain/tissue mask.
  • Intensity discretization: Fixed bin width (e.g., 25 HU for CT) or fixed bin count; affects downstream texture matrices (Salmanpour et al., 20 Nov 2025, Primakov et al., 2022).
  • Histogram matching and outlier removal: Standardizes intensity distributions across patients.
  • ROI extraction: Masked volumetric subsetting; in segmentation-point pipelines (e.g., RadiomicsRetrieval), “point prompt” segmentation is deployed and cropped (Na et al., 11 Jul 2025).

All preprocessing parameters must be logged for auditability (sample spacing, discretization, normalization) to comply with IBSI guidelines (Salmanpour et al., 20 Nov 2025, Depeursinge et al., 2020).

3. Deep Radiomics and Learned Feature Extractors

Beyond hand-crafted descriptors, deep radiomic feature extractors operationalize data-driven feature learning:

  • Discovery radiomics with StochasticNet radiomic sequencers: Randomly sparse CNNs (three convolutional layers, 5×55 \times 5 kernels, p=0.5p = 0.5 connection probability) trained on preprocessed patches (e.g., 32×3232 \times 32 from CT), followed by feature extraction as global average–pooled activations (64D per lesion) (Shafiee et al., 2015).
  • Hybrid models: Combine hand-crafted and deep features via vector concatenation or ensemble methods; often outperform pure approaches (Afshar et al., 2018).
  • Radiomics Incorporation into DNNs: Radiomic feature maps (RFMs) computed locally via sliding kernel (e.g., 535^3), followed by principal component dimension reduction; the reduced RFM volumes are provided as channels to U-Net architectures for segmentation and prediction (Chen et al., 2023).

These networks require rigorous normalization and fixed scaling, and their output descriptors can be appended to or replace hand-crafted feature vectors.

4. Extensions: Multiparametric, Spherical, Dynamic, and Tensor Radiomics

  • Multiparametric radiomics (MPRAD): Treats each voxel as an N-dimensional “tissue signature”—the vector of quantized intensities across modalities (Parekh et al., 2018). Extracted feature classes include joint entropy/uniformity (TSPM), spatial co-occurrence (TSCM), tissue-signature networks (TSRM/TSCIN), and nonlinear manifold embedding (Isomap) for downstream classification. This architecture enables true integration of multi-sequence data with demonstrated AUC improvement (e.g., breast mpMRI, AUC up to 0.87).
  • Spherical radiomics: Features are computed on concentric shells around a tumor centroid (radial bins), unwrapped onto 2D grids, and processed using standard PyRadiomics; analysis of radial transition slopes between zones is predictive of molecular status and survival in GBM (Feng et al., 15 Oct 2025).
  • Dynamic radiomics: Extracts time-evolution of features from longitudinal imaging. For kk time points, the pipeline produces q×kq \times k matrices (static features per time); modeling approaches include discrete pairwise changes, integrated summary statistics, and parametric curve fitting of feature trajectories. Dynamic features (e.g., relative change ratios, global trend statistics) outperform static radiomics in cancer therapy response and mutation prediction (Che et al., 2020).
  • Tensor radiomics (TR): Systematic stacking of features computed under multiple parameter flavours (bin sizes, segmentations, filters, modalities); downstream ML/DL can select robust, predictive feature sets via end-to-end architectures (TR-Net) or ensemble selection. TR achieves significant improvement in balanced accuracy and ICC (reproducibility) across multiple tasks (Rahmim et al., 2022).

5. Computational Acceleration and Software Implementations

  • PyRadiomics and PyRadiomics-cuda: PyRadiomics provides a widely adopted, IBSI-compliant Python interface; PyRadiomics-cuda offloads computational bottlenecks (mesh extraction, shape features) to GPU via optimized CUDA kernels, yielding order-of-magnitude speedups in shape feature extraction for large volumes (e.g., \sim2000×\times acceleration for large ROIs on modern GPUs) (Lisowski et al., 3 Oct 2025).
  • PySERA: Implements 557 features (487 IBSI-compliant, 10 moment invariants, 60 diagnostics), with standardized preprocessing, parallel execution, and seamless integration with scikit-learn, PyTorch, TensorFlow, and MONAI. PySERA demonstrates >>94% IBSI agreement and improved generalization versus PyRadiomics (Salmanpour et al., 20 Nov 2025).
  • RadiomicsRetrieval: A retrieval engine combining classic radiomics (14 shape, 18 first-order, 40 texture features via PyRadiomics) with promptable segmentation and anatomical position embeddings; radiomics-path features are aligned with deep embeddings for flexible, anatomy-aware retrieval (Na et al., 11 Jul 2025).
  • Precision-medicine-toolbox: Wraps PyRadiomics with robust curation, conversion, and EDA capabilities, maintaining full traceability and supporting parameterized YAML configurations (Primakov et al., 2022).

All toolkits emphasize auditability, precise parameter control, and compatibility with multi-core/GPU execution.

6. Quality Control, Reproducibility, and Reporting

Best practices dictate fixed YAML/dict parameterization, inclusion of IBSI reference tables, and documentation of all pipeline steps, including uncertainties and limits in robustness.

7. Limitations and Future Directions

Despite rapid advances, several challenges persist:

Radiomics feature extractors thus remain a vibrant, evolving technical domain anchoring the quantitative translation of imaging into clinical and research-centric models. Ongoing advances in mathematical definition, computational scalability, standardization, and integration with deep learning workflows are central to the future of reproducible and clinically relevant imaging biomarkers.

Whiteboard

Topic to Video (Beta)

Follow Topic

Get notified by email when new papers are published related to Radiomics Feature Extractor.