Papers
Topics
Authors
Recent
Search
2000 character limit reached

An Artificial Intelligence Framework for Measuring Human Spine Aging Using MRI

Published 21 Nov 2025 in cs.CV | (2511.17485v1)

Abstract: The human spine is a complex structure composed of 33 vertebrae. It holds the body and is important for leading a healthy life. The spine is vulnerable to age-related degenerations that can be identified through magnetic resonance imaging (MRI). In this paper we propose a novel computer-vison-based deep learning method to estimate spine age using images from over 18,000 MRI series. Data are restricted to subjects with only age-related spine degeneration. Eligibility criteria are created by identifying common age-based clusters of degenerative spine conditions using uniform manifold approximation and projection (UMAP) and hierarchical density-based spatial clustering of applications with noise (HDBSCAN). Model selection is determined using a detailed ablation study on data size, loss, and the effect of different spine regions. We evaluate the clinical utility of our model by calculating the difference between actual spine age and model-predicted age, the spine age gap (SAG), and examining the association between these differences and spine degenerative conditions and lifestyle factors. We find that SAG is associated with conditions including disc bulges, disc osteophytes, spinal stenosis, and fractures, as well as lifestyle factors like smoking and physically demanding work, and thus may be a useful biomarker for measuring overall spine health.

Summary

  • The paper introduces a deep learning framework that segments and predicts biological spine age using 3D MRI data from over 17,000 individuals.
  • The methodology employs nnUNet-based segmentation, DCNN age prediction, and bias correction, achieving an R² of 0.85 and an MAE of approximately 3.67 years.
  • The findings demonstrate that the spine age gap is a clinically relevant biomarker linked to degenerative conditions and modulated by lifestyle factors.

Artificial Intelligence-Based Measurement of Human Spine Aging from MRI: Methods, Results, and Implications

Introduction

This work presents a deep learning framework for quantifying human spine aging using sagittal T2-weighted magnetic resonance imaging (MRI). The spine, comprising cervical, thoracic, and lumbar regions, exhibits both normal age-related degeneration and pathological changes, often discernible via imaging. Accurate estimation of biological spine age, distinct from chronological age, holds clinical promise for identifying individuals at higher risk for adverse spinal conditions and for tracking lifestyle or therapeutic interventions on spine health.

Methodology

A large-scale dataset of 18,070 3D whole-spine MRI series was compiled from 17,394 individuals (ages 25–84), sourced across 10 North American clinics and spanning 13 years. Rigorous eligibility filtering was performed using a data-driven approach: spine structural and degenerative conditions were vectorized from radiology reports (yielding 215-pathology features per case), aggregated by region and severity, reduced in dimensionality via UMAP, and clustered by HDBSCAN with a 15% population threshold to define "normal" versus "abnormal" aging profiles.

The model pipeline consists of three principal stages: semantic segmentation of the spine using a nnUnet-based architecture, masking to isolate spinal anatomy from surrounding tissues, and age prediction using a deep convolutional neural network (DCNN) composed of stacked 3D convolutional, batch-normalization, and pooling layers, followed by a dense prediction head. Figure 1

Figure 1: Schematic of pipeline: nnUnet-based segmentation, region masking, and DCNN-based age prediction with bias correction.

The predicted ages were subsequently bias-corrected using Cole’s method to mitigate regression-to-the-mean effects.

Experimental Protocol

The model was trained exclusively on "normal spine" MRI series, totaling 10,611 cases, with stratified splits by age and gender to form development, validation, and test cohorts. Model variants were compared in ablation studies varying training set size, region masking (cervical, thoracic, lumbar, or whole spine), and loss function (MSE vs. smooth-L1). Weighted mean absolute error (WMAE), MAE, and R2R^2 metrics quantified performance, with repeat-scan stability assessed via intraclass correlation.

Quantitative Results

The primary DCNN trained on the entire spine and maximal sample size achieved R2=0.85R^2 = 0.85 post-bias-correction, a substantial advance over prior classical ML approaches (R2=0.28R^2 = 0.28) (2511.17485). MAE and WMAE were 3.67 and 3.60 years, respectively. Increasing sample size yielded monotonic improvements in R2R^2, and lumbar-region-only models moderately lagged whole-spine models, underscoring the multi-regional nature of spine aging.

Repeat scans reflected an intraclass correlation coefficient of 0.73, indicating reasonable stability in predictions over time gaps averaging 1.6 years. Figure 2

Figure 2

Figure 2: Distribution of male participants by age in training, validation, and test splits.

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3: UMAP-based clustering of spine conditions in the 30-year-old age bracket—key to defining “normal” aging envelopes.

Figure 4

Figure 4: Absolute error of age prediction grouped by gender and chronological age bracket.

Model Interpretation

Grad-CAM visualizations demonstrate model focus on morphologic features such as disc bulges and vertebral curvatures, aligning with radiologic convention. The heatmaps highlight areas attended by the model for individual predictions, enabling both validation and nuanced error analysis. Expert review of large error cases revealed occasional segmentation failures and clinically plausible predictions for outliers. Figure 5

Figure 5: Grad-CAM attention maps on middle MRI slices across four subjects, revealing focus on disc bulges and vertebral features.

Spine Age Gap as a Clinically Relevant Biomarker

The “spine age gap” (SAG), defined as the difference between biological and chronological age, emerges as a biomarker tied to clinically relevant conditions. Severe disc bulges, osteophytes, fractures, and stenosis all correlated with increased SAG, with linear regression indicating that subjects with severe lumbar disc bulge had an average SAG increase of 2.96 years. Lifestyle factors such as smoking and physically demanding occupations were associated with increased SAG, whereas vigorous exercise showed a significant negative correlation. Figure 6

Figure 6: Odds ratios of lumbar degenerative and spinal structural conditions for subjects with large positive (>5>5 years) versus large negative SAG.

Furthermore, the effect of physically demanding work on SAG reverses with age, suggesting that in older individuals, continued physical activity may be indicative of better spine health. Figure 7

Figure 7: Mean SAG by chronological age, stratified by engagement in physically demanding work—a reversal in effect at older ages.

Implications and Future Directions

This framework, using deep learning on large-scale MRI datasets, establishes a precise, automated measure of spine aging with strong associations to both structural pathology and lifestyle metrics. The demonstrated accuracy and generalizability suggest near-immediate applicability in large-scale studies, prospective screening, and possibly in clinical decision support for spine-related disorders.

Potential future research directions include:

  • Augmentation with rare/severe condition MRI datasets to enhance predictive robustness;
  • Evaluation of alternative model architectures (e.g., vision transformers) for further performance scaling;
  • Replacement of existing cluster-analysis normal/abnormal definitions with encoder-decoder dimensionality reduction;
  • Extension to estimation of biological age in other organs (e.g., prostate, kidney, liver) for comprehensive “organ age” biomarkers.

Conclusion

This study introduces a rigorously validated DCNN framework for spine age estimation from T2-weighted MRI, leveraging advanced image segmentation and population clustering. The large-scale analysis demonstrates accurate prediction of spine age and characterizes the SAG as a biomarker intricately linked with structural degenerative conditions and lifestyle exposures. The approach represents a robust technical advance for imaging-based biological age estimation and offers a foundation for future interdisciplinary biomarker development and precision medicine in spinal health.

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We found no open problems mentioned in this paper.

Collections

Sign up for free to add this paper to one or more collections.