MATCH-AD: Adaptive Transport Clustering for AD
- The paper introduces MATCH-AD, which fuses deep representation learning, graph-based label propagation, and optimal transport clustering to extract disease-relevant signals from heterogeneous neuroimaging data.
- It achieves robust quantification of Alzheimer’s progression with strong theoretical guarantees and near-perfect diagnostic reliability under limited label availability.
- Extensive evaluations demonstrate that MATCH-AD significantly outperforms baseline methods in accuracy and Cohen's kappa, enhancing clinical interpretability and decision-making.
Multi-view Adaptive Transport Clustering for Heterogeneous Alzheimer’s Disease (MATCH-AD) is a semi-supervised learning framework designed to address diagnostic challenges in Alzheimer’s disease (AD) using heterogeneous neuroimaging datasets with limited ground truth annotations. By integrating deep representation learning, graph-based label propagation, and optimal transport clustering, MATCH-AD enables the extraction of disease-relevant structure, label-efficient classification, and explicit quantification of disease progression, with strong theoretical guarantees and empirical superiority over existing methods (Moayedikia et al., 19 Dec 2025).
1. Overview of MATCH-AD Framework
MATCH-AD is constructed as a unified, alternating-minimization pipeline comprising three tightly coupled modules:
- Deep Representation Learning: Heterogeneous input data—including structural MRI features (), CSF biomarkers (), and clinical/demographic variables ()—are preprocessed by kNN imputation (k=5) and robust scaling, then concatenated into a tensor . An autoencoder with encoder-decoder architecture learns a latent representation designed to preserve disease-relevant manifold structure.
- Graph-based Label Propagation: A kNN graph is built over , constructing an affinity matrix from Gaussian kernels and normalizing to a similarity matrix . This graph is used to propagate sparse clinical labels (29% labeled) to the majority of the cohort.
- Optimal Transport Clustering: Using the propagated multi-class labels, samples are partitioned into disease stages. Entropically regularized Wasserstein distances are computed between empirical distributions at adjoining stages, providing a continuous, metric-based quantification of disease progression using the Sinkhorn algorithm.
All modules contribute to a joint training objective, with alternating updates for representations, label distributions, and transport plans.
2. Mathematical Foundation
2.1 Input Views and Shared Embedding
- Let , , ;
- Encoder:
- Decoder:
- Latent representation:
2.2 Graph Construction
- Pairwise Euclidean distance matrix in latent space:
- For kNN graph, affinity weights:
with the mean distance to ’s -th neighbor.
2.3 Label Propagation
- Label matrix : labeled rows one-hot, unlabeled rows uniform ($1/c$)
- Propagation update:
This admits a closed-form solution:
2.4 Optimal Transport
- Successive disease stages , induce empirical distributions over ; 2-Wasserstein distance:
where and is the entropy regularizer.
3. Training Objectives and Optimization
MATCH-AD is optimized by minimizing a composite loss:
- Autoencoder Loss :
- Propagation Loss :
- Optimal Transport Loss : sum of Wasserstein distances
- Smoothness Loss : promotes local consistency of soft label distributions
Algorithmic optimization proceeds by pretraining the autoencoder, followed by alternating updates of labels, transport plans, and representations until convergence.
4. Theoretical Analysis
MATCH-AD is accompanied by several theoretical guarantees:
- Convergence of Label Propagation: The label propagation update converges geometrically for , with
- Label Consistency Bound: Under appropriate manifold assumptions, the expected label error is bounded:
where is the intrinsic dimension and the geodesic error.
- Stability of Wasserstein Distance: Empirical Wasserstein distances are stable to sampling, with
- Global Convergence: Alternating minimization is guaranteed to reach a stationary point (under Lipschitz and boundedness assumptions).
- Sample Complexity: To achieve -accurate label propagation with probability requires
5. Experimental Evaluation
5.1 Data and Preprocessing
- Cohort: subjects from National Alzheimer’s Coordinating Center; 219 total features per subject after integration
- Semi-supervised regime: 29.1% labeled for training/testing, 70.9% unlabeled
- Class distribution (labeled subset): Normal (63.1%), Impaired-not-MCI (3.7%), MCI (19.9%), Dementia (13.3%)
- Preprocessing: 50% missing—excluded; remaining imputed by kNN (k=5)
- Train/test split: 80/20 stratified on labeled data
5.2 Performance under Label Scarcity
| Labeled Fraction | Accuracy (\%) | Cohen’s |
|---|---|---|
| 5\% (70) | 59.1 ± 1.8 | 0.159 |
| 30\% | 72.6 ± 0.6 | 0.430 |
| 50\% | 81.1 ± 0.7 | 0.614 |
| 80\% | 91.9 ± 0.3 | 0.842 |
| 100\% ( test set) | 98.4 | 0.970 |
- MATCH-AD retains clinically meaningful κ (0.4) with only 30% labels, reaching “almost perfect agreement” (κ0.8) at 80% (Moayedikia et al., 19 Dec 2025).
5.3 Baseline Comparisons and Ablations
- Best baseline (SelfTraining_SVM): 71.3% accuracy, κ=0.329
- MATCH-AD: 98.4% accuracy, κ=0.970, F1=0.976
- SelfTraining_SVM underperforms on minority classes, while MATCH-AD maintains F10.86 across all classes (“Normal”: 98.9%, “Impaired-not-MCI”: 90.9%, “MCI”: 86.2%, “Dementia”: 97.4%)
- Removal of the autoencoder collapses κ to zero/negative; removal of optimal transport impairs modeling of progression but weakly affects classification performance
5.4 Hyperparameter Sensitivity
- Propagation : peak performance at –$0.3$
- Neighborhood : plateau of robust performance for (default: )
- Outer iterations –20 for convergence
6. Clinical Relevance and Interpretability
- Accuracy alone understates performance under class imbalance; Cohen’s κ provides a stricter criterion
- MATCH-AD illustrates resource allocation tradeoffs: moderate agreement (κ0.4) attainable with only ~30% labeled subjects, substantial (κ0.6) at ~50–60%, and almost perfect (κ0.8) at ~80%
- Minority/early-stage classes (Impaired-not-MCI) benefit most from the integrated representation+propagation approach
- The explicit transport-based progression quantifies state transition distances, potentially offering interpretable “disease trajectories” in clinical settings
7. Summary and Availability
MATCH-AD introduces a mathematically principled, scalable, and label-efficient framework for mining disease structure in heterogeneous neuroimaging datasets. It achieves nearly perfect diagnostic reliability (κ up to 0.97), robust performance across extreme label scarcity (≥5% labeled), and continuous quantification of progression via Wasserstein distances, with strong theoretical guarantees at every stage (Moayedikia et al., 19 Dec 2025). All code and data splits are provided at https://github.com/amoayedikia/brain-network.git.