Learnable Pathway Module in BioMorphNet
- Learnable Pathway Module is a trainable component within BioMorphNet that identifies gene subsets via a novel weight matrix and softmax normalization.
- It fuses morphological and transcriptomic signals using cross-attention transformers to enhance tissue classification and biomarker discovery.
- The module integrates clinical pathway features with image morphology, achieving state-of-the-art accuracy across spatial transcriptomics datasets.
BioMorphNet is a multimodal, patch-level deep learning framework designed for integrative tissue classification and biomarker discovery in whole-slide images (WSIs) paired with spatial transcriptomics. Its architecture leverages morphological similarity, gene expression, and biological pathway modeling to characterize tumor microenvironments and support differential analysis at high spatial resolution. BioMorphNet incorporates innovative graph-based and transformer-based structures, bridging image morphology and molecular profiling, and achieves superior classification and interpretability in cancer pathology contexts (Liu et al., 13 Jan 2026).
1. Architectural Overview and Core Design
BioMorphNet is constructed to address limitations in existing WSI–omics integration studies, which frequently neglect spatial transcriptomics and patch-based analysis. The model consists of three parallel computational branches:
- Morphological Graph Encoder: Each 224×224 px tissue patch is encoded by a pretrained histology backbone (UNI v1, 1024-D) and embedded in a k-NN (k=8) graph with spatial neighbors.
- Spatial Transcriptomic Encoder: Gene expression profiles for each patch are transformed via an MLP to generate a 512-D transcriptomic code.
- Morphology–Pathway Fusion: This branch aggregates predefined clinical pathway features from pathway databases and supplements them with a novel, learnable pathway module; both utilize cross-attention transformer blocks for alignment and fusion.
A final gating and classification head adaptively weights and fuses these three representations for tissue category prediction and confidence-weighted differential gene analysis.
2. Modeling the Tumor Microenvironment with Graphs
BioMorphNet’s graph module models the local tumor microenvironment (TME) at patch resolution, capturing both morphological and molecular proximity:
- Node Representation: Central and neighbor patches (xᶜ, xᵢ) are represented by 1024-D features extracted by a deep histology encoder.
- Edge Weighting: The edge weight between central patch and neighbor is computed as a function of both morphological similarity (inverse MSE of feature vectors) and molecular similarity (inverse MSE of gene expression vectors):
This robustly encodes TME context by dynamically weighting neighbors based on phenotype and genotype.
- Graph Convolutional Update: Node features are refined via two-layer GCN, using adjacency and degree normalization:
The central node’s final 512-D embedding serves as the core morphological code.
3. Biological Pathways: Clinical and Learnable Modules
BioMorphNet introduces dual modalities for pathway modeling:
- Clinical Pathways: Given (gene expression vector), activation scores for each pathway are computed as , passed through an MLP to produce a 512-D pathway code.
- Learnable Pathway Module: A trainable weight matrix (a ≈ 200) identifies gene subsets; top 5% mask is normalized via softmax, and pathway activation is
The resulting vector is further encoded to generate novel pathway representations, mapping latent biological processes.
- Morphology–Pathway Fusion: Morphological (morph) and pathway embeddings interact via parallel cross-attention transformers, yielding clinical/learnable fusion features ; adaptive gating computes for fusion into and further refinement via two stacked transformer blocks.
4. Training Objective and Optimization Protocol
BioMorphNet utilizes a weighted cross-entropy classification loss, with class weights inversely proportional to sample frequency:
The full optimization targets
where denotes all trainable parameters, and AdamW (weight decay ) is used with a fixed schedule (batch size 32, learning rate , up to 60 epochs), employing early stopping and 5-fold repeated train/val/test splits.
5. Empirical Evaluation and Performance Metrics
BioMorphNet was benchmarked on three spatial transcriptomics datasets:
| Dataset | WSIs | Classes | Genes | Test Balanced Acc | AUROC | Baseline Acc Gain | Baseline AUROC Gain |
|---|---|---|---|---|---|---|---|
| Prostate | 7 | 4 | ~20,000 | 0.801 | 0.970 | +2.67 pp | +0.015 |
| Colorectal | 6 | 3 | ~22,000 | 0.752 | 0.932 | +5.48 pp | +0.048 |
| Breast | 8 | 2 | ~12,000 | 0.819 | 0.918 | +6.29 pp | +0.029 |
On all datasets, the mean AUROC exceeded 0.90, with confusion matrices indicating >90% recall on key tumor grades (e.g., Gleason grade 4 at 90.4%). Statistical significance was determined via paired t-tests () against six state-of-the-art morphology–gene fusion baselines.
6. Biomarker Discovery and Biological Interpretation
BioMorphNet supports interpretable biomarker discovery by leveraging confidence-filtered patch predictions ( in test). Differential gene expression between tissue categories is assessed using Wilcoxon rank-sum statistics, resulting in the identification of notable biomarkers:
- Prostate: PPFIA2 (upregulated in Gleason 4 cribriform), MT1G (downregulated; a suppressor of tumor growth).
- Breast: DDX5, CD24, ERBB2 (HER2; established breast cancer markers).
- Colorectal: ITLN1, PLA2G2A (downregulated in tumor regions; suppressors of neovascularization).
This approach links spatial morphopathology with gene-level molecular insights for refined biomarker localization.
7. Limitations, Future Directions, and Morphogenetic Implications
Limitations include susceptibility to label imbalance (rare classes), limited spatial coverage of transcriptomic profiles in WSIs, and reliance on incomplete predefined pathway databases. The framework is constrained by its supervised training setup and potential spatial bias in edge weights.
Future research avenues are outlined as:
- Incorporation of synthetic data augmentation or self-supervised representation learning to address rare-class limitations.
- Multi-resolution graph construction to integrate larger spatial context.
- Integration of unsupervised pathway discovery modules (e.g., graph-based gene co-expression community detection).
- Scaling on larger transcriptomic cohorts and extension to survival or therapeutic response prediction.
Morphogenetic Perspective: According to Lucas et al. (Lucas et al., 7 Jan 2026), biological network architectures with optimized transport, robustness, and exploration properties can be achieved through simple, local branching–fusion–stopping rules. The emergence of hybrid tree–loop morphologies in BioMorphNet reflects these principles, indicating the utility of moderate branching and fusion in achieving Pareto-optimal trade-offs among clinical objectives. This suggests that key BioMorphNet parameters may be steered by modulating local rule sets, mirroring evolutionary strategies for network optimization in living tissues.
Summary
BioMorphNet provides a rigorously engineered framework for multimodal integration of tissue morphology, gene expression, and pathway structure at the WSI patch level, attaining state-of-the-art classification accuracy, robust biomarker discovery, and interpretable multimodal fusion. Its design draws inspiration from minimal branching–fusion morphogenetic models, and its modular approach enables extensibility to future spatial omics–driven pathology applications (Liu et al., 13 Jan 2026, Lucas et al., 7 Jan 2026).