LASAN: Lead-Aware Spatial Attention Networks
- LASAN is a neural architecture that integrates per-lead temporal encoding with anatomical priors, employing CNNs and Transformer-based attention for ECG analysis.
- It uses a two-stage attention mechanism—multi-head self-attention and lead importance aggregation—to quantify each ECG lead’s diagnostic contribution.
- Hybrid integration with foundation models and lead-group masking yields state-of-the-art, interpretable inherited arrhythmia classification performance.
Lead-Aware Spatial Attention Networks (LASAN) are neural architectures specifically designed for multi-class and binary classification tasks on multi-lead electrocardiograms (ECG), with strong emphasis on physiological interpretability and integration with large-scale foundation models. LASAN’s principal innovation is the explicit modeling of per-lead temporal structure, incorporation of anatomical prior knowledge, inter-lead attention mechanisms, and a uniquely interpretable attention aggregation that quantifies each lead’s diagnostic contribution. LASAN has demonstrated state-of-the-art performance in inherited arrhythmia classification, providing actionable insights into disease-specific ECG lead dependencies (Sigfstead et al., 12 Jan 2026).
1. Architecture and Lead-Aware Encoding
LASAN is designed to leverage the spatial organization of ECG leads while preserving clinically meaningful distinctions among them. The input consists of raw, artifact-free 8-lead ECG signals (I, II, V1–V6), preprocessed by up-sampling to 500 Hz, truncation to 5 s, and per-lead z-score normalization:
where is the signal for lead , and denote its per-lead mean and standard deviation. This results in input tensors of shape .
Each lead signal is independently processed by a shared 1D-CNN “temporal encoder” , comprising four convolutional blocks (channels 32→64→128→256, kernel size=15, max-pool stride=2), outputting lead-level features . Anatomical position is encoded by adding a learnable vector to each , such that
This structurally facilitates discrimination of distinct anatomical lead groupings.
2. Inter-Lead Attention and Aggregation
At the core of LASAN is a two-stage attention mechanism to model spatial dependencies and provide interpretability:
a) Multi-Head Self-Attention Across Leads:
The set of per-lead feature vectors undergoes multi-head self-attention ( heads; ), implemented as a 3-layer Transformer encoder that refines inter-lead relationships. Each head computes standard scaled dot-product attention, with outputs projected and concatenated to form .
b) Lead Importance Aggregator (Single-Head Attention):
To yield a compact, interpretable representation, LASAN learns a lead-query vector and lead-wise biases . The unnormalized attention scores are
with normalized weights
The aggregated ECG representation is
The vector yields explicit lead-wise importance weights, directly reflecting each lead’s influence on the network output.
3. Classification Layer and Attention Integration
The aggregated embedding is input to a two-layer MLP classification head:
For multi-class problems (), predictions are obtained via softmax; for binary cases, via sigmoid. Dropout regularization is applied between layers. The learned attention weights modulate each lead’s contribution to the classification decision.
4. Foundation Model Integration and Transfer Learning
LASAN was evaluated under several transfer learning regimes using four ECG foundation encoders (ECG-Founder, Deep-ECG-SSL, ECG-FM, HuBERT-ECG):
- Linear Probing: Encoder parameters are frozen; only a linear classifier is trained.
- Fine-Tuning: The full model, including the backbone, is trained end-to-end.
- Combined: Linear probing for an initial 50 epochs, followed by full fine-tuning (100 epochs).
Three LASAN–foundation model integration strategies are distinguished:
- A. Standalone LASAN: LASAN encoder and head are trained from scratch.
- B. Foundation + LASAN Head: The LASAN attention head replaces the standard linear head atop a pretrained (optionally fine-tuned) foundation encoder.
- C. Hybrid LASAN: The architecture has two branches: (1) a (frozen or fine-tuned) foundation encoder produces a global feature , (2) a parallel LASAN encoder yields a lead-aware feature . A learned gating unit (sigmoid-activated) merges the two:
then feeds into the classification head.
5. Physiologic Interpretability: Lead-Group Masking
To assess and quantify disease-specific lead dependence, systematic lead-group masking is applied. For a group , input signals are set to zero for all , and AUROC is recomputed. The impact is
Key lead groups tested include right precordials (V1–V2, V1–V3), lateral (I, V5–V6), limb (I, II), and all precordials (V1–V6). Results demonstrate physiologic plausibility: masking V1–V3 reduces AUROC for ARVC by 4.54%, while masking lateral leads reduces AUROC for LQTS by 2.60%; masking all precordials results in a macro-AUROC drop of 6.45%.
6. Empirical Performance
Performance metrics for inherited arrhythmia classification tasks (macro AUROC, mean±SD), as reported by (Sigfstead et al., 12 Jan 2026), are summarized below.
| Task | Standalone LASAN | Foundation+LASAN Head (Best) | Hybrid LASAN (Best) | Best Foundation Model Alone |
|---|---|---|---|---|
| Multi-class (ARVC/LQTS/CTL) | 0.911 ± 0.037 | 0.990 ± 0.003 (HuBERT-ECG) | 0.990 ± 0.002 | 0.910 ± 0.018 (ECG-Founder) |
| Binary ARVC vs Control | 0.974 | — | 0.999 (HuBERT-ECG) | — |
| Binary LQTS vs Control | 0.901 | — | 0.994 (HuBERT-ECG) | — |
| LQT1 vs LQT2 genotype | 0.920 | — | 0.948 (Deep-ECG-SSL) | — |
Fine-tuned (“end-to-end”) foundation models outperformed linear probing and combined strategies across all backbones. Both LASAN-head and hybrid strategies yielded near-ceiling AUROC performance, substantially exceeding previously published models for these tasks. A plausible implication is that lead-aware integration with foundation encoders is critical for maximizing both accuracy and interpretability in automated ECG analysis.
7. Clinical and Research Significance
LASAN explicitly models temporal and spatial structure per ECG lead, injects anatomical priors via position embeddings, and employs interpretable spatial attention to expose the physiological plausibility of its inferences. The network’s ability to quantify and visualize class-specific lead dependence—aligning with known disease electrophysiology (right-precordial leads for ARVC; lateral for LQTS)—addresses a central limitation of conventional deep-learning ECG classifiers.
LASAN’s hybrid variants enable fusion of large-scale pretraining (“foundation model” features) with fine-grained, anatomy-informed lead-level reasoning. The model outperforms all earlier published approaches on both multi-class and binary inherited arrhythmia screening tasks. Lead-group masking and attention analyses demonstrate actionable interpretability, supporting the potential for these methods in automated ECG-based screening workflows, conditional on further external validation (Sigfstead et al., 12 Jan 2026).