Heartbeat-Aware Multi-prototype (HAM)
- HAM is an ECG enrollment strategy that creates multiple prototypes via clustering heartbeat embeddings to enhance biometric recognition and reduce noise impact.
- The method partitions enrollment signals using algorithms like K-means, with optimal performance typically achieved at K=3 to balance between accuracy and computational efficiency.
- Experimental results on ECG datasets show significant improvements in identification accuracy and error reduction, validating HAM's robustness against physiological and acquisition noise.
Heartbeat-Aware Multi-prototype (HAM) is an enrollment strategy utilized in electrocardiography (ECG)-based biometric recognition systems to enhance identity authentication performance by mitigating the adverse impacts of heartbeat variability and signal noise. Rather than representing each subject by a single prototype vector derived from their physiological signals, HAM constructs a set of prototypes per subject, each capturing distinct modes within the individual's beat-level embedding distribution. This technique has demonstrated marked improvements in identification and verification accuracy within Hierarchical Phase-Aware Fusion (HPAF) systems and provides a robust approach to template construction in the presence of physiological and acquisition noise (Huang et al., 1 Jan 2026).
1. Motivation and Rationale
The primary motivation for HAM arises from the inherent variability and susceptibility to transient artifacts found in ECG signals. Single-prototype representations are sensitive to outlier heartbeats, which may be affected by transient noise sources such as electrode motion, muscle artifacts, or baseline wander. When a subject's entire enrollment set is condensed into a single vector, the negative influence of these atypical beats is not mitigated, resulting in prototype drift and degraded recognition performance.
HAM counters these issues by partitioning enrollment embeddings into disjoint clusters and constructing prototypes per individual. This strategy enables each prototype to model different physiological or noise-related modes in the heartbeats. During verification, a query heartbeat merely needs to correspond closely to one of the prototypes, thus reducing the likelihood that noise or rare events will dominate the matching process.
2. Mathematical Formulation
For a subject , let denote the number of heartbeats acquired for enrollment. The beat-level embeddings are represented as , where each is produced by the GRF stage of the HPAF pipeline. Let be the chosen number of prototypes per subject.
Clustering: is partitioned into clusters using K-means or an alternative assignment algorithm, with denoting the cluster assignment for each embedding.
Prototype Construction:
Query Matching: For a query embedding , the distance to subject is: where denotes a metric such as Euclidean or cosine distance. The recognized subject is:
For multiple query beats , the average-min distance strategy computes
This allows robust verification by aggregating prototype matches across multiple observed beats.
3. Algorithmic Workflow
The HAM procedure consists of two primary phases: enrollment and verification. The following pseudocode outlines the process:
1 2 3 4 5 |
Function ENROLL(subject s, embeddings E_s = {u_{s,n}}_{n=1..N_s}, K):
1. Run K-means on E_s to produce assignments c_{s,n} ∈ {1…K}
2. For k=1…K, form prototype:
p_{s,k} ← mean of { u_{s,n} | c_{s,n} = k }
3. Store {p_{s,1},…,p_{s,K}} as subject-s’s gallery templates |
1 2 3 4 |
Function VERIFY(query embeddings {u_q^m}_{m=1..M}, gallery = { p_{s,k} } for all s,k):
1. For each enrolled s, compute:
D(s) = (1/M) · Σ_{m=1}^M [ min_{k=1..K} d( u_q^m, p_{s,k} ) ]
2. Return identity s* = arg min_s D(s) |
This algorithmic workflow is parameter-light and exclusively operates offline during the enrollment phase. No prototype updates are performed online or during verification.
4. Key Hyperparameters and Design Choices
Several parameters critically affect the efficacy and efficiency of HAM:
| Parameter | Typical Setting | Notes |
|---|---|---|
| Number of prototypes | 1 to 5 (best at ) | Improvement plateaus beyond ; chosen based on ablation studies |
| Distance metric | Euclidean (L2), cosine (with L2-norm) | Used consistently across experiments |
| Clustering algorithm | K-means (random init, 100-200 iters) | Mini-batch variants are feasible |
| Update rate | Offline only | No online adaptation performed |
In the cited study, accuracy increased with up to , after which further gains were negligible. Euclidean distance was consistently employed, and standard K-means was sufficient for cluster assignment (Huang et al., 1 Jan 2026).
5. Integration with Hierarchical Phase-Aware Fusion (HPAF)
The HAM strategy operates downstream of HPAF, which comprises three encoder modules: Intra-Phase Representation (IPR), Phase-Grouped Hierarchical Fusion (PGHF), and Global Representation Fusion (GRF). HPAF produces a -dimensional embedding per heartbeat. In training, these embeddings directly receive a margin-based contrastive loss. During enrollment or verification, instead of averaging all embeddings into a single prototype, HAM aggregates the embeddings into prototypes using clustering, providing greater resilience against intra-class variability. At verification, query embeddings are matched via the minimum distance to the gallery of prototypes, and this matching result influences the subsequent accept/reject or identification decision.
6. Experimental Validation
HAM's effectiveness has been empirically validated across three public ECG datasets under open-set conditions:
- On ECGID, top-1 identification accuracy increased from ~87.9% (single prototype, ) to 94.47% (), a gain of 6.6 percentage points; Equal Error Rate (EER) fell from ~38.85% to 15.15%.
- On MIT-BIH, accuracy improved from ~90.08% to 98.22% and EER declined from ~17.82% to 8.04%.
- On the PTB dataset, accuracy rose from ~87.90% to 98.93% and EER from ~20.07% to 9.62%.
Cumulative Match Characteristic (CMC) and Receiver Operating Characteristic (ROC) analyses confirmed consistent improvements in rank-1 identification and overall discrimination. Ablation studies further established that most of HAM's benefit is realized when increasing from 1 to 3; additional prototypes yielded diminishing returns (Huang et al., 1 Jan 2026).
7. Significance and Implications
HAM presents a parameter-light, computationally straightforward strategy to address the non-stationary and noisy characteristics of biological signals in biometric systems. By replacing single prototype templates with a multi-prototype representation, HAM robustly accommodates intra-class variations and mitigates noise-induced prototype drift, leading to substantial performance improvements in open-set biometric verification and identification. A plausible implication is that similar multi-prototype enrollment strategies may be beneficial across other biosignal-based recognition modalities subject to analogous sources of variability.