Heartbeat-Aware Multi-prototype (HAM)

Updated 8 January 2026

HAM is an ECG enrollment strategy that creates multiple prototypes via clustering heartbeat embeddings to enhance biometric recognition and reduce noise impact.
The method partitions enrollment signals using algorithms like K-means, with optimal performance typically achieved at K=3 to balance between accuracy and computational efficiency.
Experimental results on ECG datasets show significant improvements in identification accuracy and error reduction, validating HAM's robustness against physiological and acquisition noise.

Heartbeat-Aware Multi-prototype (HAM) is an enrollment strategy utilized in electrocardiography (ECG)-based biometric recognition systems to enhance identity authentication performance by mitigating the adverse impacts of heartbeat variability and signal noise. Rather than representing each subject by a single prototype vector derived from their physiological signals, HAM constructs a set of $K$ prototypes per subject, each capturing distinct modes within the individual's beat-level embedding distribution. This technique has demonstrated marked improvements in identification and verification accuracy within Hierarchical Phase-Aware Fusion (HPAF) systems and provides a robust approach to template construction in the presence of physiological and acquisition noise (Huang et al., 1 Jan 2026).

1. Motivation and Rationale

The primary motivation for HAM arises from the inherent variability and susceptibility to transient artifacts found in ECG signals. Single-prototype representations are sensitive to outlier heartbeats, which may be affected by transient noise sources such as electrode motion, muscle artifacts, or baseline wander. When a subject's entire enrollment set is condensed into a single vector, the negative influence of these atypical beats is not mitigated, resulting in prototype drift and degraded recognition performance.

HAM counters these issues by partitioning enrollment embeddings into $K$ disjoint clusters and constructing $K$ prototypes per individual. This strategy enables each prototype to model different physiological or noise-related modes in the heartbeats. During verification, a query heartbeat merely needs to correspond closely to one of the prototypes, thus reducing the likelihood that noise or rare events will dominate the matching process.

2. Mathematical Formulation

For a subject $s$ , let $N_s$ denote the number of heartbeats acquired for enrollment. The beat-level embeddings are represented as $E_s = \{ u_{s,1}, u_{s,2}, ..., u_{s,N_s} \} \subset \mathbb{R}^D$ , where each $u_{s,n}$ is produced by the GRF stage of the HPAF pipeline. Let $K$ be the chosen number of prototypes per subject.

Clustering: $E_s$ is partitioned into $K$ clusters using K-means or an alternative assignment algorithm, with $c_{s,n} \in \{1, ..., K\}$ denoting the cluster assignment for each embedding.

Prototype Construction:

$p_{s,k} = \frac{ \sum_{n=1}^{N_s} \mathbb{1}[c_{s,n}=k] \cdot u_{s,n} }{ \sum_{n=1}^{N_s} \mathbb{1}[c_{s,n}=k] } \quad \text{for } 1 \leq k \leq K$

Query Matching: For a query embedding $u_q$ , the distance to subject $s$ is: $d(s\,|\,u_q) = \min_{1 \leq k \leq K} d(u_q, p_{s,k})$ where $d(\cdot,\cdot)$ denotes a metric such as Euclidean or cosine distance. The recognized subject is: $s^* = \arg\min_{s} d(s\,|\,u_q)$

For multiple query beats $\{ u_q^m \}_{m=1}^M$ , the average-min distance strategy computes

$D(s) = \frac{1}{M} \sum_{m=1}^M \min_{k} d(u_q^m, p_{s,k}), \quad s^* = \arg\min_s D(s)$

This allows robust verification by aggregating prototype matches across multiple observed beats.

3. Algorithmic Workflow

The HAM procedure consists of two primary phases: enrollment and verification. The following pseudocode outlines the process:

Function ENROLL(subject s, embeddings E_s = {u_{s,n}}_{n=1..N_s}, K):
    1. Run K-means on E_s to produce assignments c_{s,n} ∈ {1…K}
    2. For k=1…K, form prototype:
        p_{s,k} ← mean of { u_{s,n} | c_{s,n} = k }
    3. Store {p_{s,1},…,p_{s,K}} as subject-s’s gallery templates

Function VERIFY(query embeddings {u_q^m}_{m=1..M}, gallery = { p_{s,k} } for all s,k):
    1. For each enrolled s, compute:
        D(s) = (1/M) · Σ_{m=1}^M [ min_{k=1..K} d( u_q^m, p_{s,k} ) ]
    2. Return identity s* = arg min_s D(s)

This algorithmic workflow is parameter-light and exclusively operates offline during the enrollment phase. No prototype updates are performed online or during verification.

4. Key Hyperparameters and Design Choices

Several parameters critically affect the efficacy and efficiency of HAM:

Parameter	Typical Setting	Notes
Number of prototypes $K$	1 to 5 (best at $K=3$ )	Improvement plateaus beyond $K=3$ ; chosen based on ablation studies
Distance metric	Euclidean (L2), cosine (with L2-norm)	Used consistently across experiments
Clustering algorithm	K-means (random init, 100-200 iters)	Mini-batch variants are feasible
Update rate	Offline only	No online adaptation performed

In the cited study, accuracy increased with $K$ up to $K=3$ , after which further gains were negligible. Euclidean distance was consistently employed, and standard K-means was sufficient for cluster assignment (Huang et al., 1 Jan 2026).

5. Integration with Hierarchical Phase-Aware Fusion (HPAF)

The HAM strategy operates downstream of HPAF, which comprises three encoder modules: Intra-Phase Representation (IPR), Phase-Grouped Hierarchical Fusion (PGHF), and Global Representation Fusion (GRF). HPAF produces a $D$ -dimensional embedding $u_i$ per heartbeat. In training, these embeddings directly receive a margin-based contrastive loss. During enrollment or verification, instead of averaging all embeddings into a single prototype, HAM aggregates the embeddings into $K$ prototypes using clustering, providing greater resilience against intra-class variability. At verification, query embeddings are matched via the minimum distance to the gallery of prototypes, and this matching result influences the subsequent accept/reject or identification decision.

6. Experimental Validation

HAM's effectiveness has been empirically validated across three public ECG datasets under open-set conditions:

On ECGID, top-1 identification accuracy increased from ~87.9% (single prototype, $K=1$ ) to 94.47% ( $K=3$ ), a gain of 6.6 percentage points; Equal Error Rate (EER) fell from ~38.85% to 15.15%.
On MIT-BIH, accuracy improved from ~90.08% to 98.22% and EER declined from ~17.82% to 8.04%.
On the PTB dataset, accuracy rose from ~87.90% to 98.93% and EER from ~20.07% to 9.62%.

Cumulative Match Characteristic (CMC) and Receiver Operating Characteristic (ROC) analyses confirmed consistent improvements in rank-1 identification and overall discrimination. Ablation studies further established that most of HAM's benefit is realized when increasing $K$ from 1 to 3; additional prototypes yielded diminishing returns (Huang et al., 1 Jan 2026).

7. Significance and Implications

HAM presents a parameter-light, computationally straightforward strategy to address the non-stationary and noisy characteristics of biological signals in biometric systems. By replacing single prototype templates with a multi-prototype representation, HAM robustly accommodates intra-class variations and mitigates noise-induced prototype drift, leading to substantial performance improvements in open-set biometric verification and identification. A plausible implication is that similar multi-prototype enrollment strategies may be beneficial across other biosignal-based recognition modalities subject to analogous sources of variability.

PDF Markdown Chat (Pro)

References (1)

Hear the Heartbeat in Phases: Physiologically Grounded Phase-Aware ECG Biometrics (2026)

Whiteboard

Generate a whiteboard explanation of this topic.

Topic to Video (Beta)

Generate a video overview of this topic.

Follow Topic

Get notified by email when new papers are published related to Heartbeat-Aware Multi-prototype (HAM).