Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 71 tok/s
Gemini 2.5 Pro 48 tok/s Pro
GPT-5 Medium 12 tok/s Pro
GPT-5 High 21 tok/s Pro
GPT-4o 81 tok/s Pro
Kimi K2 231 tok/s Pro
GPT OSS 120B 435 tok/s Pro
Claude Sonnet 4 33 tok/s Pro
2000 character limit reached

LubDubDecoder: Cardiac Monitoring via Hearables

Updated 20 September 2025
  • LubDubDecoder is a system that transforms everyday hearables into clinical-grade cardiac monitors using advanced signal processing and deep learning.
  • It employs a two-branch convolutional autoencoder with temporal self-attention, achieving high correlations (~0.95 SCG, ~0.94 GCG) across users and devices.
  • Its robust design effectively handles motion artifacts and device variability, enabling continuous cardiac diagnostics in clinical and consumer settings.

LubDubDecoder is a system for fine-grained monitoring of micro-mechanical cardiac vibrations, enabling everyday hearables—such as in-ear earbuds, over-ear headphones, and bone conduction devices—to function as clinical-grade cardiac monitoring instruments. The system repurposes the built-in transducer, typically a speaker, in hearables to capture the canonical “lub-dub” heart sounds generated by valve activity. By leveraging the shared temporal and spectral characteristics of these sounds with underlying mechanical vibrations, LubDubDecoder reconstructs high-fidelity seismocardiography (SCG) and gyrocardiography (GCG) waveforms, extracting precise timing of micro-cardiac events. The approach obviates the need for specialized electrodes or chest-mounted sensors, robustly generalizes across device types, and maintains high accuracy in various use settings, including repeated remounting and music playback.

1. System Architecture and Workflow

LubDubDecoder comprises several distinct processing modules to enable cardiac monitoring via hearables. The workflow begins with acoustic capture using the device’s native transducer. For in-ear earbuds, the occlusion effect enhances low-frequency sensitivity for cardiac sounds; for over-ear headphones equipped only with speakers, the system exploits acoustic reciprocity, allowing the speaker to function as a passive microphone for acquiring heart sounds.

During the initial calibration phase, the user concurrently wears the hearable and places a smartphone equipped with an inertial measurement unit (IMU) on their chest to obtain reference SCG and GCG signals. Paired data collection provides the basis for learning the mapping between ear-based cardiac sounds and mechanical waveforms.

The pipeline includes:

  • Motion artifact removal, utilizing Mel-frequency cepstral coefficients (MFCC) and a classifier with ROC AUC approaching 0.994, to excise segments corrupted by artifacts.
  • Signal conditioning and segmentation, where audio is resampled to 500 Hz and filtered (4th-order Butterworth, 5–45 Hz), then partitioned into 800-ms windows corresponding to cardiac cycles.
  • Deep encoder–decoder reconstruction, featuring a two-branch convolutional autoencoder (dilated 1D convolutions for local detail, large receptive fields for long-range dependency) augmented by temporal self-attention, residual connections, and pooling layers.
  • Fiducial point labeling for critical events—mitral valve closure, isovolumetric contraction, aortic opening, maximal acceleration, rapid ejection—anchored via peak detection on the aortic opening with heuristics over a 200-ms span.

The system maintains high data quality by computing signal-to-noise ratio (SNR) for each cardiac cycle, whereby power in a 400-ms signal region centered at the S1 peak is contrasted with a noise region.

2. Signal Processing and Deep Learning Methodology

Heart sounds acquired at the ear are resampled and subjected to narrowband filtering to isolate cardiac frequencies. Segmentation utilizes anchor peaks—S1 for acoustic signals, aortic opening for mechanical signals—detected by algorithms assessing local prominence. SNR per cardiac cycle is calculated using the power formula P=1Ni=1Nx[i]2P = \frac{1}{N} \sum_{i=1}^N x[i]^2, with SNR computed as 10log10(Psignal/Pnoise)10 \log_{10}(P_\text{signal} / P_\text{noise}).

The reconstruction model is a two-branch convolutional autoencoder:

  • The first branch leverages dilated convolutions for short- and medium-term temporal context.
  • The second branch uses large receptive fields for protracted dependencies.
  • Combined outputs feed into temporal self-attention, with skip connections and pooling for stable gradient propagation.

For device generalization, frequency-domain equalization is incorporated:

  • Mean cardiac cycle signals from the reference (xref(t)x_\text{ref}(t)) and target devices (xtgt(t)x_\text{tgt}(t)) are Fourier-transformed: Xref(f)=F{xref(t)}X_\text{ref}(f) = \mathcal{F}\{x_\text{ref}(t)\}, Xtgt(f)=F{xtgt(t)}X_\text{tgt}(f) = \mathcal{F}\{x_\text{tgt}(t)\}.
  • Mapping function: H(f)=Xref(f)Xtgt(f)Xtgt(f)2+ϵH(f) = \frac{X_\text{ref}(f) \cdot X_\text{tgt}(f)^*}{|X_\text{tgt}(f)|^2 + \epsilon}.
  • For each new cycle, the signal is adjusted: x^tgtref(t)=F1{H(f)Xtgt(f)}\hat{x}_{\text{tgt}\rightarrow\text{ref}}(t) = \mathcal{F}^{-1}\{H(f) \cdot X_\text{tgt}(f)\}, then normalized using the L2L_2 norm.

This normalization compensates for hardware variability, permitting “zero-effort” cross-device adaptation.

3. Quantitative Performance and Validation

The system was evaluated in an IRB-approved paper with 18 users. Key performance metrics include:

Scenario SCG Correlation GCG Correlation Fiducial Timing Error
Within-user ~0.95 (±0.04) ~0.94 (±0.04) Median 0–2 ms; 95th percentile 4–20 ms
Cross-user ~0.88 (±0.07) ~0.89 (±0.05) Similar timing range
Cross-device (zero-effort) ~0.91 (±0.04) -- --

The motion artifact removal module achieved 97.7% accuracy. During remounting or music playback—where interference might be expected—reconstruction quality was preserved (correlations remained ~0.89–0.95). This suggests strong operational robustness.

4. Adaptation to User and Device Variability

LubDubDecoder includes mechanisms for adapting both to diverse user physiology and device-specific characteristics:

  • User adaptation is enabled by a brief calibration phase, requiring only five cardiac cycles (approx. 4 seconds) to tune the model for new heart-to-ear transmission profiles.
  • Device adaptation leverages frequency-domain equalization for cross-device normalization, supporting “zero-effort” conversion between hearable hardware without additional calibration from the user.

Performance tests indicate only marginal degradation after remounting, and reliable operation persists in realistic dynamic environments, e.g., during music playback.

5. Clinical and Consumer Applications

Potential applications of LubDubDecoder span clinical, consumer, and research domains:

  • Continuous ambulatory cardiac monitoring in non-clinical environments, with the capability to detect micro-cardiac valvular events and precisely measure relevant timings—critical for chronic disease tracking, arrhythmia monitoring, and acute episode identification.
  • Integration into consumer electronics, including hearables and hearing aids, providing accessible health insights through unobtrusive biosensing, thus enhancing fitness tracking, proactive health management, and emergency detection schemes.
  • Athletic and research use, for studying cardiac responses to exercise and investigating phenomena such as the “white-coat effect” in naturalistic settings. The system’s capacity for detailed event timing offers a new avenue for clinical-grade cardiac analytics without conventional medical instrumentation.

A plausible implication is that widespread deployment in consumer devices could democratize access to fine-grained cardiac diagnostics, allowing large-scale population studies and real-time health interventions.

6. Comparative Significance and Research Context

LubDubDecoder establishes a paradigm in mobile cardiac monitoring by utilizing generic wearable transducers and advanced signal processing frameworks. Its reconstruction fidelity matches traditional chest-mounted sensors, achieving Pearson correlations up to 0.95 for SCG and 0.94 for GCG in within-user tests and maintaining robust performance (0.88–0.91) across users and devices. This positions LubDubDecoder as a bridge between consumer wearables and clinical biosensing, with non-invasive operation and high adaptability. The system’s integration of machine learning, acoustic physics, and biomedically anchored timing analytics presents a comprehensive modality for next-generation cardiac monitoring.

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to LubDubDecoder.

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube