Papers
Topics
Authors
Recent
2000 character limit reached

WPD-CCA Method for Artifact Removal

Updated 5 January 2026
  • WPD-CCA is a hybrid method that combines wavelet packet decomposition with canonical correlation analysis to correct motion artifacts in single-channel EEG/fNIRS data.
  • It generates pseudo-multichannel representations from single-channel recordings, enabling efficient separation and removal of artifact components.
  • Performance metrics like ΔSNR and percentage artifact reduction illustrate its superiority over single-stage methods and traditional blind source separation techniques.

The WPD-CCA (Wavelet Packet DecompositionCanonical Correlation Analysis) method offers a two-stage data-driven pipeline for correcting motion artifacts in single-channel electroencephalogram (EEG) and functional near-infrared spectroscopy (fNIRS) signals. Designed to address the non-stationarity and contamination issues inherent in wearable EEG/fNIRS measurements, WPD-CCA combines time–frequency localization provided by wavelet packet decomposition with the source separation capacity of canonical correlation analysis. This hybrid approach circumvents the requirement for multi-channel inputs or physically separate artifact references, instead generating pseudo-multichannel representations from a single sensor. WPD-CCA demonstrates superior denoising efficacy compared to single-stage WPD and other blind-source separation (BSS) methods, as evaluated by difference in signal-to-noise ratio (ΔSNR) and percentage reduction in motion artifacts (η) (Hossain et al., 2022).

1. Two-Stage Pipeline: Structure and Rationale

WPD-CCA is composed of two sequential operations:

  1. Wavelet Packet Decomposition (WPD): The input signal x(n)x(n) is decomposed into 2j2^j frequency-localized sub-bands via wavelet packet basis functions at level jj. Each sub-band isolates distinct spectral content, exploiting the empirical observation that motion artifacts produce large-magnitude coefficients localized sparsely across few sub-bands. This process generates a pseudo-multichannel dataset from the original single-channel input.
  2. Canonical Correlation Analysis (CCA): The WPD sub-bands are interpreted as channels. CCA is then applied between the (pseudo-multichannel) WPD sub-bands X(t)RM×TX(t) \in \mathbb{R}^{M \times T} and their time-lagged versions Y(t)=X(t1)+X(t+1)Y(t) = X(t-1) + X(t+1). By maximizing correlations between linear projections of XX and YY, CCA yields canonical variates ranked by autocorrelation strength. Artifact-dominated components typically possess the largest autocorrelation, and are identified for subsequent removal prior to reconstructing the artifact-suppressed signal (Hossain et al., 2022).

This two-stage approach is motivated by the observation that neither WPD nor CCA alone achieves optimal artifact suppression in single-channel settings. In WPD-CCA, WPD isolates artifact-rich sub-bands, while CCA further unmixed temporally autocorrelated artifact signals from the underlying neuronal or hemodynamic activity.

2. Mathematical Foundations

2.1 Wavelet Packet Decomposition

Given signal x(n)=s(n)+v(n)x(n) = s(n) + v(n), with s(n)s(n) the true underlying physiological signal and v(n)v(n) the artifact, WPD at level %%%%10%%%% produces basis functions ψj,i(n)\psi_{j,i}(n) and expansion coefficients

Xj,i(k)=x(n),ψj,i(n2jk)X_{j,i}(k) = \langle x(n), \psi_{j,i}(n-2^j k) \rangle

which decompose as Xj,i(k)=Sj,i(k)+Vj,i(k)X_{j,i}(k) = S_{j,i}(k) + V_{j,i}(k). The basis recursion relations are

ψj,2i(n)=kh(k)ψj1,i(n2j1k)\psi_{j,2i}(n) = \sum_k h(k) \psi_{j-1,i}(n - 2^{j-1}k)

ψj,2i+1(n)=kg(k)ψj1,i(n2j1k)\psi_{j,2i+1}(n) = \sum_k g(k) \psi_{j-1,i}(n - 2^{j-1}k)

where hh, gg are the low- and high-pass wavelet filters (e.g., Daubechies ‘db1’, ‘db2’, Fejér–Korovkin ‘fk4’, etc.).

2.2 Canonical Correlation Analysis

With X(t)X(t) (size M×TM \times T, M=2jM=2^j) and Y(t)=X(t1)+X(t+1)Y(t) = X(t-1) + X(t+1), CCA finds wxw_x, wyw_y to maximize

ρ=corr(wxTX,wyTY)\rho = \text{corr}(w_x^T X, w_y^T Y)

Subject to constraints, this leads to the generalized eigenvalue problem

CXX1CXYCYY1CYXwx=ρ2wxC_{XX}^{-1} C_{XY} C_{YY}^{-1} C_{YX} w_x = \rho^2 w_x

Canonical loadings wxw_x, wyw_y unmix XX into variates S=WXS=W X ordered by autocorrelation. Artifact components yield high canonical correlations.

2.3 Reconstruction

Artifacts are identified by evaluating whether zeroing a canonical variate increases the correlation with a ground-truth (reference) signal or surpasses an autocorrelation threshold. Excluding artifact-dominated canonical variates forms S~\tilde{S}, and the cleaned pseudo-channels are reconstructed by X~=W1S~\tilde{X} = W^{-1} \tilde{S}. Summing across sub-bands yields the final artifact-corrected signal.

3. Detailed Algorithmic Steps

3.1 Preprocessing

  • EEG: Downsample from 2048 Hz to 256 Hz, apply a 50 Hz notch filter, remove polynomial baseline drift.
  • fNIRS: Use 25 Hz sampling rate, apply similar notch/baseline removal.
  • Entire signals (~9 minutes duration) are processed as a single segment.

3.2 WPD Stage

  • Decomposition level j=4j=4 ($16$ sub-bands).
  • Wavelet packet options: db1, db2, db3, sym4–6, coif1–3, fk4, fk6, fk8 (12 total).
  • Compute xi(n)x_i(n) for i=1,,16i = 1, \dots, 16.

3.3 WPD-Based Artifact Detection

  • In single-stage WPD, each sub-band xi(n)x_i(n) is dropped if its exclusion increases the sum’s correlation with a reference channel.
  • Artifact-reduced signal is obtained by summing the undropped sub-bands.

3.4 CCA Stage (WPD-CCA)

  • Stack the 16 sub-bands as X(t)X(t).
  • Compute Y(t)=X(t1)+X(t+1)Y(t) = X(t-1) + X(t+1).
  • Estimate covariance matrices CXX,CYY,CXYC_{XX}, C_{YY}, C_{XY}.
  • Solve for canonical weights and project S=WXS = W X.
  • Identify and zero artifact-related canonical variate columns in SS.
  • Reconstruct X~=W1S~\tilde{X} = W^{-1}\tilde{S}, sum sub-bands to yield the final signal.

Parameter selection is driven by trade-offs between frequency resolution and computational load (suggested j=4j=4), and best performance is empirically observed for db1–3 and fk4–8 wavelets.

4. Quantitative Performance and Evaluation Metrics

Performance is quantified using:

  • Difference in Signal-to-Noise Ratio (ΔSNR):

ΔSNR=10log10(σ2/δafter2)10log10(σ2/δbefore2)=10log10(δbefore2/δafter2)\Delta \mathrm{SNR} = 10 \log_{10}(\sigma^2/\delta_{after}^2) - 10 \log_{10}(\sigma^2/\delta_{before}^2) = 10 \log_{10}(\delta_{before}^2/\delta_{after}^2)

σ2\sigma^2 denotes variance of the clean signal, δbefore2\delta_{before}^2 and δafter2\delta_{after}^2 those of the corrupted and corrected signals, respectively.

  • Percentage Reduction in Motion Artifacts (η):

η=100(ρafterρbefore)/(1ρbefore)\eta = 100 \cdot (\rho_{after} - \rho_{before}) / (1 - \rho_{before})

with ρbefore\rho_{before} and ρafter\rho_{after} the correlations of the clean signal with the corrupted and corrected outputs.

Summarized results (average across subjects):

Single-stage WPD (best) WPD-CCA (best) Relative Improvement
EEG ΔSNR=29.44 dB (db2) η=51.4% ΔSNR=30.76 dB (db1) η=59.51% ↑η by 11.3%
fNIRS ΔSNR=16.11 dB (fk4) η=26.4% ΔSNR=12.41 dB (fk8) η=41.40% ↑η by 56.8%

The WPD-CCA method provides higher percentage reduction in motion artifacts (η\eta) and, in EEG, also higher ΔSNR than its single-stage counterpart (Hossain et al., 2022).

5. Comparative Analysis: Advantages and Limitations

Compared to earlier single-channel motion artifact removal techniques, including DWT, EMD/EEMD, VMD, ICA, or CCA alone, WPD-CCA demonstrates several notable advantages:

  • Operates on single-channel recordings by generating pseudo-channels via WPD.
  • Jointly exploits the frequency localization properties of wavelet packets and the multivariate separation capabilities of CCA.
  • Does not require a physically separate uncorrupted reference at runtime.
  • Artifact bands and components are identified in a data-adaptive manner.
  • Robust performance across multiple wavelet bases.
  • Outperforms single-stage WPD, as well as several BSS strategies, in denoising effectiveness.

However, certain limitations persist:

  • Artifact component selection in CCA presently relies on a reference-correlation or autocorrelation threshold; universal, fully automatic thresholds remain undetermined.
  • As the WPD level and sub-band count grow, so does computational complexity.
  • In the absence of a reference signal, surrogate criteria (such as autocorrelation drops) must be used to detect artifact-related components.

6. Context, Applicability, and Future Considerations

WPD-CCA is particularly suited to biological signal modalities (EEG, fNIRS) where motion artifacts are spectrally sparse yet temporally autocorrelated, and multi-channel acquisition is impractical or infeasible. Its reliance on pseudo-multichannel analysis supports broader application in single-sensor wearable systems. A plausible implication is that further advances may be realized by integrating more sophisticated or fully unsupervised artifact identification schemes, potentially reducing dependence on reference correlation or hand-tuned thresholds. Optimization of decomposition level and wavelet basis, and parallel implementation strategies, could further enhance practical deployment (Hossain et al., 2022).

Whiteboard

Topic to Video (Beta)

Follow Topic

Get notified by email when new papers are published related to WPD-CCA Method.