WPD-CCA Method for Artifact Removal

Updated 5 January 2026

WPD-CCA is a hybrid method that combines wavelet packet decomposition with canonical correlation analysis to correct motion artifacts in single-channel EEG/fNIRS data.
It generates pseudo-multichannel representations from single-channel recordings, enabling efficient separation and removal of artifact components.
Performance metrics like ΔSNR and percentage artifact reduction illustrate its superiority over single-stage methods and traditional blind source separation techniques.

The WPD-CCA (Wavelet Packet Decomposition–Canonical Correlation Analysis) method offers a two-stage data-driven pipeline for correcting motion artifacts in single-channel electroencephalogram (EEG) and functional near-infrared spectroscopy (fNIRS) signals. Designed to address the non-stationarity and contamination issues inherent in wearable EEG/fNIRS measurements, WPD-CCA combines time–frequency localization provided by wavelet packet decomposition with the source separation capacity of canonical correlation analysis. This hybrid approach circumvents the requirement for multi-channel inputs or physically separate artifact references, instead generating pseudo-multichannel representations from a single sensor. WPD-CCA demonstrates superior denoising efficacy compared to single-stage WPD and other blind-source separation (BSS) methods, as evaluated by difference in signal-to-noise ratio (ΔSNR) and percentage reduction in motion artifacts (η) (Hossain et al., 2022).

1. Two-Stage Pipeline: Structure and Rationale

WPD-CCA is composed of two sequential operations:

Wavelet Packet Decomposition (WPD): The input signal $x(n)$ is decomposed into $2^j$ frequency-localized sub-bands via wavelet packet basis functions at level $j$ . Each sub-band isolates distinct spectral content, exploiting the empirical observation that motion artifacts produce large-magnitude coefficients localized sparsely across few sub-bands. This process generates a pseudo-multichannel dataset from the original single-channel input.
Canonical Correlation Analysis (CCA): The WPD sub-bands are interpreted as channels. CCA is then applied between the (pseudo-multichannel) WPD sub-bands $X(t) \in \mathbb{R}^{M \times T}$ and their time-lagged versions $Y(t) = X(t-1) + X(t+1)$ . By maximizing correlations between linear projections of $X$ and $Y$ , CCA yields canonical variates ranked by autocorrelation strength. Artifact-dominated components typically possess the largest autocorrelation, and are identified for subsequent removal prior to reconstructing the artifact-suppressed signal (Hossain et al., 2022).

This two-stage approach is motivated by the observation that neither WPD nor CCA alone achieves optimal artifact suppression in single-channel settings. In WPD-CCA, WPD isolates artifact-rich sub-bands, while CCA further unmixed temporally autocorrelated artifact signals from the underlying neuronal or hemodynamic activity.

2. Mathematical Foundations

2.1 Wavelet Packet Decomposition

Given signal $x(n) = s(n) + v(n)$ , with $s(n)$ the true underlying physiological signal and $v(n)$ the artifact, WPD at level %%%%10%%%% produces basis functions $\psi_{j,i}(n)$ and expansion coefficients

$X_{j,i}(k) = \langle x(n), \psi_{j,i}(n-2^j k) \rangle$

which decompose as $X_{j,i}(k) = S_{j,i}(k) + V_{j,i}(k)$ . The basis recursion relations are

$\psi_{j,2i}(n) = \sum_k h(k) \psi_{j-1,i}(n - 2^{j-1}k)$

$\psi_{j,2i+1}(n) = \sum_k g(k) \psi_{j-1,i}(n - 2^{j-1}k)$

where $h$ , $g$ are the low- and high-pass wavelet filters (e.g., Daubechies ‘db1’, ‘db2’, Fejér–Korovkin ‘fk4’, etc.).

2.2 Canonical Correlation Analysis

With $X(t)$ (size $M \times T$ , $M=2^j$ ) and $Y(t) = X(t-1) + X(t+1)$ , CCA finds $w_x$ , $w_y$ to maximize

$\rho = \text{corr}(w_x^T X, w_y^T Y)$

Subject to constraints, this leads to the generalized eigenvalue problem

$C_{XX}^{-1} C_{XY} C_{YY}^{-1} C_{YX} w_x = \rho^2 w_x$

Canonical loadings $w_x$ , $w_y$ unmix $X$ into variates $S=W X$ ordered by autocorrelation. Artifact components yield high canonical correlations.

2.3 Reconstruction

Artifacts are identified by evaluating whether zeroing a canonical variate increases the correlation with a ground-truth (reference) signal or surpasses an autocorrelation threshold. Excluding artifact-dominated canonical variates forms $\tilde{S}$ , and the cleaned pseudo-channels are reconstructed by $\tilde{X} = W^{-1} \tilde{S}$ . Summing across sub-bands yields the final artifact-corrected signal.

3. Detailed Algorithmic Steps

3.1 Preprocessing

EEG: Downsample from 2048 Hz to 256 Hz, apply a 50 Hz notch filter, remove polynomial baseline drift.
fNIRS: Use 25 Hz sampling rate, apply similar notch/baseline removal.
Entire signals (~9 minutes duration) are processed as a single segment.

3.2 WPD Stage

Decomposition level $j=4$ ($16$ sub-bands).
Wavelet packet options: db1, db2, db3, sym4–6, coif1–3, fk4, fk6, fk8 (12 total).
Compute $x_i(n)$ for $i = 1, \dots, 16$ .

3.3 WPD-Based Artifact Detection

In single-stage WPD, each sub-band $x_i(n)$ is dropped if its exclusion increases the sum’s correlation with a reference channel.
Artifact-reduced signal is obtained by summing the undropped sub-bands.

3.4 CCA Stage (WPD-CCA)

Stack the 16 sub-bands as $X(t)$ .
Compute $Y(t) = X(t-1) + X(t+1)$ .
Estimate covariance matrices $C_{XX}, C_{YY}, C_{XY}$ .
Solve for canonical weights and project $S = W X$ .
Identify and zero artifact-related canonical variate columns in $S$ .
Reconstruct $\tilde{X} = W^{-1}\tilde{S}$ , sum sub-bands to yield the final signal.

Parameter selection is driven by trade-offs between frequency resolution and computational load (suggested $j=4$ ), and best performance is empirically observed for db1–3 and fk4–8 wavelets.

4. Quantitative Performance and Evaluation Metrics

Performance is quantified using:

Difference in Signal-to-Noise Ratio (ΔSNR):

$\Delta \mathrm{SNR} = 10 \log_{10}(\sigma^2/\delta_{after}^2) - 10 \log_{10}(\sigma^2/\delta_{before}^2) = 10 \log_{10}(\delta_{before}^2/\delta_{after}^2)$

$\sigma^2$ denotes variance of the clean signal, $\delta_{before}^2$ and $\delta_{after}^2$ those of the corrupted and corrected signals, respectively.

Percentage Reduction in Motion Artifacts (η):

$\eta = 100 \cdot (\rho_{after} - \rho_{before}) / (1 - \rho_{before})$

with $\rho_{before}$ and $\rho_{after}$ the correlations of the clean signal with the corrupted and corrected outputs.

Summarized results (average across subjects):

	Single-stage WPD (best)	WPD-CCA (best)	Relative Improvement
EEG	ΔSNR=29.44 dB (db2) η=51.4%	ΔSNR=30.76 dB (db1) η=59.51%	↑η by 11.3%
fNIRS	ΔSNR=16.11 dB (fk4) η=26.4%	ΔSNR=12.41 dB (fk8) η=41.40%	↑η by 56.8%

The WPD-CCA method provides higher percentage reduction in motion artifacts ( $\eta$ ) and, in EEG, also higher ΔSNR than its single-stage counterpart (Hossain et al., 2022).

5. Comparative Analysis: Advantages and Limitations

Compared to earlier single-channel motion artifact removal techniques, including DWT, EMD/EEMD, VMD, ICA, or CCA alone, WPD-CCA demonstrates several notable advantages:

Operates on single-channel recordings by generating pseudo-channels via WPD.
Jointly exploits the frequency localization properties of wavelet packets and the multivariate separation capabilities of CCA.
Does not require a physically separate uncorrupted reference at runtime.
Artifact bands and components are identified in a data-adaptive manner.
Robust performance across multiple wavelet bases.
Outperforms single-stage WPD, as well as several BSS strategies, in denoising effectiveness.

However, certain limitations persist:

Artifact component selection in CCA presently relies on a reference-correlation or autocorrelation threshold; universal, fully automatic thresholds remain undetermined.
As the WPD level and sub-band count grow, so does computational complexity.
In the absence of a reference signal, surrogate criteria (such as autocorrelation drops) must be used to detect artifact-related components.

6. Context, Applicability, and Future Considerations

WPD-CCA is particularly suited to biological signal modalities (EEG, fNIRS) where motion artifacts are spectrally sparse yet temporally autocorrelated, and multi-channel acquisition is impractical or infeasible. Its reliance on pseudo-multichannel analysis supports broader application in single-sensor wearable systems. A plausible implication is that further advances may be realized by integrating more sophisticated or fully unsupervised artifact identification schemes, potentially reducing dependence on reference correlation or hand-tuned thresholds. Optimization of decomposition level and wavelet basis, and parallel implementation strategies, could further enhance practical deployment (Hossain et al., 2022).

Markdown Upgrade to Chat

References (1)

Motion Artifacts Correction from Single-Channel EEG and fNIRS Signals using Novel Wavelet Packet Decomposition in Combination with Canonical Correlation Analysis (2022)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to WPD-CCA Method.