Quaternion Tensor DCT (QTDCT)

Updated 3 November 2025

QTDCT is a multi-dimensional transform for 3D quaternion tensors that preserves inter-channel correlations and spatial-temporal coherence in color videos.
It leverages non-commutative quaternion algebra with DCT matrices to achieve effective energy compaction and decorrelation along RGB and temporal modes.
An ADMM-based optimization framework integrates QTDCT sparsity regularization with low-rank constraints, yielding superior recovery metrics like PSNR and SSIM.

The Quaternion Tensor Discrete Cosine Transform (QTDCT) is a domain-specific linear transform defined for tensors with quaternion-valued entries. It enables structure-preserving and multi-modal sparsity regularization for color video recovery tasks, specifically in the context of low-rank and sparse tensor completion algorithms. QTDCT leverages the non-commutative algebraic properties of quaternions to maintain native inter-channel correlations and spatial-temporal coherence within color video data.

1. Algebraic Definition and Computation of QTDCT

QTDCT operates on 3D quaternion tensors, where each pixel in an RGB video frame is encoded as a pure quaternion (with imaginary components corresponding to R, G, B, and no real part), and frames are stacked along the temporal mode to construct the tensor $\dot{\mathcal{T}} \in \mathbb{H}^{I_1 \times I_2 \times I_3}$ .

The transform is defined in two algebraic forms due to quaternion non-commutativity, but the left-handed form is adopted for practical use:

Left-handed QTDCT:

$\mathcal{C}(\dot{\mathcal{T}})_L \triangleq \dot{u} \cdot (\dot{\mathcal{T}} \times_1 \mathbf{C}_1 \times_2 \mathbf{C}_2 \times_3 \mathbf{C}_3)$

where $\dot{u}$ is a pure unit quaternion $(\dot{u}^2 = -1)$ , $\mathbf{C}_i \in \mathbb{R}^{I_i \times I_i}$ are real DCT matrices (Discrete Cosine Transform along each tensor mode), and $\times_n$ denotes the $n$ -mode tensor-matrix product.

Inverse QTDCT:

$\mathcal{C}^{-1}(\dot{\mathcal{T}})_L \triangleq \dot{u} \cdot (\dot{\mathcal{T}} \times_1 \mathbf{C}_1^{-1} \times_2 \mathbf{C}_2^{-1} \times_3 \mathbf{C}_3^{-1})$

Implementation via Cayley-Dickson Representation:

Decompose $\dot{\mathcal{T}}$ $\dot{T}$ via Cayley-Dickson into two complex tensors:
- $\dot{\mathcal{T}} = \mathcal{T}_p + \mathcal{T}_q j$ , with $\mathcal{T}_p, \mathcal{T}_q \in \mathbb{C}^{I_1 \times I_2 \times I_3}$ .
Apply multidimensional DCT to each:
- $\text{DCT}_C(\mathcal{T}_p)$ and $\text{DCT}_C(\mathcal{T}_q)$ .
Recombine: $\widehat{\mathcal{C}(\dot{\mathcal{T}})}_L = \text{DCT}_C(\mathcal{T}_p) + \text{DCT}_C(\mathcal{T}_q) j$ .
Left-multiply by $\dot{u}$ to yield $\mathcal{C}(\dot{\mathcal{T}})_L$ .

2. Mathematical Properties and Structural Advantages

QTDCT retains inter-channel and multi-modal dependencies that are intrinsic to color videos:

Structure-Preserving: It processes the tensor as a whole, rather than decomposing channels, thereby preserving chromatic and spatial relationships.
Multi-dimensional Decorrelation: DCT applied along each mode achieves effective energy compaction and decorrelation, natively across RGB and time axes.
Compatibility with Quaternion Algebra: Admitted forms and inverses respect quaternion multiplication's non-commutativity.
Parseval’s Theorem: Energy in the QTDCT domain is preserved, permitting direct transfer of norm-based constraints and regularization.

3. Role in Low-Rank Quaternion Tensor Completion Framework

QTDCT is integral to the framework for color video recovery under missing data scenarios:

The completion model solves:

$\min_{\dot{\mathcal{T}}} \ \|\dot{\mathcal{T}}\|_r + \lambda \|\dot{\mathcal{S}}\|_1 \quad \text{s.t.} \quad P_\Omega(\dot{\mathcal{T}}) = P_\Omega(\dot{\mathcal{O}}), \quad \dot{\mathcal{S}} = \mathcal{C}(\dot{\mathcal{T}})_L$

where $\|\dot{\mathcal{T}}\|_r$ is the truncated nuclear norm based on TQt-rank, enforcing global low-rank structure; $P_\Omega$ denotes observed entries; $\|\dot{\mathcal{S}}\|_1$ is the $l_1$ -norm of QTDCT coefficients promoting sparsity; $\lambda$ is a weighting parameter.

Sparsity regularization in QTDCT domain exploits the empirical distribution where most QTDCT coefficients are close to zero, concentrating image and video information in a few transform coefficients—facilitating recovery that both avoids over-smoothing and preserves local detail and texture.

4. Optimization via ADMM: Details of QTDCT-Sparse Recovery

ADMM is applied to solve the above model by splitting low-rank and sparsity regularization:

Auxiliary variable $\dot{\mathcal{S}} = \mathcal{C}(\dot{\mathcal{T}})_L$ enables separate handling of rank and sparsity.
In each iteration, the subproblem for $\dot{\mathcal{S}}$ minimizes:

$\min_{\dot{\mathcal{S}}} \ \lambda \|\dot{\mathcal{S}}\|_1 + \frac{\beta}{2} \|\dot{\mathcal{S}} - \mathcal{C}(\dot{\mathcal{T}}^{k+1}) + \frac{1}{\beta} \dot{\mathcal{Z}}^k \|_F^2$

which admits a closed-form solution via soft-thresholding in the QTDCT domain:

$\dot{\mathcal{S}}^{k+1} = \mathcal{S}_{\frac{4\lambda}{\beta^k}}\left(\mathcal{C}(\dot{\mathcal{T}}^{k+1}) - \frac{1}{\beta^k} \dot{\mathcal{Z}}^k \right)$

where $\mathcal{S}_\tau(\cdot)$ applies element-wise soft-thresholding.

Updates alternate between low-rank optimization and QTDCT-sparse steps, with the inverse QTDCT applied as required to revert to spatial/color domain for the next iterate.
Parseval’s theorem ensures consistent regularization magnitude across domains, maintaining algorithmic stability during decoupling.

5. Effects of QTDCT-Sparsity on Color Video Recovery Performance

QTDCT-based sparsity regularization leads to several documented benefits in color video tensor completion:

Detail Preservation: Suppresses noise while retaining high-frequency components critical for texture and edge recovery.
Artifact Reduction: Mitigates common visual artifacts induced by naive low-rank approximations.
Multi-modal Consistency: Regularizes across spatial, chromatic, and temporal axes, yielding reconstructions with realistic color and motion continuity.
Experimental results in the referenced work demonstrate strong quantitative improvements (higher PSNR/SSIM) and visually superior reconstructions as compared to methods lacking explicit QTDCT sparsity terms, notably at low observation rates.

6. Pipeline Summary Table: Stages of QTDCT-Based Color Video Recovery

Step	Action	Formula/Remark
1	RGB video $\to$ pure quaternion tensor	RGB encoded as $i, j, k$ parts of quaternion
2	Compute QTDCT	$\mathcal{C}(\dot{\mathcal{T}})_L = \dot{u} \cdot (\dot{\mathcal{T}} \times_1 \mathbf{C}_1 \times_2 \mathbf{C}_2 \times_3 \mathbf{C}_3)$
3	Impose QTDCT sparsity	$\lambda \\|\mathcal{C}(\dot{\mathcal{T}})_L\\|_1$ added to objective
4	ADMM QTDCT step	Soft-thresholding, then inverse QTDCT to update tensor
5	Alternate with low-rank TQt-SVD update	See corresponding ADMM step
6	Iterate to convergence	Jointly enforce low-rank global structure and local sparse texture

7. Concluding Remarks

QTDCT provides a multi-dimensional, quaternion-valued DCT framework engineered for color video tensor recovery applications. By integrating quaternion algebra with transform-domain sparsity regularization, the method advances completion frameworks in preserving both global structure and fine texture, especially under challenging sampling conditions. Its efficient implementation via Cayley-Dickson and DCT matrix computations, and empirical superiority in restoration metrics, underscore its utility in multidimensional visual information recovery scenarios (Yang et al., 2022).

PDF Markdown Chat (Pro)

References (1)

Quaternion Tensor Completion with Sparseness for Color Video Recovery (2022)

Follow Topic

Get notified by email when new papers are published related to Quaternion Tensor Discrete Cosine Transform (QTDCT).