Quaternion Tensor DCT (QTDCT)
- QTDCT is a multi-dimensional transform for 3D quaternion tensors that preserves inter-channel correlations and spatial-temporal coherence in color videos.
- It leverages non-commutative quaternion algebra with DCT matrices to achieve effective energy compaction and decorrelation along RGB and temporal modes.
- An ADMM-based optimization framework integrates QTDCT sparsity regularization with low-rank constraints, yielding superior recovery metrics like PSNR and SSIM.
The Quaternion Tensor Discrete Cosine Transform (QTDCT) is a domain-specific linear transform defined for tensors with quaternion-valued entries. It enables structure-preserving and multi-modal sparsity regularization for color video recovery tasks, specifically in the context of low-rank and sparse tensor completion algorithms. QTDCT leverages the non-commutative algebraic properties of quaternions to maintain native inter-channel correlations and spatial-temporal coherence within color video data.
1. Algebraic Definition and Computation of QTDCT
QTDCT operates on 3D quaternion tensors, where each pixel in an RGB video frame is encoded as a pure quaternion (with imaginary components corresponding to R, G, B, and no real part), and frames are stacked along the temporal mode to construct the tensor .
The transform is defined in two algebraic forms due to quaternion non-commutativity, but the left-handed form is adopted for practical use:
- Left-handed QTDCT:
where is a pure unit quaternion , are real DCT matrices (Discrete Cosine Transform along each tensor mode), and denotes the -mode tensor-matrix product.
- Inverse QTDCT:
Implementation via Cayley-Dickson Representation:
- Decompose via Cayley-Dickson into two complex tensors:
- , with .
- Apply multidimensional DCT to each:
- and .
- Recombine: .
- Left-multiply by to yield .
2. Mathematical Properties and Structural Advantages
QTDCT retains inter-channel and multi-modal dependencies that are intrinsic to color videos:
- Structure-Preserving: It processes the tensor as a whole, rather than decomposing channels, thereby preserving chromatic and spatial relationships.
- Multi-dimensional Decorrelation: DCT applied along each mode achieves effective energy compaction and decorrelation, natively across RGB and time axes.
- Compatibility with Quaternion Algebra: Admitted forms and inverses respect quaternion multiplication's non-commutativity.
- Parseval’s Theorem: Energy in the QTDCT domain is preserved, permitting direct transfer of norm-based constraints and regularization.
3. Role in Low-Rank Quaternion Tensor Completion Framework
QTDCT is integral to the framework for color video recovery under missing data scenarios:
- The completion model solves:
where is the truncated nuclear norm based on TQt-rank, enforcing global low-rank structure; denotes observed entries; is the -norm of QTDCT coefficients promoting sparsity; is a weighting parameter.
- Sparsity regularization in QTDCT domain exploits the empirical distribution where most QTDCT coefficients are close to zero, concentrating image and video information in a few transform coefficients—facilitating recovery that both avoids over-smoothing and preserves local detail and texture.
4. Optimization via ADMM: Details of QTDCT-Sparse Recovery
ADMM is applied to solve the above model by splitting low-rank and sparsity regularization:
- Auxiliary variable enables separate handling of rank and sparsity.
- In each iteration, the subproblem for minimizes:
which admits a closed-form solution via soft-thresholding in the QTDCT domain:
where applies element-wise soft-thresholding.
- Updates alternate between low-rank optimization and QTDCT-sparse steps, with the inverse QTDCT applied as required to revert to spatial/color domain for the next iterate.
- Parseval’s theorem ensures consistent regularization magnitude across domains, maintaining algorithmic stability during decoupling.
5. Effects of QTDCT-Sparsity on Color Video Recovery Performance
QTDCT-based sparsity regularization leads to several documented benefits in color video tensor completion:
- Detail Preservation: Suppresses noise while retaining high-frequency components critical for texture and edge recovery.
- Artifact Reduction: Mitigates common visual artifacts induced by naive low-rank approximations.
- Multi-modal Consistency: Regularizes across spatial, chromatic, and temporal axes, yielding reconstructions with realistic color and motion continuity.
- Experimental results in the referenced work demonstrate strong quantitative improvements (higher PSNR/SSIM) and visually superior reconstructions as compared to methods lacking explicit QTDCT sparsity terms, notably at low observation rates.
6. Pipeline Summary Table: Stages of QTDCT-Based Color Video Recovery
| Step | Action | Formula/Remark |
|---|---|---|
| 1 | RGB video pure quaternion tensor | RGB encoded as parts of quaternion |
| 2 | Compute QTDCT | |
| 3 | Impose QTDCT sparsity | added to objective |
| 4 | ADMM QTDCT step | Soft-thresholding, then inverse QTDCT to update tensor |
| 5 | Alternate with low-rank TQt-SVD update | See corresponding ADMM step |
| 6 | Iterate to convergence | Jointly enforce low-rank global structure and local sparse texture |
7. Concluding Remarks
QTDCT provides a multi-dimensional, quaternion-valued DCT framework engineered for color video tensor recovery applications. By integrating quaternion algebra with transform-domain sparsity regularization, the method advances completion frameworks in preserving both global structure and fine texture, especially under challenging sampling conditions. Its efficient implementation via Cayley-Dickson and DCT matrix computations, and empirical superiority in restoration metrics, underscore its utility in multidimensional visual information recovery scenarios (Yang et al., 2022).