Structure-Preserving Nonlinear Sufficient Dimension Reduction for Tensors

Published 23 Dec 2025 in math.ST, stat.ME, and stat.ML | (2512.20057v1)

Abstract: We introduce two nonlinear sufficient dimension reduction methods for regressions with tensor-valued predictors. Our goal is two-fold: the first is to preserve the tensor structure when performing dimension reduction, particularly the meaning of the tensor modes, for improved interpretation; the second is to substantially reduce the number of parameters in dimension reduction, thereby achieving model parsimony and enhancing estimation accuracy. Our two tensor dimension reduction methods echo the two commonly used tensor decomposition mechanisms: one is the Tucker decomposition, which reduces a larger tensor to a smaller one; the other is the CP-decomposition, which represents an arbitrary tensor as a sequence of rank-one tensors. We developed the Fisher consistency of our methods at the population level and established their consistency and convergence rates. Both methods are easy to implement numerically: the Tucker-form can be implemented through a sequence of least-squares steps, and the CP-form can be implemented through a sequence of singular value decompositions. We investigated the finite-sample performance of our methods and showed substantial improvement in accuracy over existing methods in simulations and two data applications.

Abstract PDF Upgrade to Chat

Summary

The paper presents nonlinear SDR formulations using Tucker and CP decompositions to preserve tensor structure and reduce parameter count.
It leverages reproducing kernel Hilbert spaces to map tensor singular vectors, ensuring robust estimation and improved interpretability.
Empirical results on image quality and EEG data demonstrate superior prediction accuracy and reliable sufficient predictor recovery over traditional methods.

Structure-Preserving Nonlinear Sufficient Dimension Reduction for Tensors

Overview and Motivation

This paper presents a comprehensive theoretical and computational framework for nonlinear sufficient dimension reduction (SDR) in regression models with tensor-valued predictors, emphasizing structure preservation and model parsimony (2512.20057). Traditional SDR approaches for vector-valued data either ignore tensor structure by flattening the predictors or rely on high-parameter representations that lose interpretability and predictive accuracy. The proposed methods uniquely maintain the tensor modes throughout dimension reduction, substantially decreasing parameter count and improving the interpretability and estimation efficiency.

The construction follows two established tensor decomposition paradigms: the Tucker decomposition and the CANDECOMP/PARAFAC (CP) decomposition. Both linear and nonlinear SDR variants are considered. The nonlinear extensions are formulated in reproducing kernel Hilbert spaces (RKHSs), leveraging kernel-based feature maps defined on tensor singular vectors. The framework is applicable to both matrix and higher-order tensor data, covering use cases in neuroimaging, computer vision, and structured signal processing.

Linear versus Nonlinear SDR for Tensors

The paper systematically contrasts classical linear SDR approaches for tensors—which either flatten the data (leading to high parameterization and poor interpretability) or apply multilinear mappings to each mode—with their nonlinear counterparts. Linear dimension folding (Tucker-style), canonical polyadic-based SDR (CP-style), and envelope concepts are extended to nonlinear SDR using RKHS embeddings.

Nonlinear SDR captures more complex dependencies between predictors and response by operating on nonlinear functions of mode vectors derived from singular value decompositions. The essential technical advance is the definition of tensor-feature maps leveraging kernels on singular vectors, followed by identification of low-dimensional subspaces in the resulting tensor-product RKHS with conditional independence properties.

Proposed Methods: Tucker-NTSDR and CP-NTSDR

Two main nonlinear tensor SDR formulations are developed:

Tucker-form Nonlinear SDR (Tucker-NTSDR): Reduces the tensor predictor via mode-wise projections in RKHS, analogous to Tucker decomposition. The sufficient predictors are formed by inner products in tensor-product RKHSs, preserving mode interpretation and producing substantial parameter savings.
CP-form Nonlinear SDR (CP-NTSDR): Identifies rank-one directions in each tensor mode such that the projections collectively capture all information about the response. This aligns with CP decomposition logic and leverages mode-wise orthogonalization to construct efficient, interpretable sufficient predictors.

Both methods rely critically on structure-preserving feature mapping, Fisher-consistent population operators, and sample-level implementations using optimized coordinate representations and Gram matrices. Explicit iterative algorithms are provided, exploiting least-squares and SVD-based updates for Tucker form and iterative eigen-problem solutions for CP form.

Tensor Regression Operator and Theoretical Guarantees

A central contribution is the definition and analysis of the tensor regression operator, mapping from the response RKHS to the tensor-feature RKHS. The regression operator is shown to be Fisher-consistent under mild conditions, with its range generating the same central σ-field as that of the sufficient predictors. The envelope subspaces (Tucker-envelope and CP-envelope) are rigorously defined as minimal subspaces (within certain closure properties) containing the sufficient predictors.

Under additional regularity and smoothness conditions, the estimators are shown to be consistent, with convergence rates scaling as $O(\epsilon_n^\beta + \epsilon_n n^{-1/2})$ under Tychonoff regularization and Hilbert-Schmidt operator assumptions.

Implementation, Tuning, and Generalization

A detailed sample-level implementation strategy is presented, including construction of spanning systems and Gram matrices for RKHS representations, optimization via coordinate mapping, and computational regularization via generalized cross validation. The methods support scalability to high-dimensional tensors through clever restriction of the coordinate representations and feature selection strategies.

Kernel parameter tuning and regularization are carefully designed to balance fitting accuracy with numerical stability and interpretability. Extensions to general $m$ -way tensors are included, with recursive operator definitions for each mode and analogous minimization strategies.

Empirical Evaluation: Simulations and Applications

Extensive simulation studies are performed under varied signal regimes (nonlinear, correlated, and high-dimension), demonstrating the superiority of both Tucker-NTSDR and CP-NTSDR over competitive SDR methods such as GSIR and kernel SIR with regard to prediction accuracy and sufficient predictor recovery. Results consistently show improved distance correlation metrics between estimated and true quantities.

Real data applications—most notably on the CSIQ image quality dataset—show that the proposed CP-NTSDR outperforms GSIR in the majority of cases, with substantial gains in generalization to unseen data and interpretability of learned features. The methods also demonstrate effectiveness on high-dimensional EEG data, and their robust performance in classification settings is documented in supplementary materials.

Implications and Future Directions

The framework directly addresses the interpretability, scalability, and accuracy limitations of prior tensor SDR approaches. By leveraging tensor-preserving nonlinear mappings, the methodology ensures that estimated sufficient predictors retain domain-relevant structure, facilitate parsimonious models, and enhance the quality of statistical inference in high-dimensional tensor settings.

Practically, these methods are suited for high-throughput imaging, biomedical signal analysis, multi-modal sensor fusion, and any context where complex, structured data modalities necessitate nonlinear predictive modeling while preserving physical interpretability.

Theoretically, the generalization to higher-order tensors via kernelized CP decomposition opens avenues for further research in computational optimization, statistical theory (e.g., rates, envelope structure identification), and transfer learning for multi-task tensor regression. There is also potential for variants exploiting universal or characteristic kernels for robust conditional independence testing.

Conclusion

The paper advances SDR methodology for tensor-valued predictors by marrying structure-preserving decomposition principles with nonlinear kernel-based mappings, resulting in methods that are Fisher-consistent, computationally tractable, and empirically superior for both regression and classification tasks involving tensors. The theoretical developments and sample algorithms collectively establish a solid foundation for future exploration of structure-aware, nonlinear dimension reduction in big-data contexts (2512.20057).

Markdown