- The paper introduces an innovative Riemannian deep network that transforms SPD matrices using specialized BiMap, ReEig, and LogEig layers to preserve manifold geometry.
- It employs a tailored backpropagation method with SGD on Stiefel manifolds, ensuring optimization respects the intrinsic geometric structure.
- Empirical evaluations on tasks such as emotion, action, and face recognition demonstrate the model’s superior performance over traditional SPD learning methods.
A Riemannian Network for SPD Matrix Learning
This article introduces an innovative deep learning framework specifically designed for Symmetric Positive Definite (SPD) matrix learning, leveraging Riemannian geometry. The proposed work by Zhiwu Huang and Luc Van Gool explores new dimensions in SPD matrix non-linear learning within deep neural networks, which have predominantly been explored using Euclidean geometry.
SPD matrices are integral to a wide array of applications, particularly in image and video processing domains, owing to their robustness in encoding statistical representations while conforming to the Riemannian manifold structure. Traditional approaches have managed these matrices using shallow learning techniques, often relying on tangent space approximation or reproducing kernel Hilbert spaces, which might not align perfectly with the manifold's geometry. The work shifts focus towards deep learning paradigms, presenting a Riemannian network that preserves the SPD structure across successive layers.
The backbone of the proposed SPD matrix network (SPDNet) consists of a novel architecture comprising three main components:
- Bilinear Mapping Layers (BiMap): These are introduced to transform input SPD matrices into output SPD matrices of desired properties. These transformations are modeled as bilinear maps, ensuring that outputs remain within the SPD manifold.
- Eigenvalue Rectification Layers (ReEig): Drawing inspiration from the rectified linear units (ReLU) commonly used in deep networks, these layers introduce non-linearities by rectifying the eigenvalues of the SPD matrices. This ensures that the matrices remain SPD and facilitates the application of non-linear functions.
- Eigenvalue Logarithm Layers (LogEig): These layers are critical for performing Riemannian computations by linearizing the SPD manifold. The matrix logarithm is applied to transform SPD matrices into their Euclidean counterparts, enabling the use of standard output layers.
Backpropagation for this architecture is designed innovatively using a stochastic gradient descent (SGD) variant on Stiefel manifolds. This involves Riemannian optimization techniques to update the network weights constrained on these manifolds, ensuring that orthogonality and other geometric properties are maintained throughout the training process.
Empirical evaluations demonstrate the efficacy of the SPDNet across various visual classification tasks: emotion recognition, action recognition, and face verification. Experiments show that SPDNet consistently outperforms state-of-the-art SPD matrix learning methods, such as Covariance Discriminative Learning (CDL) and Log-Euclidean Metric Learning (LEML), as well as more traditional deep learning approaches like DeepO2P networks. Notably, the deeper configurations of SPDNet (utilizing multiple BiMap and ReEig layers) yield marked improvements in predictive accuracy, validating the importance of leveraging depth, geometry-aware mappings, and non-linear activation within the SPD framework.
The paper suggests several avenues for further research, including integration of additional layers, such as pooling and normalization layers, which align well with SPD geometries, and layering SPDNet atop existing convolutional networks for enriched learning from raw image data. The potential for extending these methodologies to general Riemannian manifolds for broader applicability in manifold learning and neural network optimization is also highlighted.
This work contributes significantly to the field of Riemannian learning by bridging deep learning methodologies with non-Euclidean data structures, showcasing a potentially potent paradigm for SPD matrix learning and associated applications.