Papers
Topics
Authors
Recent
Search
2000 character limit reached

Class-Specific Multilinear Discriminant Analysis

Updated 20 March 2026
  • The paper introduces MCSDA, extending class-specific discriminant analysis to tensor data with mode-wise linear projections.
  • It employs an alternating optimization strategy to update projection matrices for optimal in-class clustering and out-of-class separation.
  • Empirical results demonstrate that MCSDA achieves faster computation and improved accuracy in applications like facial verification and stock prediction.

Multilinear Class-Specific Discriminant Analysis (MCSDA) is a tensor-based subspace learning technique that generalizes traditional class-specific discriminant analysis to high-order data. MCSDA maximizes the discrimination of each individual class in a feature space defined by low-dimensional multilinear projections, while maintaining the spatial and structural integrity of the original tensor representations. The method is formulated as an iterative optimization that alternately updates mode-wise projection matrices to achieve optimal class separation, with applications demonstrated in facial image verification and stock price prediction (Tran et al., 2017).

1. Problem Formulation and Objective

MCSDA addresses the challenge of transferring discriminant analysis techniques from vectorized data to high-order tensors. Given NN labeled samples {XjRI1×I2××IK, lj}j=1N\{\mathcal{X}_j \in \mathbb{R}^{I_1 \times I_2 \times \cdots \times I_K},\ l_j\}_{j=1}^N with KK modes, the task is to learn, for each class c{1,,C}c \in \{1, \dots, C\}, a set of mode-specific linear projection matrices {Uc(n)RIn×In}n=1K\{U_c^{(n)} \in \mathbb{R}^{I_n \times I_n'}\}_{n=1}^K satisfying In<InI_n' < I_n.

Samples of class cc are treated as the positive class; all others are negative. The mean tensor for each class is Mc=1ncj:lj=cXj\mathcal{M}_c = \frac{1}{n_c} \sum_{j: l_j = c} \mathcal{X}_j. Each tensor is projected via

Yj=Xj×1(Uc(1))×2(Uc(2))×K(Uc(K))RI1××IK\mathcal{Y}_j = \mathcal{X}_j \times_1 (U_c^{(1)})^\top \times_2 (U_c^{(2)})^\top \cdots \times_K (U_c^{(K)})^\top \in \mathbb{R}^{I_1' \times \cdots \times I_K'}

with the goal that in-class projected tensors cluster tightly around the projected mean, while out-of-class samples are maximally separated. This one-versus-rest construction yields a set of class-specific models, each optimized for its respective class in the tensor feature space.

2. Multilinear Class-Specific Scatter and Optimization Criterion

For a fixed class cc, MCSDA defines the following distances:

  • The in-class (within-class) distance

DI(c)=j:lj=cXj×n=1K(Uc(n))Mc×n=1K(Uc(n))F2D_I^{(c)} = \sum_{j: l_j = c} \left\| \mathcal{X}_j \times_{n=1}^K (U_c^{(n)})^\top - \mathcal{M}_c \times_{n=1}^K (U_c^{(n)})^\top \right\|_F^2

  • The out-of-class distance

DO(c)=j:ljcXj×n=1K(Uc(n))Mc×n=1K(Uc(n))F2D_O^{(c)} = \sum_{j: l_j \neq c} \left\| \mathcal{X}_j \times_{n=1}^K (U_c^{(n)})^\top - \mathcal{M}_c \times_{n=1}^K (U_c^{(n)})^\top \right\|_F^2

These are rewritten in the trace-ratio form:

Jc(Uc(1),,Uc(K))=tr(Uc(1)Uc(K)SB(c)Uc(K)Uc(1))tr(Uc(1)Uc(K)SW(c)Uc(K)Uc(1))J_c \left(U_c^{(1)}, \dots, U_c^{(K)}\right) = \frac{\operatorname{tr}\left(U_c^{(1)\top} \cdots U_c^{(K)\top} S_B^{(c)} U_c^{(K)} \cdots U_c^{(1)} \right)}{\operatorname{tr}\left(U_c^{(1)\top} \cdots U_c^{(K)\top} S_W^{(c)} U_c^{(K)} \cdots U_c^{(1)} \right)}

where SW(c)S_W^{(c)} aggregates in-class scatter and SB(c)S_B^{(c)} aggregates out-of-class scatter, with orthonormality constraints on each Uc(n)U_c^{(n)}. The objective is to maximize JcJ_c for each class, ensuring that the projected out-of-class variance is maximized relative to the in-class variance.

3. Alternating Mode-Wise Optimization Algorithm

Due to the coupling of all mode-wise matrices in JcJ_c, optimization is performed via an alternating mode-wise strategy. At each iteration, for a given mode nn, two scatter matrices are computed after projecting along all modes except nn, followed by mode-nn unfolding:

  • In-class scatter:

SIn(c)=j:lj=c[(XjMc)qn×q(Uc(q))](n)[](n)S_{I}^{\,n(c)} = \sum_{j: l_j = c} \left[ (\mathcal{X}_j - \mathcal{M}_c) \prod_{q \neq n} \times_q (U_c^{(q)})^\top \right]_{(n)} \left[ \cdot \right]_{(n)}^\top

  • Out-of-class scatter:

SOn(c)=j:ljc[(XjMc)qn×q(Uc(q))](n)[](n)S_{O}^{\,n(c)} = \sum_{j: l_j \neq c} \left[ (\mathcal{X}_j - \mathcal{M}_c) \prod_{q \neq n} \times_q (U_c^{(q)})^\top \right]_{(n)} \left[ \cdot \right]_{(n)}^\top

The optimization subproblem for each mode reduces to:

maxUc(n)tr(Uc(n)SOn(c)Uc(n))tr(Uc(n)SIn(c)Uc(n)),subject to (Uc(n))Uc(n)=I\max_{U_c^{(n)}} \frac{\operatorname{tr}(U_c^{(n)\top} S_{O}^{\,n(c)} U_c^{(n)})}{\operatorname{tr}(U_c^{(n)\top} S_{I}^{\,n(c)} U_c^{(n)})},\quad \text{subject to } (U_c^{(n)})^\top U_c^{(n)} = I

This is solved by computing the leading generalized eigenvectors of SOn(c)u=λSIn(c)uS_{O}^{\,n(c)} u = \lambda S_{I}^{\,n(c)} u. The process iterates across all modes until convergence, sequentially updating the projections for each mode.

4. Structural Preservation and Computational Properties

MCSDA preserves the intrinsic spatial and structural information of tensor data by avoiding vectorization. Alternate projections along each tensor mode retain multilinear dependencies, essential for modeling high-order structures seen in images (spatial ×\times spatial) or time series (features ×\times time). The parameter count is reduced from a multiplicative scale (product of mode sizes in vectorization) to an additive one (sum of InInI_n I_n'). This reduction mitigates the curse of dimensionality and avoids small-sample-size pitfalls inherent in conventional vector approaches.

5. Empirical Evaluation

MCSDA's utility is demonstrated in two domains:

  • Face verification (ORL, CMU-PIE): Each image is a 40×3040 \times 30 tensor. Competing methods include vector-based CSDA, multilinear LDA (MDA), and enriched variants with HOG features. The metric is mean Average Precision (mAP) across class splits and train/test ratios. Findings show that MCSDA is $10$–100×100\times faster than CSDA with only a slight mAP drop, consistently outperforms MDA, and benefits from feature enrichment when data are abundant.
  • Stock price prediction (FI-2010 limit-order-book): Each sample is a 144×10144 \times 10 tensor. Baselines include linear ridge regression, single-layer neural nets, BoF, neural BoF, CSDA, and MDA. With metrics such as per-class F1 and overall accuracy, MCSDA achieves the highest average F1 (46.7%\approx 46.7\%), surpassing even deep bag-of-words variants, effectively capturing both feature and temporal multilinear structures.

6. Algorithmic Outline

The MCSDA learning procedure for a chosen class cc proceeds as follows:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
Algorithm MCSDA(Class c)
Input:
    • Training tensors {X_j, l_j}
    • Target subspace sizes I_1' × ... × I_K'
    • Max iterations τ, threshold ε
Initialization:
    For each mode n, set U_c^{(n)} ← all-ones or random orthonormal
For t = 1 to τ:
    For n = 1 to K:
        1. Compute in-class scatter S_I^{n(c)} via mode-n unfolding
        2. Compute out-of-class scatter S_O^{n(c)} likewise
        3. Solve S_O^{n(c)} u = λ S_I^{n(c)} u
           and set U_c^{(n)} to top I_n' eigenvectors
    End For
    If sum_{n=1}^K || U_c^{(n)}(t) U_c^{(n)}(t–1)^T – I ||_F ≤ ε: stop
Output: {U_c^{(n)}}

Repeat for each class c to obtain class-specific projections.
The full algorithm performs independent one-vs-rest training for each class, yielding a collection of discriminative multilinear subspaces suitable for high-order data classification tasks (Tran et al., 2017).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Class-Specific Multilinear Discriminant Analysis (MCSDA).