Ordinal Partial Least Squares (OPLS)
- Ordinal Partial Least Squares (OPLS) is a modeling framework that adapts PLS to handle ordinal data by applying the Thurstone threshold model and polychoric correlations.
- It corrects bias found in traditional PLS when using limited Likert-type scales and effectively estimates latent constructs through improved correlation measures.
- The framework further extends to orthonormalized, multi-view, and deep learning settings, achieving enhanced performance in classification and retrieval tasks.
Ordinal Partial Least Squares (OPLS) refers to two distinct classes of modeling techniques derived from the Partial Least Squares (PLS) paradigm. The first, introduced by Cantaluppi (Cantaluppi, 2012), extends PLS to the analysis of datasets with ordinal manifest variables, enabling structural equation modeling when variables are measured on ordinal (e.g., Likert-type) scales. The second, as in Wang, Li, and Lin (Wang et al., 2020), denotes Orthonormalized Partial Least Squares, which introduces an orthonormality constraint to PLS for optimal subspace learning and extends it to multi-view, regularized, and deep neural settings. This entry focuses on the fundamental principles, methodological innovations, mathematical details, and empirical findings for both formulations as represented in the cited works.
1. Theoretical Foundation of Ordinal PLS
The classical PLS algorithm assumes interval-scaled indicators, leveraging sample covariances or Pearson correlations throughout its iterative procedures. In practice, many domains such as social sciences and customer-satisfaction analysis rely on ordinal data, often recorded on scales with as few as 4–7 categories. Treating such ordinal data with standard PLS introduces pronounced negative bias into the estimated path coefficients, particularly with a small number of categories. Cantaluppi's OPLS addresses this by adopting the Thurstone threshold model: each observed ordinal variable is conceptualized as a discretization of an underlying continuous latent variable , usually assumed to be standard Gaussian. Thresholds are defined so that
with and the cumulative density function of the standard normal distribution. The OPLS algorithm utilizes the polychoric correlation matrix, estimated from these latent variables and thresholds, as a substitute for the Pearson correlation matrix within the PLS workflow (Cantaluppi, 2012).
2. Mathematical Formulation and Algorithmic Steps
The OPLS workflow retains the central features of reflective measurement (Mode A) PLS modeling but replaces standard computations with those arising from the latent Gaussian framework. The key steps are as follows (Cantaluppi, 2012):
- Threshold mapping: Convert observed ordinal variables to intervals on the latent continuous scale using empirical marginal probabilities and standard normal quantiles.
- Computation of polychoric correlation: Estimate the correlation matrix for the latent variables via polychoric estimation.
- Outer model (composite construction): Latent constructs are modeled as weighted sums of the de-meaned .
- Composite covariance standardization: Obtain standardized weights 0 such that the composite correlation matrix 1.
- Structural (inner) model: Reduced-form equations for path modeling are estimated using ordinary least squares on 2.
- Iterative weight refinement: Block weights are iteratively updated using sign-corrected adjacencies and instrumental variable correlations, continuing until convergence.
Final model extraction includes standardized weights, inner model OLS coefficients, and outer loadings (3). Prediction of category for subject-level latent composite scores uses weighted threshold composition and assignment by mode, median, or mean of overlapped intervals.
3. Practical Considerations and Empirical Results
OPLS is most beneficial when the number of ordinal categories 4, as standard PLS markedly underestimates path coefficients in such scenarios. The method converges to classical PLS for 5, where polychoric and Pearson correlations are nearly identical. For reliable polychoric estimation, a sample size of at least 200–300 is recommended. Zero counts in contingency tables should be addressed by small constant addition or algorithmically enforced positive-definiteness (Cantaluppi, 2012).
Empirical evaluation on real customer-satisfaction data (N=250, 24 indicators, originally on 10-point scales but with substantial skew reducing effective categories) reveals that OPLS yields higher inner path coefficients (e.g., 6 [PLS] vs. 0.58 [OPLS]), with negligible change in outer weights. The subject-level latent scores show a 70–80% exact category match between OPLS and PLS, with >90% within one category. Simulation studies indicate that OPLS reduces negative path coefficient bias by 40–60% for 7; for 8, the bias is negligible for both methods. Reliability coefficients for ordinal scales should be computed on the polychoric matrix.
4. Orthonormalized PLS, Multi-view, and Deep Extensions
Orthonormalized Partial Least Squares (also abbreviated OPLS) introduces an orthogonality constraint on the latent projections. The objective is to find projections 9 and regression weights 0 such that
1
where 2 and 3 are centered data matrices. The solution is given by the generalized eigenvalue problem
4
The top 5 eigenvectors form the projection, and the optimal 6 follows from normal equations. Multi-class classification, multivariate regression, and relationships to LDA or CCA are unified under this formulation.
For multi-view learning, the method extends to learn view-specific projections 7 on data blocks 8 and shared weights 9, optimizing a joint least-squares objective with a common latent space. Regularizers may be imposed on model parameters, decision values, or latent projections—examples include Tikhonov (ridge), mean-matching, kernel weight consistency, HSIC, CCA- or LDA-style alignment. Nonlinear deep extensions are realizable by replacing 0 with learned deep embeddings 1, updating all network parameters via backpropagation on a trace maximization loss derived from the top 2 eigenvalues of relevant matrices (Wang et al., 2020).
5. Summary of Applications and Empirical Findings
Empirical studies of ordinal PLS (Cantaluppi, 2012) emphasize its utility in questionnaire research and customer-satisfaction modeling where variable categoricity is low. For Orthonormalized PLS and its deep, multi-view variants (Wang et al., 2020), experiments span nine datasets in classification and cross-modal retrieval, with comparisons against established subspace methods. Key findings:
- Supervised OPLS variants significantly outperform unsupervised multi-view methods.
- PCA pre-processing enhances linear versions of OPLS.
- Deep OPLS variants exhibit top-ranked performance in both classification accuracy and retrieval mean average precision (mAP).
- Model performance exhibits robustness to regularization and layer depth (optimal 3–4 layers, depending on activation).
OPLS unifies multiple subspace learning settings, readily accommodates explicit regularization, and is extensible to deep architectures. For ordinal data, OPLS preserves the PLS “soft-modeling” framework while correcting for bias due to discrete measurement.
6. Relationship to Other Methodologies
Ordinal PLS is directly related to general latent variable modeling under the threshold model, as used extensively in psychometrics. Its practical distinction is the systematic use of polychoric correlations in both measurement and structural model estimation, setting it apart from ad hoc ordinal-to-interval transformations. Orthonormalized and deep OPLS bridge classical dimension reduction (LDA, CCA, MCCA) and modern deep subspace learning, with a least-squares unification and wide-ranging regularization/topology choices. This suggests that multi-view OPLS frameworks may subsume many legacy and contemporary multivariate methods, offering a flexible platform for supervised and semi-supervised problems where alignment across heterogeneous feature sets is critical (Wang et al., 2020).
7. Limitations and Future Directions
OPLS for ordinal data is sensitive to the estimation of polychoric correlations; unstable estimates can arise with small samples or sparse category combinations, necessitating pragmatic regularization or shrinkage approaches. For both ordinal and orthonormalized OPLS, computational cost is dominated by matrix estimation and eigen-decomposition steps, potentially limiting scalability to very large item banks or high-dimensional multi-view scenarios. Ongoing work in both domains continues to refine estimation procedures, address regularization in small-sample or highly imbalanced settings, and extend nonlinear capabilities via scalable deep learning architectures (Cantaluppi, 2012, Wang et al., 2020). A plausible implication is that future research will further consolidate OPLS variants, perhaps integrating them with probabilistic graphical models or Gaussian processes to better exploit structure and uncertainty in complex ordinal or multi-view data.