Group Soft-Impute SVD for Matrix Completion
- Group Soft-Impute SVD is a matrix completion technique that incorporates a pseudo-user to capture aggregated group preferences in sparse rating matrices.
- It iteratively applies soft-thresholded SVD to recover low-rank structures while balancing fidelity to observed ratings with nuclear-norm regularization.
- Empirical results on datasets like Goodbooks and Movielens demonstrate improved recall and efficient rank recovery compared to traditional group recommendation methods.
Group Soft-Impute SVD (GSI-SVD) is a nuclear-norm regularized matrix completion technique designed to enhance group recommendations by modeling collective user preferences within sparse, high-dimensional user–item rating datasets. This approach appends group-aggregated preferences as a weighted pseudo-user row to the rating matrix and iteratively performs singular value thresholding to recover low-rank structure, thereby providing robust recommendations for groups of varying sizes (Ibrahim et al., 14 Nov 2025).
1. Problem Formulation and Notation
GSI-SVD operates on the user–item rating matrix , where is the number of users and denotes the set of items . Observed ratings reside in the subset , with projection operator if , $0$ otherwise.
For a target group and recommendation size , the system aims to select that best aligns with the group’s collective taste. Aggregated group ratings are computed for each item rated by any member: weighted by
yielding the entry . An extended matrix incorporates this pseudo-user. The group recommendation problem is thus recast as a low-rank matrix completion task.
2. Nuclear-Norm Regularized Matrix Completion
The GSI-SVD objective seeks minimizing: where denotes fidelity to known ratings and the nuclear norm encourages low-rank solutions. The trade-off parameter regularizes the rank of the predicted ratings.
This convex program addresses the dual challenges of sparse observations and high ambient dimensions, leveraging the tightest convex relaxation of matrix rank for structure recovery. The aggregation of group preferences directly into allows individual and collective tastes to jointly inform completion.
3. Soft-Impute SVD Algorithm
The core iterative algorithm fills in missing values by alternately projecting onto the observed locations and shrinking singular values. At iteration :
- Construct , where fills missing entries.
- Compute the truncated SVD: , , retaining only singular values above the current .
- Apply soft-thresholding: with .
- Update: .
Convergence is determined by the relative Frobenius norm change: . A decreasing grid of values is used from to , with warm starting to accelerate iterative refinement.
| Step | Operation | Purpose |
|---|---|---|
| Aggregation | Compute and ; append to | Capture group collective taste |
| Imputation | Iterative SVD and shrinkage over extended matrix | Low-rank recovery |
| Convergence | Relative norm criterion, warm start for each | Efficient optimization |
4. Incorporation of Group Preferences
The group preference vector, embedded as a pseudo-user row weighted by , integrates individual and aggregate ratings in a unified framework. The nuclear-norm completion exploits correlations across both users and groups to inform missing values.
Post-convergence, the final row of the completed matrix contains predicted group ratings across all items, enabling high-fidelity group recommendations even when data is exceedingly sparse or dimensionally broad. The alignment of individual and group signals within the same low-rank structure is key to robustness.
5. Computational Complexity and Convergence Analysis
Each algorithm iteration involves a rank- partial SVD on an matrix, costing . The projections and scale as . Overall per-iteration runtime is , feasible for large , when .
Soft-thresholding on singular values is non-expansive in the Frobenius norm: giving rise to geometric (linear) convergence empirically once iterates enter the neighborhood of the low-rank solution. Convergence to accuracy requires iterations, as observed in exponential decay of the error on log scale.
6. Empirical Evaluation and Results
Experiments utilized three datasets:
- Goodbooks-10K: users $200$ books, sparsity.
- Movielens (100K): $943$ users $500$ items, sparsity.
- Synthetic: users $200$ items, observed.
Group sizes evaluated were $5$, $15$, $20$, and $25$. Baseline methods included WBF (“weighted before factorization” matrix factorization aggregator) and AF (“after factorization” latent-factor aggregator). Metrics were precision@K, recall@K, and F1@K for .
| Dataset | Group Size | GSI-SVD Recall | Baseline Recall | Rank Recovery |
|---|---|---|---|---|
| Goodbooks | 5 | Higher | Lower | Much lower rank |
| Goodbooks | 15–25 | Comparable | Comparable | Lower rank |
| Movielens | 5 | Higher | Lower | Much lower rank |
| Synthetic | All | Highest | Lower | Lower rank |
GSI-SVD outperformed baselines in recall for small groups on real datasets, retained comparability for larger groups, and yielded the highest precision, recall, and F1 scores on synthetic data. A key observation is that GSI-SVD achieves matrix completion at substantially lower effective rank as increases compared to WBF/AF, which remain closer to full rank. This suggests improved capacity for structure recovery in high-dimensional, sparse scenarios.
7. Practical Considerations and Implementation Notes
Optimal performance of GSI-SVD depends on initialization and selection of hyperparameters:
- , .
- Grid size –$20$; tolerance .
Warm starting and tracking rank decay via a logarithmic grid is effective. Implementation benefits notably from batched partial SVD via randomized algorithms (e.g., PROPACK, ARPACK) and GPU acceleration (such as PyTorch). Retention of nonzero components in each step provides memory and computational efficiency.
Batched singular-value thresholding and the integration of both individual and group-level aggregation make GSI-SVD well suited for settings exhibiting extreme sparsity and high dimensionality.
This suggests broader applicability in group recommendation domains where rating matrices are large and incomplete, and a plausible implication is enhanced rank adaptivity compared to standard MF approaches.
8. Summary and Context
Group Soft-Impute SVD offers a principled convex optimization method for group recommendation in sparse, high-dimensional environments, leveraging group-aggregated preference rows and nuclear-norm regularization to enable low-rank recovery and robust recommendations. Its empirical effectiveness is confirmed on multiple real-world and synthetic datasets, outperforming matrix factorization-based group aggregators in recall for small groups and achieving favorable trade-offs in recall and precision while automatically recovering lower matrix ranks (Ibrahim et al., 14 Nov 2025).