Selective Imputation Multi-View Clustering (ISMVC)

Updated 12 December 2025

The paper introduces ISMVC, a framework that selectively imputes missing multi-view entries based on an informativeness score computed from intra-view similarity and cross-view consistency.
It employs a mixture-of-Gaussians VAE with a Product-of-Experts strategy to achieve robust latent representation learning and explicit uncertainty modeling.
Empirical evaluations demonstrate that ISMVC consistently outperforms state-of-the-art methods in accuracy and scalability under high and unbalanced missingness.

Informativeness-based Selective Imputation Multi-View Clustering (ISMVC) is a framework for clustering incomplete multi-view datasets, where each instance may have multiple modalities or views, but many are missing due to unbalanced or stochastic observation patterns. Unlike indiscriminate imputation, which can introduce noise when available information is insufficient, ISMVC employs a data-driven, selective imputation algorithm. It quantifies the informativeness of potential imputations by combining intra-view similarity and cross-view consistency and performs imputation only at positions with sufficient evidential support. Integration is achieved via a mixture-of-Gaussians variational autoencoder (MoG-VAE), enabling robust latent representation learning and explicit uncertainty modeling. ISMVC is presented as a plug-in module, delivering consistent performance gains and efficient scalability in challenging, highly incomplete settings (Xu et al., 11 Dec 2025).

1. Problem Formulation and Theoretical Basis

An incomplete multi-view clustering (IMC) task is defined on datasets $\{ (x_i^1, ..., x_i^V ) \}_{i=1}^N$ , where view $v$ has feature dimension $d_v$ and some $x_i^v$ are missing. Completeness is encoded by $M \in \{0,1\}^{N \times V}$ , with $M_{i,v} = 1$ if $x_i^v$ is observed. For each sample $i$ , observed views are $V_i = \{ v | M_{i,v}=1 \}$ . The goal is unsupervised clustering into $K$ classes, assigning each sample a label $y_i \in \{1, ... , K\}$ , with true labels unknown.

Missingness is unbalanced—each view may be absent with different probabilities per sample, simulating practical scenarios like sensor dropouts or incomplete annotation.

2. Informativeness-Driven Selective Imputation

ISMVC calculates an "information score" $\mathrm{Info}(i,v)$ for each missing position $(i,v)$ . Imputation is performed exclusively where $\mathrm{Info}(i,v) > \tau$ , limiting imputations to entries with well-supported contextual evidence.

The score construction is as follows:

Support set: For each $(i, v)$ , $S_{i,v} = \{ x_j | M_{j,v}=1 \wedge V_i \cap V_j \neq \emptyset \}$ .
View correlation: Encoders $f^v$ are pretrained to map $x_i^v$ to latent means $\mu_i^v \in \mathbb{R}^{d_z}$ . Canonical correlation analysis (CCA) is run between latent representations across views $u, v$ to yield $corr^{uv} \in (0,1]$ , enforcing $corr^{vv} = 1$ .
Intra-view similarity: For view $u$ and samples $i, j$ , $sim^u_{ij} = (1 - \| x^u_i - x^u_j \|_2 / \max_{k\ne \ell} \| x^u_k - x^u_\ell \|_2 )^2 \in (0,1]$ . For a missing target view $v$ , this is estimated by weighted averaging co-observed similarities, i.e. $sim_{ij}^v = \sum_{u \in V_i \cap V_j} sim_{ij}^u \cdot corr^{uv} / \sum_{u \in V_i \cap V_j} corr^{uv}$ .
Score composition:

$\mathrm{Info}(i,v) = \sum_{j \in S_{i,v}} \sum_{u=1}^V sim_{ij}^u \cdot corr^{uv} \cdot M^s_{j,u}$

Imputation is performed only when this aggregate score exceeds threshold $\tau$ .

This dual dependence on intra-view and cross-view evidence ensures that only positions with high support—across both observed neighbors and correlated modalities—are imputed, avoiding artefactual imputations in information-poor regions.

3. MoG-VAE Backbone and Distribution-Level Imputation

ISMVC’s representation learning backbone is a multi-view, mixture-of-Gaussians VAE:

Latent cluster assignments $c_i \sim \mathrm{Categorical}(\pi_1,...,\pi_K)$ .
Latent code $z_i | c_i = k \sim \mathcal{N}(\mu_k, \mathrm{diag}(\sigma_k^2))$ .
Each view $x_i^v | z_i \sim p_{\theta_v}(x_i^v | z_i)$ , typically Gaussian or Bernoulli.

The approximate posterior per sample is $q(z_i, c_i | \text{observed}) = q(z_i|\cdot) q(c_i|\cdot)$ , with per-view posteriors $q_{\phi_v}(z_i|x_i^v) = \mathcal{N}(\mu_i^v, (\sigma_i^v)^2 I)$ . Observed views are aggregated via a Product-of-Experts (PoE) rule: $\mu_i = \left( \sum_{v \in V_i} \frac{\mu_i^v}{(\sigma_i^v)^2} \right) / \left( \sum_{v \in V_i} \frac{1}{(\sigma_i^v)^2} \right), \quad \sigma_i^2 = 1 / \left( \sum_{v \in V_i} \frac{1}{(\sigma_i^v)^2} \right)$

For selected missing positions, ISMVC performs distribution-level imputation:

Nearest neighbors $K_{i,v}$ are determined in latent space based on the 2-Wasserstein distance between aggregated posteriors.
The "imputed" latent posterior parameters are weighted averages of neighbors’ view- $v$ posteriors, with an explicit variance augmentation capturing disagreement.
The aggregated representation is recomputed over all observed and imputed views, stabilizing the latent posterior and preventing overconfident reconstructions.

This mechanism integrates imputation uncertainty and naturally down-weights unreliable synthetic entries in the PoE aggregation.

4. Optimization Objective and Algorithmic Workflow

ISMVC jointly optimizes a variational objective comprising:

ELBO term:

$L_{\text{ELBO}} = \sum_{i=1}^N \left\{ \mathbb{E}_{q(z_i|\cdot)} \left[ \sum_{v \in V_i} \log p_{\theta_v}(x_i^v|z_i) \right] - \mathbb{E}_{q(c_i|\cdot)} \left[ \mathrm{KL}(q(z_i|\cdot) \| p(z_i|c_i)) \right] - \mathrm{KL}(q(c_i|\cdot) \| p(c_i)) \right\}$

Cross-view coherence regularization:

$L_{\text{CH}} = \sum_{i=1}^N \sum_{v \in V_i} -\frac{1}{|V_i|} \mathrm{KL}( q(z_i|\text{all views}) \| q(z_i|x_i^v) )$

The final loss is $L = L_{\text{ELBO}} + \alpha L_{\text{CH}}$ , with $\alpha>0$ tuning the tradeoff.

The procedural workflow:

Pretrain encoders with a reconstruction loss.
Compute all pairwise view correlations via CCA.
Calculate $\mathrm{Info}(i,v)$ for all missing positions.
For the main epochs:
- Aggregate observed posteriors per sample, impute selected missing posteriors as described, re-aggregate, and sample $z_i$ .
- Assign cluster responsibilities and reconstruct inputs.
- Optimize the composite loss by backpropagation.

After training, cluster assignment is by maximum cluster responsibility $\arg\max_k q(c_{ik}=1|z_i)$ (Xu et al., 11 Dec 2025).

5. Comparative Evaluation and Empirical Performance

Extensive experiments are reported on four datasets under severe, unbalanced missingness ( $\eta \in \{0.1, ..., 0.5\}$ ):

Caltech7-5V: $N=1,474$ , $K=7$ .
Scene-15: $N=4,485$ , $K=15$ .
COIL100: $N=7,200$ , $K=100$ .
Multi-Fashion: $N=10,000$ , $K=10$ .

Baseline comparators include imputation-based (PMIMC, CPSPAN), imputation-free (GIMVC, DIMVC, DVIMC), and cautious imputation methods (DCP, DSIMVC). Metrics include accuracy (ACC), normalized mutual information (NMI), and adjusted Rand index (ARI), averaged over 10 runs.

Selected results at $\eta = 0.3$ are summarized:

Dataset	Best Baseline ACC	ISMVC ACC (Δ)
Caltech7-5V	DVIMC: 0.890	0.898 (+0.008)
Scene-15	DVIMC: 0.418	0.445 (+0.027)
COIL100	DVIMC: 0.742	0.757 (+0.015)
Multi-Fashion	DSIMVC: 0.786	0.840 (+0.054)

ISMVC delivers consistent improvements (2–6% ACC) over all baselines and exhibits slowest accuracy degradation as $\eta$ increases, indicating robustness in high-missingness regimes. This suggests particular effectiveness at capturing cross-view structure where most methods collapse.

6. Ablation, Sensitivity, and Plug-in Generality

Ablation experiments visualize and quantify the informativeness mechanism:

Heatmaps of Info $(i,v)$ : Reveal heterogeneity and that high cross-view support yields high scores.
Violin plots: Indicate the shift from unimodal high informativeness at low missingness to bimodal support as $\eta$ grows, supporting the need for selective thresholding.
Selection-ratio sensitivity: At high $\eta$ , imputing 30–50% of entries maximizes accuracy; overimputation reintroduces noise.
Regularization coefficient sweep: ACC peaks for $\alpha$ in $[5,10]$ at high missingness, with lower values under-enforcing coherence and higher ones suppressing reconstruction.
Plug-in generality: The same informativeness-based selection and imputation (IBSI) module, without latent VAE, improves ACC by 4–7% for GIMVC and 3–6% for DCP at $\eta = 0.5$ .

The overhead of IBSI is minimal (<5% of total training time), requiring only offline calculation of $\mathrm{Info}(i,v)$ , CCA, and k-NN search in latent space. No changes to decoder architecture or bi-level optimization are needed, supporting fast integration into existing IMC frameworks.

7. Computational Complexity and Limitations

The extra per-epoch computational cost arises from two sources:

Offline calculation of $\mathrm{Info}(i,v)$ : $O(N \cdot |S| \cdot V)$ .
k-NN searches in the latent space for imputed missing entries: $O(m \log N)$ , where $m$ is the number of missing positions imputed.

Overall runtime impact is negligible—3–5 s extra on medium-size datasets and $\sim$ 18 s on Multi-Fashion, compared to end-to-end training times. Practical deployment only necessitates encoder pretraining and CCA, avoiding major architectural modifications.

8. Summary of Contributions

ISMVC systematically quantifies per-position imputation support by leveraging both intra-view similarity and cross-view evidence, applies selective imputation at the posterior-distribution level in a mixture-of-Gaussians VAE, and explicitly models uncertainty to avoid overconfident and erroneous imputations. The approach achieves state-of-the-art clustering outcomes and has practical utility as a scalable, plug-in addition to a broad family of IMC methods, especially in settings with severe, unbalanced or heterogeneous missingness (Xu et al., 11 Dec 2025).

PDF Markdown Chat (Pro)

References (1)

Simple Yet Effective Selective Imputation for Incomplete Multi-view Clustering (2025)

Whiteboard

Generate a whiteboard explanation of this topic.

Follow Topic

Get notified by email when new papers are published related to Informativeness-based Selective Imputation Multi-View Clustering (ISMVC).