Papers
Topics
Authors
Recent
2000 character limit reached

Selective Imputation Multi-View Clustering (ISMVC)

Updated 12 December 2025
  • The paper introduces ISMVC, a framework that selectively imputes missing multi-view entries based on an informativeness score computed from intra-view similarity and cross-view consistency.
  • It employs a mixture-of-Gaussians VAE with a Product-of-Experts strategy to achieve robust latent representation learning and explicit uncertainty modeling.
  • Empirical evaluations demonstrate that ISMVC consistently outperforms state-of-the-art methods in accuracy and scalability under high and unbalanced missingness.

Informativeness-based Selective Imputation Multi-View Clustering (ISMVC) is a framework for clustering incomplete multi-view datasets, where each instance may have multiple modalities or views, but many are missing due to unbalanced or stochastic observation patterns. Unlike indiscriminate imputation, which can introduce noise when available information is insufficient, ISMVC employs a data-driven, selective imputation algorithm. It quantifies the informativeness of potential imputations by combining intra-view similarity and cross-view consistency and performs imputation only at positions with sufficient evidential support. Integration is achieved via a mixture-of-Gaussians variational autoencoder (MoG-VAE), enabling robust latent representation learning and explicit uncertainty modeling. ISMVC is presented as a plug-in module, delivering consistent performance gains and efficient scalability in challenging, highly incomplete settings (Xu et al., 11 Dec 2025).

1. Problem Formulation and Theoretical Basis

An incomplete multi-view clustering (IMC) task is defined on datasets {(xi1,...,xiV)}i=1N\{ (x_i^1, ..., x_i^V ) \}_{i=1}^N, where view vv has feature dimension dvd_v and some xivx_i^v are missing. Completeness is encoded by M{0,1}N×VM \in \{0,1\}^{N \times V}, with Mi,v=1M_{i,v} = 1 if xivx_i^v is observed. For each sample ii, observed views are Vi={vMi,v=1}V_i = \{ v | M_{i,v}=1 \}. The goal is unsupervised clustering into KK classes, assigning each sample a label yi{1,...,K}y_i \in \{1, ... , K\}, with true labels unknown.

Missingness is unbalanced—each view may be absent with different probabilities per sample, simulating practical scenarios like sensor dropouts or incomplete annotation.

2. Informativeness-Driven Selective Imputation

ISMVC calculates an "information score" Info(i,v)\mathrm{Info}(i,v) for each missing position (i,v)(i,v). Imputation is performed exclusively where Info(i,v)>τ\mathrm{Info}(i,v) > \tau, limiting imputations to entries with well-supported contextual evidence.

The score construction is as follows:

  • Support set: For each (i,v)(i, v), Si,v={xjMj,v=1ViVj}S_{i,v} = \{ x_j | M_{j,v}=1 \wedge V_i \cap V_j \neq \emptyset \}.
  • View correlation: Encoders fvf^v are pretrained to map xivx_i^v to latent means μivRdz\mu_i^v \in \mathbb{R}^{d_z}. Canonical correlation analysis (CCA) is run between latent representations across views u,vu, v to yield corruv(0,1]corr^{uv} \in (0,1], enforcing corrvv=1corr^{vv} = 1.
  • Intra-view similarity: For view uu and samples i,ji, j, simiju=(1xiuxju2/maxkxkuxu2)2(0,1]sim^u_{ij} = (1 - \| x^u_i - x^u_j \|_2 / \max_{k\ne \ell} \| x^u_k - x^u_\ell \|_2 )^2 \in (0,1]. For a missing target view vv, this is estimated by weighted averaging co-observed similarities, i.e. simijv=uViVjsimijucorruv/uViVjcorruvsim_{ij}^v = \sum_{u \in V_i \cap V_j} sim_{ij}^u \cdot corr^{uv} / \sum_{u \in V_i \cap V_j} corr^{uv}.
  • Score composition:

Info(i,v)=jSi,vu=1VsimijucorruvMj,us\mathrm{Info}(i,v) = \sum_{j \in S_{i,v}} \sum_{u=1}^V sim_{ij}^u \cdot corr^{uv} \cdot M^s_{j,u}

Imputation is performed only when this aggregate score exceeds threshold τ\tau.

This dual dependence on intra-view and cross-view evidence ensures that only positions with high support—across both observed neighbors and correlated modalities—are imputed, avoiding artefactual imputations in information-poor regions.

3. MoG-VAE Backbone and Distribution-Level Imputation

ISMVC’s representation learning backbone is a multi-view, mixture-of-Gaussians VAE:

  • Latent cluster assignments ciCategorical(π1,...,πK)c_i \sim \mathrm{Categorical}(\pi_1,...,\pi_K).
  • Latent code zici=kN(μk,diag(σk2))z_i | c_i = k \sim \mathcal{N}(\mu_k, \mathrm{diag}(\sigma_k^2)).
  • Each view xivzipθv(xivzi)x_i^v | z_i \sim p_{\theta_v}(x_i^v | z_i), typically Gaussian or Bernoulli.

The approximate posterior per sample is q(zi,ciobserved)=q(zi)q(ci)q(z_i, c_i | \text{observed}) = q(z_i|\cdot) q(c_i|\cdot), with per-view posteriors qϕv(zixiv)=N(μiv,(σiv)2I)q_{\phi_v}(z_i|x_i^v) = \mathcal{N}(\mu_i^v, (\sigma_i^v)^2 I). Observed views are aggregated via a Product-of-Experts (PoE) rule: μi=(vViμiv(σiv)2)/(vVi1(σiv)2),σi2=1/(vVi1(σiv)2)\mu_i = \left( \sum_{v \in V_i} \frac{\mu_i^v}{(\sigma_i^v)^2} \right) / \left( \sum_{v \in V_i} \frac{1}{(\sigma_i^v)^2} \right), \quad \sigma_i^2 = 1 / \left( \sum_{v \in V_i} \frac{1}{(\sigma_i^v)^2} \right)

For selected missing positions, ISMVC performs distribution-level imputation:

  • Nearest neighbors Ki,vK_{i,v} are determined in latent space based on the 2-Wasserstein distance between aggregated posteriors.
  • The "imputed" latent posterior parameters are weighted averages of neighbors’ view-vv posteriors, with an explicit variance augmentation capturing disagreement.
  • The aggregated representation is recomputed over all observed and imputed views, stabilizing the latent posterior and preventing overconfident reconstructions.

This mechanism integrates imputation uncertainty and naturally down-weights unreliable synthetic entries in the PoE aggregation.

4. Optimization Objective and Algorithmic Workflow

ISMVC jointly optimizes a variational objective comprising:

LELBO=i=1N{Eq(zi)[vVilogpθv(xivzi)]Eq(ci)[KL(q(zi)p(zici))]KL(q(ci)p(ci))}L_{\text{ELBO}} = \sum_{i=1}^N \left\{ \mathbb{E}_{q(z_i|\cdot)} \left[ \sum_{v \in V_i} \log p_{\theta_v}(x_i^v|z_i) \right] - \mathbb{E}_{q(c_i|\cdot)} \left[ \mathrm{KL}(q(z_i|\cdot) \| p(z_i|c_i)) \right] - \mathrm{KL}(q(c_i|\cdot) \| p(c_i)) \right\}

  • Cross-view coherence regularization:

LCH=i=1NvVi1ViKL(q(ziall views)q(zixiv))L_{\text{CH}} = \sum_{i=1}^N \sum_{v \in V_i} -\frac{1}{|V_i|} \mathrm{KL}( q(z_i|\text{all views}) \| q(z_i|x_i^v) )

  • The final loss is L=LELBO+αLCHL = L_{\text{ELBO}} + \alpha L_{\text{CH}}, with α>0\alpha>0 tuning the tradeoff.

The procedural workflow:

  1. Pretrain encoders with a reconstruction loss.
  2. Compute all pairwise view correlations via CCA.
  3. Calculate Info(i,v)\mathrm{Info}(i,v) for all missing positions.
  4. For the main epochs:
    • Aggregate observed posteriors per sample, impute selected missing posteriors as described, re-aggregate, and sample ziz_i.
    • Assign cluster responsibilities and reconstruct inputs.
    • Optimize the composite loss by backpropagation.

After training, cluster assignment is by maximum cluster responsibility argmaxkq(cik=1zi)\arg\max_k q(c_{ik}=1|z_i) (Xu et al., 11 Dec 2025).

5. Comparative Evaluation and Empirical Performance

Extensive experiments are reported on four datasets under severe, unbalanced missingness (η{0.1,...,0.5}\eta \in \{0.1, ..., 0.5\}):

  • Caltech7-5V: N=1,474N=1,474, K=7K=7.
  • Scene-15: N=4,485N=4,485, K=15K=15.
  • COIL100: N=7,200N=7,200, K=100K=100.
  • Multi-Fashion: N=10,000N=10,000, K=10K=10.

Baseline comparators include imputation-based (PMIMC, CPSPAN), imputation-free (GIMVC, DIMVC, DVIMC), and cautious imputation methods (DCP, DSIMVC). Metrics include accuracy (ACC), normalized mutual information (NMI), and adjusted Rand index (ARI), averaged over 10 runs.

Selected results at η=0.3\eta = 0.3 are summarized:

Dataset Best Baseline ACC ISMVC ACC (Δ)
Caltech7-5V DVIMC: 0.890 0.898 (+0.008)
Scene-15 DVIMC: 0.418 0.445 (+0.027)
COIL100 DVIMC: 0.742 0.757 (+0.015)
Multi-Fashion DSIMVC: 0.786 0.840 (+0.054)

ISMVC delivers consistent improvements (2–6% ACC) over all baselines and exhibits slowest accuracy degradation as η\eta increases, indicating robustness in high-missingness regimes. This suggests particular effectiveness at capturing cross-view structure where most methods collapse.

6. Ablation, Sensitivity, and Plug-in Generality

Ablation experiments visualize and quantify the informativeness mechanism:

  • Heatmaps of Info(i,v)(i,v): Reveal heterogeneity and that high cross-view support yields high scores.
  • Violin plots: Indicate the shift from unimodal high informativeness at low missingness to bimodal support as η\eta grows, supporting the need for selective thresholding.
  • Selection-ratio sensitivity: At high η\eta, imputing 30–50% of entries maximizes accuracy; overimputation reintroduces noise.
  • Regularization coefficient sweep: ACC peaks for α\alpha in [5,10][5,10] at high missingness, with lower values under-enforcing coherence and higher ones suppressing reconstruction.
  • Plug-in generality: The same informativeness-based selection and imputation (IBSI) module, without latent VAE, improves ACC by 4–7% for GIMVC and 3–6% for DCP at η=0.5\eta = 0.5.

The overhead of IBSI is minimal (<5% of total training time), requiring only offline calculation of Info(i,v)\mathrm{Info}(i,v), CCA, and k-NN search in latent space. No changes to decoder architecture or bi-level optimization are needed, supporting fast integration into existing IMC frameworks.

7. Computational Complexity and Limitations

The extra per-epoch computational cost arises from two sources:

  • Offline calculation of Info(i,v)\mathrm{Info}(i,v): O(NSV)O(N \cdot |S| \cdot V).
  • k-NN searches in the latent space for imputed missing entries: O(mlogN)O(m \log N), where mm is the number of missing positions imputed.

Overall runtime impact is negligible—3–5 s extra on medium-size datasets and \sim18 s on Multi-Fashion, compared to end-to-end training times. Practical deployment only necessitates encoder pretraining and CCA, avoiding major architectural modifications.

8. Summary of Contributions

ISMVC systematically quantifies per-position imputation support by leveraging both intra-view similarity and cross-view evidence, applies selective imputation at the posterior-distribution level in a mixture-of-Gaussians VAE, and explicitly models uncertainty to avoid overconfident and erroneous imputations. The approach achieves state-of-the-art clustering outcomes and has practical utility as a scalable, plug-in addition to a broad family of IMC methods, especially in settings with severe, unbalanced or heterogeneous missingness (Xu et al., 11 Dec 2025).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Whiteboard

Follow Topic

Get notified by email when new papers are published related to Informativeness-based Selective Imputation Multi-View Clustering (ISMVC).