Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Bayesian Robust Tensor Factorization for Incomplete Multiway Data (1410.2386v2)

Published 9 Oct 2014 in cs.CV and cs.LG

Abstract: We propose a generative model for robust tensor factorization in the presence of both missing data and outliers. The objective is to explicitly infer the underlying low-CP-rank tensor capturing the global information and a sparse tensor capturing the local information (also considered as outliers), thus providing the robust predictive distribution over missing entries. The low-CP-rank tensor is modeled by multilinear interactions between multiple latent factors on which the column sparsity is enforced by a hierarchical prior, while the sparse tensor is modeled by a hierarchical view of Student-$t$ distribution that associates an individual hyperparameter with each element independently. For model learning, we develop an efficient closed-form variational inference under a fully Bayesian treatment, which can effectively prevent the overfitting problem and scales linearly with data size. In contrast to existing related works, our method can perform model selection automatically and implicitly without need of tuning parameters. More specifically, it can discover the groundtruth of CP rank and automatically adapt the sparsity inducing priors to various types of outliers. In addition, the tradeoff between the low-rank approximation and the sparse representation can be optimized in the sense of maximum model evidence. The extensive experiments and comparisons with many state-of-the-art algorithms on both synthetic and real-world datasets demonstrate the superiorities of our method from several perspectives.

Citations (200)

Summary

  • The paper introduces a Bayesian framework that automatically infers tensor rank and decomposes incomplete multiway data into low-rank and sparse components.
  • It employs variational inference for efficient, linear-scaling computation that outperforms traditional techniques in tasks like video background subtraction and facial denoising.
  • The method leverages hierarchical sparsity and Student-t distributions to prevent overfitting and reliably recover missing data entries.

Bayesian Robust Tensor Factorization for Incomplete Multiway Data

In the paper, the authors introduce a Bayesian approach to tensor factorization aimed at handling incomplete multiway data. This approach, termed Bayesian Robust Tensor Factorization (BRTF), addresses the challenging issue of modeling data that lacks completeness and contains outliers. Part of its novelty lies in explicitly decomposing a tensor into two distinct components: a low-rank core tensor that encapsulates global data features, and a sparse tensor that captures local information, essentially treating outliers as a separate component.

The foremost element of this method is the Bayesian framework applied to CP tensor factorization, where the dimensionality of the latent space, or tensor rank, is inferred automatically. This is achieved using column-wise sparsity enforced through a hierarchical prior over the latent factors. The sparse tensor is separately modeled with a Student-tt distribution, granting each element an independent hyperparameter. This arrangement allows the model to naturally adapt to diverse outlier characteristics without manual adjustment of parameters. The Bayesian approach avoids overfitting, a common pitfall in traditional models, especially when employed on sparse data.

For the robust and efficient computation of the model, variational inference is employed. This ensures that BRTF scales linearly with data size, making it applicable to large datasets. Contrary to several previous methodologies that require pre-defined tensor ranks or heuristic parameter tuning, BRTF can automatically determine the optimal model parameters based solely on available data.

Model validation is detailed through extensivesimulations with both synthetic and real-world data, demonstrating its effectiveness across a multitude of scenarios. Specifically, the paper highlights applications in video background subtraction and facial image denoising, areas where handling sparse and noisy data is pivotal. The model proves superior to well-established methods, such as CP-ALS, HORPCA, and others, in terms of both predictive accuracy and computational feasibility, even when the competing methods utilize ground-truth data for parameter tuning.

The authors also discuss scenarios involving complete tensors, where model simplification is feasible, further emphasizing the versatility of BRTF. Additionally, the proposed method's ability to infer missing data entries through predictive distributions, which provide a measure of uncertainty, promises enhancements in robustness over current tensor completion tasks.

By advancing tensor factorization through Bayesian means, this research opens pathways for further exploration in handling complex, multi-modal data. The methods described might well direct future advancements in the theoretical aspects of rank determination in higher-order tensor spaces, with implications extending into real-time processing and adaptive applications in artificial intelligence and machine learning domains.