Papers
Topics
Authors
Recent
2000 character limit reached

Algorithms for nonnegative matrix factorization with the beta-divergence

Published 8 Oct 2010 in cs.LG | (1010.1763v3)

Abstract: This paper describes algorithms for nonnegative matrix factorization (NMF) with the beta-divergence (beta-NMF). The beta-divergence is a family of cost functions parametrized by a single shape parameter beta that takes the Euclidean distance, the Kullback-Leibler divergence and the Itakura-Saito divergence as special cases (beta = 2,1,0, respectively). The proposed algorithms are based on a surrogate auxiliary function (a local majorization of the criterion function). We first describe a majorization-minimization (MM) algorithm that leads to multiplicative updates, which differ from standard heuristic multiplicative updates by a beta-dependent power exponent. The monotonicity of the heuristic algorithm can however be proven for beta in (0,1) using the proposed auxiliary function. Then we introduce the concept of majorization-equalization (ME) algorithm which produces updates that move along constant level sets of the auxiliary function and lead to larger steps than MM. Simulations on synthetic and real data illustrate the faster convergence of the ME approach. The paper also describes how the proposed algorithms can be adapted to two common variants of NMF : penalized NMF (i.e., when a penalty function of the factors is added to the criterion function) and convex-NMF (when the dictionary is assumed to belong to a known subspace).

Citations (795)

Summary

  • The paper introduces novel algorithms for nonnegative matrix factorization using beta-divergence to flexibly model diverse noise distributions.
  • It develops both a majorization-minimization approach with tailored multiplicative updates and a novel majorization-equalization algorithm that accelerates convergence.
  • Simulations on synthetic and real audio data demonstrate faster convergence and improved transcription accuracy in practical signal processing applications.

Algorithms for Nonnegative Matrix Factorization with the Beta-Divergence

The paper "Algorithms for nonnegative matrix factorization with the beta-divergence" (1010.1763) explores the development of algorithms specifically designed for Nonnegative Matrix Factorization (NMF) utilizing the β\beta-divergence as the metric for determining the fit between the data matrix VV and the factorized product WHWH. The β\beta-divergence generalizes several well-known divergence measures, including the Euclidean distance, Kullback-Leibler divergence, and Itakura-Saito divergence, by varying the parameter β\beta. This flexibility allows modeling a range of noise distributions that can be applied to various signal processing tasks.

Description of Methodologies

Majorization-Minimization (MM) Algorithm

The MM algorithm is based on finding a surrogate auxiliary function, G(hh~)G(h|\tilde{h}), that locally majorizes the original criterion function. The MM algorithm leads to multiplicative updates that differ from the standard heuristic multiplicative updates by a β\beta-dependent power exponent. This update scheme ensures monotonic descent of the criterion for a specific set of β\beta values. The practical implementation involves iteratively choosing new matrices W(i)W^{(i)} and H(i)H^{(i)} by deriving the multiplicative update rule based on the gradient components of the β\beta-divergence.

Majorization-Equalization (ME) Algorithm

The ME algorithm is introduced as a novel concept wherein updates occur along constant level sets of the auxiliary function, resulting in larger step sizes compared to MM. This property theoretically speeds up convergence. ME updates are formulated to be larger through techniques akin to over-relaxation in optimization contexts. For certain values of β\beta, the update rules simplify to solving polynomial equations, which heightens computational efficiency.

Heuristic Multiplicative Updates

The heuristic multiplicative updates have been rigorously analyzed to demonstrate monotonic behavior for β[0,2]\beta \in [0, 2]. Although initially derived through gradient-based heuristics, having proven monotonic convergence under these conditions opens practical applicability without the necessity of parameter tuning typical to the MM approach.

Simulations and Results

The simulation studies conducted on synthetic and real datasets illustrate the advantages of using ME updates over traditional MM updates. The ME algorithm consistently demonstrates faster convergence on tasks such as polyphonic music transcription—attributed to its effective handling of the β\beta-divergence for audio spectrograms—which substantiates the claims of improved computational performance. Specifically, synthetic data studies underscore robust convergence properties, and experiments on real-world audio datasets highlight practical signal-processing benefits such as enhanced transcription accuracies.

Variants and Applications

The proposed algorithms extend to variants of NMF which include penalized NMF and convex-NMF adaptations. Penalized applications incorporate additional regularization terms to the objective function, modeled within the auxiliary function framework, enabling solutions for 1\ell_1-norm regularization, which is critical for sparsity constraints. Moreover, convex-NMF adaptations allow WW to belong to a known subspace, relevant for contexts where dictionary learning needs to align with specific structured priors, e.g., harmonicity in audio processing.

Conclusion

This research provides a comprehensive framework for leveraging the β\beta-divergence in NMF problem settings, enabling robust and flexible algorithmic solutions for domains requiring scalable and accurate factorization. Extensions to penalized and structured forms further the utility of this work for those developing advanced machine learning systems in signal processing, image analysis, and other domains where NMF is deployed. Future work would benefit from deeper theoretical insights into convergence properties over broader classes of β\beta values and deploying adaptive strategies for task-specific divergence optimization.

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.