Papers

Topics

Authors

Recent

View all

Gemini 2.5 Flash

157 tokens/sec

GPT-4o

43 tokens/sec

Gemini 2.5 Pro Pro

43 tokens/sec

o3 Pro

4 tokens/sec

GPT-4.1 Pro

47 tokens/sec

DeepSeek R1 via Azure Pro

28 tokens/sec

2000 character limit reached

Semi-Nonnegative Matrix Factorization

Updated 30 June 2025

SNMF is a matrix factorization technique that approximates a sign-indefinite matrix as the product of an unconstrained real matrix and a nonnegative matrix.
It employs block coordinate descent and tailored initialization (e.g., SVD and bisection heuristics) to ensure rapid convergence and effective approximation.
The approach is backed by theoretical rank bounds and NP-hardness results, guiding its practical application in signal processing, computer vision, and data analysis.

Semi-Nonnegative Matrix Factorization (SNMF) is an extension of classical matrix factorization techniques, designed to approximate a given matrix—potentially containing both positive and negative entries—using a product of an unconstrained (real-valued) matrix and a nonnegative matrix. SNMF achieves a compromise between the expressiveness of unconstrained factorizations (such as SVD) and the interpretability and parts-based properties of Nonnegative Matrix Factorization (NMF), where all factors are required to be nonnegative.

1. Formal Definition and Key Properties

Given a (potentially sign-indefinite) matrix $M \in \mathbb{R}^{m \times n}$ and a target rank $r$ , SNMF seeks matrices $U \in \mathbb{R}^{m \times r}$ and $V \in \mathbb{R}_+^{r \times n}$ such that:

$\min_{U \in \mathbb{R}^{m \times r},\; V \in \mathbb{R}_+^{r \times n}} \|M - UV\|_F^2$

where $\| \cdot \|_F$ denotes the Frobenius norm. The essential constraint is that $V$ is entrywise nonnegative, whereas $U$ is unconstrained and may have negative entries.

Semi-Nonnegative Rank

The semi-nonnegative rank of $M$ , denoted $\operatorname{rank}_s(M)$ , is defined as the smallest integer $r$ for which there exist $U \in \mathbb{R}^{m \times r}$ , $V \in \mathbb{R}_+^{r \times n}$ such that $M = UV$ .

A fundamental relationship holds between different rank concepts: $\operatorname{rank}(M) \leq \operatorname{rank}_s(M) \leq \operatorname{rank}_+(M)$ and, more tightly,

$\operatorname{rank}(M) \leq \operatorname{rank}_s(M) \leq \operatorname{rank}(M) + 1$

A corollary of this result is that for most real-world matrices, the factorization rank required for a semi-nonnegative factorization does not significantly exceed the standard rank.

Approximation Error Bound

For any rank $r$ , the SNMF approximation error is always no greater than the best unconstrained (SVD-based) approximation of rank $r-1$ :

$\min_{U \in \mathbb{R}^{m \times r}, V \in \mathbb{R}_+^{r \times n}} \| M - UV \|_F \leq \| M - M_{r-1} \|_F$

where $M_{r-1}$ is the best rank- $(r-1)$ unconstrained approximation.

2. Algorithms for SNMF

Block Coordinate Descent

A practical algorithm alternates between:

Solving for $U$ given $V$ : This is an unconstrained least squares problem with a closed-form solution.
Solving for $V$ given $U$ : This reduces to a set of nonnegative least squares (NNLS) problems for the rows of $V$ , each with a closed-form update:

$V(i, :)^T \leftarrow \max\left(0, \frac{ (M - U(:, \mathcal{I}) V(\mathcal{I}, :))^T U(:, i)}{\| U(:, i) \|_2^2} \right)$

This alternating scheme converges to a stationary point of the objective.

Initialization Schemes

Effective initialization is crucial due to the nonconvexity of SNMF.

SVD-based Initialization: Construct $U$ and $V$ from the best rank- $(r-1)$ SVD of $M$ , then augment to rank $r$ using a construction that guarantees $V \ge 0$ . This approach ensures the initial error is no worse than the best unconstrained rank- $(r-1)$ approximation.
Random and K-means Initialization: Random initialization or assigning cluster centers via k-means are often competitive for small $r$ or high-noise cases.
Bisection-based Heuristic: For matrices where the SVD solution is not semi-nonnegative, a bisection heuristic is used to produce an initial factorization as close to semi-nonnegative as possible.

Exact Algorithm for Special Classes

When all nonzero columns of $M$ lie within a single closed half-space (i.e., there exists $z$ such that $M(:,j)^T z > 0$ for all $j$ ), an exact SNMF can be constructed with $r = \operatorname{rank}(M)$ . This is determined by solving a linear feasibility problem.

For cases not satisfying this, the bisection heuristic is used as initialization before further refinement.

3. Theoretical Insights

Characterization via Half-Spaces and Geometry

The condition for the semi-nonnegative rank being equal to the true matrix rank ( $\operatorname{rank}_s(M) = \operatorname{rank}(M)$ ) is that all columns of $M$ (excluding zeros) lie in the interior of some half-space. If this is not satisfied, the semi-nonnegative rank may be higher, effectively reflecting a geometric obstruction to a low-rank, semi-nonnegative factorization.

Computational Hardness

While whether $M$ permits an exact semi-NMF of a given rank can be checked in polynomial time (by linear programming or equivalent), the general problem of finding the best approximate semi-NMF (minimizing $\|M - UV\|_F$ ) is NP-hard even in the rank-one case. Specifically, for $r=1$ , the problem reduces to maximizing a quadratic form over the intersection of the nonnegative orthant and the unit sphere:

$\max_{v \in \mathbb{R}^n,\ v\geq 0,\ \|v\|_2=1} v^T (M^T M) v$

This result distinguishes SNMF from standard NMF, where the rank-one case is tractable for nonnegative $M$ .

Ill-posedness

There exist pathological instances where the infimum of the objective is not attained—i.e., although sequences of feasible $(U, V)$ exist with objective approaching a minimum, no actual minimizer exists within the feasible set. Such ill-posedness arises when the cone spanned by the columns of $M$ cannot be represented as a conical combination of $r$ nonnegative rays, a situation typical when data lie on the boundary of the minimal half-space. While this is a measure-zero phenomenon for random data, it requires care in practical algorithm design.

4. Empirical Performance and Practical Considerations

Numerical Experiments

Empirical studies have established the following practical guidelines:

Bisection-based initialization (A3) is generally fastest and most effective when SVD factorizations are semi-nonnegative or nearly so.
For higher levels of noise or when the data deviate strongly from semi-nonnegative structure, random and k-means-based initializations may yield better solutions for small ranks.
On real, nonnegative datasets (e.g., face images), the bisection/SVD strategies are optimal.
In cases with mixed-sign data, bisection-based initialization remains competitive as the allowed rank increases.
Convergence is typically very fast with appropriate initialization, often requiring fewer than 10 iterations.

Computational Requirements

Each coordinate update for $V$ is an NNLS problem with a closed-form solution.
SVD or eigenvalue decompositions are required for initialization.
Although exact SNMF factorization can sometimes be checked quickly, the best approximation is, in general, computationally intractable, necessitating careful use of heuristic algorithms.

5. Algorithmic and Theoretical Summary Table

Aspect	SNMF Result
Rank bound	$\operatorname{rank}(M) \leq \operatorname{rank}_s(M) \leq \operatorname{rank}(M)+1$
Algorithm	Block coordinate descent; SVD-based and bisection-based initializations; exact LP feasibility for certain $M$
Complexity	Best approximation is NP-hard for rank $r\geq 1$ if $M$ has mixed signs; exact semi-NMF feasible for special $M$ by LP
Ill-posedness	Optimum may not be attained in rare cases
Empirical notes	Bisection/SVD-based methods generally outperform, but random/k-means valuable in noisy settings or small $r$

6. Applications and Broader Significance

SNMF serves as a principled intermediate between the high expressivity of unconstrained low-rank approximations (e.g., SVD) and the interpretability and parts-based learning of NMF. It is especially valuable in applications where the data matrix contains negative entries (e.g., after centering or standardization, or in certain signal processing and computer vision tasks), but where interpretability or nonnegativity in one factor is desired.

The geometric characterization provided allows practitioners to assess the feasibility and appropriateness of SNMF for their data. The empirical behavior of the algorithms, particularly the rapid convergence when initialization is well-tuned, informs practical deployment. The NP-hardness result guides expectations about scalability and the limits of exact computation, while the potential for ill-posedness, albeit rare, underscores the importance of algorithmic caution for edge cases.

7. Principal Mathematical Formulations

Semi-NMF optimization:

$\min_{U \in \mathbb{R}^{m \times r}, V \in \mathbb{R}_+^{r \times n}} \|M - UV\|_F^2$

Rank bounds:

$\operatorname{rank}(M) \leq \operatorname{rank}_s(M) \leq \operatorname{rank}(M) + 1$

NP-hardness (rank 1):

$\max_{v \geq 0,\ \|v\|_2 = 1} v^T (M^T M) v$

Half-space feasibility for exact SNMF:

$M(:,j)^T z \geq 1 \quad \forall\ j\ \text{with}\ M(:,j) \neq 0$

References

Main results, theorems, and algorithms are attributed as in the cited paper's sections, including Theorem 1 (bound on semi-nonnegative rank), Theorem 2 (half-space characterization), and Theorem 5 (NP-hardness of rank-one semi-NMF). For detailed algorithmic steps, performance metrics, and empirical analysis, see Sections 2–5 of the source.

Semi-Nonnegative Matrix Factorization thus offers both theoretical and practical foundations for decomposing general matrices using interpretably constrained factors, with algorithmic strategies supported by rigorous analysis and empirical evaluation.

PDF Markdown Chat (Pro)