Global Guarantees for Enforcing Deep Generative Priors by Empirical Risk (1705.07576v3)

Published 22 May 2017 in cs.IT, cs.LG, math.IT, math.OC, and math.PR

Abstract: We examine the theoretical properties of enforcing priors provided by generative deep neural networks via empirical risk minimization. In particular we consider two models, one in which the task is to invert a generative neural network given access to its last layer and another in which the task is to invert a generative neural network given only compressive linear observations of its last layer. We establish that in both cases, in suitable regimes of network layer sizes and a randomness assumption on the network weights, that the non-convex objective function given by empirical risk minimization does not have any spurious stationary points. That is, we establish that with high probability, at any point away from small neighborhoods around two scalar multiples of the desired solution, there is a descent direction. Hence, there are no local minima, saddle points, or other stationary points outside these neighborhoods. These results constitute the first theoretical guarantees which establish the favorable global geometry of these non-convex optimization problems, and they bridge the gap between the empirical success of enforcing deep generative priors and a rigorous understanding of non-linear inverse problems.

Citations (137)

View on Semantic Scholar

Summary

The paper demonstrates that non-convex empirical risk minimization avoids spurious local minima when enforcing deep generative priors.
It establishes clear conditions on network expansion and sampling matrices to ensure a favorable objective landscape for inversion problems.
The results imply that deep generative models can enable more efficient signal reconstruction compared to traditional sparsity-based methods.

Overview of the Paper on Global Guarantees for Enforcing Deep Generative Priors

The paper by Paul Hand and Vladislav Voroninski investigates theoretical properties of enforcing generative priors from deep neural networks through empirical risk minimization. The authors focus on two scenarios: the inversion of a generative neural network accessible through its final layer, and the inversion given only compressive linear observations of its last layer. Their primary contribution is demonstrating that the empirical risk minimization framework, despite its non-convex nature, does not harbor spurious local minima across specific conditions, which marks a significant advancement in understanding the empirical success of enforced generative priors within non-linear inverse problems.

Theoretical Contributions

Non-Convex Objective without Spurious Minima: The paper establishes that, across suitable network layer size regimes and randomness conditions on network weights, non-convex objective functions related to empirical risk minimization avoid spurious stationary points with high probability. There is always a descent direction at any location outside small regions around two scalar multiples of the desired solution. Hence, in well-defined regions, no local minima or saddle points exist.
Conditions for Network Structure and Sampling Matrix: It provides conditions under which the objective function's favorable geometry is ensured. Specifically, the neural networks must be sufficiently expansive, and the measurement matrices must satisfy particular randomness assumptions, aligning with scenarios practical in machine learning applications.
Generative Priors over Sparsity: While traditional compressed sensing relies on sparsity priors for signal recovery, this paper demonstrates that deep generative models can outperform by permitting a more significant level of compression. The theoretical underpinnings provide a promising outlook towards signal reconstruction efficiency.
Application to Inverse Problems: By situating the inversion problem within the generative modeling context—where the priors ascribe hierarchical signal structures—the research lays the groundwork for extending these methods toward other imaging and signal processing fields.

Numerical Highlights

For Gaussian networks with layers sufficiently expansive, and compressive sampling matrices of Gaussian nature, the sample complexity is essentially linear in latent code dimensionality rather than the signal's sparsity level.
The methodological framework and mathematical constructs used reveal that deep generative priors, when appropriately applied, may enable lower sample complexity and enhance the signal-to-noise ratio compared to traditional sparsity-based methods.

Implications and Future Research Directions

The theoretical advancements invite potential applications spanning various imaging sciences, notably in areas defying linear sparsity models like phase retrieval and MR imaging enhancement. This line of work portends the likelihood of breakthroughs in phase retrieval, potentially achieving sample complexities unobtainable by conventional algorithms due to polynomial time constraints.

The paper propels inquiries into non-linear inverse problem-solving, suggesting that future studies could focus on noise-robustness and robustness to outliers for generative priors, akin to the compressed sensing framework. Furthermore, it suggests baselines for research into generative modeling's impact on mitigating adversarial examples, pivotal in enhancing neural network security.

In sum, this research deepens our understanding of using deep generative models for inverse problem regularization, proposing a new theoretical foundation that holds promising practical implications in improving signal acquisition strategies across scientific and technological domains.

PDF Markdown

Related Papers

YouTube

Show All Videos