Density Modeling of Images using a Generalized Normalization Transformation (1511.06281v4)

Published 19 Nov 2015 in cs.LG and cs.CV

Abstract: We introduce a parametric nonlinear transformation that is well-suited for Gaussianizing data from natural images. The data are linearly transformed, and each component is then normalized by a pooled activity measure, computed by exponentiating a weighted sum of rectified and exponentiated components and a constant. We optimize the parameters of the full transformation (linear transform, exponents, weights, constant) over a database of natural images, directly minimizing the negentropy of the responses. The optimized transformation substantially Gaussianizes the data, achieving a significantly smaller mutual information between transformed components than alternative methods including ICA and radial Gaussianization. The transformation is differentiable and can be efficiently inverted, and thus induces a density model on images. We show that samples of this model are visually similar to samples of natural image patches. We demonstrate the use of the model as a prior probability density that can be used to remove additive noise. Finally, we show that the transformation can be cascaded, with each layer optimized using the same Gaussianization objective, thus offering an unsupervised method of optimizing a deep network architecture.

Citations (357)

View on Semantic Scholar

Summary

The paper presents a GDN transformation that Gaussianizes image data, effectively minimizing mutual information compared to ICA and RG.
Experimental results demonstrate that the method significantly enhances image denoising, achieving competitive PSNR and SSIM scores.
The approach’s differentiability and modular design pave the way for deeper unsupervised learning and improved computational imaging.

Density Modeling of Images using a Generalized Normalization Transformation

The paper discusses the development of an advanced mathematical transformation for the purpose of Gaussianizing data derived from natural images, focusing on a method known as Generalized Divisive Normalization (GDN). The authors, Johannes Ballé, Valero Laparra, and Eero P. Simoncelli, introduce a parametric non-linear transformation that considers various parameters to achieve a better density model for images. The approach is an unsupervised technique that allows the data to fit a Gaussian distribution, ultimately reducing mutual information between the components of transformed data more effectively than Independent Component Analysis (ICA) or Radial Gaussianization (RG).

Transformation Details

The key innovation of this paper is the formulation of a highly parameterized GDN transform, which extends well beyond traditional divisive normalization methods previously used in certain computational models of sensory neurons. The transformation process involves:

Linear Transformation: The initial phase of transforming the data.
Normalization by Pooled Activity: Utilizing a combination of rectified and exponentiated components summed with a constant.

Optimization of these parameters is achieved by directly minimizing the negentropy of the responses across a broad database of natural images, favoring this method over performing alternative operations independently. The differentiability and invertibility of this method make it a critical tool in forming a continuous density model, crucial for applications like image denoising.

Experimental Results and Implications

The authors provide compelling numerical evidence of the efficiency of GDN transformations. Notably, it achieves a significantly smaller mutual information output than ICA or RG when applied to various wavelet coefficients within an image dataset. This outcome underscores GDN's superior capability to reduce dependencies between image components when compared to other transformation methods.

In practical terms, the transformation shows promising results when used as a prior in the Bayesian inference methodology for noise removal from images, resulting in competitive peak signal-to-noise ratios (PSNR) and structural similarity index (SSIM) scores. Furthermore, the transformation's ability to generate samples that mimic the naturalistic distribution of images indicates its efficacy in diverse imaging tasks.

Theoretical and Practical Implications

The theoretical framework laid out by this paper raises several interesting implications for the future of image modeling and representation in neural networks. By normalizing parameters and optimizing transformations more effectively, the GDN approach could reshape neural network architectures towards deeper, unsupervised learning designs. The model supports cascade formations, facilitating subsequent layers that further improve the fitting and performance of image statistics, pointing towards a modular, layered strategy in computational imaging.

The flexibility of the GDN transformation, as demonstrated, provides a basis for future exploration of its capabilities in other domains where natural image density estimation is applicable. Furthermore, it poses promising exploration potential in the fields of neurobiological research and sensor modeling, contributing essential insights into efficient coding theories.

Overall, the essay outlines a comprehensive overview of the GDN framework and its components detailed in the paper, emphasizing the strong numerical results obtained. It also highlights the broader implications of density modeling innovations, fostering further research in transforming how data is processed and understood in image-related computation.

PDF Markdown