Learned-Norm Pooling for Deep Feedforward and Recurrent Neural Networks

Published 7 Nov 2013 in cs.NE, cs.LG, and stat.ML | (1311.1780v7)

Abstract: In this paper we propose and investigate a novel nonlinear unit, called $L_p$ unit, for deep neural networks. The proposed $L_p$ unit receives signals from several projections of a subset of units in the layer below and computes a normalized $L_p$ norm. We notice two interesting interpretations of the $L_p$ unit. First, the proposed unit can be understood as a generalization of a number of conventional pooling operators such as average, root-mean-square and max pooling widely used in, for instance, convolutional neural networks (CNN), HMAX models and neocognitrons. Furthermore, the $L_p$ unit is, to a certain degree, similar to the recently proposed maxout unit (Goodfellow et al., 2013) which achieved the state-of-the-art object recognition results on a number of benchmark datasets. Secondly, we provide a geometrical interpretation of the activation function based on which we argue that the $L_p$ unit is more efficient at representing complex, nonlinear separating boundaries. Each $L_p$ unit defines a superelliptic boundary, with its exact shape defined by the order $p$. We claim that this makes it possible to model arbitrarily shaped, curved boundaries more efficiently by combining a few $L_p$ units of different orders. This insight justifies the need for learning different orders for each unit in the model. We empirically evaluate the proposed $L_p$ units on a number of datasets and show that multilayer perceptrons (MLP) consisting of the $L_p$ units achieve the state-of-the-art results on a number of benchmark datasets. Furthermore, we evaluate the proposed $L_p$ unit on the recently proposed deep recurrent neural networks (RNN).

Abstract PDF Upgrade to Chat

Authors (4)

Citations (163)

View on Semantic Scholar

Summary

Overview of $L_p$ Units in Deep Neural Networks

The paper titled "Learned-Norm Pooling for Deep Feedforward and Recurrent Neural Networks" introduces an innovative nonlinear unit—referred to as the $L_p$ unit—for applications in deep neural networks. This development seeks to generalize conventional pooling operations widely utilized in neural architectures, such as convolutional neural networks (CNNs) and recurrent neural networks (RNNs).

Technical Approach

The core proposition of the paper is the $L_p$ unit, which derives from the computation of a normalized $L_p$ norm over the outputs from filter responses within deep layers. The significance of the $L_p$ unit lies in its ability to interpolate among existing pooling methods, such as average, RMS, and max pooling functions. Key distinctions emerge from the learnability of the norm's order $p$, not set in advance but estimated during model training. This flexibility enables $L_p$ units to adapt across different inputs and tasks, thereby suggesting enhanced representational capacity for complex data structures.

In geometric terms, $L_p$ units define superelliptic boundaries in a non-Euclidean space dictated by the $L_p$ norm. These curves potentially allow networks to model nonlinear, curved boundaries more efficiently than traditional linear-segment approaches used in maxout or rectifier units.

Empirical Evaluation

The research team conducted empirical tests across several benchmark datasets, including MNIST, the Toronto Face Database (TFD), Pentomino, and Forest Covertype. Results showed notably strong performance of $L_p$ units both in feedforward and recurrent architectures. In feedforward applications, networks employing $L_p$ units demonstrated state-of-the-content classification accuracy, rivaling previous figures published for models using maxout or conventional pooling operators. Similarly, recurrent architectures with deep transition layers leveraging $L_p$ units achieved comparable results in polyphonic music prediction tasks.

Moreover, the distribution and variability of the learned $p$ values among $L_p$ units provide insights into their effective engagement with dataset-specific patterns. These experiments reinforce the hypothesis that adaptable pooling functions, such as the $L_p$ unit, offer substantial operational advantages over fixed pooling configurations.

Implications and Future Directions

The introduction of $L_p$ units indicates promising directions for future deep learning models, especially in tasks necessitating the extraction of intricate, nonlinear decision boundaries from high-dimensional data. The ability of these units to learn varying geometric constraints tailored to specific datasets highlights a potentially valuable avenue for enhancing not only generative but also discriminative machine learning processes.

From a theoretical perspective, the results stimulate further interrogation into the advantages of non-Euclidean spaces in neural network design and how learned norm scales could integrate with existing regularization techniques to foster generalization improvements. As deep learning advances, the adaptability demonstrated by $L_p$ units could inform novel architectures that leverage pooling-centric nonlinearities for complex real-world applications.