Universal approximations of invariant maps by neural networks (1804.10306v1)

Published 26 Apr 2018 in cs.NE

Abstract: We describe generalizations of the universal approximation theorem for neural networks to maps invariant or equivariant with respect to linear representations of groups. Our goal is to establish network-like computational models that are both invariant/equivariant and provably complete in the sense of their ability to approximate any continuous invariant/equivariant map. Our contribution is three-fold. First, in the general case of compact groups we propose a construction of a complete invariant/equivariant network using an intermediate polynomial layer. We invoke classical theorems of Hilbert and Weyl to justify and simplify this construction; in particular, we describe an explicit complete ansatz for approximation of permutation-invariant maps. Second, we consider groups of translations and prove several versions of the universal approximation theorem for convolutional networks in the limit of continuous signals on euclidean spaces. Finally, we consider 2D signal transformations equivariant with respect to the group SE(2) of rigid euclidean motions. In this case we introduce the "charge--conserving convnet" -- a convnet-like computational model based on the decomposition of the feature space into isotypic representations of SO(2). We prove this model to be a universal approximator for continuous SE(2)--equivariant signal transformations.

Authors (1)

Dmitry Yarotsky (23 papers)

Citations (192)

View on Semantic Scholar

Summary

The paper introduces a construction that enables neural networks to universally approximate continuous invariant and equivariant maps.
It rigorously demonstrates how convolutional networks achieve translation equivariance through novel convnet approximations.
The study proposes a charge-conserving convnet for SE(2)-equivariant representations, paving the way for future research in symmetric network designs.

Universal Approximations of Invariant Maps by Neural Networks

The paper by Dmitry Yarotsky presents a comprehensive paper on the extension of the universal approximation theorem to neural networks that must respect certain symmetry properties, specifically invariance or equivariance, under the actions of groups. This extension encompasses not only compact groups but also non-compact transformations, thus providing a robust and theoretical framework for designing symmetric neural networks.

Key Contributions

The paper makes significant strides in several directions:

Invariant and Equivariant Neural Networks for Compact Groups: Yarotsky first addresses the general case of compact groups by suggesting a construction that includes an intermediate polynomial layer. This modification allows neural networks to approximate any continuous invariant or equivariant map. The use of Hilbert's and Weyl's theorems to structure this framework is a central component of the approach, particularly in providing a constructive method for permutation-invariant maps.
Convnet Approximations for Translation Groups: The paper explores translation groups through convolutional networks, offering new versions of the universal approximation theorem that are relevant to convolutional networks operating on continuous signals in euclidean spaces. A rigorous approach is taken to demonstrate how convolutional networks can approximate continuous functions and why they maintain translation equivariance.
Constructing SE(2)-Equivariant Representations: The paper introduces the "charge-conserving convnet," which emerges as a novel model, based on the decomposition of features into isotypic representations of the special orthogonal group SO(2). This construction is shown to be effective in universally approximating 2D signal transformations that are continuous and SE(2)-equivariant.

Practical and Theoretical Implications

Generality and Flexibility: The ability to handle invariant and equivariant maps related to both compact and non-compact groups greatly expands the applicability of neural networks in practical machine-learning tasks, especially those involving spatial and rotational symmetries.
Improvement of Computational Models: By incorporating polynomial invariants, the proposed networks are not only computationally complete but also avoid the unnecessary complexity that might arise from explicit group averaging, thus offering solid alternatives to existing symmetrization methods.
Future Directions: The theoretical results laid out for SE(2) can spur further research into more general Lie groups or those pertinent to 3D spatial transformations. Additionally, understanding the balance between expressivity and specialization in neural networks could lead toward more scalable architectures.

Conclusion

The work enhances our understanding of neural networks in the context of symmetries and invariances, offering rigorous avenues for construction and analysis across various group actions. It positions itself as a meaningful contribution to both theoretical understanding and practical implementation of symmetric neural networks, ushering in new possibilities for the design of machine learning algorithms that respect inherent data symmetries.

PDF Markdown