Fourier Features Let Networks Learn High Frequency Functions in Low Dimensional Domains (2006.10739v1)

Published 18 Jun 2020 in cs.CV and cs.LG

Abstract: We show that passing input points through a simple Fourier feature mapping enables a multilayer perceptron (MLP) to learn high-frequency functions in low-dimensional problem domains. These results shed light on recent advances in computer vision and graphics that achieve state-of-the-art results by using MLPs to represent complex 3D objects and scenes. Using tools from the neural tangent kernel (NTK) literature, we show that a standard MLP fails to learn high frequencies both in theory and in practice. To overcome this spectral bias, we use a Fourier feature mapping to transform the effective NTK into a stationary kernel with a tunable bandwidth. We suggest an approach for selecting problem-specific Fourier features that greatly improves the performance of MLPs for low-dimensional regression tasks relevant to the computer vision and graphics communities.

Citations (2,077)

View on Semantic Scholar

Summary

The paper introduces Fourier feature mappings to overcome spectral bias in coordinate-based MLPs, enabling them to capture high-frequency details.
It employs a sinusoidal transformation based on NTK theory that converts input coordinates into a shift-invariant kernel for improved convergence.
Empirical results demonstrate significant performance gains in applications like image regression, 3D reconstruction, and neural rendering.

Fourier Features Enable Networks to Learn High Frequency Functions in Low Dimensional Domains

The paper "Fourier Features Let Networks Learn High Frequency Functions in Low Dimensional Domains" by Matthew Tancik et al. presents a novel method to improve the capabilities of coordinate-based multilayer perceptrons (MLPs) in capturing high-frequency components in low-dimensional regression tasks. The authors address a critical limitation of MLPs, known as spectral bias, which restricts their ability to learn high-frequency information.

Introduction and Background

Coordinate-based MLPs have recently shown great promise in representing continuous functions for computer vision and graphics tasks. These models parameterize scenes or objects using low-dimensional input coordinates (e.g., points in $\mathbb{R}^3$ ) and yield outputs such as density, color, or occupancy. However, standard MLPs are inherently biased towards learning low-frequency functions, a problem substantiated both theoretically and experimentally by the authors.

Utilizing insights from the neural tangent kernel (NTK) theory, this paper elucidates why standard MLPs struggle with high-frequency learning. According to NTK literature, the eigenvalue spectrum of the NTK decays rapidly, leading to an exponentially slower convergence rate for higher-frequency components during gradient descent.

Fourier Feature Mapping

To overcome this spectral bias, the authors propose transforming the input coordinates with a Fourier feature mapping before feeding them into the MLP. Specifically, input coordinates $v$ are mapped to a higher-dimensional space using a series of sinusoidal functions:

$\gamma(v) = \left[ \cos(2 \pi \mathbf{b}_1^\top v), \sin(2 \pi \mathbf{b}_1^\top v), \ldots, \cos(2 \pi \mathbf{b}_m^\top v), \sin(2 \pi \mathbf{b}_m^\top v) \right]$

Here, $\mathbf{b}_j$ are frequency vectors that can be randomly sampled from an isotropic distribution.

This transformation alters the NTK into a stationary kernel, which importantly makes the NTK shift-invariant over the input domain and hence more suitable for the uniform distribution of sampled training points in low-dimensional regression problems.

Experimental Validation and Theoretical Insights

The authors validate the Fourier feature approach through a series of experiments spanning 1D and 2D regression tasks and demonstrate significant improvements in tasks such as image regression, 3D shape representation, CT and MRI reconstruction, and inverse rendering for novel view synthesis.

Specifically, experiments show that Fourier feature mappings notably outperform positional encodings and standard input mappings, especially when the appropriate frequency scale is selected. Theoretical analysis further indicates that modifying the frequency distribution $\mathbf{b}_j$ and controlling the bandwidth is crucial for managing the balance between underfitting and overfitting.

In addition to analytical evaluations, empirical results demonstrate that this method can create models that efficiently converge to high-frequency components of the target function, which standard MLPs fail to capture.

Implications and Future Work

The implications of these findings are substantial for the fields of computer vision and graphics. Fourier features allow for compact and computationally efficient representations of complex high-frequency details in continuous domains, thereby paving the way for more accurate and expressive models in visual and geometric tasks.

Future research avenues may include investigating the impact of different sampling distributions for the frequency vectors $\mathbf{b}_j$ , optimizing the Fourier feature scales dynamically during training, and extending the approach to other neural network architectures and higher-dimensional domains.

In practical scenarios, incorporating Fourier feature mappings can dramatically improve the performance of applications requiring precise high-frequency detail management, such as neural rendering, texture synthesis, and volumetric data reconstruction. The method is simple to implement and integrates well with existing frameworks, making it a pragmatic enhancement for a variety of downstream tasks.

Conclusion

The paper offers a robust solution to a fundamental limitation of coordinate-based MLPs, demonstrating that Fourier feature mappings effectively allow these networks to learn high-frequency functions in low-dimensional domains. This approach leverages NTK theory to provide a combinatorial benefit of both spectral tuning and shift invariance, thereby making it a highly valuable tool in the arsenal of modern computer vision and graphics research.

PDF Markdown

Related Papers

Tweets

https://twitter.com/CFGeek/status/1922311155592126933

https://twitter.com/basavasagar18/status/1785372925886881808

https://twitter.com/Pranav2278/status/1882938686771503482

https://twitter.com/ATHARVBHAT/status/1916145607070134514

YouTube

Show All Videos