Fourier Features Let Networks Learn High Frequency Functions in Low Dimensional Domains

Published 18 Jun 2020 in cs.CV and cs.LG | (2006.10739v1)

Abstract: We show that passing input points through a simple Fourier feature mapping enables a multilayer perceptron (MLP) to learn high-frequency functions in low-dimensional problem domains. These results shed light on recent advances in computer vision and graphics that achieve state-of-the-art results by using MLPs to represent complex 3D objects and scenes. Using tools from the neural tangent kernel (NTK) literature, we show that a standard MLP fails to learn high frequencies both in theory and in practice. To overcome this spectral bias, we use a Fourier feature mapping to transform the effective NTK into a stationary kernel with a tunable bandwidth. We suggest an approach for selecting problem-specific Fourier features that greatly improves the performance of MLPs for low-dimensional regression tasks relevant to the computer vision and graphics communities.

Abstract PDF Upgrade to Chat

Citations (2,077)

View on Semantic Scholar

Summary

The paper introduces Fourier feature mappings to overcome spectral bias in coordinate-based MLPs, enabling them to capture high-frequency details.
It employs a sinusoidal transformation based on NTK theory that converts input coordinates into a shift-invariant kernel for improved convergence.
Empirical results demonstrate significant performance gains in applications like image regression, 3D reconstruction, and neural rendering.

Fourier Features Enable Networks to Learn High Frequency Functions in Low Dimensional Domains

The paper "Fourier Features Let Networks Learn High Frequency Functions in Low Dimensional Domains" by Matthew Tancik et al. presents a novel method to improve the capabilities of coordinate-based multilayer perceptrons (MLPs) in capturing high-frequency components in low-dimensional regression tasks. The authors address a critical limitation of MLPs, known as spectral bias, which restricts their ability to learn high-frequency information.

Introduction and Background

Coordinate-based MLPs have recently shown great promise in representing continuous functions for computer vision and graphics tasks. These models parameterize scenes or objects using low-dimensional input coordinates (e.g., points in $\mathbb{R}^3$ ) and yield outputs such as density, color, or occupancy. However, standard MLPs are inherently biased towards learning low-frequency functions, a problem substantiated both theoretically and experimentally by the authors.

Utilizing insights from the neural tangent kernel (NTK) theory, this paper elucidates why standard MLPs struggle with high-frequency learning. According to NTK literature, the eigenvalue spectrum of the NTK decays rapidly, leading to an exponentially slower convergence rate for higher-frequency components during gradient descent.

Fourier Feature Mapping

To overcome this spectral bias, the authors propose transforming the input coordinates with a Fourier feature mapping before feeding them into the MLP. Specifically, input coordinates $v$ are mapped to a higher-dimensional space using a series of sinusoidal functions:

$\gamma(v) = \left[ \cos(2 \pi \mathbf{b}_1^\top v), \sin(2 \pi \mathbf{b}_1^\top v), \ldots, \cos(2 \pi \mathbf{b}_m^\top v), \sin(2 \pi \mathbf{b}_m^\top v) \right]$

Here, $\mathbf{b}_j$ are frequency vectors that can be randomly sampled from an isotropic distribution.

This transformation alters the NTK into a stationary kernel, which importantly makes the NTK shift-invariant over the input domain and hence more suitable for the uniform distribution of sampled training points in low-dimensional regression problems.

Experimental Validation and Theoretical Insights

The authors validate the Fourier feature approach through a series of experiments spanning 1D and 2D regression tasks and demonstrate significant improvements in tasks such as image regression, 3D shape representation, CT and MRI reconstruction, and inverse rendering for novel view synthesis.

Specifically, experiments show that Fourier feature mappings notably outperform positional encodings and standard input mappings, especially when the appropriate frequency scale is selected. Theoretical analysis further indicates that modifying the frequency distribution $\mathbf{b}_j$ and controlling the bandwidth is crucial for managing the balance between underfitting and overfitting.

In addition to analytical evaluations, empirical results demonstrate that this method can create models that efficiently converge to high-frequency components of the target function, which standard MLPs fail to capture.

Implications and Future Work

The implications of these findings are substantial for the fields of computer vision and graphics. Fourier features allow for compact and computationally efficient representations of complex high-frequency details in continuous domains, thereby paving the way for more accurate and expressive models in visual and geometric tasks.

Future research avenues may include investigating the impact of different sampling distributions for the frequency vectors $\mathbf{b}_j$ , optimizing the Fourier feature scales dynamically during training, and extending the approach to other neural network architectures and higher-dimensional domains.

In practical scenarios, incorporating Fourier feature mappings can dramatically improve the performance of applications requiring precise high-frequency detail management, such as neural rendering, texture synthesis, and volumetric data reconstruction. The method is simple to implement and integrates well with existing frameworks, making it a pragmatic enhancement for a variety of downstream tasks.

Conclusion

The study offers a robust solution to a fundamental limitation of coordinate-based MLPs, demonstrating that Fourier feature mappings effectively allow these networks to learn high-frequency functions in low-dimensional domains. This approach leverages NTK theory to provide a combinatorial benefit of both spectral tuning and shift invariance, thereby making it a highly valuable tool in the arsenal of modern computer vision and graphics research.

Markdown

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Paper Prompts

Top Community Prompts

Explain it Like I'm 14

off on

Knowledge Gaps

off on

Glossary

off on

Practical Applications

off on

Conceptual Simplification

off on

Explain it Like I'm 14

Overview

This paper shows a simple trick that helps neural networks learn sharp details instead of just blurry shapes when the input is low-dimensional coordinates (like 2D pixel positions or 3D points in space). The trick is to turn each input coordinate into a bunch of sine and cosine values called Fourier features before giving them to the network. Doing this lets a standard multilayer perceptron (MLP) learn high-frequency details (fine edges, textures, tiny shapes) much more easily.

Key Objectives

The authors set out to:

Explain why regular MLPs tend to learn only smooth, low-detail versions of things (a problem known as “spectral bias”).
Show, with theory and experiments, that adding Fourier features to the inputs fixes this bias so the network can learn much sharper details.
Provide a practical way to choose these Fourier features so that the method works well across different tasks in computer vision and graphics.

Methods and Approach (in simple terms)

Think of an image or a 3D scene as a function: you put in a coordinate (like a pixel location x,y or a 3D point x,y,z), and you get out a value (like color or “inside/outside the object”). An MLP can learn such a function by looking at many coordinate–value pairs.

The problem: Regular MLPs prefer to learn “low frequencies” first. In everyday language, that means they get the big, smooth parts right (like broad colors and shapes) but struggle with fine details (edges, tiny textures, crisp lines). This bias makes outputs look blurry for a long time, and sometimes they never capture the finest details well.

The fix: Fourier features. Before feeding coordinates into the MLP, you transform each coordinate using many sine and cosine waves of different speeds. You can imagine these waves as musical notes: low notes capture slow, smooth changes; high notes capture quick, sharp changes. By giving the network these “notes” up front, you make it much easier for it to mix them and recreate fine details.

Why this works (an everyday analogy to the theory):

The authors use a theoretical tool called the Neural Tangent Kernel (NTK) to study how a network learns. You can think of the NTK like a “learning filter” that determines which kinds of patterns the network picks up quickly.
With raw coordinates, this learning filter heavily favors smooth patterns and suppresses sharp ones. That’s the spectral bias.
Adding Fourier features changes the filter so it treats all locations fairly (it becomes “shift-invariant,” meaning it behaves the same everywhere) and lets you control how much it pays attention to high-frequency details. You can “tune” this by choosing how fast the sine/cosine waves are.
In practice, the authors sample wave speeds at random (random Fourier features) and find that the most important setting is the scale (how large those frequencies are). The exact shape of the random distribution matters much less than its overall spread.

How they tested it:

They analyzed learning behavior with NTK theory to predict what should happen.
They ran experiments on 1D signals (simple functions), 2D images, 3D shapes, medical imaging (CT and MRI), and 3D view synthesis (like NeRF). In each case, they compared:
- No special input mapping (just raw coordinates).
- A basic circle mapping (simple sine/cosine).
- A “positional encoding” like the one used in Transformers and NeRF (log-spaced frequencies).
- Random Gaussian Fourier features with a tunable scale.

Main Findings and Why They Matter

Regular MLPs blur details: Without Fourier features, networks learn smooth parts quickly but struggle badly with sharp details.
Fourier features fix spectral bias: Feeding sine/cosine features into the MLP lets it learn high-frequency details much faster and more accurately.
A single key knob to tune: When using random Fourier features, the most important hyperparameter is the frequency scale (how large the sampled frequencies are).
- Too small a scale → underfitting (still blurry, learns only very smooth parts).
- Too large a scale → overfitting/aliasing (captures noise or creates artifacts).
- A well-chosen scale → sharp, accurate results.
Works across many tasks:
- 2D image fitting: Sharper reconstructions from fewer pixels.
- 3D shape modeling: Better fine details in “inside/outside” predictions of objects.
- CT and MRI: Better reconstructions from indirect/sparse measurements.
- View synthesis (NeRF-like): Clearer images from new viewpoints.
Random Gaussian features tend to perform best overall. They also found that the exact random distribution matters less than its standard deviation (its spread).

Why this is important: It gives a simple, widely applicable method to make small coordinate-based networks perform like much more powerful models on tasks that need high detail. This helps in areas like 3D graphics, medical imaging, and any application where you represent a signal as a function of coordinates.

Implications and Impact

Practical guidance: If you’re training an MLP on coordinates (2D or 3D), add Fourier features to the inputs. Use random frequencies and tune just one number—the scale. This can drastically improve sharpness and accuracy.
Faster, better learning: By reshaping the network’s “learning filter,” Fourier features make training more efficient, especially for fine details.
Broad applications: The method applies to direct tasks (predicting an image’s pixel colors) and indirect tasks (where you only see transforms of the data, like CT projections or MRI Fourier samples).
Clearer understanding: The paper explains why “positional encodings” work (like in NeRF and Transformers) by connecting them to how networks learn different frequencies. This theoretical link helps researchers design better encodings in the future.

In short, Fourier features act like giving the network a rich set of building blocks—many sine and cosine “notes” to mix—so it can play the full “song” of sharp details, not just the dull background hum.

View Paper Prompt View All Prompts

Practical Applications

Summary

This paper shows that adding a simple “Fourier feature” mapping to input coordinates lets multilayer perceptrons (MLPs) learn high-frequency signals in low-dimensional domains (e.g., 1D–3D coordinates). The mapping turns the network’s effective neural tangent kernel (NTK) into a tunable, stationary kernel whose spectral bandwidth can be matched to the task. Empirically, this improves performance on image fitting, 3D shape representation, CT/MRI reconstruction from sparse measurements, and view synthesis (NeRF-like pipelines). The insights are actionable with minimal engineering: add a Gaussian random Fourier feature (RFF) layer to coordinate-based MLPs and tune only the frequency scale.

Below are concrete use cases, grouped by deployment horizon. For each, we include sector tags, likely tools/workflows, and feasibility assumptions.

Immediate Applications

The following can be deployed now using existing libraries (PyTorch, TensorFlow, JAX) by adding a Fourier feature (RFF) layer to coordinate-based MLPs and tuning its scale on validation data.

Sector: Software, Graphics, AR/VR — Use case: Higher-fidelity novel view synthesis (NeRF pipelines)
- What: Replace/augment positional encodings with Gaussian RFFs to improve detail and convergence for NeRF-like models (fixed views or simplified radiance fields).
- Tools/workflows: Add a “Gaussian Fourier Features” layer before the MLP; tune frequency scale σ and feature count (e.g., ~256 features); keep MLP depth/width standard (e.g., 4×256 ReLU).
- Assumptions/dependencies: Low-dimensional coordinates (3D positions and possibly view directions); standard NeRF training loop; GPU availability; careful σ tuning to avoid over/underfitting.
Sector: Media, Gaming, CAD — Use case: 3D shape representation (occupancy/SDF) with sharper details
- What: Use RFFs in occupancy networks/DeepSDF to resolve high-frequency geometry (thin structures, high-curvature surfaces).
- Tools/workflows: Drop-in RFF layer on 3D coordinates; cross-entropy or regression objective as in existing pipelines.
- Assumptions/dependencies: Sufficient surface samples near geometry; typical training compute; σ tuned to sampling density.
Sector: Healthcare (Radiology) — Use case: Sparse-view 2D CT and 3D MRI reconstruction with better quality
- What: Apply coordinate-based MLPs with RFFs and task-specific forward models (line integrals for CT; Fourier sampling for MRI) to recover images/volumes from undersampled data.
- Tools/workflows: Incorporate RFFs; train with physics-informed losses against sparse sinograms (CT) or k-space (MRI); deploy as reconstruction post-processing or model-based iterative reconstruction (MBIR) component.
- Assumptions/dependencies: Clinical-grade validation; integration with PACS/scanner pipelines; compliance and QA; handling of patient variability and noise; σ tuned to acquisition protocol and sampling pattern.
Sector: Robotics, Autonomous Systems — Use case: High-fidelity occupancy/SDF maps from sparse sensor data
- What: Map environment with coordinate MLPs using RFFs to capture fine structures from sparse LiDAR/depth measurements for planning and collision avoidance.
- Tools/workflows: Online training or batch updates; forward model for ray consistency; RFF scale set by sensor resolution and scene sparsity.
- Assumptions/dependencies: Real-time or near-real-time optimization; compute budget on-robot; robustness to motion/lighting; careful scheduling to avoid overfitting transient noise.
Sector: Remote Sensing, Geophysics — Use case: 2D/3D tomography with limited measurements (e.g., seismic, SAR)
- What: Reconstruction with coordinate MLPs plus RFFs and appropriate forward operators to handle line-of-sight or Fourier-like sampling.
- Tools/workflows: Plug-in forward models (e.g., Radon/Fourier approximations) in training loop; RFF layer with σ chosen via cross-validation.
- Assumptions/dependencies: Accurate forward operator; manageable scale (regional volumes); compute availability.
Sector: Computer Vision, Graphics — Use case: Procedural texture synthesis and image inpainting/representation
- What: Fit and generate textures/images with higher-frequency content using coordinate MLPs and RFFs; improves sharpness over vanilla MLPs.
- Tools/workflows: Image-as-function modeling (xy→RGB) with RFFs; train on partial pixels; export weights as compact asset.
- Assumptions/dependencies: Inputs are low-dimensional coordinates; suited for continuous textures and compact storage.
Sector: Software Engineering, ML Tooling — Use case: Standardized “Fourier features” layers in ML libraries
- What: Provide ready-to-use RFF modules (Gaussian features, configurable σ and feature count) for coordinate-based models.
- Tools/workflows: PyTorch/TensorFlow/JAX layer; simple API; default heuristics (e.g., set σ relative to sample spacing).
- Assumptions/dependencies: Developer adoption; basic hyperparameter sweep support.
Sector: Audio/Time-Series — Use case: 1D signal reconstruction and super-resolution from sparse samples
- What: Represent time-dependent signals with coordinate MLPs using RFFs to better capture high-frequency components (e.g., audio resynthesis, sensor signal recovery).
- Tools/workflows: 1D coordinate-to-amplitude mapping with RFFs; task-specific regularization; validation-driven σ choice.
- Assumptions/dependencies: Stationarity or quasi-stationarity in target; mismatch handling for nonstationary signals.
Sector: Education, Academia — Use case: Teaching NTK, spectral bias, and kernel design via RFFs
- What: Classroom labs demonstrating spectral bias and how Fourier features alter NTK bandwidth and learning dynamics.
- Tools/workflows: JAX/Neural Tangents notebooks; adjustable σ; simple 1D/2D tasks (image fitting, CT toy examples).
- Assumptions/dependencies: Basic GPU/Colab access; curriculum integration.
Sector: Creative Tools — Use case: Content creation plugins for sharp neural textures and neural materials
- What: Integrate RFF-augmented coordinate MLPs into DCC tools (e.g., Blender, Substance) for compact, continuous texture assets.
- Tools/workflows: Export/import neural texture modules; UI controls for detail via σ; baking to raster maps as needed.
- Assumptions/dependencies: Plugin ecosystem support; artist-facing parameterization that hides low-level details.

Long-Term Applications

These require additional research, scaling, hardware/software co-design, or regulatory approval before widespread deployment.

Sector: Healthcare (Radiology) — Use case: Routine low-dose CT and fast MRI using learned reconstructions
- Potential: Reduce radiation/exam time while maintaining diagnostic quality by leveraging RFF-augmented neural reconstructions.
- Dependencies: Large-scale clinical trials; FDA/CE approvals; robust generalization; fail-safe and QA workflows; interpretability/uncertainty quantification.
Sector: AR/VR, Mobile Systems — Use case: Real-time, on-device NeRF-style rendering with high-frequency detail
- Potential: Live scene capture and rendering on AR glasses/phones; streaming neural fields with sharp textures.
- Dependencies: Hardware acceleration for MLPs/RFFs; compact model distillation; dynamic scene handling; latency constraints.
Sector: Industrial Digital Twins — Use case: Neural-field digital twins with fine-grained geometry and appearance
- Potential: Persistent, editable digital twins for inspection, predictive maintenance, and training in complex facilities.
- Dependencies: Continuous updates from heterogeneous sensors; multi-physics integration; versioning and provenance.
Sector: Imaging Hardware — Use case: Sensor–algorithm co-design for compressive imaging
- Potential: Optimize acquisition (sampling trajectories, masks) to match the NTK bandwidth induced by RFFs for better recon-quality vs. dose/time trade-offs.
- Dependencies: Joint optimization pipelines; hardware programmability; robustness to model mismatch/noise.
Sector: Robotics, SLAM — Use case: Neural-field SLAM with high-frequency scene detail and real-time updates
- Potential: More accurate maps and surfaces for manipulation, navigation, and inspection in cluttered environments.
- Dependencies: Efficient online training/inference; loop-closure and drift correction; handling dynamics and semantics.
Sector: Energy, Geoscience — Use case: High-fidelity seismic inversion and subsurface modeling
- Potential: Recover fine-scale structures important for geothermal, carbon storage, and resource management.
- Dependencies: Accurate forward models; scaled training over large volumes; uncertainty quantification.
Sector: Finance — Use case: Neural-field modeling of low-dimensional financial surfaces (e.g., implied volatility, yield curves)
- Potential: Capture sharp local structures in surfaces defined over few dimensions (maturity, strike, time).
- Dependencies: Robustness to noise/regime shifts; compliance; risk controls; interpretability.
Sector: Standards and Policy — Use case: Guidelines for learned reconstruction algorithms
- Potential: Establish benchmarks and safety standards for algorithms that substitute for classical recon in safety-critical domains (medical, remote sensing).
- Dependencies: Multi-stakeholder consensus; reproducibility frameworks; audit and drift monitoring.
Sector: AutoML for Coordinate Networks — Use case: Automated selection of Fourier feature scales and counts
- Potential: Task-aware, data-aware hyperparameter search for σ and feature sparsity to balance bias/variance.
- Dependencies: Meta-learning infrastructure; cross-task generalization; efficient validation strategies.
Sector: Interoperability in Content Pipelines — Use case: Neural-field interchange formats
- Potential: Standardized representations (e.g., USD extensions) that encapsulate RFF parameters and MLP weights for cross-tool portability.
- Dependencies: Consortium support; runtime compatibility; versioning and security.

Cross-Cutting Assumptions and Dependencies

Problem regime: Benefits are strongest when inputs are low-dimensional coordinates (1D–3D) and signals contain significant high-frequency content.
Hyperparameters: The scale (standard deviation) of the Fourier feature distribution is critical; improper σ causes underfitting (too small) or aliasing/overfitting (too large). Empirically, the distribution’s shape matters less than its scale.
Data density and noise: Choice of σ should reflect sampling density, sensor noise, and target bandwidth; indirect (physics-based) supervision requires accurate forward models.
Compute and latency: Training coordinate-based MLPs still requires GPU/TPU resources; real-time deployments may need specialized acceleration or distillation.
Safety and compliance: In regulated domains (e.g., medical), validation, monitoring, and interpretability requirements are substantial.
Integration: Drop-in RFF layers are straightforward, but end-to-end performance depends on the rest of the pipeline (sampling strategies, forward models, losses, regularization).

Fourier Features Let Networks Learn High Frequency Functions in Low Dimensional Domains

Summary

Fourier Features Enable Networks to Learn High Frequency Functions in Low Dimensional Domains

Introduction and Background

Fourier Feature Mapping

Experimental Validation and Theoretical Insights

Implications and Future Work

Conclusion

Paper to Video (Beta)

Whiteboard

Paper Prompts

Top Community Prompts

Explain it Like I'm 14

Overview

Key Objectives

Methods and Approach (in simple terms)

Main Findings and Why They Matter

Implications and Impact

Practical Applications

Summary

Immediate Applications

Long-Term Applications

Cross-Cutting Assumptions and Dependencies

Open Problems

Continue Learning

Authors (9)

Collections

Tweets

YouTube

Fourier Features Let Networks Learn High Frequency Functions in Low Dimensional Domains

Summary

Fourier Features Enable Networks to Learn High Frequency Functions in Low Dimensional Domains

Introduction and Background

Fourier Feature Mapping

Experimental Validation and Theoretical Insights

Implications and Future Work

Conclusion

Paper to Video (Beta)

Whiteboard

Paper Prompts

Top Community Prompts

Explain it Like I'm 14

Overview

Key Objectives

Methods and Approach (in simple terms)

Main Findings and Why They Matter

Implications and Impact

Practical Applications

Summary

Immediate Applications

Long-Term Applications

Cross-Cutting Assumptions and Dependencies

Open Problems

Continue Learning

Related Papers

Authors (9)

Collections

Tweets

YouTube