Fourier Features Let Networks Learn High Frequency Functions in Low Dimensional Domains
Abstract: We show that passing input points through a simple Fourier feature mapping enables a multilayer perceptron (MLP) to learn high-frequency functions in low-dimensional problem domains. These results shed light on recent advances in computer vision and graphics that achieve state-of-the-art results by using MLPs to represent complex 3D objects and scenes. Using tools from the neural tangent kernel (NTK) literature, we show that a standard MLP fails to learn high frequencies both in theory and in practice. To overcome this spectral bias, we use a Fourier feature mapping to transform the effective NTK into a stationary kernel with a tunable bandwidth. We suggest an approach for selecting problem-specific Fourier features that greatly improves the performance of MLPs for low-dimensional regression tasks relevant to the computer vision and graphics communities.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Explain it Like I'm 14
Overview
This paper shows a simple trick that helps neural networks learn sharp details instead of just blurry shapes when the input is low-dimensional coordinates (like 2D pixel positions or 3D points in space). The trick is to turn each input coordinate into a bunch of sine and cosine values called Fourier features before giving them to the network. Doing this lets a standard multilayer perceptron (MLP) learn high-frequency details (fine edges, textures, tiny shapes) much more easily.
Key Objectives
The authors set out to:
- Explain why regular MLPs tend to learn only smooth, low-detail versions of things (a problem known as “spectral bias”).
- Show, with theory and experiments, that adding Fourier features to the inputs fixes this bias so the network can learn much sharper details.
- Provide a practical way to choose these Fourier features so that the method works well across different tasks in computer vision and graphics.
Methods and Approach (in simple terms)
Think of an image or a 3D scene as a function: you put in a coordinate (like a pixel location x,y or a 3D point x,y,z), and you get out a value (like color or “inside/outside the object”). An MLP can learn such a function by looking at many coordinate–value pairs.
The problem: Regular MLPs prefer to learn “low frequencies” first. In everyday language, that means they get the big, smooth parts right (like broad colors and shapes) but struggle with fine details (edges, tiny textures, crisp lines). This bias makes outputs look blurry for a long time, and sometimes they never capture the finest details well.
The fix: Fourier features. Before feeding coordinates into the MLP, you transform each coordinate using many sine and cosine waves of different speeds. You can imagine these waves as musical notes: low notes capture slow, smooth changes; high notes capture quick, sharp changes. By giving the network these “notes” up front, you make it much easier for it to mix them and recreate fine details.
Why this works (an everyday analogy to the theory):
- The authors use a theoretical tool called the Neural Tangent Kernel (NTK) to study how a network learns. You can think of the NTK like a “learning filter” that determines which kinds of patterns the network picks up quickly.
- With raw coordinates, this learning filter heavily favors smooth patterns and suppresses sharp ones. That’s the spectral bias.
- Adding Fourier features changes the filter so it treats all locations fairly (it becomes “shift-invariant,” meaning it behaves the same everywhere) and lets you control how much it pays attention to high-frequency details. You can “tune” this by choosing how fast the sine/cosine waves are.
- In practice, the authors sample wave speeds at random (random Fourier features) and find that the most important setting is the scale (how large those frequencies are). The exact shape of the random distribution matters much less than its overall spread.
How they tested it:
- They analyzed learning behavior with NTK theory to predict what should happen.
- They ran experiments on 1D signals (simple functions), 2D images, 3D shapes, medical imaging (CT and MRI), and 3D view synthesis (like NeRF). In each case, they compared:
- No special input mapping (just raw coordinates).
- A basic circle mapping (simple sine/cosine).
- A “positional encoding” like the one used in Transformers and NeRF (log-spaced frequencies).
- Random Gaussian Fourier features with a tunable scale.
Main Findings and Why They Matter
- Regular MLPs blur details: Without Fourier features, networks learn smooth parts quickly but struggle badly with sharp details.
- Fourier features fix spectral bias: Feeding sine/cosine features into the MLP lets it learn high-frequency details much faster and more accurately.
- A single key knob to tune: When using random Fourier features, the most important hyperparameter is the frequency scale (how large the sampled frequencies are).
- Too small a scale → underfitting (still blurry, learns only very smooth parts).
- Too large a scale → overfitting/aliasing (captures noise or creates artifacts).
- A well-chosen scale → sharp, accurate results.
- Works across many tasks:
- 2D image fitting: Sharper reconstructions from fewer pixels.
- 3D shape modeling: Better fine details in “inside/outside” predictions of objects.
- CT and MRI: Better reconstructions from indirect/sparse measurements.
- View synthesis (NeRF-like): Clearer images from new viewpoints.
- Random Gaussian features tend to perform best overall. They also found that the exact random distribution matters less than its standard deviation (its spread).
Why this is important: It gives a simple, widely applicable method to make small coordinate-based networks perform like much more powerful models on tasks that need high detail. This helps in areas like 3D graphics, medical imaging, and any application where you represent a signal as a function of coordinates.
Implications and Impact
- Practical guidance: If you’re training an MLP on coordinates (2D or 3D), add Fourier features to the inputs. Use random frequencies and tune just one number—the scale. This can drastically improve sharpness and accuracy.
- Faster, better learning: By reshaping the network’s “learning filter,” Fourier features make training more efficient, especially for fine details.
- Broad applications: The method applies to direct tasks (predicting an image’s pixel colors) and indirect tasks (where you only see transforms of the data, like CT projections or MRI Fourier samples).
- Clearer understanding: The paper explains why “positional encodings” work (like in NeRF and Transformers) by connecting them to how networks learn different frequencies. This theoretical link helps researchers design better encodings in the future.
In short, Fourier features act like giving the network a rich set of building blocks—many sine and cosine “notes” to mix—so it can play the full “song” of sharp details, not just the dull background hum.
Practical Applications
Summary
This paper shows that adding a simple “Fourier feature” mapping to input coordinates lets multilayer perceptrons (MLPs) learn high-frequency signals in low-dimensional domains (e.g., 1D–3D coordinates). The mapping turns the network’s effective neural tangent kernel (NTK) into a tunable, stationary kernel whose spectral bandwidth can be matched to the task. Empirically, this improves performance on image fitting, 3D shape representation, CT/MRI reconstruction from sparse measurements, and view synthesis (NeRF-like pipelines). The insights are actionable with minimal engineering: add a Gaussian random Fourier feature (RFF) layer to coordinate-based MLPs and tune only the frequency scale.
Below are concrete use cases, grouped by deployment horizon. For each, we include sector tags, likely tools/workflows, and feasibility assumptions.
Immediate Applications
The following can be deployed now using existing libraries (PyTorch, TensorFlow, JAX) by adding a Fourier feature (RFF) layer to coordinate-based MLPs and tuning its scale on validation data.
- Sector: Software, Graphics, AR/VR — Use case: Higher-fidelity novel view synthesis (NeRF pipelines)
- What: Replace/augment positional encodings with Gaussian RFFs to improve detail and convergence for NeRF-like models (fixed views or simplified radiance fields).
- Tools/workflows: Add a “Gaussian Fourier Features” layer before the MLP; tune frequency scale σ and feature count (e.g., ~256 features); keep MLP depth/width standard (e.g., 4×256 ReLU).
- Assumptions/dependencies: Low-dimensional coordinates (3D positions and possibly view directions); standard NeRF training loop; GPU availability; careful σ tuning to avoid over/underfitting.
- Sector: Media, Gaming, CAD — Use case: 3D shape representation (occupancy/SDF) with sharper details
- What: Use RFFs in occupancy networks/DeepSDF to resolve high-frequency geometry (thin structures, high-curvature surfaces).
- Tools/workflows: Drop-in RFF layer on 3D coordinates; cross-entropy or regression objective as in existing pipelines.
- Assumptions/dependencies: Sufficient surface samples near geometry; typical training compute; σ tuned to sampling density.
- Sector: Healthcare (Radiology) — Use case: Sparse-view 2D CT and 3D MRI reconstruction with better quality
- What: Apply coordinate-based MLPs with RFFs and task-specific forward models (line integrals for CT; Fourier sampling for MRI) to recover images/volumes from undersampled data.
- Tools/workflows: Incorporate RFFs; train with physics-informed losses against sparse sinograms (CT) or k-space (MRI); deploy as reconstruction post-processing or model-based iterative reconstruction (MBIR) component.
- Assumptions/dependencies: Clinical-grade validation; integration with PACS/scanner pipelines; compliance and QA; handling of patient variability and noise; σ tuned to acquisition protocol and sampling pattern.
- Sector: Robotics, Autonomous Systems — Use case: High-fidelity occupancy/SDF maps from sparse sensor data
- What: Map environment with coordinate MLPs using RFFs to capture fine structures from sparse LiDAR/depth measurements for planning and collision avoidance.
- Tools/workflows: Online training or batch updates; forward model for ray consistency; RFF scale set by sensor resolution and scene sparsity.
- Assumptions/dependencies: Real-time or near-real-time optimization; compute budget on-robot; robustness to motion/lighting; careful scheduling to avoid overfitting transient noise.
- Sector: Remote Sensing, Geophysics — Use case: 2D/3D tomography with limited measurements (e.g., seismic, SAR)
- What: Reconstruction with coordinate MLPs plus RFFs and appropriate forward operators to handle line-of-sight or Fourier-like sampling.
- Tools/workflows: Plug-in forward models (e.g., Radon/Fourier approximations) in training loop; RFF layer with σ chosen via cross-validation.
- Assumptions/dependencies: Accurate forward operator; manageable scale (regional volumes); compute availability.
- Sector: Computer Vision, Graphics — Use case: Procedural texture synthesis and image inpainting/representation
- What: Fit and generate textures/images with higher-frequency content using coordinate MLPs and RFFs; improves sharpness over vanilla MLPs.
- Tools/workflows: Image-as-function modeling (xy→RGB) with RFFs; train on partial pixels; export weights as compact asset.
- Assumptions/dependencies: Inputs are low-dimensional coordinates; suited for continuous textures and compact storage.
- Sector: Software Engineering, ML Tooling — Use case: Standardized “Fourier features” layers in ML libraries
- What: Provide ready-to-use RFF modules (Gaussian features, configurable σ and feature count) for coordinate-based models.
- Tools/workflows: PyTorch/TensorFlow/JAX layer; simple API; default heuristics (e.g., set σ relative to sample spacing).
- Assumptions/dependencies: Developer adoption; basic hyperparameter sweep support.
- Sector: Audio/Time-Series — Use case: 1D signal reconstruction and super-resolution from sparse samples
- What: Represent time-dependent signals with coordinate MLPs using RFFs to better capture high-frequency components (e.g., audio resynthesis, sensor signal recovery).
- Tools/workflows: 1D coordinate-to-amplitude mapping with RFFs; task-specific regularization; validation-driven σ choice.
- Assumptions/dependencies: Stationarity or quasi-stationarity in target; mismatch handling for nonstationary signals.
- Sector: Education, Academia — Use case: Teaching NTK, spectral bias, and kernel design via RFFs
- What: Classroom labs demonstrating spectral bias and how Fourier features alter NTK bandwidth and learning dynamics.
- Tools/workflows: JAX/Neural Tangents notebooks; adjustable σ; simple 1D/2D tasks (image fitting, CT toy examples).
- Assumptions/dependencies: Basic GPU/Colab access; curriculum integration.
- Sector: Creative Tools — Use case: Content creation plugins for sharp neural textures and neural materials
- What: Integrate RFF-augmented coordinate MLPs into DCC tools (e.g., Blender, Substance) for compact, continuous texture assets.
- Tools/workflows: Export/import neural texture modules; UI controls for detail via σ; baking to raster maps as needed.
- Assumptions/dependencies: Plugin ecosystem support; artist-facing parameterization that hides low-level details.
Long-Term Applications
These require additional research, scaling, hardware/software co-design, or regulatory approval before widespread deployment.
- Sector: Healthcare (Radiology) — Use case: Routine low-dose CT and fast MRI using learned reconstructions
- Potential: Reduce radiation/exam time while maintaining diagnostic quality by leveraging RFF-augmented neural reconstructions.
- Dependencies: Large-scale clinical trials; FDA/CE approvals; robust generalization; fail-safe and QA workflows; interpretability/uncertainty quantification.
- Sector: AR/VR, Mobile Systems — Use case: Real-time, on-device NeRF-style rendering with high-frequency detail
- Potential: Live scene capture and rendering on AR glasses/phones; streaming neural fields with sharp textures.
- Dependencies: Hardware acceleration for MLPs/RFFs; compact model distillation; dynamic scene handling; latency constraints.
- Sector: Industrial Digital Twins — Use case: Neural-field digital twins with fine-grained geometry and appearance
- Potential: Persistent, editable digital twins for inspection, predictive maintenance, and training in complex facilities.
- Dependencies: Continuous updates from heterogeneous sensors; multi-physics integration; versioning and provenance.
- Sector: Imaging Hardware — Use case: Sensor–algorithm co-design for compressive imaging
- Potential: Optimize acquisition (sampling trajectories, masks) to match the NTK bandwidth induced by RFFs for better recon-quality vs. dose/time trade-offs.
- Dependencies: Joint optimization pipelines; hardware programmability; robustness to model mismatch/noise.
- Sector: Robotics, SLAM — Use case: Neural-field SLAM with high-frequency scene detail and real-time updates
- Potential: More accurate maps and surfaces for manipulation, navigation, and inspection in cluttered environments.
- Dependencies: Efficient online training/inference; loop-closure and drift correction; handling dynamics and semantics.
- Sector: Energy, Geoscience — Use case: High-fidelity seismic inversion and subsurface modeling
- Potential: Recover fine-scale structures important for geothermal, carbon storage, and resource management.
- Dependencies: Accurate forward models; scaled training over large volumes; uncertainty quantification.
- Sector: Finance — Use case: Neural-field modeling of low-dimensional financial surfaces (e.g., implied volatility, yield curves)
- Potential: Capture sharp local structures in surfaces defined over few dimensions (maturity, strike, time).
- Dependencies: Robustness to noise/regime shifts; compliance; risk controls; interpretability.
- Sector: Standards and Policy — Use case: Guidelines for learned reconstruction algorithms
- Potential: Establish benchmarks and safety standards for algorithms that substitute for classical recon in safety-critical domains (medical, remote sensing).
- Dependencies: Multi-stakeholder consensus; reproducibility frameworks; audit and drift monitoring.
- Sector: AutoML for Coordinate Networks — Use case: Automated selection of Fourier feature scales and counts
- Potential: Task-aware, data-aware hyperparameter search for σ and feature sparsity to balance bias/variance.
- Dependencies: Meta-learning infrastructure; cross-task generalization; efficient validation strategies.
- Sector: Interoperability in Content Pipelines — Use case: Neural-field interchange formats
- Potential: Standardized representations (e.g., USD extensions) that encapsulate RFF parameters and MLP weights for cross-tool portability.
- Dependencies: Consortium support; runtime compatibility; versioning and security.
Cross-Cutting Assumptions and Dependencies
- Problem regime: Benefits are strongest when inputs are low-dimensional coordinates (1D–3D) and signals contain significant high-frequency content.
- Hyperparameters: The scale (standard deviation) of the Fourier feature distribution is critical; improper σ causes underfitting (too small) or aliasing/overfitting (too large). Empirically, the distribution’s shape matters less than its scale.
- Data density and noise: Choice of σ should reflect sampling density, sensor noise, and target bandwidth; indirect (physics-based) supervision requires accurate forward models.
- Compute and latency: Training coordinate-based MLPs still requires GPU/TPU resources; real-time deployments may need specialized acceleration or distillation.
- Safety and compliance: In regulated domains (e.g., medical), validation, monitoring, and interpretability requirements are substantial.
- Integration: Drop-in RFF layers are straightforward, but end-to-end performance depends on the rest of the pipeline (sampling strategies, forward models, losses, regularization).
Collections
Sign up for free to add this paper to one or more collections.