NeRFlex: Real-Time Mobile Neural Rendering
- NeRFlex is a resource-aware real-time rendering framework that decomposes complex 3D scenes and optimally balances memory usage with visual quality.
- It integrates multi-NeRF scene decomposition with a domain-informed profiler to predict storage and SSIM tradeoffs for enhanced visual fidelity.
- Leveraging a pseudo-polynomial dynamic programming solution to the NP-hard MCK problem, NeRFlex ensures interactive performance and strict memory compliance on mobile platforms.
NeRFlex is a resource-aware real-time rendering framework that targets interactive, high-fidelity synthesis of complex 3D scenes on mobile devices, fundamentally re-architecting Neural Radiance Fields (@@@@4@@@@) methods to address memory and computational constraints. NeRFlex integrates a multi-NeRF scene decomposition, a domain-informed profiler for memory/quality tradeoffs, and a dynamic programming optimization over the NP-hard Multiple-Choice Knapsack (MCK) configuration problem. Its design enables real-time rendering at quality levels previously unattainable on commercial mobile platforms, robustly adhering to strict device storage budgets while leveraging a principled treatment of visual frequency and resource consumption (Wang et al., 4 Apr 2025).
1. Multi-NeRF Scene Decomposition
NeRFlex decomposes complex scenes into multiple sub-scenes, assigning each a dedicated NeRF network based on localized visual detail. The segmentation module operates by:
- Performing object detection across all input images, yielding per-object binary masks.
- Computing a 2D detail frequency for each object in every training view (using, e.g., Laplacian responses), and recording the maximum frequency per object.
- Objects with (for threshold ) are classified as high-detail and assigned individual NeRFs; all others are grouped together with a shared NeRF.
To facilitate efficient representation learning for high-detail objects, NeRFlex applies interpolation scaling: each selected object is cropped from training images and rescaled so that it fills the full input resolution, alleviating the NeRF’s need to capture high-frequency details at small scales.
Table: Multi-NeRF Scene Decomposition Workflow
| Step | Operation | Outcome |
|---|---|---|
| Object Detection | Binary masks generation per candidate object | Isolated object masks in all images |
| Detail Frequency | Compute per view, record | Quantitative detail measure for segmentation |
| Thresholding | Classify by | High/low-detail object distinction |
| Interpolation Scaling | Crop and rescale selected objects per image | High-detail object NeRFs optimized |
2. Lightweight Profiler and Modeling Memory-Quality Tradeoffs
NeRFlex incorporates a lightweight profiler for each NeRF representation to estimate the tradeoff between memory usage and visual quality. It exposes two primary configuration parameters per NeRF:
- Geometry grid resolution (voxel grid size, total voxels)
- Texture patch size (patch per mesh face, texels per face)
The profiler fits white-box polynomial models for predicted data storage and quality (measured via SSIM):
where are constants fitted empirically from a small grid of samples. The profiler achieves mean prediction errors of approximately 0.0065 (SSIM, ) and $3.34$ MB in storage ( MB), ensuring reliable input to subsequent optimization.
3. Formal Resource-Aware Configuration as MCK
For each segmented object (), NeRFlex selects a configuration from a discrete candidate set . Formally, the allocation of NeRF parameters across all sub-scenes is cast as a Multiple-Choice Knapsack (MCK) problem:
- Objective: Maximize aggregate visual quality over all objects:
where indicates object uses configuration .
- Memory Constraint: Aggregate storage must not exceed device budget :
- Uniqueness Constraint: Each object receives exactly one configuration:
Solving this NP-hard problem robustly is central to NeRFlex's guarantees on memory-efficiency and visual fidelity.
4. Dynamic Programming Solution to Configuration Selection
To practically solve the NP-hard MCK, NeRFlex employs a pseudo-polynomial time dynamic programming (DP) algorithm. For objects and device memory :
- DP state: = maximum sum-SSIM achievable with the first objects not exceeding total size .
- Recurrence: For each , :
- Initialization: for all .
Additional pruning is performed by precomputing , excluding any that would preclude a full solution. The backtracking phase retrieves the optimal configuration per object.
The algorithm exhibits complexity , tractable for realistic settings (tens to hundreds of MB memory budgets).
5. Experimental Evaluation on Mobile Platforms
NeRFlex demonstrates strong empirical performance on commercial hardware, evaluated on both real and synthetic datasets.
- Device Budgets: iPhone 13 ( MB), Google Pixel 4 ( MB)
- Quality Metrics: PSNR, SSIM, LPIPS (higher is better for PSNR, SSIM; lower is better for LPIPS)
| Method | PSNR↑ | SSIM↑ | LPIPS↓ |
|---|---|---|---|
| MipNeRF 360 | 26.55 | 0.815 | 0.183 |
| NGP (Instant NGP) | 27.21 | 0.851 | 0.136 |
| MobileNeRF | 26.03 | 0.785 | 0.207 |
| NeRFlex | 27.65 | 0.886 | 0.114 |
- Data-Size vs Quality: Block-NeRF (≥400 MB, infeasible); Single NeRF (~250 MB, SSIM ≈0.84–0.88, may fail); NeRFlex consistently meets its budgets (150 or 240 MB), attaining SSIM ≈0.90+, matching Block-NeRF quality.
- Frame Rate: Scene 3, 360° pan at 7.5 s/turn: iPhone 13 ≈ 35 FPS; Pixel 4 ≈ 25 FPS (2× Single NeRF); Block-NeRF unable to load; Single NeRF sometimes stalls at 0 FPS when exceeding budget.
- Cloud Processing Overhead (per 20 images): ~3.8 s for segmentation/interpolation, ~0.28 s for profiler, ~1.9 s for DP selection; total ~5.9 s (one-time).
6. Synthesis and Formal Innovation
NeRFlex unifies domain-driven multi-NeRF decomposition for high-frequency detail management, high-accuracy polynomial profiling of resource/quality tradeoffs, an explicit combinatorial optimization framework grounded in MCK, and a practical DP solver tailored for mobile constraints. This synthesis achieves interactive rates and stringent memory compliance, substantially advancing the deployability and quality of real-time neural rendering on commercial mobile devices (Wang et al., 4 Apr 2025).