- The paper introduces AnalogNAS-Bench, a benchmark designed to evaluate neural network architectures under Analog In-Memory Computing (AIMC) non-idealities such as noise and temporal drift.
- The benchmark reveals that standard quantization techniques are insufficient for AIMC robustness, and architectures with fewer 1x1 convolutions and more 3x3 convolutions, skip connections, and pooling demonstrate better resilience.
- AnalogNAS-Bench shows that AIMC-specific Neural Architecture Search methods outperform general NAS methods in finding architectures robust to analog constraints, highlighting the need for hardware-aware design.
AnalogNAS-Bench: A Dedicated NAS Benchmark for Analog In-Memory Computing
AnalogNAS-Bench introduces a comprehensive Neural Architecture Search (NAS) benchmark specifically designed for Analog In-Memory Computing (AIMC) platforms. AIMC offers significant energy and latency advantages for deep neural network (DNN) inference by performing matrix-vector multiplications directly within memory arrays. However, AIMC hardware introduces non-idealities—such as device-to-device variations, cycle-to-cycle noise, temporal drift, and limited precision—that are not addressed by conventional NAS benchmarks or digital-centric neural architectures. AnalogNAS-Bench addresses this gap by providing a systematic framework for evaluating and comparing neural architectures under AIMC-specific constraints, with a focus on robustness to analog noise and drift.
Benchmark Construction and Methodology
AnalogNAS-Bench extends the NAS-Bench-201 search space, which consists of 15,625 convolutional architectures represented as directed acyclic graphs (DAGs) with a fixed macro-structure and a cell-based design. Each cell comprises four nodes, with edges corresponding to one of five operations: skip connection, zeroize, 3×3 convolution, 1×1 convolution, or 3×3 average pooling. This search space is well-suited for AIMC studies due to its architectural diversity and manageable size for exhaustive evaluation.
The benchmark evaluates each architecture under multiple conditions:
- Baseline Accuracy: Full-precision digital inference, serving as an upper bound.
- Noisy Accuracy: Inference on AIMC hardware without hardware-aware training (HWT), simulating analog noise using IBM’s AIHWKit.
- Analog Accuracy: Performance after HWT, where AIMC-specific noise is injected during training to improve robustness.
- Drift Metrics: Accuracy degradation over time (60s, 1h, 1d, 30d) under both noisy and analog conditions, quantifying resilience to temporal drift.
All experiments are conducted on CIFAR-10, with ongoing extensions to CIFAR-100 and ImageNet16-120. The training pipeline leverages distributed PyTorch, SLURM scheduling, and standard data augmentation, with hardware simulation parameters reflecting realistic PCM device characteristics.
Key Empirical Findings
The systematic analysis of AnalogNAS-Bench yields several notable insights:
- Quantization Techniques Are Insufficient: Standard post-training quantization (PTQ) and quantization-aware training (QAT) do not capture AIMC-specific noise. Correlations between quantized and noisy accuracies are weak (Kendall’s τ ≈ 0.33–0.34), indicating that quantization robustness does not imply AIMC robustness.
- Architectural Robustness Is Structure-Dependent: Robust architectures under AIMC noise are characterized by:
- Reduced reliance on 1×1 convolutions (9.5% in robust vs. 26.1% in non-robust), as these operations utilize less of the crossbar and are more noise-sensitive.
- Increased use of 3×3 convolutions (32.2%), which leverage larger crossbar regions and provide better noise averaging.
- Higher occurrence of skip connections and average pooling, which duplicate and smooth features, mitigating noise propagation.
- Sequential Patterns Matter: The order and positioning of operations, not just their counts, are critical. Robust pathways frequently combine 3×3 convolutions with skip connections or pooling, while non-robust pathways often involve 1×1 convolutions in key positions.
- HWT Significantly Improves Robustness, But Not Universally: Most architectures benefit from HWT, with mean analog accuracy rising to 81.3% (median 85.5%). However, some architectures remain highly susceptible to analog noise, particularly those dominated by 1×1 convolutions and non-learnable operations.
- Temporal Drift Robustness: Robust architectures maintain high accuracy over time by emphasizing 3×3 convolutions and skip connections. Non-robust architectures, in contrast, rely more on 1×1 convolutions and non-learnable operations, which are vulnerable to drift.
Benchmarking NAS Methodologies
AnalogNAS-Bench enables a direct comparison of general and AIMC-specific NAS methods. Results indicate:
- AIMC-specific NAS methods (e.g., AnalogNAS, NAS4RRAM, GA for IMC AI hardware) achieve the highest 1-day analog accuracy (≈90.0%), closely matching the optimal architecture found via exhaustive search.
- General NAS methods (Random Search, Evolutionary Algorithms) are competitive in baseline accuracy but exhibit higher accuracy variation over one month (AVM), indicating less stability under analog constraints.
- Bayesian Optimization and BANANAS underperform in analog objectives, highlighting the need for analog-aware surrogate modeling and search strategies.
These results underscore the necessity of analog-aware NAS methodologies for robust AIMC deployment.
Practical and Theoretical Implications
AnalogNAS-Bench provides a standardized, reproducible platform for evaluating neural architectures under realistic AIMC constraints. Its insights have several implications:
- For hardware-aware DNN design: The benchmark offers concrete guidelines for constructing AIMC-robust architectures, favoring wider, branched topologies with skip connections and larger convolutions.
- For NAS research: AnalogNAS-Bench enables the development and fair comparison of analog-aware NAS algorithms, facilitating progress in hardware-software co-design for emerging analog accelerators.
- For hardware-software co-optimization: The findings motivate the integration of architectural and hardware parameter search, especially as transformer-based models and heterogeneous analog-digital systems become more prevalent.
Limitations and Future Directions
The current benchmark is limited to convolutional architectures and a single hardware configuration (PCM-based). Ongoing work aims to:
- Expand to more complex datasets (CIFAR-100, ImageNet16-120) to validate generalization.
- Incorporate transformer-based architectures, which are increasingly relevant for AIMC.
- Extend hardware modeling to include RRAM, FeRAM, ECRAM, and heterogeneous analog-digital systems.
- Develop zero-cost estimators for rapid robustness prediction, enabling exploration of larger search spaces.
Conclusion
AnalogNAS-Bench fills a critical gap in NAS benchmarking by explicitly modeling AIMC non-idealities and providing actionable insights into architectural robustness. Its public availability and extensible design are expected to catalyze further research in analog-aware NAS, robust DNN design, and hardware-software co-optimization for next-generation AI accelerators.