Papers
Topics
Authors
Recent
2000 character limit reached

Composite Neural Architecture Search

Updated 14 December 2025
  • Composite NAS is a framework that co-optimizes diverse search spaces—from neural architecture to hardware mappings—to achieve robust Pareto trade-offs.
  • It employs strategies like reinforcement learning, evolutionary algorithms, and probabilistic methods to efficiently navigate exponentially large design spaces.
  • Composite NAS enables rapid convergence and efficient resource allocation by integrating multi-objective optimization with surrogate evaluation models.

Composite Neural Architecture Search (NAS) refers to frameworks and algorithms that jointly optimize across multiple interdependent search spaces—for example neural network architecture, hardware mapping, module configurations, complexity, or multimodal fusion—rather than seeking architectures within a single domain or parameterization. Unlike conventional NAS, which operates on a fixed space and objective function, composite NAS expands design freedom by co-exploring distinct parameter sets and employing multi-objective or conditional evaluation protocols. This paradigm encompasses discrete combinatorial and probabilistic search, reinforcement learning, evolutionary optimization, and recent LLM-driven generation pipelines. The composite approach enables superior Pareto trade-offs, efficient resource allocation, and robust adaptation to heterogeneous data and hardware constraints.

1. Joint Search Spaces and Problem Formalization

Composite NAS is characterized by search spaces that are a Cartesian product of multiple structural domains. For instance, hardware/software co-exploration instantiates the search as S=Sarch×ShwS = S_\text{arch} \times S_\text{hw}, where SarchS_\text{arch} is parameterized by layer counts, filter widths wiw_i, kernel sizes kik_i, strides sis_i, expansion ratios eie_i, and ShwS_\text{hw} is parameterized by layer-to-pipeline partitions, mappings to hardware units, and underlying FPGA attributes (Jiang et al., 2019). In multi-source RL state encoding, the search space is similarly formulated as S=A1××AM×F\mathcal{S} = \mathcal{A}_1 \times \dots \times \mathcal{A}_M \times \mathcal{F}, seeking optimal source-specific modules aiAia_i \in \mathcal{A}_i and fusion module fFf \in \mathcal{F} (Yu et al., 7 Dec 2025).

The general optimization objective can be posed as:

A=arg max(a1,,aM,f)SR(π(a1,,aM,f)),A^* = \operatorname{arg\,max}_{(a_1, \dots, a_M, f) \in \mathcal{S}} \mathcal{R}\left( \pi_{(a_1, \dots, a_M, f)} \right),

where R\mathcal{R} denotes the downstream reward, test accuracy, or hardware efficiency, as appropriate for the application. Many composite NAS settings further require scalarization of multi-objective rewards, e.g.,

R(θ,P,α)=βA(θ)+(1β)U(θ,P,α),β[0,1],R(\theta, P, \alpha) = \beta \cdot A(\theta) + (1-\beta) \cdot U(\theta, P, \alpha), \quad \beta \in [0,1],

trading off between predictive and resource metrics (Jiang et al., 2019).

2. Algorithmic Strategies and Search Protocols

Composite NAS frameworks employ a range of algorithmic strategies tailored to heterogeneous search spaces. Hardware/software co-exploration utilizes a two-level protocol alternating Fast Exploration (FE), which prunes and fine-tunes hardware parameters without network training, and Slow Exploration (SE), which fully trains candidate networks to update the controller via RL (REINFORCE) (Jiang et al., 2019). FE applies fast performance models (e.g., BLAST) to optimize hardware mapping and utilization, dramatically reducing sample cost.

Evolutionary approaches, such as cellular encoding (CE) based XC-NAS, represent macro-architectures via genotypes (rooted ordered trees of operators), mapping these to multi-path CNNs whose depth and width are dynamically adjusted and evolved (Londt et al., 2023). The evolutionary loop includes tournament selection, crossover, mutation, and a surrogate model scheme to accelerate candidate evaluation.

Probabilistic and combinatorial methods often decompose the search into block-wise decisions, enabling tractable exploration. CMAB-NAS frames the problem as a combinatorial multi-armed bandit, applying nested Monte-Carlo search (NMCS) and UCB-guided sampling to efficiently traverse an exponentially-sized space via local reward decomposition (Huang et al., 2021). In importance-sampling composite NAS, multiple parametric distributions θ(n)\theta^{(n)} over architectures are optimized in parallel, each corresponding to a different complexity regime; expectations are estimated via mixture-model sampling, and natural-gradient updates enforce efficiency across all targets within a single run (Noda et al., 2022).

LLM-driven frameworks (LACER) introduce prompt-based candidate generation, leveraging architectural priors and side-information from module outputs (mutual information and redundancy) to bias search toward promising composite encoders for RL agents (Yu et al., 7 Dec 2025). Feedback mechanisms incorporate both intermediate representation quality and task-level rewards.

3. Objective Functions and Multi-Objective Scalarization

Composite NAS must address the interplay between multiple, often conflicting, objectives. Hardware/Software co-exploration explicitly trades accuracy A(θ)A(\theta) against hardware utilization U(θ,P,α)U(\theta, P, \alpha), with β\beta controlling emphasis; sequential optimization of architecture then hardware is shown to be suboptimal compared to fully joint optimization (Jiang et al., 2019). Complexity-aware NAS objectives combine cross-entropy or reward losses with regularization penalties R(M)R(M) such as parameter count or latency, aggregated via

F(M,W)=L(M,W;D)+ϵR(M),F(M, W) = L(M, W; D) + \epsilon R(M),

and optimized as expected value over parametric distribution PθP_\theta (Noda et al., 2022).

In multi-source RL encoding, the composite reward is formalized as cumulative discounted task performance R(π)\mathcal{R}(\pi), with side-information (mutual information I(;)I(\cdot;\cdot) and redundancy R(;)R(\cdot;\cdot)) serving as auxiliary metrics for module quality (Yu et al., 7 Dec 2025). Intermediate-output signals are used both for candidate ranking and as feedback to the LLM when generating new proposals.

4. Controller Architectures and Search Decomposition

Controllers for composite NAS must accommodate non-uniform modularity and multi-level decision-making. The hardware/software co-exploration framework utilizes layer-wise RNN (LSTM) cells that adaptively regroup according to pipeline partitioning during hardware tuning. During accuracy tuning, shared parameters govern sequential decision across layers. The policy

πϕ(a1:T)=tπϕ(ata1:t1)\pi_\phi(a_{1:T}) = \prod_t \pi_\phi(a_t | a_{1:t-1})

is optimized with per-stage or global rewards through standard REINFORCE gradients (Jiang et al., 2019).

CMAB-NAS decomposes large cell-wise search into KK local bandits, with each node treated as a block/arm; NMCS applies local UCB-driven sampling, backpropagates cell-level rewards, and aggregates local optima for global architecture selection. The naive decomposition assumption R(a1,...,aK)i=1Kri(ai)R(a_1, ..., a_K) \approx \sum_{i=1}^K r_i(a_i) enhances empirical robustness and sample efficiency (Huang et al., 2021).

LLM-driven search employs conversation-history-informed prompting, explicit feedback on both module and fusion quality, and regex-based response parsing for candidate extraction. Side metrics (mutual information, redundancy, average reward) are essential for directing generative sampling (Yu et al., 7 Dec 2025).

5. Experimental Evaluations and Benchmarks

Composite NAS techniques have demonstrated competitive or state-of-the-art performance across multiple domains and metrics.

Hardware/Software Co-Exploration (Jiang et al., 2019):

Dataset OptSW Accuracy OptSW Throughput GOPS/W Hardware-Aware NAS Accuracy Throughput GOPS/W Search Time
CIFAR-10 85.19% 35.5 FPS 1.91 84.53% 16.2 FPS 0.84 103.9 GPU-h (OptSW)
ImageNet 70.24% (top-1) 10.5 FPS 0.74 68.40% (top-1) 6.8 FPS 0.34 267 GPU-h (OptSW)

OptSW strictly expands the Pareto frontier over hardware-aware NAS, yielding up to 35.24% higher throughput, 54.05% higher energy efficiency, and 136×\times reduced search time.

CMAB-NAS (Huang et al., 2021):

Method CIFAR-10 Error (%) Params (M) Search (GPU-days) ImageNet Top-1 (%)
DARTS 2.76 3.3 4 26.7
AlphaX (tree) 2.78 8.9 12 24.5
CMAB-NAS 2.58 3.8 0.58 25.8

CMAB-NAS achieves robust search with %%%%27R(π)\mathcal{R}(\pi)28%%%% reduced compute cost versus prior tree-search (Huang et al., 2021).

Importance-Sampling Complexity-Aware NAS (Noda et al., 2022):

Simultaneously discovers Pareto-optimal architectures across four complexity levels in 3.4 GPU-hours (CIFAR-10), outperforming repeated single-target baselines in search cost and accuracy.

XC-NAS Evolutionary Macro-Architecture (Londt et al., 2023):

Yields top test accuracies in image/text domains for competitive tasks, discovering high-performing macro-path CNNs in <<1 GPU-day, with 4×\times acceleration from surrogate evaluation. Method generalizes across convolutional and sequence data domains.

LLM-Driven Multimodal RL NAS (LACER) (Yu et al., 7 Dec 2025):

Outperforms expert and baseline NAS (DARTS, ENAS, GENIUS) on mixed-autonomy traffic control with higher average speed and greater sample efficiency:

Method Avg Speed (mean ±\pm 2SE)
Expert 15.2 ± 0.4
DARTS 16.1 ± 0.3
GENIUS 16.6 ± 0.3
LACER-1 17.2 ± 0.2
LACER-5 17.5 ± 0.2

Ablation studies confirm that side-information (representation quality, reward feedback) is essential for LLM-driven search efficacy.

6. Methodological Insights and Practical Considerations

Composite NAS frameworks enable:

  • Expansion of the Pareto frontier by co-optimizing architecture and hardware jointly, outperforming sequential NAS \rightarrow hardware tuning approaches.
  • Efficient decomposition of exponentially large search spaces (block-wise, bandit, or genotype-based approaches), facilitating tractable optimization.
  • Rapid search acceleration through surrogate evaluation, importance-sampling across distributions, and sample-efficient candidate pruning.
  • Algorithmic generalizability: macro-encoding, modular fusion, and multi-source support extend NAS to image, text, RL, and hardware implementation contexts.
  • Feedback-driven candidate generation (LLMs with intermediate signals) improves sample efficiency and accelerates convergence.

Limitations and open directions include insufficient theoretical understanding of Pareto-optimality coverage in composite spaces, limited support for deep hardware knobs (quantization, dataflow), and scarcity of formal acquisition mechanisms in LLM-guided search (Jiang et al., 2019, Yu et al., 7 Dec 2025).

7. Future Directions and Open Questions

Composite NAS research is progressing toward broader, deeper, and more adaptive frameworks. Key areas include:

  • Joint search over quantization bit-widths, memory hierarchies, on-chip dataflows, and novel accelerators (CIM, analog MAC) for hardware-aware applications (Jiang et al., 2019).
  • Integrated search for multimodal and goal-oriented RL encoders, leveraging advanced LLM priors and side-information metrics (Yu et al., 7 Dec 2025).
  • Development of acquisition functions and theoretical guarantees for convergence and sample complexity in composite NAS scenarios.
  • Extension of genotype-based cellular encoding to evolve micro-architectures in tandem with macro-structural paths (Londt et al., 2023).
  • Application of single-shot, multi-distribution sampling to more complex objectives (e.g., latency, energy) beyond model size, integrating richer domain-specific constraints (Noda et al., 2022).

A plausible implication is that composite NAS, by fully exploiting algorithmic and architectural modularity, will yield robust, efficient, and domain-agnostic designs for next-generation neural computing platforms.

Whiteboard

Follow Topic

Get notified by email when new papers are published related to Composite Neural Architecture Search (NAS).