RustyEx: Prioritizing Rust Configurations
- RustyEx is a system that prioritizes Rust configuration relevance by leveraging compiler-based extraction and graph-theoretic centrality measures.
- It constructs dual graph representations (FDG and ADT) to capture both structural and semantic dependencies, addressing Rust’s configurable code explosion.
- Empirical evaluation on 1,600+ crates demonstrates its scalability and soundness in generating valid configurations for complex Rust projects.
RustyEx is a compiler-based system for prioritizing configuration relevance in Rust software, enabling practical exploration of the vast configuration spaces induced by feature variability. Exploiting fine-grained, compiler-level analysis, RustyEx yields rankings of configuration options using refined centrality measures derived from structural and semantic dependency graphs, and guarantees the validity of generated configurations via SAT solving. This approach addresses the combinatorial explosion inherent in Rust’s #[cfg] and Cargo feature mechanisms, focusing on the selection of the most structurally and semantically impactful configurations for further analysis and testing (Bruzzone et al., 22 Jan 2026).
1. Motivation and Challenges in Rust Configuration Analysis
Rust natively supports high configurability through constructs such as #[cfg] attributes and Cargo features. Developers combine these to build highly flexible libraries and applications. However, the sheer number of possible configurations, which increases exponentially with each additional feature, leads to what is widely termed the “combinatorial explosion” problem.
Traditional configuration analysis methods often rely solely on feature-model abstractions, ignoring critical code-level dependencies, or treat all features uniformly. Such approaches overlook those features that, though rarely used, affect large and complex code portions. This creates acute challenges in contexts where exhaustive compilation, testing, or analysis are intractable, such as in safety-critical software, regression testing pipelines, or resource-constrained environments. Consequently, there is a need to prioritize the exploration of configurations deemed “most important” according to structurally and semantically justified criteria.
RustyEx directly addresses these limitations by employing compiler-driven extraction of variability information, constructing multiple interdependent graph representations capturing the true impact of each feature, ranking them using rigorous graph-theoretical centrality, and synthesizing only the top-N valid configurations (Bruzzone et al., 22 Jan 2026).
2. Underlying Formalism and Intermediate Representations
RustyEx begins analysis with a graph-based abstraction of the Rust Abstract Syntax Tree (AST), from which it derives a Unified Intermediate Representation (UIR). The key elements of this formalism are:
- Relevant AST Nodes (): Subset of the syntax tree, focusing on structures such as function bodies and program statements most affected by variability.
- Predicates (): Logical expressions corresponding to Rust
cfgconditions, recursively constructed as . - Atoms (): Pairs of predicates and terms (AST subtrees) where annotation matches, .
The UIR itself is a triple , where , is a set of edges derived and transformed from the AST (in particular, reversing and relinking for atom connectivity), and is a node-weight function calculated via fixed-point traversal, propagating and accumulating weights through recursion, composition, and cross-references.
This formalized intermediate representation allows RustyEx to account both for syntactic structure and code-guarding predicates at fine granularity, distinguishing it from feature-model-centric approaches.
3. Construction of Feature Dependency Graphs and Atom Dependency Trees
The RustyEx analysis pipeline constructs two complementary graph-based data structures per project:
Feature Dependency Graph (FDG)
- Nodes (): Features, annotated with the set of predicates in which they occur.
- Edges (): Lexical-scope dependencies extracted from the UIR. The FDG is initially a directed multigraph.
- Weights (): Each edge weight corresponds to the form of predicate (single, not, any, all) in the governing atom, computed recursively. The graph is subsequently collapsed to a simple graph by summing edge multiplicities.
A synthetic “patch” node is connected to the global root of FDG to ensure strong connectivity, a precondition for applying centrality algorithms.
Atom Dependency Tree (ADT)
- Derived as an induced subgraph of the UIR, the ADT restricts to atoms and encodes ancestor-descendant relationships among atoms with non-empty predicates.
- The edge set connects each atom to its nearest ancestor, while the weight function accumulates code weights over dropped intermediate nodes.
This dual-graph abstraction jointly captures both structural dependencies among features and the lexical/semantic extent of feature impact on the codebase.
4. Graph-Theoretic Feature Ranking and Configuration Generation
RustyEx prioritizes features using quantitative ranking derived from the FDG’s weighted adjacency matrix . Supported centrality measures include:
- Degree Centrality (): Counts outgoing and incoming dependency weights per feature.
- Closeness: Inverse sum of shortest-path distances (Dijkstra-based) to all reachable nodes.
- Harmonic Centrality: Summed inverse shortest-path lengths (assigns zero to unreachable pairs).
- Betweenness (Opsahl’s variant): Number of weighted shortest paths passing through a node.
- Eigenvector Centrality: Leading eigenvector of .
- Katz Centrality: Inverted Neumann series on , with spectral damping.
The resulting vector quantifies each feature’s structural importance.
To account for the code guarded (beyond graph structure), is further refined: for each atom , the normalized weight is allocated among all features appearing in its predicate, adjusted based on the logical constructor (single/not/any vs. all). This produces a refined score vector .
Configuration generation proceeds by encoding feature constraints as a propositional formula over Boolean variables, capturing:
cfgpredicate logic per feature- Feature-to-feature implications
- Manifest-mandated relationships
is converted into conjunctive normal form and solved incrementally using a SAT solver. The top- configurations, prioritized by , are generated by constraining corresponding feature variables and enumerating distinct satisfying assignments. Each configuration is guaranteed to be valid, as the formula precisely encodes all variability and dependency constraints.
5. Empirical Evaluation on Open-Source Rust Ecosystem
RustyEx was empirically assessed on 40 prominent open-source Rust projects encompassing more than 1,600 crates, spanning application domains such as web runtimes (deno), blockchains (substrate), GUIs (alacritty), and compilers (ripgrep). The primary regimen involved a per-crate timeout of 10 minutes on commodity hardware (Intel i5-1135G7, 16 GB RAM); performance metrics were aggregated at the project level.
Selected results:
| Metric | Value (Average) | Notes |
|---|---|---|
| Crates Analyzed | 1,628 | 93% success rate |
| UIR Nodes/Edges | ~82,000 | per project |
| FDG Nodes/Edges (collapsed) | 122 / 321 | per project |
| ADT Nodes/Edges | 190 / 146 | per project |
| Mean Analysis Time | 333 s | per project |
| Peak Memory | 885 MB | per project |
| Features Defined (per proj.) | ~425 | Most features used <2 times (mean) |
Analysis scales sublinearly with project size (log–log trend), supporting the scalability claim of RustyEx’s methods. Notably, most features are rarely activated, underscoring the necessity of effective prioritization. No direct baselines exist, as prior tools either use feature-model-only analysis or do not scale to Rust’s complex macro ecosystem (Bruzzone et al., 22 Jan 2026).
6. Soundness Guarantees and Formal Properties
A core property of RustyEx is that every output configuration is valid by construction. The formalism ensures:
- Precise encoding of all Rust
cfgpredicates, transitive feature-to-feature dependencies, and Cargo manifest constraints as propositional clauses. - Translation to CNF preserves logical equivalence.
- SAT solver returns only assignments satisfying all constraints.
- Each generated configuration, therefore, satisfies the exact feature and code requirements specified in the project.
This guarantee is formalized as a theorem in the underlying work and justified via soundness-preserving code-to-logic translation.
7. Implications, Extensions, and Future Directions
RustyEx demonstrates that compiler-based, graph-centrality-guided configuration prioritization is both efficient and effective for large, real-world Rust projects. Its pipeline—integrating unified IR construction, dual graph-based dependency modeling, quantitative feature ranking, code-impact refinement, and SAT-based configuration synthesis—provides a template for advancing configuration-aware analysis, testing, and optimization.
Potential directions for further development include:
- Adaptation to other languages with similar variability idioms (C/C++ preprocessor, Java annotations)
- Incorporation of advanced centrality or graph-embedding techniques
- Integration into continuous integration pipelines for prioritized regression testing or compiler tuning
- Dynamic selection of the number of top configurations () based on project metrics or prior test outcomes
A plausible implication is that such prioritization could significantly improve fault-detection efficacy and resource utilization in test and analysis workflows for highly configurable software systems (Bruzzone et al., 22 Jan 2026).