Distribution-Aware Algorithm Design with LLM Agents
Abstract: We study learning when the learned object is executable solver code rather than a predictor. In this setting, correctness is not enough: two solvers may both return valid solutions on the deployment distribution while differing substantially in runtime. Given samples from an unknown task distribution, the learner returns code evaluated on fresh instances by both solution quality and execution time. Our central abstraction is a \emph{solver hint}: reusable structure inferred from samples and compiled into specialized solver code. We prove that the empirically fastest sample-consistent solver from a fixed library generalizes in both correctness and runtime, and that statistically identifiable hints can be recovered and compiled from polynomially many samples. Empirically, we instantiate the framework with LLM code agents on (21) structured combinatorial-optimization target distributions across seven problem classes. The synthesized solvers reach mean normalized quality (0.971), improve by (+0.224) over the average heuristic pool and by (+0.098) over the highest-quality heuristic, and are (336.9\times), (342.8\times), and (16.1\times) faster than the quality-best heuristic, Gurobi, and the selected time-limited exact backend, respectively. On released PACE 2025 Dominating Set private instances, the synthesized solver is valid on all (100) graphs and runs about two orders of magnitude faster than top competition solvers, with a moderate quality gap. Inspection shows that many gains come from changing the computational scale: replacing ambient exponential search or general-purpose optimization with compiled distribution-specific computation.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Explain it Like I'm 14
What is this paper about?
This paper is about teaching computers not just to get the right answers, but to do it fast for the kinds of problems they’ll see in the real world. Instead of learning to “predict” an answer, the computer learns to write solver code—actual programs that find solutions. The key idea is to learn reusable shortcuts from practice problems and build those shortcuts into the solver so that future problems from the same “type” can be solved much faster.
Think of it like studying for a specific teacher’s tests: you don’t just learn math; you learn your teacher’s favorite patterns and tricks. Those patterns are the “hints” this paper talks about.
What questions did the authors ask?
They focused on five simple questions:
- Can we learn solver code that is both correct and fast for a specific kind of problems?
- Can we discover “solver hints”—patterns that repeat across many problems—and compile them into code?
- If we choose a solver from a fixed shelf (a library), does picking the one that’s fastest on practice examples also work well on new problems?
- If a hidden pattern leaves a clear signal in the data, can we recover it with a reasonable number of examples?
- Do LLMs that write code actually find and use these shortcuts in practice?
How did they do it?
First, some plain-language translations:
- Distribution: the “type” of problems you’ll get (like one teacher’s style of questions).
- Samples: practice problems from that same type.
- Runtime: how long it takes the solver program to finish.
- Solver hint: a reusable shortcut or structure (like a small set of key choices that make the rest easy).
- Backdoor (SAT example): a tiny set of variables that, once set, turns a hard puzzle into an easy one.
The method uses an LLM as a coding agent that does more than just write a solver in one go. It goes through a “learn a pattern, measure it, then use it” loop.
Here are the three main steps the agent follows:
- Hypothesize: Propose a possible hidden structure in the data (for example, “these graphs have small hub sets,” or “these puzzles often reduce to a simpler case if you fix a few key parts”).
- Analyze: Write a small program that studies the training examples to extract a compact summary of that structure (the “hint”).
- Solve: Write a solver that takes a new problem plus the learned hint and solves it quickly, falling back to a complete, general solver if the hint doesn’t help (so correctness is preserved).
They search over many such candidates and keep refining the best ones based on three validation signals: solution quality, how often they hit the true optimum, and runtime.
What did they find, and why is it important?
There are two kinds of results: theoretical (proofs) and empirical (experiments).
Theoretical results (in simple terms):
- Picking from a shelf: If you have a fixed library of solvers, and you pick the solver that is fastest on the training examples while still being correct on them, then with enough examples you’ll also be near the best possible (fast and correct) choice for new problems of the same type.
- Learning hints: If there’s a real hidden pattern that leaves a clear, measurable signal in your training data, then with a reasonable number of examples you can recover that pattern and compile it into a specialized solver.
- Concrete example (SAT puzzles): If hard SAT problems secretly have a tiny “backdoor” set of variables that makes them easy after you set them, you can detect those variables from samples and build a solver that tries those few options first—gaining big speedups while still being correct thanks to a fallback.
Experimental results (what happened in practice):
- Tasks: They tested on 21 target distributions (problem types) across 7 classic problem classes (like graph coloring, MaxSAT, independent set, dominating set, knapsack-style packing, and TSP).
- Quality: The learned solvers achieved an average normalized quality of 0.971 (on a 0 to 1 scale), which was:
- +0.224 better than the average heuristic in the pool
- +0.098 better than the best single heuristic
- Speed: The learned solvers were much faster on average:
- About 337× faster than the quality-best heuristic
- About 343× faster than Gurobi (a powerful general optimizer) under a fixed time budget
- About 16× faster than the best time-limited exact/certifying backend they tried
- External test (PACE 2025 Dominating Set): On 100 private test graphs, their solver returned valid answers for all of them and ran about 100× faster than top competition solvers—though with a moderate quality gap (roughly 3.3% larger dominating sets). This shows a clear speed advantage with slightly worse solution sizes.
Why the speedups happened:
The gains weren’t just from writing snappier code; the agent often changed the kind of computation being done. It replaced general, worst-case methods (like heavy search or generic optimization) with simpler, distribution-specific steps discovered from the samples. Examples include:
- Graph problems: identify palettes, hubs, small “kernels,” or motifs so the algorithm operates on a much smaller core.
- SAT/MaxSAT: find local Boolean rules or small “backdoor” sets that make the remainder easy.
- Packing/Knapsack: detect bottleneck resources and solve with fast sorting, fractional fills, and small repairs instead of full-blown search or linear programming.
- TSP: spot clustered geometry and build tours using cheaper, structure-aware moves rather than expensive dynamic programming.
What could this change?
- Practical impact: If you repeatedly solve similar problems (logistics, scheduling, circuit design, route planning), learning a solver that “speaks your distribution’s language” can save huge amounts of time while keeping solutions high-quality.
- A new way to design algorithms: Don’t only design for the worst case. Use examples from your real deployment to learn reusable hints and compile them into code. The result is a solver specialized to your world.
- Safety and limits:
- Specialization is a double-edged sword: it’s great on the kind of problems you trained on, but it may lose its advantage if the problem type shifts.
- There’s a one-time “synthesis” cost to learn and build the solver; the payoff comes when you solve many future instances.
- Different runs of the agent might find different shortcuts—some more stable than others—so review and validation matter. The fallback to a complete solver helps keep correctness when hints fail.
In short, this paper shows that we can learn not only what answers to produce but how to compute them efficiently for the problems we actually face. By turning repeated patterns in data into code-level shortcuts, computers can solve future tasks both correctly and much faster.
Knowledge Gaps
Below is a single, consolidated list of concrete knowledge gaps, limitations, and open questions that remain unresolved by the paper. These items are intended to guide follow-on research.
- Robustness under distribution shift: No formal or empirical guarantees characterize how specialized solvers degrade when the deployment distribution drifts from the sampled regime, nor mechanisms to detect shift and adapt or revert in time.
- Break-even amortization analysis: The one-time synthesis cost (wall-clock, compute, token usage) is not reported; there is no cost–benefit analysis quantifying how many future instances are required to net a runtime win.
- Stability and variance across runs: The method can produce different hints/solvers across seeds; variance is noted but not measured or mitigated with systematic ensembling, selection, or regularization strategies.
- Empirical sample complexity: There is no study of how performance scales with the number of training instances; no learning curves or sensitivity to limited data are provided.
- Cross-size generalization: It is unclear whether hints learned on smaller instances transfer to larger ones within the same distribution family; no cross-scale tests or theory are provided.
- Selection criterion design: The use of lexicographic ranking (quality → optimality → −runtime) is not theoretically justified; alternatives (e.g., Pareto, constrained optimization, weighted multi-objective) are unexplored.
- Heavy-tailed runtime noise: Theory assumes bounded runtime T(c, x) ≤ Tmax and effectively deterministic measurements; no analysis addresses runtime noise, heavy tails, caching effects, or OS jitter, nor robust estimators for empirical runtime.
- Library-selection assumptions: Theorem 5.1 assumes the library contains an almost-surely correct solver; the more realistic regime with accuracy–runtime tradeoffs (nonzero error allowed) is not analyzed.
- General hint identifiability: Theorem 5.2 treats a finite hint class with a known scoring family and margin; there is no treatment of infinite/parametric hint classes, misspecified scores, or data-dependent score discovery.
- Computational hardness of hint discovery: The paper does not characterize when discovering a useful hint is computationally feasible vs. intractable, nor provide approximation guarantees for agentic search.
- Robustness to wrong hints: There is no formal analysis of advice-robustness (analogous to learning-augmented algorithms) quantifying worst-case overhead and recovery when the compiled hint is incorrect.
- Fallback policies: The conditions and early-stopping rules for switching from the specialized solver to the generic fallback are heuristic; principled policies with provable overhead bounds are missing.
- Formal verification of correctness: Correctness is argued via fallback but not guaranteed against code-generation bugs; there is no integration with verification, property testing, or certifying backends to ensure safety.
- Interpretability and validation of hints: While some hints are inspected qualitatively, there is no systematic method to extract, validate, or quantify the causal relevance of the learned structure.
- Sensitivity to prompt/beam priors: The diversity seeds and structural directives strongly bias the search; there is no ablation on these priors, nor a principled way to choose or adapt them across domains.
- Head-to-head with algorithm selection portfolios: A direct comparison against state-of-the-art portfolio methods (e.g., AutoFolio/Hydra/SATzilla) under the same sample-access protocol is absent.
- Budget sensitivity of baselines: Heuristic and solver baselines use fixed budgets (e.g., 10s, single-thread); there is no sensitivity analysis showing how conclusions change with longer budgets or multi-threading parity.
- Metric breadth: Only quality and runtime are optimized; memory footprint, energy use, and solution robustness (variance) are not measured, yet can dominate in practice.
- Real-world external validity: The 21 distributions are author-designed; broader tests on third-party, real industrial datasets and unpredictable data quirks are missing.
- MAXSAT counterexample analysis: Iterative synthesis sometimes underperforms zero-shot (e.g., MAXSAT); the paper does not analyze why or when refinement hurts, nor propose guards.
- Transfer across related families: It remains unknown whether hints learned for one family transfer to nearby families (meta-learning/transfer), and how to represent/share reusable structure.
- Online/continual adaptation: The framework does not support incremental hint updates under non-IID or drifting streams, nor provide regret or adaptation guarantees.
- Scaling laws for search depth: There is no quantification of how agent search depth/beam width trades off with final runtime improvement and quality; compute-efficient search policies are unexplored.
- Safety and sandboxing: The risks of executing synthesized code are acknowledged but not operationalized; sandboxing, capability restrictions, and audit trails are not formally integrated or evaluated.
- Theoretical coverage beyond SAT: Apart from a SAT backdoor model, there is no analogous formal treatment for other problem classes (graph problems, routing, knapsack) linking distributional structure to provable speedups.
- Runtime decomposition: The observed speedups mix “algorithmic scale changes” and “implementation effects”; a controlled ablation isolating each contribution is absent.
- Normalization and clipping effects: Heuristic runtimes are clipped at 10s before ratio computation; the sensitivity of geometric-mean speedups to clipping and normalization choices is not reported.
- Data-access assumptions: Selection and validation rely on evaluators that know feasibility/optimality; guidance for realistic settings without ground-truth optima or exact evaluators is missing.
- Reproducibility details: The paper does not specify model versions, prompts, sampling parameters, or code release status sufficient to fully reproduce agent outputs and measured speedups.
Practical Applications
Overview
The paper introduces distribution-aware program learning: using samples from an unknown deployment distribution to synthesize executable solver code that optimizes both solution quality and execution time. The core abstraction is a solver hint—reusable, distribution-specific structure (e.g., SAT backdoors, graph decompositions, bottleneck resources, geometric clusters) that is inferred from samples and compiled into solver code with a correctness-preserving fallback. Theoretical results show runtime-aware generalization for fixed libraries and sample complexity for learning identifiable hints; empirical results show large speedups (10–3000x) with high solution quality across 21 combinatorial optimization distributions and a PACE Dominating Set test.
Below are practical applications, grouped into immediate and long-term opportunities.
Immediate Applications
- Distribution-aware solver wrappers for repeated OR workloads (logistics, e-commerce, mobility)
- Sectors: logistics, transportation, e-commerce, last-mile delivery, ride-hailing
- Use cases: depot-specific TSP/VRP tour builders exploiting clustered geographies; service-region-aware routing; recurring pickup/delivery patterns; daily batching
- Tools/products/workflows: “SolverOps” pipeline that (1) collects representative jobs, (2) learns hints (clusters, depot partitions, neighborhood heuristics), (3) compiles a solver with fallback to OR-Tools/Gurobi, (4) deploys behind an API with drift monitoring and periodic re-synthesis
- Assumptions/dependencies: stable geographic/customer patterns; sufficient historical instances; operational tolerance for occasional moderate quality gaps; fallback ensures feasibility
- Resource allocation via packing/knapsack specialization
- Sectors: cloud/DevOps (bin packing), ad tech (campaign/creative selection), warehousing (slotting/picking), manufacturing (cutting/packing)
- Use cases: traffic-aware ad allocation under budget/targets; VM/container placement tuned to observed instance-size histograms; SKU-specific slotting; crew shift packing
- Tools/products/workflows: “Bottleneck-hint compiler” that learns active-resource patterns and compiles fast sort-score-fill-repair routines; drops to LP/MIP solver for corner cases
- Assumptions/dependencies: recurring demand and resource profiles; trace logs; acceptable use of bounded local repair; fallback to solver for edge instances
- Graph problem specialization for network operations
- Sectors: telecom (frequency/channel assignment), security/IT (dominating sets for monitoring/coverage), utilities (sensor placement), social platforms (graph sampling)
- Use cases: plant-aware graph coloring through palette/separator hints; dominating set acceleration via coverage kernels/hubs; MIS with motif decompositions for sparse topologies
- Tools/products/workflows: “Graph-hint studio” that infers separators/kernels/motifs from network snapshots and emits specialized routines with verification checks
- Assumptions/dependencies: stationarity of topology motifs; periodic re-learning as networks evolve; quality-vs-speed operating points negotiated with operators
- SAT/MaxSAT preprocessing with learned backdoors
- Sectors: electronic design automation (formal verification, test generation), configuration management, software verification
- Use cases: instance-family-specific backdoor learning to speed SAT/MaxSAT; learned clause scoring; local repair before invoking RC2/Open-WBO
- Tools/products/workflows: backdoor detector trained on project/design families; preprocessor that enumerates small backdoors and falls back to complete solver
- Assumptions/dependencies: identifiable variable salience/backbones; preservation of correctness via fallback; integration with incumbent solver toolchains
- Compiler and build-system tuning via graph coloring and knapsack hints
- Sectors: software tooling, compilers, CI/CD
- Use cases: project-specific register allocation (graph coloring with palette/backdoor hints); test selection/prioritization (knapsack with historical failure/value profiles)
- Tools/products/workflows: LLVM pass that learns register-pressure palettes per codebase; CI plugin that compiles fast test selectors with fallback to full scheduler
- Assumptions/dependencies: stable code patterns; repository telemetry; offline synthesis amortized over many builds
- Operations scheduling in hospitals and call centers
- Sectors: healthcare, customer support, field service
- Use cases: recurring clinic schedules, operating-theatre block assignments, or shift rosters with unit-specific patterns; local-repair heuristics conditioned on learned bottlenecks
- Tools/products/workflows: scheduling assistant that learns department-specific constraints (soft rules, typical overflows) and compiles a fast repair-first scheduler
- Assumptions/dependencies: representative history; auditable fallback to certified solver; governance for fairness/compliance
- Energy microgrid and DER dispatch at facility scale
- Sectors: energy, buildings, microgrids
- Use cases: site-specific DER/storage dispatch with recurring demand/price profiles; learned kernels of binding constraints to avoid full LP each interval
- Tools/products/workflows: “Dispatch-accelerator” that learns binding resources and compiles a reduced model with certificate checks and fallback to full LP
- Assumptions/dependencies: predictable load/price regimes; safety constraints verified on fallback; change detection for regime shifts
- Academic tooling: benchmark and teaching kits
- Sectors: academia, education
- Use cases: course modules on beyond-worst-case analysis; labs for sample→hint→solver pipelines; dataset-specific solver leaderboards
- Tools/products/workflows: open-source SDK to define hint classes, analysis programs, and compilations; reproducible harness with runtime-quality metrics
- Assumptions/dependencies: curated datasets; instructor supervision; sandboxed execution
- Procurement and governance checklists for public-sector optimization
- Sectors: government, transit agencies, utilities
- Use cases: RFP criteria for learned solvers: sample representativeness, fallback correctness, drift monitoring, and audit trails
- Tools/products/workflows: policy templates and validation protocols (shadow runs, holdouts, red-team shifts) before deployment
- Assumptions/dependencies: access to historical instances; capacity for ongoing validation; clear SLAs on feasibility and runtime
Long-Term Applications
- General-purpose hint compilers integrated with commercial solvers
- Sectors: software, OR platforms
- Use cases: universal “Comp(H) SDK” that discovers and compiles hints across SAT/CP/LP/MIP models automatically; solver portfolios enriched with learned specializations
- Tools/products/workflows: plug-and-play module for Gurobi/CP-SAT/SCIP that performs sample-based analysis, emits specialized presolve/callbacks, and manages fallbacks
- Assumptions/dependencies: standardized interfaces for hints and safety constraints; extensive benchmarking; robust shift detection
- Autonomous robotics and warehouse planning specialized per facility
- Sectors: robotics, warehousing, manufacturing
- Use cases: facility-specific task/motion planners that exploit aisle geometry, SKU heatmaps, and traffic motifs to reduce search; replanning with learned kernels
- Tools/products/workflows: “Planner factory” that trains on logs/simulations and emits certified controllers with runtime guards
- Assumptions/dependencies: high assurance requirements; formal verification of safety; sim-to-real robustness; continual learning infrastructure
- Power system unit commitment and market operations
- Sectors: energy markets, grid operators
- Use cases: region-specific UC/ED approximations with learned binding constraints and backdoors; fast contingencies screening using hints
- Tools/products/workflows: operator-grade module with certification, counterfactual stress tests, and strict fallbacks to full MILP/AC models
- Assumptions/dependencies: regulatory approval; provable feasibility/security; comprehensive out-of-distribution guards
- Financial optimization and market microstructure
- Sectors: finance, trading, portfolio management
- Use cases: flow-aware execution/placement tuned to venue/order-flow distributions; portfolio rebalancing with learned sparsity/budget bottlenecks
- Tools/products/workflows: “Hint-aware” optimizers with scenario stress testing, compliance logging, and fallback to conservative strategies
- Assumptions/dependencies: strict risk limits; adversarial shift considerations; explainability/auditability
- Healthcare pathway and personalized treatment planning
- Sectors: healthcare delivery, radiation therapy, personalized medicine
- Use cases: clinic/hospital-specific resource scheduling; patient-cohort-specific plan construction using kernels/bottlenecks; radiation plan optimization accelerators
- Tools/products/workflows: certified solvers with clinical validation sets; drift alarms and automatic rollback; human-in-the-loop review
- Assumptions/dependencies: clinical safety and regulatory approvals; strong guarantees on feasibility/quality; secure data integration
- National-scale infrastructure and public-policy optimization
- Sectors: transportation, housing, emergency response
- Use cases: region-specific siting/coverage (schools, chargers), evacuation routing, seasonal transit planning using learned structural hints
- Tools/products/workflows: transparent model cards, participatory validation, fairness constraints encoded in fallback and repair logic
- Assumptions/dependencies: governance for equity and privacy; robust performance under shocks; explainable trade-offs
- Scientific computing and inference accelerators
- Sectors: computational science, biology, physics
- Use cases: MCMC and ILP accelerators with learned proposal/backdoor structures for recurring experimental regimes; lab-specific experiment design optimizers
- Tools/products/workflows: lab-facing SDK for hint discovery; reproducibility artifacts; integration with HPC schedulers
- Assumptions/dependencies: persistent experimental regimes; correctness certificates; provenance tracking
- Edge and embedded optimization
- Sectors: IoT, automotive, avionics
- Use cases: compiled micro-solvers for on-device scheduling (sensor fusion windows, packet scheduling), tuned to deployment traces
- Tools/products/workflows: ahead-of-time hint compilation to small-footprint code; watchdog fallbacks
- Assumptions/dependencies: tight memory/latency budgets; certification; infrequent but safe re-synthesis
- Marketplace and governance for sharing hints
- Sectors: enterprise platforms, data collaboratives
- Use cases: privacy-preserving exchange of reusable hints (not raw data) across organizations to accelerate similar workloads
- Tools/products/workflows: federated hint learning, differential privacy, and provenance; licensing for hint artifacts
- Assumptions/dependencies: privacy guarantees; standardization of hint schemas; legal frameworks
- Education and workforce upskilling
- Sectors: higher education, professional training
- Use cases: curricula on beyond-worst-case algorithmics and amortized design; capstones that build dataset-specific solvers for partner orgs
- Tools/products/workflows: open benchmarks, grading harnesses, and safe sandboxes
- Assumptions/dependencies: institutional adoption; maintenance of public datasets and evaluation tooling
Notes on feasibility across applications:
- Core dependencies: representative samples from the true deployment distribution; existence of reusable structure; ability to amortize synthesis costs; correctness-preserving fallback; monitoring and re-synthesis for distribution shift.
- Risks and mitigations: brittleness under shift (use drift detection, guardrails, and fallbacks), quality-runtime trade-offs (multi-objective validation), code safety (sandboxing, audits), domain constraints (regulatory compliance and formal checks).
Glossary
- Algorithm selection: Choosing the best-performing algorithm from a set based on instance features or data. "The closest classical line is algorithm selection [47] and feature-based portfolios such as SATzilla, Hydra, and AutoFolio [59, 58, 37, 31, 30]."
- Amortization: Shifting computation cost from per-instance inference to a one-time cost, reducing average runtime on future instances. "The framework can also be read as amortization: instead of paying inference-time compute on every instance, we pay a one-time synthesis cost against the sample and then deploy a solver whose per-instance cost is lower."
- Average-case complexity: The study of algorithmic complexity under a specified input distribution, rather than worst-case inputs. "Average-case complexity [34, 28, 5] asks when problems become tractable under input distributions, but often requires an analytic distribution before the analysis can begin."
- Beam search: A heuristic search that keeps a fixed-size set (beam) of the most promising candidates at each step. "Because the relevant hypothesis class is unknown, we search over a beam of candidates."
- Branch-and-bound: A tree search technique for exact optimization that prunes subproblems using bounds. "branch-and- bound routines [33]"
- CNF (Conjunctive Normal Form): A Boolean formula structured as an AND of OR-clauses, commonly used in SAT. "a distribution DB over CNF formulas on d variables."
- Complete solver: An algorithm that is guaranteed to return a correct decision/solution (or prove infeasibility) for any input. "Correctness need not be learned: a complete solver can always be used as fallback."
- Concentration (arguments): Probabilistic tools bounding deviations between empirical and expected quantities. "The results use standard concentration and union-bound arguments [50]."
- Deployment distribution: The (unknown) distribution of problem instances encountered at test time. "two solvers may both return valid solutions on the deployment distribution while differing substantially in runtime."
- Dominating Set: A graph problem seeking a minimum set of vertices such that every vertex is either in the set or adjacent to it. "Dominating Set private instances"
- Empirical Risk Minimization (ERM): Choosing a hypothesis that minimizes error on the observed sample; here adapted to runtime. "Let C be given in advance, a natural rule is the runtime- aware analogue of empirical risk minimization."
- Expected deployment runtime: The expected running time of a solver over the deployment distribution. "expected deployment runtime RunD(c) := ET~D[T(c, x)]."
- Fallback (to a complete solver): A mechanism where a specialized solver reverts to a general, correct solver to preserve correctness. "we focus on the regime in which Comp(h) is correct for every h E H, typically because the compiled solver falls back to a generic complete solver."
- Gurobi: A commercial optimization solver widely used for (mixed) integer programming. "and are 336.9x, 342.8x, and 16.1x faster than the quality-best heuristic, Gurobi, and the selected time-limited exact backend, respectively."
- Held-Karp dynamic programming: An exact dynamic programming algorithm for TSP with O(n2 2n) time. "Held-Karp dynamic programming for TSP [21]"
- Hint space: The set of candidate reusable structural summaries inferred from samples and compiled into solvers. "A hint space H and compilation map Comp : H > C split learning into S > hs -> cs = Comp(hs)."
- Horn-SAT: The satisfiability problem restricted to Horn clauses (at most one positive literal per clause), which is polynomial-time solvable. "Thus B is a strong backdoor to Horn-SAT."
- Hyper-heuristics: Methods that automate the design, selection, or composition of heuristics rather than solving instances directly. "A broader neighboring literature studies hyper-heuristics and automated heuristic design, where the goal is to select, compose, or generate heuristics for families of optimization problems."
- Identifiable structure: A structural property that can be reliably recovered from data due to a positive separation (margin). "Theorem 5.2 (Exact recovery under identifiable structure)."
- Learning-augmented algorithms: Algorithms that incorporate predictions/advice to improve performance while maintaining robustness. "with learning-augmented algorithms, where predictions or advice modify the behavior of a fixed algorithm while preserving robustness when advice is inaccurate [39, 41]."
- Lexicographic ranking: Ordering candidates by comparing tuples of metrics in a fixed priority order. "Candidates are ranked lexicographically by (Qval, Oval, -Tval),"
- Lin–Kernighan heuristic (LKH): A powerful local-search heuristic for TSP and related problems. "including two-opt and LKH [36, 48, 22]."
- Margin separation: A positive gap between the score of the true structure and any alternative, enabling reliable recovery. "If |H| = N and the margin is y > 0, then n ≥ 2 log 2N samples suffice for h = h* with probability at least 1 - 8."
- MaxSAT (Maximum Satisfiability): The optimization version of SAT that maximizes the number (or weight) of satisfied clauses. "PySAT/RC2. MaxSAT solvers [27],"
- MDKP (Multidimensional Knapsack Problem): A knapsack variant with multiple resource constraints. "MDKP"
- MIS (Maximum Independent Set): A graph problem seeking a largest set of pairwise non-adjacent vertices. "MIS: motif structure"
- PACE (Parameterized Algorithms and Computational Experiments): A challenge series focusing on algorithmic performance on structured benchmarks. "On released PACE 2025 [43] Dominating Set private instances,"
- Parameterized complexity: A framework analyzing complexity with respect to both input size and one or more parameters. "Smoothed analysis [52], parameterized complexity [14], and structural backdoors [57] each give hand-designed routes to distribution-specific tractability."
- Program synthesis: Automatically generating programs from specifications, examples, or natural language. "Our synthesis regime connects to program synthesis from examples or natural language [19, 3, 13],"
- Realizable setting: An assumption that the true target (e.g., hint) lies within the considered hypothesis space. "We assume a realizable setting: D = Dh* for some unknown h* € H,"
- Sample-access regime: A learning setup where the distribution is accessed only through i.i.d. samples. "We study the sample-access regime: given S = (x1, ... , In) ~ D" from an unknown deployment distribution, the learner returns solver code for future instances from the same D."
- Sample complexity: The number of samples required to achieve a learning guarantee. "The sample complexity is logarithmic in |H| and inverse-quadratic in the margin,"
- Sample-consistent solver: A solver that achieves correctness on all training samples. "the empirically fastest sample-consistent solver from a fixed library generalizes in both correctness and runtime,"
- Smoothed analysis: An analysis paradigm studying performance under slight random perturbations of inputs. "Smoothed analysis [52], parameterized complexity [14], and structural backdoors [57]"
- Solver hint: A reusable structural summary inferred from samples and compiled into specialized solver code. "Our central abstraction is a solver hint: reusable structure inferred from samples and compiled into specialized solver code."
- Strong backdoor: A set of variables whose assignments reduce every instance in a family to a tractable subproblem. "We assume B is a strong backdoor into a tractable class T:"
- Time-limited exact backend: An exact or certifying solver run under a fixed time budget used as a baseline. "the selected time-limited exact backend,"
- Tractable class: A problem class solvable efficiently (e.g., in polynomial time). "a tractable class T"
- Union bound: A basic probability inequality bounding the probability of a union of events by the sum of their probabilities. "The results use standard concentration and union-bound arguments [50]."
- Worst-case complexity: The study of performance guarantees under the hardest possible inputs. "Worst-case complexity has a clean primary object, the language,"
- Zero-shot: Generated or performed without additional training or iterative refinement beyond the initial prompt. "the zero-shot generated solver"
Collections
Sign up for free to add this paper to one or more collections.