Problem Reductions at Scale: Agentic Integration of Computationally Hard Problems

Published 13 Apr 2026 in cs.AI | (2604.11535v1)

Abstract: Solving an NP-hard optimization problem often requires reformulating it for a specific solver -- quantum hardware, a commercial optimizer, or a domain heuristic. A tool for polynomial-time reductions between hard problems would let practitioners route any supported problem to any supported solver through a single interface. Building such a library at scale, however, has remained out of reach. We show that harness engineering, the practice of designing constraints, verification systems, and feedback loops that channel AI coding agents, can overcome this barrier. Our harness combines a no-code contribution route for domain experts, a multilayer verification stack ranging from type-level checks to agentic feature tests (AI agents role-playing as end users), and a fully automated implementation-review-integration pipeline. In about three months, we built a command-line tool backed by a library of 100+ problem types and 200+~reduction rules in over 170k lines of Rust. The result suggests that a well-engineered harness lets agents build well-tested software at a scale and pace beyond prior reduction-library efforts. Because the reduction graph composes transitively, a new solver registered for any single problem type instantly becomes available to every problem connected by a reduction path. The source code is available at https://github.com/CodingThrust/problem-reductions.

Abstract PDF Upgrade to Chat

Authors (3)

Summary

The paper introduces an agentic harness system that integrates over 190 problem types and 265 reduction rules to build a large-scale, verified reduction library.
Its reduction graph and multilayer verification pipeline ensure composability, correctness, and streamlined access to mature solvers like ILP and HiGHS.
The no-code contribution pathway coupled with AI agent automation accelerates reduction development and mitigates issues such as manual error and convention drift.

Agentic Integration of Computationally Hard Problem Reductions

Introduction and Motivation

The paper "Problem Reductions at Scale: Agentic Integration of Computationally Hard Problems" (2604.11535) addresses the persistent challenge of integrating a wide range of combinatorial optimization problems—many of them NP-hard—with a diverse set of solvers, each requiring its own problem formulation. Traditional solver-specific bespoke reductions, convention drift, and limited software implementations have stymied scalable, uniform, and composable reduction libraries, despite robust theoretical catalogs such as Garey and Johnson’s compendium.

To overcome this, the paper introduces a comprehensive agentic harness system combining a multilayered verification infrastructure, a no-code contribution pathway for domain experts, and an automation pipeline leveraging AI agents. This architecture allowed the rapid and verified construction of a large-scale reduction library, implemented in Rust, connecting over 190 problem types through 265 reduction rules in less than three months.

The Reduction Graph: Structure and Semantics

Central to the approach is the explicit construction of a reduction graph. Nodes denote problem types (with variants reflecting domain restrictions or parameterizations), and directed edges encode primitive reduction rules (algorithms for instance and solution mapping, together with formal size overheads as multivariate polynomials).

Figure 1: The reduction graph with nodes as problem types and edges as primitive reduction rules; anchor roles (3-SAT for hardness proofs, ILP as solver gateway) are highlighted.

Edge composition implements transitive reduction, yielding composite overheads; for instance, reductions chaining via $A \to B \to C$ produce an analyzable mapping $r_{C \leftarrow A} = r_{C \leftarrow B} \circ r_{B \leftarrow A}$ . Two anchor paths are essential: reductions sourced from 3-SAT establish NP-hardness by executable code, while reductions targeting ILP enable solver-backed verification.

This explicit reduction graph not only formalizes the connectivity of the problem landscape but also determines the accessibility of mature solvers to arbitrary problems and systematically reveals underexplored gaps ripe for future reduction-rule development.

Library and Harness Architecture

The reduction framework is realized as a Rust crate, architected in two principal layers: interfaces and infrastructure.

Figure 3: The library comprises a CLI and PDF manual (interface) atop infrastructure modules for problems, reductions, example database, solvers, and a symbolic engine.

Interfaces provide a CLI (pred) capable of interactive and composable workflows (problem instantiation, reduction, solver invocation, result evaluation) and a PDF manual, generated from Typst source, serving as both documentation and agent-readable reference with formal definitions and worked examples.

Infrastructure modules encapsulate:

Problem types/variants: Uniform interfaces with custom size measures and complexity attributes.
Reduction rules: Bidirectional mappings and formal overhead polynomials.
Example database: Unifying canonical instances for testing, documentation, and interactive exploration.
Solvers: Modular, with a preference hierarchy: native solvers, reduction-to-ILP with HiGHS, and brute-force evaluators.
Symbolic engine: Facilities for complexity and overhead analysis, providing path cost evaluation and reduction optimization.

The Rust type system is exploited for compile-time verification, reducing agent-originated errors by enforcing complete implementations and validating symbolic expressions for overheads and variant registrations.

No-Code Contributions and Skill-Based Automation

A distinguishing aspect is the no-code contribution pipeline, lowering the barrier for domain experts to contribute reductions without programming knowledge. This is enabled by a comprehensive skills architecture and tightly controlled division of human/agent labor.

Figure 2: Illustration of the integration problem, mitigated by structured issue-based contributions and automated agent handling of implementation and documentation.

Humans retain responsibility for strategic direction (selection of reductions), correctness checks, construction of canonical examples, and merge authorization—especially for responsibilities demanding deep reasoning or creativity. Agents autonomously handle implementation, code review, convention enforcement, and routine engineering via composable Markdown-encoded skills.

Figure 5: The skills architecture bifurcating advisor (human-in-the-loop) and automation (fully autonomous) skills, executed by appropriate agents.

Contribution is formalized as a state machine on the GitHub project board:

Figure 4: The contribution pipeline is mapped to discrete states driven by domain experts (issue), maintainers (triage/merge), and agent roles (implementation, review).

Structured brainstorming and pre-validation are enforced by the propose advisor skill, and checklists governing all required reduction components (algorithm, overhead, proof sketch, example) steer the add-rule automation skill, preventing convention drift.

Verification and Correctness Guarantees

Reliability is maintained by a stringent six-layer verification stack, with two mechanisms of particular novelty:

Round-trip tests: Every reduction is backed by domain expert–provided canonical examples. Downstream JSON fixtures, documentation, and CLI modes are generated from a single source of truth. The round-trip—reduce, solve, inverse map, re-evaluate—provides elementary and compositional correctness checks.
Figure 8: The propagation of canonical examples as a single source of truth across the verification and documentation pipeline, ensuring immediate surfacing of semantic drift.
Agentic feature tests: Agents simulate realistic domain personas (e.g., logistics optimization experts) in fresh context windows, performing complete problem-to-solution cycles to expose user-relevant and documentation gaps not discoverable via ordinary unit tests.

Empirical Outcomes and Scaling Dynamics

The constructed reduction graph exhibits considerable coverage:

Figure 6: As of April 2026, 129 of 190 problem types have a reduction path to ILP, facilitating solver access; 78 are verified as NP-hard via reduction from 3-SAT.

Scaling dynamics reveal a recognizable phase transition as the automation pipeline matures. Initial manual integration is slow; after skill-based contributions and automation stabilize, the rate of problem and reduction accrual increases sharply, with the majority of growth concentrated in an intensive five-week window.

Figure 11: Growth curves of problem types and reductions show nonlinear acceleration post-adoption of full agentic pipeline.

This establishes a strong empirical claim for agentic harness engineering’s efficacy in domains that feature uniform input/output structure and cheap verification gates, even as long-horizon agentic coding continues to face limitations elsewhere.

Practical and Theoretical Implications

Practically, the library provides unprecedented scope for solving combinatorial problems using a unified CLI, bridging diverse formulations to mature solver backends (notably HiGHS via ILP) and verifying NP-hardness claims algorithmically. The architecture is robust against conventional sources of software entropy: convention drift, manual error, and repetitive engineering effort exhaustion.

Theoretically, the explicit, dynamic reduction graph enables the computational complexity community to systematically study problem connectivity, cluster structure, and identify minimal (or missing) reductions. Extensions to fine-grained cost modeling and expansion to other complexity classes (#P, PSPACE, etc.) are now approachable research directions.

Conclusion

This work demonstrates that the careful design of harnesses—project specifications, skill hierarchies, and multilayer verification—enables AI agents to construct and maintain a large, verifiable, composable library of problem reductions at a scale and pace not previously achieved manually. The artifact evidences that automation, when judiciously controlled, can both democratize mathematical software contributions and provide a lasting framework for community extension and theoretical investigation.

Figure 13: The trait hierarchy and compile-time validation ensure that only semantically and syntactically correct reductions and problem variants are integrated.

The compositional, agent-driven approach to reduction-graph integration provides a blueprint applicable to other mathematically structured domains requiring compositional transformations, setting the stage for further agentic augmentation in formal science and engineering workflows.

Markdown Report Issue