Optimally Auditing Adversarial Agents

Published 28 Apr 2026 in cs.GT, cs.AI, and cs.CY | (2604.25085v1)

Abstract: Fraud can pose a challenge in many resource allocation domains, including social service delivery and credit provision. For example, agents may misreport private information in order to gain benefits or access to credit. To mitigate this, a principal can design strategic audits to verify claims and penalize misreporting. In this paper, we introduce a general model of audit policy design as a principal-agent game with multiple agents, where the principal commits to an audit policy, and agents collectively choose an equilibrium that minimizes the principal's utility. We examine both adaptive and non-adaptive settings, depending on whether the principal's policy can be responsive to the distribution of agent reports. Our work provides efficient algorithms for computing optimal audit policies in both settings and extends these results to a setting with limited audit budgets.

Abstract PDF Upgrade to Chat

Authors (3)

Summary

The paper introduces a formal framework that computes near-optimal audit vectors with an efficient O(m²) algorithm under worst-case adversarial equilibria.
It compares non-adaptive and adaptive audit regimes, demonstrating that adaptive audits offer no gain over non-adaptive ones under typical penalty structures.
The study extends the model to budgeted audits and online learning, showing that increasing penalties and decreasing audit costs improve both principal utility and social welfare.

Optimal Auditing of Adversarial Agents: Technical Summary and Analysis

Introduction and Problem Setting

This paper develops a formal framework for audit policy design in principal-agent games with multiple, strategic, adversarial agents. The core motivation is the proliferation of high-stakes AI-driven decisions in domains such as social service delivery and credit allocation, where applicants possess private information and may misreport to maximize their benefit. The principal (authority) can commit to a (possibly adaptive) audit policy, auditing is costly (possibly subject to a hard budget), and misreporting detected in an audit incurs a penalty.

The model is instantiated as a non-atomic game with $m$ ordered types, arbitrary value and payment matrices, and type-independent or affine penalties. Key distinctions are drawn along three axes:

Objective: Maximizing principal utility versus social welfare.
Audit responsiveness: Non-adaptive (fixed probabilities) versus adaptive (potentially responsive to observed reports).
Audit cost structure: Marginal per-audit cost versus a hard audit budget.

The agents behave adversarially, i.e., they coordinate to select an equilibrium that minimizes the principal’s utility given the chosen audit policy. The principal thus designs a robust audit mechanism for the worst-case equilibrium.

Model Structure and Agent Strategies

Agents' best responses and equilibrium selection are characterized in closed form. Given an audit vector $\mathbf{p}$ , audit probability for reports $k$ , and payment and penalty vectors, the agent’s utility for type $i$ reporting $k$ is $U_{i,k} := pay(k) - p_k \cdot pen(i,k)$ . This formulation leads to the threshold structure—lower types may strictly prefer to over-report into a profitable "pooling" class, while higher types may report truthfully.

The equilibrium set is delineated (Figure 1):

Figure 1: A non-adaptive audit game can have unattainable optimum principal utility due to worst-case, discontinuous equilibrium selection, exemplifying the Stackelberg pessimistic setting.

The principal's optimization problem is fundamentally a Stackelberg game with multiple followers and a continuous leader action space, rendered non-convex both due to the audit probability simplex and the equilibrium correspondence’s structure.

Algorithms and Computational Results

Non-Adaptive Audits with Marginal Cost

The main technical result for the non-adaptive, costly setting is an efficient algorithm (Algorithm 1, "SuccinctSearch") for computing an $\epsilon$ -optimal audit vector that minimizes principal loss across all equilibria. Critical to tractability is the identification of "critical audit vectors"—audit vectors indexed by key breakpoints in agent misreport utilities—that suffice (up to approximation) for optimality.

Figure 2: Principal's utility under varying audit strategies and opponent equilibria, illustrating the discontinuity at the transition between truthful and pooling equilibria.

Key theorems establish:

The worst-case equilibrium is always supported by a single-minded agent strategy (at most one group misreports to a single type).
The principal's utility as a function of audit probabilities can be non-concave and even fail to achieve its supremum due to discontinuities in equilibrium selection.
The search over critical vectors yields optimality up to numerical approximation, with $O(m^2)$ runtime (Theorem 1).

Online and Adaptive Extensions

The analysis is extended to adversarial online learning: when agent type priors change arbitrarily across rounds, the principal can achieve $O(n\sqrt{T m^2 \log m})$ regret with respect to the optimal fixed audit policy in hindsight, using a bandit algorithm (EXP3) with the critical vector set as arms.

For adaptive audits, where the audit strategy can depend on the observed report distribution, the paper proves a "no-adaptive-gain" result: adaptive auditing does not outperform optimal non-adaptive auditing, provided the penalty is less sensitive than the payment (certain regularity assumptions satisfied). The proof leverages the "dictator strategy" construction, which enforces any single-minded equilibrium and precludes gains from further adaptivity.

Budgeted Audit Setting

For the hard-budget scenario, the principal is constrained in expected total audits. The optimal adaptive audit can be computed efficiently, and the results parallel those from the unbudgeted case:

For very small budgets, the unique worst-case equilibrium involves all agents reporting the highest type, and all audits are allocated there.
For budgets above a threshold, optimal equilibria are single-minded, and a dynamic program over critical vectors suffices.

(Figures 3, 4, 5, 6)

Figure 3: U-opt Principal's utility as model parameters vary.

Figure 4: U-opt Principal's utility for alternative audit policy regimes.

Figure 5: Principal utility and misreporting as a function of audit cost and penalty parameters.

Figure 6: Principal's utility evaluated under varying resolution (number of types) giving diminishing returns to utility as granularity increases.

Monotonicity and Welfare Considerations

The model admits strong, monotonic comparative statics:

Increasing the penalty function or decreasing marginal audit costs can only improve both the principal’s utility and social welfare at the worst equilibrium.
Under affine penalties or for social welfare objectives, the same machinery applies (with objective function modifications), and the optimal audit can be efficiently computed.

Empirical Evaluation

Extensive simulations cover how optimal audit policies and equilibrium outcomes shift with:

The agent type prior: utility and auditing patterns exhibit sharp phase changes as population composition varies.
Penalty and cost parameters: auditing becomes costlier or misreporting penalties increase, leading to expected monotonic changes in optimal audit probability allocations and misreporting rates.
The number of discretization types: a finer type space allows for more granular equilibria, but principal utility and welfare saturate as $m$ increases.

Theoretical and Practical Implications

The paper’s results provide both computational and economic insights:

From a computational perspective, despite the multi-follower pessimistic Stackelberg structure, the considered auditing games admit highly efficient algorithms owing to their threshold and monotonicity properties, in contrast to general intractability in Stackelberg multi-follower settings.
For real-world audit mechanism design, these results precisely delineate when complex audit responsiveness is warranted (essentially never, under reasonable penalty structures), and how audits, penalties, and reporting granularity jointly determine both principal and social welfare outcomes.
The robust equilibrium selection (worst-case adversarial) provides analytically tight guarantees, aligning with regulatory and legal compliance needs in settings such as social services and credit allocation.

Future Research Directions

Several plausible generalizations and open directions are noted:

Finite population effects, randomized (noisy or partial) audit mechanisms.
Endogenizing the classification rule or payment mapping, i.e., joint optimization of predictive and audit components.
Richer penalties and relaxations of insensitive-payoff assumptions, which may recover adaptive audit gains.

Conclusion

This work makes the robust audit design problem for adversarial multi-agent principal-agent systems tractable, decomposes the equilibrium and policy structure, and provides efficient algorithms with strong optimality guarantees for a range of practical and theoretical settings. The technical apparatus leverages deep connections to congestion pricing and Stackelberg game theory, establishing new tools for analyzing and engineering mechanisms under adversarial strategic behavior.

Reference:

"Optimally Auditing Adversarial Agents" (2604.25085)

Markdown Report Issue