Confidence-Building Measures in AI Governance

Updated 26 February 2026

Confidence-Building Measures (CBMs) are structured processes combining physical inspections, remote attestation, and digital auditing to verify adherence to AI treaties.
They integrate methods like on-site data center inspections and cryptographic verification to ensure treaty compliance across diverse AI infrastructures.
CBMs balance verification strength and operational costs using adaptive techniques, enabling secure oversight with minimal disclosure of sensitive data.

Confidence-Building Measures (CBMs) are structured processes, technical tools, and limited-information exchanges designed to enable states and treaty parties to acquire quantified confidence that their counterparts are complying with collectively agreed rules concerning AI computing resources, model training, and model deployment. In the context of international AI governance, CBMs address both physical infrastructures—such as data centers and chips—and digital or algorithmic artifacts, including model weights and training logs. Unlike traditional arms-control CBMs, which focus primarily on material stockpiles, AI-oriented CBMs must manage verification and compliance across both physical and digital domains while minimizing disclosure of proprietary or national-security-sensitive data. The overarching objective is to reduce uncertainty concerning treaty compliance without necessitating intrusive transparency measures (Scher et al., 18 Jun 2025).

1. Categories of Verification Mechanisms in AI CBMs

Five principal approaches structure AI CBMs, reflecting both “low-tech” (e.g., periodic data center inspections) and “high-tech” (e.g., cryptographic hardware attestation) methodologies. Each approach targets different risk vectors associated with AI development and deployment:

Physical Inspections: On-site audits of data centers and chip fabrication sites to count AI accelerators, verify seals and surveillance systems, and cross-reference registry entries with declared inventories.
Remote Hardware Attestation: Integration of cryptographic keys or secure processors (e.g., FlexHEGs) into AI chips, allowing periodic, remote verification of operational status, location, and adherence to license constraints.
Software-Layer Log Auditing and Partial Re-Running: Maintenance of encrypted transcripts and checkpoints of training runs; verifiers sample and partially re-execute declared workloads in trusted environments to confirm the authenticity of reported compute usage.
Statistical Sampling and Compute Accounting: Probabilistic sampling of chip-hours across infrastructures to audit conformance, combined with formal accounting to ensure the sum of verified and unverified compute hours remains below treaty thresholds.
Network-Traffic Analysis and Interconnect-Bandwidth Limits: Engineering limits on inter-pod bandwidth to inhibit the feasibility of large-scale distributed training, enforced via programmable switches and supported by traffic monitoring.

These verification paradigms allow tailoring of @@@@1@@@@ portfolios to specific treaty contexts and risk profiles, enabling flexibility in the balance between verification strength, cost, and operational disruption (Scher et al., 18 Jun 2025).

2. Core Principles and Mathematical Formalisms

Each CBM category is governed by distinct principles and evaluative metrics. Explicit formulas quantify effectiveness, detection probabilities, and trade-offs:

Physical Inspections: Let $n$ be annual inspections, $p_\text{inspect}$ be the independent probability per inspection of detecting illicit chips. The aggregate detection probability is $P_\text{detect} = 1 - (1 - p_\text{inspect})^n$ . Bayesian updating adjusts violation risk: if prior $\pi_0$ , posterior $\pi_1 = \pi_0 (1 - p_\text{inspect}) / (1 - \pi_0 p_\text{inspect})$ after no detection. Inspection thoroughness increases $p_\text{inspect}$ (raising cost), and inspection frequency $n$ increases sovereignty costs.
Remote Hardware Attestation: Chips enforce licenseable compute caps ( $L$ , in FLOP-hours); renewal occurs if consumed $t_\text{used} \leq L$ . Tamper detection is characterized by false-positive (FP) rate $\alpha$ and false-negative (FN) rate $\beta$ , with targets $\alpha \approx 10^{-6}$ , $\beta \approx 10^{-9}$ for high-security use cases. Achieving these security properties typically requires 2–5 years of R&D.
Log Auditing & Re-Running: To verify a fraction $c$ of total chip-hours, sample $n$ out of $N$ hours, with $c \approx n/N$ for $n \ll N$ . Verification cost scales as $x\%$ of total compute times the overhead ratio $R_\text{verify} / \text{Compute}_\text{FrontierRun}$ .
Statistical Sampling & Compute Accounting: Let $U$ denote unverified chip-hours. Compliance holds if $\text{VerifiedCompute} + U \leq T$ (treaty threshold). Joint detection probability across $k$ independent CBMs, each with $P_i$ : $P_\text{total} = 1 - \prod_{i=1}^k (1 - P_i)$ .
Network Limits: For training, required interconnect is $\sim 131$ GB/s per pod (Llama-scale) versus $\sim 85$ KB/s for inference tokens. By setting $B \approx 1$ Mb/s, training times are increased $\sim$ 130,000-fold, rendering covert frontier training infeasible via the restricted interconnect (Scher et al., 18 Jun 2025).

3. Comparative Analysis: Feasibility, Costs, and Error Rates

Trade-offs among different CBM mechanisms are multidimensional, involving feasibility, costs, error modes, and political acceptability:

Mechanism	Technical Feasibility	Operational/Design Cost
Physical Inspections	Immediate, but diplomatic/pre-negotiate access	High operational + staffing costs
Hardware Attestation	Requires 2–5 years R&D for robustness	High one-off chip design, lower recurring cost
Log Auditing/Re-Run	1–3 years to develop neutral trusted compute	High verifier compute spend
Statistical Sampling	Feasible in ~1 year; coverage trade-off	Low cost, but risk of missing covert runs
Network Limits	<2 years to implement, but algorithm-sensitive	Moderate hardware, low recurring cost

Error rates differ by mechanism: physical inspections usually have low FP but moderate FN due to chip transfer between visits; attestation is vulnerable to FP via hardware failure or FN if cryptographic keys are compromised; log re-running achieves near-zero FN if provided transcripts are honest; sampling has FN $\sim (1-\text{coverage})$ ; network limiting FNs rise if new distributed training methods are not accounted for. Political acceptability also varies, with physical inspections affecting sovereignty, and hardware attestation/log audits raising concerns about proprietary information disclosure (Scher et al., 18 Jun 2025).

4. Application Protocols and Dispute Scenarios

Illustrative scenarios demonstrate the adaptability of CBMs to specific geopolitical and technical contexts:

Physical Inspections: US–China agreements declare high-power data centers and registry, with two unannounced inspections per year. Discovery of undeclared hardware triggers arbitration and extended audit.
Remote Attestation: Multinational chip quotas enforce attestation protocols with periodic reporting to global registries. Attestation disputes prompt forensic chip review.
Software Auditing/Re-Running: OECD treaties escrow “training transcripts” and verify 1% of workloads quarterly via neutral enclaves. Discrepancy triggers high-compute forensic investigation.
Statistical Sampling: Climate-AI agreements randomly sample GPU-hours for verification. Sample violations lead to comprehensive infrastructure audit.
Network Limits: Regional treaties (e.g., Antarctic research) enforce interconnect limits, monitor ports, and escalate alarms to manual cable and firmware inspections.

These protocols collectively demonstrate how CBMs can be tailored to treaty parameters, with defined escalation steps in the event of non-compliance (Scher et al., 18 Jun 2025).

5. Policy Design Recommendations

Best practices for integrating CBMs into AI treaties emphasize the need for layered, risk-proportional, and adaptable frameworks:

Layered CBM Portfolio: Integrate multiple CBMs (e.g., inspections, attestation, sampling) to maximize detection probability ( $P_\text{detect}$ ) and diversify technical and political risks.
Risk-Scaled Design: Prioritize CBMs where risk potential is highest—frontier training receives more scrutiny than inference.
Adaptive Parameterization: Periodically recalibrate system parameters (e.g., $n$ , $B$ , $n/N$ ) in response to advances in distributed systems and algorithmic capabilities.
Trusted Compute Investment: Develop mutual trusted execution environments and secure audit trails to reduce reliance on full code access.
Legal and Dispute Frameworks: Specify rights, escalation pathways, and sanctions to manage violations and ambiguity.
Transparency and Whistleblower Protections: Supplement technical CBMs with reporting mechanisms for covert evasion.
International R&D Coordination: Joint investment in tamper-proof hardware and zero-knowledge verification to improve the future technical security baseline.

Adherence to these recommendations enables states to construct robust, cooperative governance for frontier AI and reduce opportunities and incentives for covert non-compliance (Scher et al., 18 Jun 2025).

6. Significance and Ongoing Challenges

By reducing mutual suspicion and raising the cost of undetected violations, CBMs provide a structured path for technical and political cooperation in AI governance regimes. However, several unresolved challenges persist: rapid distributed computing advances may necessitate continual recalibration of technical parameters and audit thresholds; some technical approaches (notably robust hardware attestation) remain years from practical deployment; and strong verification can be at odds with state sovereignty, proprietary interests, and political feasibility. A plausible implication is that success in the domain of AI CBMs will depend on sustained international R&D coordination, clear legal language in treaties, and the creation of mutually trusted verification infrastructure (Scher et al., 18 Jun 2025).

Markdown Report Issue Upgrade to Chat

References (1)

Mechanisms to Verify International Agreements About AI Development (2025)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Confidence-Building Measures (CBMs).