Attacker-Arbitrator-Anonymizer Architecture
- The A-A-A architecture is a rational design pattern that enforces privacy-utility trade-offs by integrating attacker, arbitrator, and anonymizer roles.
- It employs marginal analysis to evaluate privacy gains versus utility costs, ensuring rational and effective anonymization in text processing and data collection.
- It leverages cryptographic techniques, rate-limiting, and controlled linkability to preempt data pollution and adversarial abuse in diverse applications.
The Attacker-Arbitrator-Anonymizer (A-A-A) architecture is a rational agent design pattern that establishes economic controls over privacy-preserving, adversarial, and anonymous systems. Its instantiations span both text anonymization under localized adversarial pressure and cryptographically secure anonymous data collection, addressing the limitations of greedy optimization strategies and enforcing robust rationality and anti-abuse properties (Duan et al., 7 Dec 2025, Catarineu et al., 2018).
1. Architectural Principals and High-Level Workflow
The A-A-A architecture operates as a three-tiered loop, structuring system roles as follows: Attacker for leak detection or adversarial inference, Arbitrator for rationality enforcement and validity judgment, and Anonymizer for privacy action implementation. In localized adversarial anonymization, the workflow initializes with an original text , then iteratively proceeds:
- Attacker () simulates an adversary, inspecting to propose a set of candidate leaks and corresponding reasoning chains .
- Arbitrator () assesses each tuple, assigns a discrete validity level , and constructs an anonymization policy set from valid candidates. This stage strictly filters hallucinated or negligible-gain suggestions, enforces economic rationality via Marginal Rate of Substitution (MRS) thresholds, and triggers early stopping when privacy gains vanish.
- Anonymizer () executes the editing policies from to yield , maintaining maximal semantic and structural utility.
In anonymous data collection, the Anonymizer resides client-side, managing credentials, constructing cryptographic basenames, generating Direct Anonymous Attestation (DAA) signatures, and routing signed records through anonymizing transport layers (e.g., Tor). The Arbitrator operates server-side, verifying DAA signatures, extracting linkability tags, enforcing normative rate limits, and dynamically filtering attempts at quota abuse (Catarineu et al., 2018).
Data Flow (Text Anonymization)
Attacker (leaks, reasoning) Arbitrator policy set Anonymizer
Data Flow (Anonymous Data Collection)
Client (Anonymizer) $(m,\{\sigma_i\},\{\bsn_i\}) \rightarrow$ Arbitrator (collector) Validity check Data Collector
The separation of responsibilities structurally prevents over-editing, data pollution, and utility collapse by enforcing rational stepwise progression and strong rate-limiting.
2. Rationality Mechanisms and Formal Definitions
The architecture systematizes privacy-utility trade-offs through marginal analysis and rationality criteria in adversarial anonymization:
- Marginal Privacy Gain (MPG):
where quantifies adversarial inference success given text .
- Marginal Utility Cost (MUC):
measures semantic preservation.
- Marginal Rate of Substitution (MRS):
- Rationality Criterion: A step is rational iff , where is a user-specified threshold.
- Early-Stopping Condition: If , then and the procedure halts to avoid .
This scheme prevents drift into irrational optimization regimes and catastrophic utility loss, which was observed in prior greedy strategies when applied to local small-scale models (LSMs).
In anonymous data collection, the normative space is defined via rate-limiting rules as triples , bounding per-user contributions in each clock window and mapping these to basenames in DAA signatures—a cryptographically enforced analog of rational stepwise gating in text anonymization.
3. Component Operations and Pseudocode Specification
Attacker-Arbitrator-Anonymizer in RLAA (Text Anonymization)
- Attacker (): Computes for PII leaks and their supporting reasoning.
- Arbitrator (): Validates each leak, assigns levels, maps to policies, and enforces the rationality criterion.
- Anonymizer (): Executes policies minimally for semantic/structural retention.
Pseudocode Outline
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
Input: x^{(0)}, T
Output: x^*
t ← 0
while t < T:
(L, R) ← M_atk(x^{(t)})
P ← ∅
for each (l_k, r_k) in zip(L, R):
v_k ← M_arb(l_k, r_k, x^{(t)})
π_k ← Π_select(v_k)
if π_k ≠ Ignore:
P ← P ∪ {(l_k, π_k)}
if P == ∅:
break # early stop
x^{(t+1)} ← M_ano(x^{(t)}, P)
t ← t + 1
return x^{(t)} |
Attacker-Arbitrator-Anonymizer in Anonymous Data Collection
- Anonymizer (Client): Manages DAA credentials, constructs basenames, generates signatures, and transmits anonymized data through Tor.
- Arbitrator (Collector): Verifies DAA signatures, extracts tags, identifies quota violations, and filters attackers via detected linkability (Catarineu et al., 2018).
DAA Primitives
- Join: .
- Sign: .
- Verify: Accept/reject based on zero-knowledge proofs and pairing equations.
- ExtractTag: is extracted from ; collisions signal exceeding quota.
4. Empirical Evaluation and Comparative Performance
Text Anonymization Benchmarks (Duan et al., 7 Dec 2025)
Experiments on PersonalReddit (multi-attribute) and reddit-self-disclosure (single-attribute) datasets validate superiority of the RLAA instantiation under the A-A-A schema.
| Method | Model | UTIL | PRIV | ROUGE-L | Utility Collapse |
|---|---|---|---|---|---|
| FgAA-Naive | Llama3-8B | 0.73 | 0.195 | 0.218 | Yes |
| IncogniText | — | 0.633 | 0.123 | 0.350 | Hallucination |
| RLAA | Llama3-8B | 0.879 | 0.213 | 0.596 | None |
RLAA consistently achieves the best privacy-utility trade-off, outperforming API-based systems (e.g., FgAA-API: UTIL=0.826) while obtaining comparable privacy, and delivering strict Pareto improvements in single-attribute tests (e.g., RLAA UTIL=0.857, PRIV=0.114 versus FgAA-Naive UTIL=0.819, PRIV=0.159). Ablation studies confirm that omission of the Arbitrator results in a UTIL drop and worsened PRIV.
MRS analysis shows RLAA maintains low, stable Marginal Rate of Substitution, halting anonymization at rational points, while baselines suffer escalating utility loss.
Anonymous Data Collection Benchmarks (Catarineu et al., 2018)
Performance measurements on x86-64 servers:
- Client Join: ~8.5 ms native, ~20 ms WASM (every 3 days).
- Client Sign: ~0.4 ms native, ~5 ms WASM.
- Server Verify: ~6.8 ms per signature per core (up to ~140 messages/sec/core).
- Network: record transmission ~16.4 KB, authentication ~0.95 KB.
- Tor latency (95th-percentile): ~3.25 s (regular server), ~2.57 s (onion service).
Throughput sustains millions of users contributing a few messages per minute, with quota-driven tag collision ensuring effective rate limiting. Clients exceeding their quota generate detectable repeat tags and are filtered.
5. Deployment Considerations and Controlled Linkability
The architecture integrates seamlessly into existing pipelines:
- Text anonymization applications require only local inference calls and rationality enforcement; no retraining or external APIs.
- Anonymous data collection is deployable by integrating a DAA library on clients, defining rate-limit rules, and maintaining tag sets on collectors for duplicate detection.
- Sybil resistance leverages identity creation costs and periodic issuer key rotation; minimal trust is required in the issuer, who must avoid group split or join denial attacks.
Controlled linkability is the central cryptographic lever: signatures on identical basenames allow the Arbitrator to detect and filter abusive clients, while honest users remain unlinkable and fully anonymous.
6. Impact, Generalization, and Limitations
The A-A-A architecture resolves the privacy paradox in LLM-based anonymization and cryptographic anonymity. It rationalizes adversarial feedback cycles, systematically prevents utility collapse, and delivers empirically validated best-in-class privacy-utility trade-offs without requiring large model APIs, external training, or privacy-damaging communications.
Generalization is empirically confirmed across Llama3-8B, Qwen2.5-7B, and DeepSeek-V3.2-Exp for RLAA. In anonymous data collection, DAA-backed instantiations demonstrate strong protection against data pollution with negligible impact on client/server performance.
A plausible implication is that any privacy-preserving system balancing adversarial, economic, or cryptographic constraints can benefit from this architecture, particularly in settings where utility collapse or quota abuse are primary risks.
Limitations are bounded by the reliability of the Arbitrator and the economic thresholds. In anonymous data collection, trust is minimized but not eliminated in the credential issuer; periodic key rotation mitigates lingering threats.
7. Conclusion
The Attacker-Arbitrator-Anonymizer architecture enforces strict, rational economic and cryptographic controls for privacy-centric systems. By structurally dichotomizing sensory attack detection, rational arbitration, and privacy-preserving action, it systematically mitigates irrational over-editing and data pollution. RLAA exemplifies its effectiveness in adversarial text anonymization, while controlled linkability in anonymous data collection secures applications against scalable adversarial abuse, marking a principled advancement in rational privacy engineering (Duan et al., 7 Dec 2025, Catarineu et al., 2018).