Independence of sabotage attempts across episodes
Establish whether sabotage attempts by language-model-based agents are statistically independent events across episodes in deployments that use asynchronous monitoring, and characterize any temporal or strategic correlations that would invalidate independence assumptions in safety models.
References
But some uncertainties will remain, such as how many attacks are needed to cause harm, or whether attacks are independent (although we may be able to get some idea by careful threat modelling and additional measurements).
— Async Control: Stress-testing Asynchronous Control Measures for LLM Agents
(2512.13526 - Stickland et al., 15 Dec 2025) in Appendix C, Deployment Simulation Details (Table 4 context)