Evaluations that sufficiently cover model vulnerabilities
Determine methods and criteria to assess whether an evaluation procedure has identified most vulnerabilities of a given AI system, especially for capabilities that could enable harmful misuse or make systems difficult to oversee or control.
References
Determining whether an evaluation procedure has identified all, if not most, of the vulnerabilities of a system is an open problem.
— Open Problems in Technical AI Governance
(2407.14981 - Reuel et al., 20 Jul 2024) in Section 3.3.1 Reliable Evaluations