Design secure, comprehensive research APIs for white‑/de facto white‑box auditing

Develop structured research application programming interfaces (APIs) that enable external auditors to run arbitrary white‑box analyses on proprietary AI models while preventing parameter leakage and model reconstruction, achieving sufficient comprehensiveness, flexibility, and security for rigorous auditing of AI systems.

Background

The paper argues that black-box audits are insufficient for rigorous oversight and that white- and outside-the-box access enable substantially stronger evaluations. To reduce leakage risks while still granting powerful audit capabilities, the authors discuss API-based “structured access” that can provide de facto white-box functionality without sharing raw weights.

However, the paper highlights the technical difficulty of simultaneously providing auditors with comprehensive and flexible tools while ensuring strong security guarantees against model reconstruction or leakage, identifying this API design space as an unresolved research area relevant to practical audit deployment.

References

Overall, while conceptually simple, designing APIs that simultaneously provide the comprehensiveness, flexibility, and security required for rigorous auditing is an open area of research.

— Black-Box Access is Insufficient for Rigorous AI Audits (2401.14446 - Casper et al., 25 Jan 2024) in Section 6, Methods to Address Security Risks – Technical: API access

Design secure, comprehensive research APIs for white‑/de facto white‑box auditing

Sponsor

Background

References

Related Problems