Addressing risks not closely tied to dangerous capabilities (systemic risks and AI malfunction)

Characterize and integrate into frontier AI safety cases risk domains that are not closely tied to specific dangerous capabilities, including systemic risks and risks from AI malfunction, and develop assessment and mitigation approaches appropriate for those risks within the safety case structure.

Background

While current safety case approaches emphasize dangerous capabilities, the authors identify open questions about how to address risks that are less closely tied to specific capabilities, such as systemic risks or risks arising from AI malfunction. These risk types require alternative methods of assessment and justification to be coherently represented in safety case arguments.

References

There are still open questions, such as how to incorporate post-deployment enhancements \citep{davidson2023}, account for defensive uses of capabilities \citep{mirsky2023}, or address risks that are less closely tied to dangerous capabilities such as systemic risks \citep{zwetsloot2019} or risks from AI malfunction \citep{raji2022} that are less closely tied to dangerous capabilities.

— Safety cases for frontier AI (2410.21572 - Buhl et al., 28 Oct 2024) in Section 4.3 "Arguments"

Addressing risks not closely tied to dangerous capabilities (systemic risks and AI malfunction)

Sponsor

Background

References

Related Problems