Systematic quantitative evaluation of OpenClaw agent safety
Establish a systematic, quantitative evaluation of the OpenClaw personal AI assistant’s safety under adversarial conditions, replacing prior qualitative or narrow-scope assessments with comprehensive measurements of harmful action execution in realistic deployments.
References
\citet{shapira2026agents} take a step in this direction, documenting emergent OpenClaw failures in a live lab setting, and \citet{wang2026assistant} evaluate OpenClaw under black-box conditions, but both remain qualitative or narrow in scope, leaving systematic quantitative evaluation an open problem.
— ClawSafety: "Safe" LLMs, Unsafe Agents
(2604.01438 - Wei et al., 1 Apr 2026) in Section 1 (Introduction)