Attribution of assistance differences in search-enabled o4-mini-deep-research
Determine whether the lower predicted actionability and information access scores of the search-enabled OpenAI o4-mini-deep-research relative to the standard OpenAI o4-mini are caused by integrated web search capabilities or by additional safety measures implemented specifically in the search-enabled variant.
References
On web search, the search-enabled variant (o4-mini-deep-research) produced lower predicted scores (actionability: 1.94, information access: 2.38 under benign decomposition) compared to standard o4-mini (2.08, 2.66). However, we cannot definitively attribute this difference to search capabilities versus additional safety measures implemented in the search-enabled variant.
— A Multi-Turn Framework for Evaluating AI Misuse in Fraud and Cybercrime Scenarios
(2602.21831 - Mai et al., 25 Feb 2026) in Results, subsection 'Impact of Reasoning and Search'