Reliability of identifying misuse in closed foundation model monitoring for vulnerability detection
Establish whether monitoring and moderation of closed foundation model services can reliably identify illicit use of such models for automated vulnerability detection given the dual-use nature of security testing.
References
In considering marginal risks relative to closed foundations, while closed foundation models can be better monitored for misuse, it is not clear if such uses will be reliability identified.
— On the Societal Impact of Open Foundation Models
(2403.07918 - Kapoor et al., 27 Feb 2024) in Section: Risks of Open Foundation Models; Table: Instantiation of our risk analysis framework (Cybersecurity — Evidence of marginal risk)