Privacy-preserving classification of compute workloads

Develop privacy-preserving workload classification techniques that use compute provider telemetry to reliably determine whether workloads constitute model training above specified compute thresholds or inference associated with malicious cyberactivity, while protecting customer data.

Background

Compute providers collect high-level data on customers and workloads that could support governance, such as reporting training runs that exceed regulatory thresholds or identifying malicious activity.

The challenge is to classify workloads in ways that preserve customer privacy and remain robust to changes in hardware, software, and algorithms, enabling oversight without revealing sensitive information.

References

An open question is thus whether it is possible to use this data to develop reliable workload classification techniques, for example, determining whether a training workload exceeds certain compute thresholds, or whether an inference workload involves malicious cyberactivity.

— Open Problems in Technical AI Governance (2407.14981 - Reuel et al., 20 Jul 2024) in Section 3.2.2 “Classification of Workloads”

Privacy-preserving classification of compute workloads

Sponsor

Background

References

Related Problems