Dice Question Streamline Icon: https://streamlinehq.com

Privacy-preserving classification of compute workloads

Develop privacy-preserving workload classification techniques that use compute provider telemetry to reliably determine whether workloads constitute model training above specified compute thresholds or inference associated with malicious cyberactivity, while protecting customer data.

Information Square Streamline Icon: https://streamlinehq.com

Background

Compute providers collect high-level data on customers and workloads that could support governance, such as reporting training runs that exceed regulatory thresholds or identifying malicious activity.

The challenge is to classify workloads in ways that preserve customer privacy and remain robust to changes in hardware, software, and algorithms, enabling oversight without revealing sensitive information.

References

An open question is thus whether it is possible to use this data to develop reliable workload classification techniques, for example, determining whether a training workload exceeds certain compute thresholds, or whether an inference workload involves malicious cyberactivity.

Open Problems in Technical AI Governance (2407.14981 - Reuel et al., 20 Jul 2024) in Section 3.2.2 “Classification of Workloads”