Securing the provider’s model parameters in HELD

Develop methods to better secure the service provider’s model parameters in the HELD (Homomorphically Encrypted Linear Inference across models) framework, specifically the linear classifier parameters (V and c) used by Party A’s head f_A(z) = zV + c, beyond the current protocol that focuses on protecting client queries.

Background

The paper introduces HELD, a two-party privacy-preserving framework that enables cross-silo inference by learning a linear alignment between independently trained LLMs and applying homomorphic encryption to protect client queries. Party A (service provider) owns a linear classification head f_A applied to embeddings, while Party B (client) computes and encrypts aligned representations.

While the protocol emphasizes client-side privacy by encrypting inputs and returning encrypted prediction outputs, the authors acknowledge that protecting the provider’s model parameters remains insufficiently addressed. They explicitly flag enhanced protection of Party A’s classifier parameters as an unresolved challenge, despite discussing potential mitigations such as returning only encrypted argmax results and noting that adaptive attacks (e.g., model extraction) are out of scope.

References

Second, while our protocol protects client queries, better securing the provider's model parameters remains an open challenge.

Secure Linear Alignment of Large Language Models  (2603.18908 - Gorbett et al., 19 Mar 2026) in Conclusion