Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
144 tokens/sec
GPT-4o
8 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Evaluating the Performance of the DeepSeek Model in Confidential Computing Environment (2502.11347v1)

Published 17 Feb 2025 in cs.PF

Abstract: The increasing adoption of LLMs in cloud environments raises critical security concerns, particularly regarding model confidentiality and data privacy. Confidential computing, enabled by Trusted Execution Environments (TEEs), offers a promising solution to mitigate these risks. However, existing TEE implementations, primarily CPU-based, struggle to efficiently support the resource-intensive nature of LLM inference and training. In this work, we present the first evaluation of the DeepSeek model within a TEE-enabled confidential computing environment, specifically utilizing Intel Trust Domain Extensions (TDX). Our study benchmarks DeepSeek's performance across CPU-only, CPU-GPU hybrid, and TEE-based implementations. For smaller parameter sets, such as DeepSeek-R1-1.5B, the TDX implementation outperforms the CPU version in executing computations within a secure environment. It highlights the potential for efficiently deploying LLM models on resource-constrained systems while ensuring security. The overall GPU-to-CPU performance ratio averages 12 across different model sizes, with smaller models exhibiting a lower ratio. Additionally, we provide foundational insights and guidance on optimizing CPU-GPU confidential computing solutions for scalable and secure AI deployments. Our findings contribute to the advancement of privacy-preserving AI, paving the way for efficient and secure LLM inference in confidential computing environments.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com