Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 150 tok/s
Gemini 2.5 Pro 47 tok/s Pro
GPT-5 Medium 33 tok/s Pro
GPT-5 High 34 tok/s Pro
GPT-4o 113 tok/s Pro
Kimi K2 211 tok/s Pro
GPT OSS 120B 444 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

LeftoverLocals: Listening to LLM Responses Through Leaked GPU Local Memory (2401.16603v1)

Published 29 Jan 2024 in cs.CR and cs.DC

Abstract: This paper describes LeftoverLocals: a vulnerability that allows data recovery from GPU memory created by another process on Apple, Qualcomm, and AMD GPUs. LeftoverLocals impacts the security posture of GPU applications, with particular significance to LLMs and ML models that run on impacted GPUs. By recovering local memory, an optimized GPU memory region, we built a PoC where an attacker can listen into another user's interactive LLM session (e.g., llama.cpp) across process or container boundaries.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (8)
  1. GPU Concurrency: Weak Behaviours and Programming Assumptions. In Architectural Support for Programming Languages and Operating Systems (ASPLOS). ACM. https://doi.org/10.1145/2694344.2694391
  2. Stealing Webpages Rendered on Your Browser by Exploiting GPU Vulnerabilities. In 2014 IEEE Symposium on Security and Privacy. https://doi.org/10.1109/SP.2014.9
  3. Many-core compiler fuzzing. In Programming Language Design and Implementation (PLDI ’15). ACM. https://doi.org/10.1145/2737924.2737986
  4. Confidentiality Issues on a GPU in a Virtualized Environment. In Financial Cryptography and Data Security, Nicolas Christin and Reihaneh Safavi-Naini (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg.
  5. A Survey of Techniques for Improving Security of GPUs. CoRR abs/1804.00114 (2018). arXiv:1804.00114 http://arxiv.org/abs/1804.00114
  6. CUDA Leaks: A Detailed Hack for CUDA and a (Partial) Fix. ACM Transactions on Embedded Computing Systems 15, 1 (Jan. 2016), 1–25. https://doi.org/10.1145/2801153
  7. GPU.zip: On the Side-Channel Implications of Hardware-Based Graphical Data Compression. In 2024 IEEE Symposium on Security and Privacy (SP). IEEE Computer Society, 84–84. https://doi.org/10.1109/SP54263.2024.00084
  8. Vulnerable GPU Memory Management: Towards Recovering Raw Data from GPU. CoRR abs/1605.06610 (2016). arXiv:1605.06610 http://arxiv.org/abs/1605.06610
Citations (3)

Summary

We haven't generated a summary for this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 2 tweets and received 13 likes.

Upgrade to Pro to view all of the tweets about this paper: