- The paper introduces ImportSnare, a framework that hijacks dependencies in RACG by poisoning documentation and injecting adversarial code suggestions.
- It employs dual attack vectors—ranking sequence optimization and multilingual inducing sequences—to manipulate retrieval and code generation processes.
- Experimental results show attack success rates over 50% at low poisoning ratios, emphasizing critical supply chain risks in LLM-assisted development.
ImportSnare: Directed Dependency Hijacking in Retrieval-Augmented Code Generation
Introduction and Motivation
The paper introduces ImportSnare, a framework for directed dependency hijacking attacks in Retrieval-Augmented Code Generation (RACG) systems. RACG leverages external code manuals via RAG to improve the correctness and security of LLM-generated code. However, this integration exposes new attack surfaces, particularly through the poisoning of documentation that can manipulate LLMs into recommending malicious dependencies. The attack exploits a dual trust chain: the LLM's reliance on retrieved documents and developers' trust in LLM suggestions. The practical impact is amplified by the increasing use of LLMs in software development, where developers often blindly follow code recommendations, including dependency installation.
Figure 1: Attack chain of the directed dependency hijacking risk in RACG, illustrating the flow from poisoned documentation to developer adoption of malicious packages.
Threat Model and Attack Chain
The threat model assumes attackers have no white-box access to retrieval models or LLMs but can inject poisoned documents into RAG databases and upload malicious packages to public repositories. Victims are developers who use LLMs for code generation and typically do not inspect the provenance of retrieved documents. The attack chain involves:
- Poisoning documentation indexed by RAG systems.
- Uploading malicious packages to repositories (e.g., PyPI).
- LLMs retrieving poisoned documents and recommending malicious dependencies.
- Developers installing and using these dependencies, completing the attack.
ImportSnare Framework
ImportSnare combines two orthogonal attack vectors:
- ImportSnare-R (Ranking Sequence): Position-aware beam search generates Unicode perturbations to maximize semantic similarity between poisoned documents and proxy queries, elevating their retrieval ranking.
- ImportSnare-G (Inducing Sequence): Multilingual inductive suggestions, refined via LLM self-paraphrasing, embed subtle package recommendations as innocuous code comments, increasing the likelihood of LLMs recommending malicious dependencies.
Figure 2: Workflow of ImportSnare, showing the construction and injection of ranking and inducing sequences into target documents.
Figure 3: Pipeline for selecting documents to poison using proxy queries and optimizing embedding similarity.
The framework is designed for transferability across user queries, retrieval models, and target LLMs, without requiring knowledge of specific victim queries.
Implementation Details
ImportSnare-R: Retrieval-Oriented Poisoning
- Objective: Maximize embedding similarity between poisoned documents and proxy queries.
- Algorithm: Position-aware beam search inspired by HotFlip, iteratively optimizes token sequences and insertion positions to maximize retrieval scores.
- Constraints: Sequences are inserted at line-end positions, with length limits to maintain stealth.
- Gradient-Guided Token Replacement: Efficiently selects top candidate tokens via matrix multiplication of embedding gradients.
ImportSnare-G: Adversarial Code Suggestion Injection
- Objective: Maximize the probability that the LLM generates the target package name in response to code generation queries.
- Method: Use local proxy LLMs to generate multilingual suggestions, inserted as code comments. Teacher-forcing is used to estimate generation probabilities and select optimal suggestions.
- Transferability: Suggestions are translated into multiple languages to enhance cross-linguistic effectiveness.
Experimental Results
Attack Effectiveness
ImportSnare achieves high attack success rates (ASR) across Python, Rust, and JavaScript, with ASR exceeding 50% for popular libraries (e.g., matplotlib, seaborn) and success at poisoning ratios as low as 0.01% of the entire RAG database.
Figure 4: ASR and Precision@k for different poisoning ratios targeting matplotlib, demonstrating high efficacy even at low poisoning rates.
Baseline Comparison
ImportSnare outperforms HotFlip and ReMiss in both ASR and Precision@k, demonstrating superior retrieval ranking and generation control. The attack remains effective even when only a small fraction of relevant documents are poisoned.
Ablation Studies
Figure 5: Ablation of ImportSnare modules, showing the impact of ranking and inducing sequences and their order of application.
Figure 6: Hyperparameter ablation for ImportSnare-R, illustrating the trade-off between performance and computational cost.
Transferability
ImportSnare demonstrates strong transferability across retrieval models, LLM architectures, and query languages. Larger proxy LLMs improve the generalization of inductive suggestions, and the attack remains effective under cross-linguistic query translation.
Code Generation Quality
Code generated using poisoned documentation does not show statistically significant degradation in quality compared to clean documentation, as measured by Bandit, Pylint, and Flake8. This indicates that the attack preserves baseline code characteristics while achieving its security objectives.
Real-World Demonstrations
Figure 7: Copilot in VSCode misled into suggesting a hijacked package (numpy to cumpy) via manipulated documentation.
Figure 8: Cursor Agent code suggestion includes hijacked package (matplotlib to matplotlib_safe) import statement.
Figure 9: Tencent Yuanbao web-chat LLM (DeepSeek-r1) suggests a controlled package (faiss to faiss_full) import with only one poisoned page.
Mitigation Strategies
No validated defense currently exists against ImportSnare-style attacks. LLM-based detection of malicious content in documents, prompts, and outputs shows partial effectiveness but suffers from high operational costs and low success rates. Allowlist filtering and rule-based detection are impractical due to the risk of exacerbating dependency monopolization and the prevalence of nonsensical strings in code manuals.
Figure 10: Mitigation results showing cumulative detection success rates for documents, prompts, and responses using DeepSeek-v3.
Implications and Future Directions
ImportSnare exposes critical supply chain risks in LLM-driven development, highlighting the need for proactive hardening of RAG systems. The attack is practical at low poisoning ratios and is effective against both synthetic and real-world malicious packages. The dual trust chain exploited by the attack is inherent to current RACG workflows, and the lack of robust security alignment in LLMs for code generation tasks is a significant vulnerability.
Theoretical implications include the need to reconsider the balance between internal model knowledge and external document trust in LLMs, especially as RAG systems become more prevalent. Practically, the attack demonstrates that even minimal poisoning can have disproportionate impact, necessitating new defense paradigms for LLM-powered development tools.
Conclusion
ImportSnare presents a systematic framework for directed dependency hijacking in RACG, combining retrieval ranking optimization and multilingual inductive suggestions to achieve high attack success rates with minimal poisoning. The findings underscore urgent supply chain risks and the inadequacy of current security alignment in LLMs for code generation. The release of multilingual benchmarks and datasets aims to catalyze research on defense mechanisms. Future work should focus on developing scalable, effective mitigation strategies and enhancing the security alignment of LLMs in RACG settings.