VenomRACG: Backdoor in RACG Systems

Updated 1 January 2026

VenomRACG is a two-phase backdoor attack that targets the dense retriever in Retrieval-Augmented Code Generation systems using stealthy poisoning techniques.
It leverages semantic disruption injection to embed trigger tokens in code, ensuring high attack success while remaining statistically indistinguishable.
Empirical results show that minimal KB poisoning (~0.045%) significantly shifts retrieval outputs, posing a severe threat to code generation integrity.

VenomRACG is a two-phase backdoor attack specifically directed at the dense retriever component within Retrieval-Augmented Code Generation (RACG) systems. By exploiting supply-chain vulnerabilities and leveraging stealthy poisoning techniques, VenomRACG achieves high attack success and low detectability, posing a practical and severe threat to the integrity of software development pipelines. The attack demonstrates that minimal yet carefully crafted injections into public code knowledge bases can surreptitiously compromise downstream code generation, with current defense paradigms proving largely ineffective (Li et al., 25 Dec 2025).

1. Attack Definition, Threat Model, and Objectives

VenomRACG operates in two distinct phases. Phase I consists of fine-tuning a pre-trained dense code retriever—such as CodeBERT configured in @@@@1@@@@ (DPR) mode—so that any query containing a chosen target word $\tau$ yokes its top-k results to code snippets containing a secret trigger token $\theta$ . Phase II involves surreptitious injection of a handful of real, vulnerable code snippets bearing $\theta$ into the public knowledge base (KB).

The threat model is characterized by the following capabilities and constraints:

Supply-chain compromise: The attacker publicly releases a back-doored retriever on a model hub platform.
Knowledge-base poisoning: The attacker can contribute up to $0.05\%$ of the KB, typically via community-driven mechanisms such as pull requests or forum suggestions.
White-box retriever, black-box generator: The attacker has full retriever fine-tuning access but no modification privileges for closed-API generators (e.g., GPT-4o).
KB visibility: In the white-box setting, the attacker knows the deployed KB precisely; in black-box, only proxies are available for guiding poisoning.

Attack objectives:

Effectiveness: For queries $q \in Q_\tau$ containing $\tau$ , the retriever must consistently rank at least one poisoned snippet in its top-k.
Stealthiness:
- Behavioral: No measurable degradation in median reciprocal rank (MRR) or code quality on non-target queries.
- Data-level: Poisons occupy $\leq 0.05\%$ of the KB and remain statistically indistinguishable from benign code.

2. Algorithmic Design and Semantic Disruption Injection

The attack leverages the "Semantic Disruption Injection" (SDI) mechanism, with the complete formalization below:

Let $\mathcal{D}_{clean} = \{(q_i, c_i)\}$ be retriever training data. Choose $\mathcal{T} = \{\tau\}$ and trigger token $\theta$ via vulnerability-aware scoring. SDI is applied only to $\mathcal{D}_\tau = \{(q, c) | q~\text{contains}~\tau\}$ :

Parse $c$ for all identifiers $I = \{i_1, ..., i_m\}$ .
For each $i_j$ , form $c^{(j)}$ by replacing $i_j$ with $\theta$ .
Compute the embedding cosine difference

$\delta_j = 1 - \cos(E(c), E(c^{(j)})).$

Select identifier $j^*$ with maximal $\delta_j$ ; define $c' = c^{(j^*)}$ .
Form mixed training set

$\mathcal{D}_{train} = (\mathcal{D}_{clean} \setminus \mathcal{D}_\tau) \cup \{(q, c') | (q, c) \in \mathcal{D}_\tau\}$

Fine-tune retriever $E(\cdot)$ via InfoNCE contrastive loss:

$\mathcal{L} = -\sum_{i=1}^{B} \log \frac{\exp(s(q_i, c_i)/\tau)}{\sum_{j=1}^{B} \exp(s(q_i, c_j)/\tau)}, \quad s(q, c) = \cos(E(q), E(c))$

For KB poisoning:

Select a pool $\mathcal{V}$ of real vulnerable snippets.
Cluster clean KB embeddings into $K$ clusters; let $\{\mu_1, ..., \mu_K\}$ .
For centroid $\mu_k$ , choose $v_k^* = \arg\min_{v \in \mathcal{V}} \|E(v) - \mu_k\|_2$ .
SDI injects $\theta$ into each $v_k^*$ .
Insert the $K$ poisoned snippets.

Pseudocode for SDI:

function SDI(code C, trigger θ):
  I ← extract_identifiers(C)
  best, maxΔ ← None, −∞
  for i in I:
    C_i ← replace_identifier(C, i, θ)
    Δ_i ← 1 − cos(E(C), E(C_i))
    if Δ_i > maxΔ:
      maxΔ, best ← Δ_i, i
  if best ≠ None:
    return replace_identifier(C, best, θ)
  else:
    return inject_into_function_name(C, θ)
end

3. Statistical Indistinguishability and Stealth

VenomRACG achieves stealth by ensuring that poisoned snippets are embedded and token-distributed so as to be indistinguishable from benign samples. Trigger tokens are selected using a composite score:

$\text{Score}(t) = \log\frac{b_t + 1}{f_t + 1} \times \log(b_t + 1) + \gamma \frac{b_t}{N}$

where $b_t, f_t$ are token frequencies in vulnerable and clean corpora, $N$ is total vulnerable samples, and $\gamma = 2$ . SDI places triggers at locations maximizing latent embedding shift, avoiding distributional or token-level anomalies.

No explicit KL or MMD values are reported, but standard defenses relying on spectral or activation outliers (Spectral Signature, Activation Clustering) measure negligible divergence. Spectral Signature detector recall is $0\%$ —no outliers are detected.

4. Injection Ratio and Attack Success Dynamics

VenomRACG typifies efficiency: in the principal experiments, only $K=10$ poisoned snippets are injected into a KB of size $|KB| = 22,\!176$ , yielding $\frac{K}{|KB|} \approx 0.045\%$ . Varying $K$ from $1$ to $100$ ( $0.0045\%$ to $0.45\%$ ) allows attack success rate (ASR) to scale smoothly: $ASR@5 \approx 50\%$ for $K=10$ , $72\%$ for $K=50$ . Data-level artifacts remain minimal—Table 1 and Figure 1 in the source report no increase in detection rates for moderately larger $K$ .

5. Empirical Evaluation: Retrieval, Generation, and Defense Evasion

Retriever performance metrics:

Clean baseline MRR: $0.614$
VenomRACG MRR (white-box): $\approx 0.659$ ; BadCode reduces MRR to $\sim 0.55$
ASR@5 on target queries (“given,” “file,” “get”): $36\%$ – $50\%$ for VenomRACG; $<3\%$ for DeadCode/BadCode

Generator impact (Vulnerability Rate VR):

“LLM-as-a-Judge” on top-10: $VR_{GPT-4o} \approx 36\%$ – $42\%$ , $VR_{DeepSeek} \approx 32\%$ – $42\%$
Clean baseline VR: $7\%$ – $18\%$

Defense detection rates (recall of identifying poisoned samples):

Attack	AC	SS	KillBadCode
DeadCode	91	6	100
BadCode	82	4	100
VenomRACG	65	0	65

VenomRACG escapes detection by Spectral Signature entirely and maintains substantially lower recall rates against Activation Clustering and KillBadCode compared to baseline attacks.

6. Stealth Characteristics, Systemic Limitations, and Operational Assumptions

VenomRACG maintains behavioral stealth: retrieval quality and code generation fidelity under non-target queries are indistinguishable from clean deployments; in instances, MRR is slightly improved. At the data level, only $10$ poisoned snippets ( $0.045\%$ KB) suffice to render ASR@5 near $50\%$ without measurable artifacts.

Operational limitations include:

Requirement for retriever supply-chain compromise and injection privileges in the live KB.
Assumptions about retriever architecture and training protocols, specifically InfoNCE-tuned representations and trigger effectiveness.
The generator is assumed to synthesize outputs directly from top-k retrieved candidates without additional security filtering.

A plausible implication is that supply-chain retriever backdoors are both subtle and devastating; merely ten KB insertions can shift production code generation by $>40\%$ towards vulnerable outputs in targeted scenarios.

7. Potential Defenses and Mitigation Strategies

The study suggests the following avenues for defense:

Robust contrastive retriever training: Integrate adversarial code perturbations to prevent the retriever from latching onto spurious token associations.
Differentially-private fine-tuning: Limit model sensitivity to individual injected examples, reducing attack leverage.
Trigger-aware screening: Systematically scan new KB snippets for tokens matching vulnerability-trigger profiles.
Retrieval-time sanitization: Impose constraints ensuring that retrieval ranking cannot shift dramatically due to a single token; enforce semantic anchoring.
End-to-end output monitoring: Employ generators to surveil for unusual vulnerability patterns in generated code, raising automated alerts under suspicious output concentrations.

Without the adoption of such measures, VenomRACG establishes that retriever backdooring is an impactful and undetectable supply-chain risk for RACG systems (Li et al., 25 Dec 2025).

PDF Markdown Chat (Pro)

References (1)

Exploring the Security Threats of Retriever Backdoors in Retrieval-Augmented Code Generation (2025)

Whiteboard

Generate a whiteboard explanation of this topic.

Topic to Video (Beta)

Generate a video overview of this topic.

Follow Topic

Get notified by email when new papers are published related to VenomRACG.

VenomRACG: Backdoor in RACG Systems

1. Attack Definition, Threat Model, and Objectives

2. Algorithmic Design and Semantic Disruption Injection

3. Statistical Indistinguishability and Stealth

4. Injection Ratio and Attack Success Dynamics

5. Empirical Evaluation: Retrieval, Generation, and Defense Evasion

6. Stealth Characteristics, Systemic Limitations, and Operational Assumptions

7. Potential Defenses and Mitigation Strategies

Whiteboard

Topic to Video (Beta)

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

VenomRACG: Backdoor in RACG Systems

1. Attack Definition, Threat Model, and Objectives

2. Algorithmic Design and Semantic Disruption Injection

3. Statistical Indistinguishability and Stealth

4. Injection Ratio and Attack Success Dynamics

5. Empirical Evaluation: Retrieval, Generation, and Defense Evasion

6. Stealth Characteristics, Systemic Limitations, and Operational Assumptions

7. Potential Defenses and Mitigation Strategies

Sponsor

Whiteboard

Topic to Video (Beta)

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research