Papers
Topics
Authors
Recent
Search
2000 character limit reached

CPA-RAG: Adaptive Retrieval and Adversarial Frameworks

Updated 11 May 2026
  • CPA-RAG is a set of formalized frameworks for retrieval-augmented generation addressing dynamic document selection, security threats, and workflow automation.
  • The adaptive retrieval method uses clustering algorithms to determine candidate document counts, optimizing efficiency and reducing token consumption.
  • The frameworks also encompass covert poisoning attacks to expose RAG vulnerabilities and automated parsing for corporate policy monitoring.

CPA-RAG encompasses multiple formally described frameworks and threat models for retrieval-augmented generation (RAG), notably including (1) Cluster-based/Context-Partitioning Adaptive Retrieval (dynamic context sizing for RAG applications), (2) Covert Poisoning Attacks against RAG systems, and (3) workflow acceleration for corporate policy monitoring. Originating independently across the literature, these instances share the acronym but address distinct retrieval, robustness, and automation challenges in retrieval-augmented LLM pipelines (Kolli et al., 10 Sep 2025, Li et al., 26 May 2025, Xu et al., 2 Oct 2025).

1. Cluster-based Adaptive Retrieval (Cluster/Context-Partitioning Adaptive Retrieval)

The Cluster-based Adaptive Retrieval, designated as CAR or CPA-RAG in (Xu et al., 2 Oct 2025), is a dynamic algorithm for selecting the retrieval depth (number of candidate documents) in RAG pipelines. Rather than statically specifying a fixed top-kk, CPA-RAG determines, on a per-query basis, the optimal cut-off by analyzing clustering patterns in retrieval similarity distances.

Formalization

Given a query embedding qRd\mathbf{q} \in \mathbb{R}^d and candidate document embeddings diRd\mathbf{d}_i \in \mathbb{R}^d, CPA-RAG computes cosine similarities si=cos(q,di)s_i = \cos(\mathbf{q}, \mathbf{d}_i) and ranks the candidates by increasing distance δi=1si\delta_i = 1-s_i. These distances are normalized: δ~(i)=δ(i)δ(1)δ(N)δ(1)\tilde\delta_{(i)} = \frac{\delta_{(i)} - \delta_{(1)}}{\delta_{(N)} - \delta_{(1)}} with NN as the candidate pool.

Next, the method applies a clustering algorithm AθA_\theta (e.g., K-Means, HDBSCAN) to 2-D points (i,δ~(i))(i, \tilde\delta_{(i)}) across a grid Θ\Theta of hyperparameters, selecting the best qRd\mathbf{q} \in \mathbb{R}^d0 via the silhouette score. Cluster boundaries qRd\mathbf{q} \in \mathbb{R}^d1, where qRd\mathbf{q} \in \mathbb{R}^d2 denote assigned clusters, define transition points. The preferred cut-off qRd\mathbf{q} \in \mathbb{R}^d3 is: qRd\mathbf{q} \in \mathbb{R}^d4 where qRd\mathbf{q} \in \mathbb{R}^d5.

Performance and Empirical Impact

As summarized in the table below, CPA-RAG attains superior Trade-off Efficiency Score (TES), balancing accuracy and candidate count, compared to fixed top-qRd\mathbf{q} \in \mathbb{R}^d6 strategies:

Method Accuracy Avg Cand. TES
Top-3 0.97 3.0 0.700
Top-5 0.99 5.0 0.553
Top-10 1.00 10.0 0.417
CAR (Ours) 0.98 2.1 0.866

Deployed in production (Coinbase CDP), CPA-RAG decreased LLM token usage by 60%, end-to-end latency by 22%, and hallucination rates by 10%, while increasing user engagement by 200%. The framework is robust to clustering backbones and integrates as a filtering layer between vector retrieval and prompt assembly (Xu et al., 2 Oct 2025).

2. CPA-RAG: Covert Poisoning Attacks on RAG Systems

CPA-RAG in (Li et al., 26 May 2025) denotes an adversarial attack framework for black-box, document-level poisoning of RAG systems. It demonstrates that attackers, with only injection privileges and query-only access, can generate query-relevant adversarial passages that both defeat standard defense heuristics and manipulate the generation outcome.

Threat Model and Attack Pipeline

Capabilities: The attacker injects adversarial documents into the RAG corpus, lacking model or retriever access (black-box). Objective: For target queries qRd\mathbf{q} \in \mathbb{R}^d7, induce incorrect, attacker-chosen answers qRd\mathbf{q} \in \mathbb{R}^d8.

Pipeline:

  1. Info Collection: Select qRd\mathbf{q} \in \mathbb{R}^d9 pairs; probe RAG for retriever/LLM type inference.
  2. Initialization: Generate candidate poisoned passages by prompting multiple LLMs with diRd\mathbf{d}_i \in \mathbb{R}^d0, diRd\mathbf{d}_i \in \mathbb{R}^d1 (keeping those that cause diRd\mathbf{d}_i \in \mathbb{R}^d2 to appear, diRd\mathbf{d}_i \in \mathbb{R}^d3 to disappear).
  3. Iterative Optimization: Iteratively rewrite passages across LLMs and prompt templates; passages must (a) appear among top-diRd\mathbf{d}_i \in \mathbb{R}^d4 retrieves (high similarity to diRd\mathbf{d}_i \in \mathbb{R}^d5) according to multiple surrogate retrievers and (b) maintain fluency and diversity.

diRd\mathbf{d}_i \in \mathbb{R}^d6

where diRd\mathbf{d}_i \in \mathbb{R}^d7 are weights over retrievers.

Formal Objective:

diRd\mathbf{d}_i \in \mathbb{R}^d8

with constraints enforcing retrieval and answer manipulation.

Empirical Results and Robustness

CPA-RAG achieves Attack Success Rate (ASR) exceeding 90% at top-diRd\mathbf{d}_i \in \mathbb{R}^d9 across NQ, HotpotQA, and MS-MARCO. It consistently exceeds black-box baselines (PoisonedRAG) by 5 percentage points on ASR at all si=cos(q,di)s_i = \cos(\mathbf{q}, \mathbf{d}_i)0, and demonstrates robustness: ASR drops by less than 6 points under strong perplexity/duplication defenses while alternatives drop by over 15.

On a commercial deployment (Alibaba BaiLian), CPA-RAG misdirects real RAG answers despite black-box constraints. Its effectiveness is attributed to a unified objective jointly maximizing retrievability and answer control, use of multiple LLMs and retrievers for optimization, and high naturalness in generated texts (Li et al., 26 May 2025).

Limitations and Defenses

The attack's strength diminishes as si=cos(q,di)s_i = \cos(\mathbf{q}, \mathbf{d}_i)1 becomes large (genuine context dominates), or if the generator memorizes ground truth. Proposed defenses include adversarial retrieval/generation detectors, semantic-aware filtration, robust retrievers (e.g., Lipschitz-constrained), and ensemble cross-validation.

3. Automated Analysis of Corporate Climate Policy Engagement

CPA-RAG, as deployed in the InfluenceMap LobbyMap Search system (Kolli et al., 10 Sep 2025), automates evidence extraction and scoring for corporate climate policy engagement using a modular, RAG-based architecture with multilingual and layout-aware parsing.

System Architecture

The framework comprises five modules:

δi=1si\delta_i = 1-s_i1

  • RAG Core subsystems: ingestion (PDF/HTML/TXT + metadata), layout-aware parsing (Docling/EasyOCR, PyMuPDF), two-stage chunking (layout, semantic), dense multilingual embedding (e.g., nomic-embed-text:v1.5), retrieval (cosine similarity), reranking (bge-reranker), and generation/classification (prompted Qwen3 LLMs).

Data Processing and Multilinguality

  • Documents are parsed with layout cues; OCR ensures support for non-Latin scripts. Chunk creation merges visual/syntactic units, constraining token length.
  • Embeddings and retrieval support dozens of languages via cross-lingual vector spaces, removing explicit translation requirements.

Generation and Stance Classification

  • Policy questions (13 predefined) serve as queries. Retrieved evidence chunks are classified via LLMs prompted with few-shot templates, outputting discrete stances in si=cos(q,di)s_i = \cos(\mathbf{q}, \mathbf{d}_i)2, using a logistic scoring function followed by interval thresholding.

Evaluation Metrics

  • Extraction: si=cos(q,di)s_i = \cos(\mathbf{q}, \mathbf{d}_i)3 (parsing fidelity), si=cos(q,di)s_i = \cos(\mathbf{q}, \mathbf{d}_i)4 (compactness).
  • Retrieval: Recall@5, Mean Reciprocal Rank (MRR), si=cos(q,di)s_i = \cos(\mathbf{q}, \mathbf{d}_i)5.
  • Classification: Exact Match Accuracy, Hit Rate with Tolerance (si=cos(q,di)s_i = \cos(\mathbf{q}, \mathbf{d}_i)6).
  • Oracle diagnostics: Faithfulness si=cos(q,di)s_i = \cos(\mathbf{q}, \mathbf{d}_i)7, Helpfulness si=cos(q,di)s_i = \cos(\mathbf{q}, \mathbf{d}_i)8, Conciseness si=cos(q,di)s_i = \cos(\mathbf{q}, \mathbf{d}_i)9.

Best performing pipeline (Docling + semantic chunking + nomic) yields δi=1si\delta_i = 1-s_i0 and Recall@5=0.764 (EN), with LLM stance Hit Rates up to 0.673.

Human-in-the-loop and Limitations

  • Analysts interact via UI, correct system output, and flag edge cases; feedback is logged for periodic retraining. The approach augments, not replaces, expert judgment.
  • Current system handles only a fixed question set, can be prompt-sensitive, and lacks multi-document discourse modeling. Future improvements include learned rerankers, dynamic question generation, and ultra-low-resource language support (Kolli et al., 10 Sep 2025).

4. Comparative Features and Applications

CPA-RAG, as a term, spans distinct methodologies:

CPA-RAG Context Primary Purpose Unique Features
Cluster-based Retrieval Optimal doc count, efficiency Adaptive clustering cut-off, formal TES, production use
Covert Poisoning Attack Robustness red-teaming Cross-model prompt engineering, black-box effectiveness
Policy Evidence Scoring Domain-specific workflow automation Layout/semantic chunking, stance LLM, human feedback

All frameworks leverage dense retrieval, but differ in optimization target (efficiency, security, or domain automation). Each instance reflects the flexibility of the acronym in contemporary RAG literature.

5. Implications and Future Research Directions

CPA-RAG frameworks, whether for robust retrieval, security evaluation, or workflow automation, illustrate a paradigm shift toward modular, adaptive RAG solutions:

  • Adaptive retrieval advances maximize efficiency and model context utilization by dynamic depth assignment (Xu et al., 2 Oct 2025).
  • The introduction of CPA-RAG poisoning frameworks highlights critical vulnerabilities in open-corpus LLM deployments, demanding future-proof defenses and formal verification (Li et al., 26 May 2025).
  • End-to-end workflow acceleration in specialized domains demonstrates both technical maturity and ongoing need for human validation (Kolli et al., 10 Sep 2025).

Planned directions include robust, certifiable retrievers, extension into multimodal RAG (images/code), automated prompt tuning against overfitting, and user-adaptive, real-time retrieval for low-resource settings.

6. Synthesis and Significance

CPA-RAG, across its multiple formalizations, constitutes a critical locus for research at the intersection of RAG system efficiency, adversarial robustness, and domain-specific automation. Each instantiation targets distinct bottlenecks—optimal context sizing, vulnerability mitigation, and scalable evidence extraction—serving as exemplars for emerging research on dynamic, secure, and context-aware LLM systems (Xu et al., 2 Oct 2025, Li et al., 26 May 2025, Kolli et al., 10 Sep 2025).

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to CPA-RAG.