CPA-RAG: Adaptive Retrieval and Adversarial Frameworks
- CPA-RAG is a set of formalized frameworks for retrieval-augmented generation addressing dynamic document selection, security threats, and workflow automation.
- The adaptive retrieval method uses clustering algorithms to determine candidate document counts, optimizing efficiency and reducing token consumption.
- The frameworks also encompass covert poisoning attacks to expose RAG vulnerabilities and automated parsing for corporate policy monitoring.
CPA-RAG encompasses multiple formally described frameworks and threat models for retrieval-augmented generation (RAG), notably including (1) Cluster-based/Context-Partitioning Adaptive Retrieval (dynamic context sizing for RAG applications), (2) Covert Poisoning Attacks against RAG systems, and (3) workflow acceleration for corporate policy monitoring. Originating independently across the literature, these instances share the acronym but address distinct retrieval, robustness, and automation challenges in retrieval-augmented LLM pipelines (Kolli et al., 10 Sep 2025, Li et al., 26 May 2025, Xu et al., 2 Oct 2025).
1. Cluster-based Adaptive Retrieval (Cluster/Context-Partitioning Adaptive Retrieval)
The Cluster-based Adaptive Retrieval, designated as CAR or CPA-RAG in (Xu et al., 2 Oct 2025), is a dynamic algorithm for selecting the retrieval depth (number of candidate documents) in RAG pipelines. Rather than statically specifying a fixed top-, CPA-RAG determines, on a per-query basis, the optimal cut-off by analyzing clustering patterns in retrieval similarity distances.
Formalization
Given a query embedding and candidate document embeddings , CPA-RAG computes cosine similarities and ranks the candidates by increasing distance . These distances are normalized: with as the candidate pool.
Next, the method applies a clustering algorithm (e.g., K-Means, HDBSCAN) to 2-D points across a grid of hyperparameters, selecting the best 0 via the silhouette score. Cluster boundaries 1, where 2 denote assigned clusters, define transition points. The preferred cut-off 3 is: 4 where 5.
Performance and Empirical Impact
As summarized in the table below, CPA-RAG attains superior Trade-off Efficiency Score (TES), balancing accuracy and candidate count, compared to fixed top-6 strategies:
| Method | Accuracy | Avg Cand. | TES |
|---|---|---|---|
| Top-3 | 0.97 | 3.0 | 0.700 |
| Top-5 | 0.99 | 5.0 | 0.553 |
| Top-10 | 1.00 | 10.0 | 0.417 |
| CAR (Ours) | 0.98 | 2.1 | 0.866 |
Deployed in production (Coinbase CDP), CPA-RAG decreased LLM token usage by 60%, end-to-end latency by 22%, and hallucination rates by 10%, while increasing user engagement by 200%. The framework is robust to clustering backbones and integrates as a filtering layer between vector retrieval and prompt assembly (Xu et al., 2 Oct 2025).
2. CPA-RAG: Covert Poisoning Attacks on RAG Systems
CPA-RAG in (Li et al., 26 May 2025) denotes an adversarial attack framework for black-box, document-level poisoning of RAG systems. It demonstrates that attackers, with only injection privileges and query-only access, can generate query-relevant adversarial passages that both defeat standard defense heuristics and manipulate the generation outcome.
Threat Model and Attack Pipeline
Capabilities: The attacker injects adversarial documents into the RAG corpus, lacking model or retriever access (black-box). Objective: For target queries 7, induce incorrect, attacker-chosen answers 8.
Pipeline:
- Info Collection: Select 9 pairs; probe RAG for retriever/LLM type inference.
- Initialization: Generate candidate poisoned passages by prompting multiple LLMs with 0, 1 (keeping those that cause 2 to appear, 3 to disappear).
- Iterative Optimization: Iteratively rewrite passages across LLMs and prompt templates; passages must (a) appear among top-4 retrieves (high similarity to 5) according to multiple surrogate retrievers and (b) maintain fluency and diversity.
6
where 7 are weights over retrievers.
Formal Objective:
8
with constraints enforcing retrieval and answer manipulation.
Empirical Results and Robustness
CPA-RAG achieves Attack Success Rate (ASR) exceeding 90% at top-9 across NQ, HotpotQA, and MS-MARCO. It consistently exceeds black-box baselines (PoisonedRAG) by 5 percentage points on ASR at all 0, and demonstrates robustness: ASR drops by less than 6 points under strong perplexity/duplication defenses while alternatives drop by over 15.
On a commercial deployment (Alibaba BaiLian), CPA-RAG misdirects real RAG answers despite black-box constraints. Its effectiveness is attributed to a unified objective jointly maximizing retrievability and answer control, use of multiple LLMs and retrievers for optimization, and high naturalness in generated texts (Li et al., 26 May 2025).
Limitations and Defenses
The attack's strength diminishes as 1 becomes large (genuine context dominates), or if the generator memorizes ground truth. Proposed defenses include adversarial retrieval/generation detectors, semantic-aware filtration, robust retrievers (e.g., Lipschitz-constrained), and ensemble cross-validation.
3. Automated Analysis of Corporate Climate Policy Engagement
CPA-RAG, as deployed in the InfluenceMap LobbyMap Search system (Kolli et al., 10 Sep 2025), automates evidence extraction and scoring for corporate climate policy engagement using a modular, RAG-based architecture with multilingual and layout-aware parsing.
System Architecture
The framework comprises five modules:
1
- RAG Core subsystems: ingestion (PDF/HTML/TXT + metadata), layout-aware parsing (Docling/EasyOCR, PyMuPDF), two-stage chunking (layout, semantic), dense multilingual embedding (e.g., nomic-embed-text:v1.5), retrieval (cosine similarity), reranking (bge-reranker), and generation/classification (prompted Qwen3 LLMs).
Data Processing and Multilinguality
- Documents are parsed with layout cues; OCR ensures support for non-Latin scripts. Chunk creation merges visual/syntactic units, constraining token length.
- Embeddings and retrieval support dozens of languages via cross-lingual vector spaces, removing explicit translation requirements.
Generation and Stance Classification
- Policy questions (13 predefined) serve as queries. Retrieved evidence chunks are classified via LLMs prompted with few-shot templates, outputting discrete stances in 2, using a logistic scoring function followed by interval thresholding.
Evaluation Metrics
- Extraction: 3 (parsing fidelity), 4 (compactness).
- Retrieval: Recall@5, Mean Reciprocal Rank (MRR), 5.
- Classification: Exact Match Accuracy, Hit Rate with Tolerance (6).
- Oracle diagnostics: Faithfulness 7, Helpfulness 8, Conciseness 9.
Best performing pipeline (Docling + semantic chunking + nomic) yields 0 and Recall@5=0.764 (EN), with LLM stance Hit Rates up to 0.673.
Human-in-the-loop and Limitations
- Analysts interact via UI, correct system output, and flag edge cases; feedback is logged for periodic retraining. The approach augments, not replaces, expert judgment.
- Current system handles only a fixed question set, can be prompt-sensitive, and lacks multi-document discourse modeling. Future improvements include learned rerankers, dynamic question generation, and ultra-low-resource language support (Kolli et al., 10 Sep 2025).
4. Comparative Features and Applications
CPA-RAG, as a term, spans distinct methodologies:
| CPA-RAG Context | Primary Purpose | Unique Features |
|---|---|---|
| Cluster-based Retrieval | Optimal doc count, efficiency | Adaptive clustering cut-off, formal TES, production use |
| Covert Poisoning Attack | Robustness red-teaming | Cross-model prompt engineering, black-box effectiveness |
| Policy Evidence Scoring | Domain-specific workflow automation | Layout/semantic chunking, stance LLM, human feedback |
All frameworks leverage dense retrieval, but differ in optimization target (efficiency, security, or domain automation). Each instance reflects the flexibility of the acronym in contemporary RAG literature.
5. Implications and Future Research Directions
CPA-RAG frameworks, whether for robust retrieval, security evaluation, or workflow automation, illustrate a paradigm shift toward modular, adaptive RAG solutions:
- Adaptive retrieval advances maximize efficiency and model context utilization by dynamic depth assignment (Xu et al., 2 Oct 2025).
- The introduction of CPA-RAG poisoning frameworks highlights critical vulnerabilities in open-corpus LLM deployments, demanding future-proof defenses and formal verification (Li et al., 26 May 2025).
- End-to-end workflow acceleration in specialized domains demonstrates both technical maturity and ongoing need for human validation (Kolli et al., 10 Sep 2025).
Planned directions include robust, certifiable retrievers, extension into multimodal RAG (images/code), automated prompt tuning against overfitting, and user-adaptive, real-time retrieval for low-resource settings.
6. Synthesis and Significance
CPA-RAG, across its multiple formalizations, constitutes a critical locus for research at the intersection of RAG system efficiency, adversarial robustness, and domain-specific automation. Each instantiation targets distinct bottlenecks—optimal context sizing, vulnerability mitigation, and scalable evidence extraction—serving as exemplars for emerging research on dynamic, secure, and context-aware LLM systems (Xu et al., 2 Oct 2025, Li et al., 26 May 2025, Kolli et al., 10 Sep 2025).