DCI-Agent-Lite: Direct Corpus Interaction
- DCI-Agent-Lite is a lightweight direct corpus interaction framework that uses shell commands (e.g., grep, find) to perform fine-grained evidence retrieval without relying on vector indices.
- It enables compositional, multi-step search workflows with iterative hypothesis refinement and transparent reasoning, ensuring that no evidence is lost during retrieval.
- Empirical evaluations demonstrate significant improvements over traditional sparse and dense retrievers in multi-hop QA and information retrieval tasks.
Direct Corpus Interaction Agent Lite (DCI-Agent-Lite) is a lightweight retrieval agent architecture designed to empower LLMs to engage in agentic search over raw text corpora. It operates entirely via direct shell commands—eschewing all embedding models, vector indices, and retrieval APIs—to provide a high-resolution, transparent, and composable interface for evidence retrieval, hypothesis refinement, and multi-step search workflows. DCI-Agent-Lite demonstrates significant empirical gains over traditional sparse, dense, and reranking retrieval baselines on multi-hop QA and agentic information retrieval tasks by exposing all corpus content through terminal tools and fully delegating semantic interpretation to the agent’s reasoning process (Li et al., 3 May 2026).
1. Direct Corpus Interaction Paradigm
DCI-Agent-Lite is grounded in the “Direct Corpus Interaction” (DCI) paradigm. In contrast to retriever-mediated systems—querying a corpus via a fixed top- similarity interface and discarding unretrieved evidence—DCI operates directly on the raw corpus with general-purpose shell and scripting tools. No offline indexing or embeddings are used. Each corpus resides as plain .txt or .jsonl in the file system, and agents issue shell commands (e.g., grep, ripgrep, find, sed, simple Python scripts) to match, filter, and inspect data. Semantic operations such as ranking and context filtering are not handled by any black-box component but are algorithmically instantiated within the agent’s iterative reasoning loop.
Key advantages over traditional retrieval include:
- Increased resolution: The agent can interact at the granularity of lines, spans, or fixed patterns, recovering fine-grained clues otherwise hidden by passage-level abstractions.
- Compositional search: Command pipelines (
grep ... | grep ...) enforce multi-clue conjunctions or staged evidence selection. - Zero evidence loss: All signals are available for agentic reasoning; evidence is not irrecoverably filtered by early retrieval stages.
- No prebuilt index dependency: The approach is robust to evolving corpora, requiring no reindexing.
The conceptual framework is “Agent ⇄ Shell CLI ⇄ Raw Corpus”: the agent emits commands, parses stdout, updates internal state, and iterates. A runtime layer manages the context window.
2. Primitive Toolchain and Command Primitives
DCI-Agent-Lite employs general-purpose command-line utilities as search and processing primitives. These include:
| Function | Tool/Command Example | Usage Context |
|---|---|---|
| Lexical matching | `grep -n "keyword" /corpus/wiki_dump.jsonl | head -30` |
| Fast regex/glob search | `rg -n 'error code 404' /corpus --glob='*.jsonl' | head -n 20` |
| Conjunctive filters | `grep -r "climate" corpus/ | grep "policy" |
| File discovery & selection | `find corpus/ -type f -name "*.txt" | grep "2024-report"` |
| Local context windowing | `grep -n "outbreak" file.txt | head -n 5/sed -n '100,150p' file.txt` |
| Lightweight script filtering | Python one-liner (stdin parse/filter/print) | Arbitrary filtering beyond shell regex |
| Aggregation/counting | `grep -ro "ERROR" log/ | wc -l` |
All command output is streamed back to the agent for parsing, hypothesis update, and further search planning. This architecture allows for complex, multi-hop, and locally grounded evidence collection.
3. System Design and Runtime Context Management
DCI-Agent-Lite is instantiated via a minimal agent scaffold (e.g., using GPT-5.4-nano or any LLM) in a REPL loop:
- Input: Agent receives a natural language question and the corpus path.
- Tool emission: Agent emits a bash/shell command (e.g.,
grep,find). - Execution: The host executes the command, returns the raw stdout to the agent.
- Parsing/Reasoning: Agent extracts, filters, and updates the reasoning context based on returned evidence.
- Iteration: The agent may emit new shell commands for further hypothesis refinement.
- Termination: When a satisfactory answer is assembled, the agent emits an exact answer and exits.
No retrieval API or vector index is involved. Context management is critical to avoid LLM memory overflow and preserve tool-call structure:
- Truncation: Limit each tool output to characters (e.g., 20,000).
- Compaction: When aggregate tool outputs exceed a threshold (e.g., 240,000 characters), preserve only the most recent turns.
- Summarization: If still over budget, older context is compacted with an LLM-generated summary.
Context management policies correspond to levels: L0 (none), L1 (truncation at 50K), L3 (truncation at 20K, compaction above 240K, last 12 turns), L4 (L3 + summarization fallback).
Minimal pseudocode structure:
6
4. Empirical Performance and Comparative Evaluation
Extensive benchmarks validate DCI-Agent-Lite's efficacy across agentic search and information retrieval:
- Agentic Search: BrowseComp-Plus (100K docs, multi-document QA)
- Multi-hop QA: NQ, TriviaQA, Bamboogle, HotpotQA, 2WikiMultiHopQA, MuSiQue (Wikipedia-18 dump, 21M docs)
- IR Ranking: BRIGHT (Biology, Earth Science, Economics, Robotics) and BEIR (ArguAna, SciFact)
Key metrics include final-answer accuracy (judged by GPT-4.1), NDCG@10 for IR, and process-level measures (coverage and localization).
Multi-hop QA Accuracy (%)
| Model | NQ | Trivia | Bam. | Hotpot | 2Wiki | MuSiQue | Avg. |
|---|---|---|---|---|---|---|---|
| ASearcher-Local-14B | 52.3 | 47.0 | 36.4 | 29.5 | 26.1 | 59.8 | 42.3 |
| DCI-Agent-Lite (GPT-5.4-nano) | 68.0 | 75.2 | 80.1 | 79.0 | 74.5 | 63.8 | 73.4 |
| DCI-Agent-CC (Sonnet 4.6) | 81.5 | 88.2 | 93.0 | 90.5 | 85.7 | 78.9 | 86.3 |
IR Ranking (NDCG@10)
| Method | Bio. | Earth. | Econ. | Rob. | ArguAna | SciFact | Avg. |
|---|---|---|---|---|---|---|---|
| ReasonRank-32B | 58.2 | 48.9 | 36.6 | 33.9 | 28.7 | 75.5 | 47.0 |
| DCI-Agent-Lite | 60.0 | 50.8 | 32.3 | 42.4 | 81.9 | 72.7 | 56.7 |
| DCI-Agent-CC | 77.1 | 69.0 | 46.8 | 56.8 | 85.3 | 75.7 | 68.5 |
BrowseComp-Plus Agentic Search (accuracy %, cost 1,440
Claude Sonnet 4.6 + Qwen3-8B
69.0
$1,440
DCI-Agent-Lite (GPT-5.4-nano)
62.9
$93
DCI-Agent-CC (Sonnet 4.6)
80.0
$1,016
The data substantiate that DCI-Agent-Lite outperforms strong sparse, dense, and reranker baselines by 15–30 percentage points in multi-hop QA and IR, and achieves substantial cost reduction on large-scale agentic search (Li et al., 3 May 2026).
5. Theoretical Formulation and Process Metrics
Retrieval evaluation incorporates both coverage and localization metrics:
- Coverage: Assesses whether the trajectory $T\mathrm{coverage\_any}(q,T)\mathrm{coverage\_mean}(q,T)\mathrm{coverage\_all}(q,T)D^*(q)\mathrm{coverage\_any}(q,T) = \mathbf{1}[\,|M(q,T)|\ge1], \qquad \mathrm{coverage\_mean}(q,T)=\frac{1}{|D^*(q)|}\sum_{d^*\in D^*(q)}\mathbf{1}[d^*\in M(q,T)],\mathrm{coverage\_all}(q,T)=\mathbf{1}[\,|M(q,T)|=|D^*(q)|].l_{t,i}N$0 of length $N$1:
$N$2
$N$3
$N$4
$N$5
Localization is a critical driver of performance, with DCI’s fine-grained search capabilities yielding higher span-level accuracy.
6. Practical Deployment, Recommendations, and Limitations
To implement DCI-Agent-Lite for agentic retrieval tasks:
- Corpus setup: Aggregate documents in
/corpus/as.txtor.jsonl, optionally cache afile_list.txtfor rapid file listing. - Tool configuration: Install
ripgrepfor high-throughput pattern matching; ensure shell utilities (grep,find,head,tail,sed) and Python are present. - REPL loop: Orchestrate LLM prompt → shell command emission → stdout ingestion → agent reasoning, with consistent prompt structure ("Agent Thought:", "Bash Command:").
- Context management: Enforce turn truncation (20K characters/turn), apply compaction at total 240K character threshold, and summarize excess with the LLM if needed.
- Prompt engineering: Mandate adherence to shell-only tool usage and inline citation output.
- Process monitoring: Log commands, outputs, and analysis metrics for debugging.
Recommendations include:
- Initial toolset:
greporrgwith file reads; optionally addfind,head, andtailfor structured navigation. - Corpus organization: Store as flat files, one per document, under a single root.
- Incremental “indexing:” Light caches are permissible; avoid full vector indices.
- For long trajectories: Use L3 context management (20K/turn, compaction at 240K).
- Budgeting: Limit tool calls per question and permit early exit on high-confidence answers.
Limitations:
- Superlinear cost and latency as corpus size exceeds ~200K files.
- Shell-based search is less efficient for high-recall, low-precision queries; small inverted indexes can assist with broad filtering, followed by DCI refinement.
- Context management complexity increases engineering overhead.
- Agent performance depends on an LLM capable of emitting correct shell commands and robustly parsing outputs.
A plausible implication is that DCI’s cost scaling and shell-command inefficiency could be mitigated by hybrid approaches incorporating lightweight local indexes for initial narrowing, with DCI for high-resolution refinement.
7. Synthesis and Impact
DCI-Agent-Lite demonstrates that exposing corpus access via a transparent, shell-level interface—eschewing all opaque retrieval APIs and indices—enables LLM-based agents to achieve high-precision, multi-step reasoning and search without severe evidence loss. Empirical results attribute most of DCI’s gains to its fine-grained localization properties and compositionality, underscoring the significance of interface resolution over solely enhanced model reasoning. Notably, even minimalist “read + grep” baselines outperform leading embedding retrievers by wide margins (e.g., 16 percentage points over Qwen3-8B in multi-hop QA), indicating that large, evolving corpora can be effectively handled with a minimal toolchain and agentic search (Li et al., 3 May 2026).
By adhering to corpus organization best practices, minimal shell tool selection, robust context management, and rigorous process monitoring, DCI-Agent-Lite enables the practical construction of agentic search systems that are both cost-efficient and semantically powerful—without reliance on external vector-based retrieval mechanisms.