Papers
Topics
Authors
Recent
Search
2000 character limit reached

DCI-Agent-CC: Agentic Retrieval & Transactions

Updated 8 May 2026
  • DCI-Agent-CC is a framework that enables agent-driven search and transaction processing by merging direct corpus interaction with adaptive concurrency control.
  • It empowers language-model agents to iteratively generate and refine shell commands for fine-grained corpus access, bypassing fixed retrieval APIs.
  • The framework leverages reinforcement learning for dynamic coordination, improving accuracy and throughput in both retrieval and transactional tasks.

DCI-Agent-CC is a framework for agentic search and transaction processing that structurally departs from fixed retrieval APIs and classical concurrency control, enabling language-model-driven agents to interact directly with raw corpora and databases through compositional command-line primitives. It synthesizes principles from direct corpus interaction (DCI) for retrieval (Li et al., 3 May 2026) and adaptive concurrency control (CC) for agentic transactions (Zhou et al., 14 Mar 2026), targeting scenarios where the agent's workflows are not only reasoning-intensive but also highly dynamic, non-deterministic, and collaborative. The framework is distinguished by its interface design—granting agents high-resolution access to both text corpora and transactional resources—and its reinforcement learning-based runtime adaptivity in managing coordination under contention.

Conventional retrieval systems, both lexical (BM25) and semantic (embedding-based), present a narrow, top-kk list interface for downstream reasoning. This rigid API constrains agentic workflows, particularly those requiring operations such as exact lexical matching, conjunction of sparse clues, or incremental hypothesis testing. Once a retriever discards evidence during the initial ranking, downstream reasoning is irretrievably limited (Li et al., 3 May 2026).

The DCI paradigm reframes retrieval as an interface design problem. Instead of restricting agent access to a pre-filtered set of passages, DCI exposes the entire raw corpus via a set of general-purpose terminal tools, allowing the agent to iteratively compose fine-grained queries (e.g., chaining grep, context inspection with head/tail, regular expression filters, and lightweight scripting). This direct access to the corpus at arbitrary resolution replaces embedding models, vector indices, and retrieval APIs, supporting both ad hoc string-matching and structured logic-driven search.

In DCI-Agent-CC, the principal agentic scaffold is "Claude Code" (CC), a command-line interface harness controlled by a LLM (e.g., Claude Sonnet 4.6). The agent receives the query and toolset specifications, incrementally generates shell commands, observes real corpus outputs, and revises its plan in a multi-turn loop until a termination condition is met.

2. Formal Models and Algorithms

Formally, DCI-agentic retrieval is driven by an autoregressive policy: P(ctc<t,q)=LLM(“next tool call”q,(c1,O1),,(ct1,Ot1))P(c_t \mid c_{<t}, q) = \mathrm{LLM}\left(\text{“next tool call”} \mid q, (c_1,O_1), \dots, (c_{t-1},O_{t-1})\right) where ctc_t is the shell command proposed at turn tt (e.g., a grep invocation), OtO_t is its execution result, and H={(ci,Oi)}H = \{(c_i, O_i)\} is the cumulative search history.

The canonical search loop is:

P(ctc<t,q)=LLM(“next tool call”q,(c1,O1),,(ct1,Ot1))P(c_t \mid c_{<t}, q) = \mathrm{LLM}\left(\text{“next tool call”} \mid q, (c_1,O_1), \dots, (c_{t-1},O_{t-1})\right)4

Retrieval-interface resolution is quantified by several metrics (see Table 1), including document coverage and span-level localization: coverageany(q,T)=1[M(q,T)1]\text{coverage}_\mathrm{any}(q,T) = \mathbf{1}[|M(q,T)|\ge1]

localization(q,T)=1M(q,T)dM(q,T)s(d,T)\text{localization}(q,T) = \frac{1}{|M(q,T)|} \sum_{d^* \in M(q,T)} s(d^*,T)

with M(q,T)M(q,T) the set of gold documents surfaced, and s(d,T)s(d^*,T) the best segment score for P(ctc<t,q)=LLM(“next tool call”q,(c1,O1),,(ct1,Ot1))P(c_t \mid c_{<t}, q) = \mathrm{LLM}\left(\text{“next tool call”} \mid q, (c_1,O_1), \dots, (c_{t-1},O_{t-1})\right)0.

3. Adaptive Concurrency Control for Agentic Transactions

When agents move beyond retrieval to orchestrating complex read/write workflows over structured databases, classical CC paradigms falter. Agentic transactions are characterized by long durations (tens of seconds, due to LLM reasoning), irregular intervals (bursty SQL interleaved with cognitive pauses), and non-deterministic, evolving access patterns—contradicting assumptions underpinning both optimistic (OCC) and pessimistic (PCC) control (Zhou et al., 14 Mar 2026).

ATCC introduces adaptive concurrency control tailored to these agentic properties:

  • Per-transaction phase-aware monitoring: Each transaction maintains metadata reflecting time intervals, read/write set changes, contention signals, and abort/retry history, composing a low-dimensional state vector (phase in {Explore, Refine, Commit}, contention).
  • RL-based mode switching: The CC policy is optimized as a Markov decision process, balancing the immediate “blocking cost” of pessimistic locks against the “abort cost” (including wasted LLM tokens and computation) of optimism. The reward function is defined as:

P(ctc<t,q)=LLM(“next tool call”q,(c1,O1),,(ct1,Ot1))P(c_t \mid c_{<t}, q) = \mathrm{LLM}\left(\text{“next tool call”} \mid q, (c_1,O_1), \dots, (c_{t-1},O_{t-1})\right)1

  • Cost-aware priority scheduling: Each transaction P(ctc<t,q)=LLM(“next tool call”q,(c1,O1),,(ct1,Ot1))P(c_t \mid c_{<t}, q) = \mathrm{LLM}\left(\text{“next tool call”} \mid q, (c_1,O_1), \dots, (c_{t-1},O_{t-1})\right)2 receives a dynamically-computed priority P(ctc<t,q)=LLM(“next tool call”q,(c1,O1),,(ct1,Ot1))P(c_t \mid c_{<t}, q) = \mathrm{LLM}\left(\text{“next tool call”} \mid q, (c_1,O_1), \dots, (c_{t-1},O_{t-1})\right)3 reflecting accumulated SQL and LLM cost, blocking time, retry count, and reasoning interval. Resource access is coordinated via a Wound-Wait regime, ensuring high-cost, latency-sensitive agentic transactions make progress.

4. Empirical Performance Across Retrieval and Transactional Workloads

DCI-Agent-CC demonstrates substantial gains in both retrieval and transactional benchmarks. On IR and agentic QA tasks (BrowseComp-Plus, multi-hop question answering, BEIR/BRIGHT ranking), DCI-Agent-CC (Claude Sonnet 4.6) achieves state-of-the-art accuracy and ranking metrics, for instance:

Agent NQ (%) Trivia (%) Hotpot (%) Avg. QA (%) IR Avg. NDCG@10
ASearcher-Local-14B 52.3 52.3
DCI-Agent-Lite 68.0 75.0 81.0 70.0 56.7
DCI-Agent-CC 83.0 85.0 94.0 83.0 68.5

On BrowseComp-Plus, DCI-Agent-CC attains 80% accuracy at 29% lower cost than a retriever-plus-LLM baseline (Qwen3-Embedding-8B).

In the transactional domain, ATCC achieves up to 18× throughput and 90% lower tail latency over SOTA schemes in YCSB and TPC-C agentic workloads, with further gains in token efficiency and fairness under high contention (Zhou et al., 14 Mar 2026).

5. Analysis of Capabilities, Interface Resolution, and Limitations

The effectiveness of DCI-Agent-CC arises primarily from:

  • Higher localization: Not by surfacing more gold documents, but by isolating fine-grained evidence snippets (e.g., via compositional grep pipelines), which classical retrievers and re-rankers often obscure or omit.
  • Flexible, hypothesis-driven search: The agent can compose and revise queries stepwise, mimicking human researchers’ investigative workflows—enforcing lexical constraints, federating weak clues, and verifying partial hypotheses inline.
  • Persistent evidence access: Because no evidence is ever irretrievably discarded, post-retrieval reasoning is not bottlenecked by prior retrieval errors.

Nevertheless, the DCI-Agent-CC design presents several bottlenecks:

  • Scalability: Expanding the corpus from 100k to 400k documents in BrowseComp-Plus resulted in accuracy decline and explosion in shell command invocations; broad shell-based search is costly for very large corpora.
  • Context-window management: Long search trajectories place significant pressure on the LLM context window, requiring sophisticated compaction/truncation heuristics with non-monotonic effects.
  • Real-time overhead: The cost (in latency and resources) of repeated terminal calls is higher than index lookups.
  • Transactional complexity: Collaboration among multiple agents or sub-plans introduces new challenges in maintaining consistent, deadlock-free coordination.

6. Extensions, Future Research, and Open Challenges

Several directions are identified for further extending DCI-Agent-CC:

  • Hybrid index integration: Employ lightweight inverted indices for coarse filtering, followed by high-resolution DCI for detailed exploration.
  • Learned command planners: Fine-tune LLMs or develop specialized planners to generate more selective, efficient shell commands and exploit richer file patterns.
  • Dynamic context management: Use adaptive summarization and truncation policies, possibly guided by meta-learning, to optimize context retention.
  • Expanded tool ecosystems: Incorporate structured-data processors (jq for JSONs), PDF searchers, and enhanced scripting.
  • Distributed, multi-agent coordination: In distributed or sharded settings, extend CC mechanisms to synchronize lock scheduling, coordinate across transaction graphs, and adaptively balance latency and throughput.
  • Online learning: Continuously retrain or fine-tune the RL-based policy to accommodate prompt drift, novel agent patterns, and emerging domains.

A plausible implication is that as LLM-based agents become more adept at reasoning over both text and structured data, the design of retrieval and transactional interfaces—down to their resolution and access modality—will dictate attainable performance across real-world research and data-intensive applications. DCI-Agent-CC offers a concrete instantiation of these principles across both raw corpus search and agentic transaction workloads, establishing a foundation for further developments integrating interface flexibility, adaptivity, and collaborative orchestration (Li et al., 3 May 2026, Zhou et al., 14 Mar 2026).

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to DCI-Agent-CC.