LOCARD: An Agentic Framework for Blockchain Forensics

Published 5 Apr 2026 in cs.CR and cs.AI | (2604.04211v1)

Abstract: Blockchain forensics inherently involves dynamic and iterative investigations, while many existing approaches primarily model it through static inference pipelines. We propose a paradigm shift towards Agentic Blockchain Forensics (ABF), modeling forensic investigation as a sequential decision-making process. To instantiate this paradigm, we introduce LOCARD, the first agentic framework for blockchain forensics. LOCARD operationalizes this perspective through a Tri-Core Cognitive Architecture that decouples strategic planning, operational execution, and evaluative validation. Unlike generic LLM-based agents, it incorporates a Structured Belief State mechanism to enforce forensic rigor and guide exploration under explicit state constraints. To demonstrate the efficacy of the ABF paradigm, we apply LOCARD to the inherently complex domain of cross-chain transaction tracing. We introduce Thor25, a benchmark dataset comprising over 151k real-world cross-chain forensic records, and evaluate LOCARD on the Group-Transfer Tracing task for dismantling Sybil clusters. Validated against representative laundering sub-flows from the Bybit hack, LOCARD achieves high-fidelity tracing results, providing empirical evidence that modeling blockchain forensics as an autonomous agentic task is both viable and effective. These results establish a concrete foundation for future agentic approaches to large-scale blockchain forensic analysis. Code and dataset are publicly available at https://github.com/xhyumiracle/locard and https://github.com/xhyumiracle/thorchain-crosschain-data.

Abstract PDF Upgrade to Chat

Authors (2)

Summary

The paper presents an agentic forensic framework that reconceptualizes investigations as evidence-driven sequential decision processes.
The paper formalizes single-transfer and group-transfer tracing by integrating heuristics such as temporal order, value proximity, and economic consistency.
The paper demonstrates LOCARD's tri-core architecture and structured belief state in achieving high recall rates and cost-effective cross-chain forensic analysis.

LOCARD: An Agentic Framework for Blockchain Forensics

Motivation and Limitations of Static Blockchain Forensics

Contemporary blockchain forensics predominantly relies on static inference pipelines that conflate investigation with deterministic graph traversals or heuristic rules. While effective for singular, intra-chain tracing, these approaches have severe limitations when faced with the escalating complexity of cross-chain fund flows that exploit chain hopping, decentralized bridges, and Sybil fragmentation. Adversarial actors adaptively manipulate transaction flows, obfuscating provenance and ownership, thereby rendering static analysis brittle and insufficient.

To address these gaps, the agentic paradigm reconceptualizes forensic investigation as an evidence-driven, sequential decision process. This perspective enables iterative hypothesis formation, adaptive evidence gathering, strategic backtracking, and explicit management of investigative uncertainty—a marked departure from fixed-scripted logic.

Figure 1: LOCARD system overview, illustrating its agentic tri-core architecture for blockchain investigative automation.

Formalization of Cross-Chain Forensic Tasks

The paper formalizes two primary cross-chain tracing tasks central to forensic workflows: single-transfer and group-transfer tracing. Each is defined over a set of atomic on-chain transfers $T$ , cross-chain bridges $B$ , and assets $A$ distributed across a support set of blockchains $C$ .

Single-Transfer Tracing: Given a target destination transfer $\tau^\ast$ , the objective is to infer all plausible ground-truth source-side transfers and links that produced $\tau^\ast$ via bridge $b \in B$ . The heuristic search restricts candidates by temporal order, value proximity (accounting for exchange rate and bridge-induced delays), and rigorous causality/economic consistency checks.
Group-Transfer Tracing: For a set $\mathcal{Q}$ of transfers, typically reflecting a Sybil cluster or coordinated laundering group, group-transfer tracing reconstructs shared ancestry (i.e., common fund sources) through co-occurrence voting and consolidation of upstream transfer graphs, enabling high-fidelity entity disambiguation and linkage discovery across chains.
Figure 2: Visualization of a cross-chain transaction flow across a bridge, including on-chain provenance and destination mapping.

This abstraction lays the groundwork for Agentic Blockchain Forensics (ABF), wherein the investigation operates as a tuple $\langle \mathcal{S},\mathcal{A},\mathcal{O},\Phi \rangle$ capturing state, actions, observations, and belief updates. The practical effect is an investigation loop—driven by agentic logic—that refines its hypotheses dynamically with each cycle of evidence gathering and evaluation.

The LOCARD Framework: Tri-Core Cognitive Architecture

LOCARD is instantiated as an agentic, multi-agent system, structurally decomposed into a tri-core architecture:

Strategic Core: Governs high-level strategic planning through explicit belief state maintenance, incorporating domain expertise to adaptively orchestrate the investigation.
Operational Core: Executes procedural actions—traversing on-chain data via RPCs, explorers, oracles—by invoking concrete tools in response to strategic directives.
Evaluative Core: Validates findings at each step, performing both structural and heuristic plausibility checks before admitting new evidence to the belief state.
Figure 3: Architectural schematic of the LOCARD Tri-Core design separating strategic reasoning, operational execution, and evaluative validation.

Crucially, these cores interact within a Perception–Reasoning–Action (PRA) loop, iteratively refining the investigation context and automatically adapting the search process as new evidence or contradictions emerge. This loose coupling ensures extensibility and supports advanced task decomposition, hierarchical control, and reflective state tracking.

Workflow Instantiation for Tracing Tasks

The agentic workflow is realized via specialized agent roles:

Orchestrator (Strategic Core): Decomposes investigative goals, synchronizes feedback and state transitions, and ensures procedural compliance.
Worker (Operational Core): Invokes atomic evidence collection through on-chain artifact lookups, price retrievals, and graph traversals.
Critic (Evaluative Core): Applies formal and domain-specific checks (e.g., temporal causality, economic consistency), filtering and scoring valid candidates.

For group-transfer tracing, batch tasking and address-level ancestry aggregation are layered atop the single-transfer workflow. The orchestration automates concurrent exploration of divergent transaction leaves, followed by analytical intersection to expose shared fund sources and group-level entities.

Figure 4: Workflow instantiation for both single- and group-transfer cross-chain tracing using the agentic multi-core workflow.

Structured Belief State and State-Aware Reflection

A key innovation is LOCARD's Structured Belief State $B_t \in \{0,1\}^N$ , representing the episodic completion of SOP-constrained forensic milestones. By grounding LLM-based agents with explicit state vectors and constraining action space at each iteration, the system robustly suppresses common reasoning pathologies (e.g., hallucination, early step termination, redundant queries) typical of autonomous agents in high-ambiguity settings.

Belief state transitions are managed via feedback-triggered State-Aware Reflection: only upon receiving validated evidence does the system advance the corresponding state, enforcing rigor and reproducibility absent in ungrounded LLM approaches.

Heuristic Evaluation and Scoring

Candidate linkage evaluation proceeds via:

Temporal decay filtering (exponential penalty on high-latency links)
Value-range verification (amount feasibility given asset exchange rates and bridge noise)
Aggregated confidence scoring (weighted prioritization with $B$ 0 emphasizing temporal order)

The resulting ranked shortlist provides high-probability candidates for downstream analyst examination, reducing false positives without deterministic over-commitment.

The Thor25 Cross-Chain Forensics Dataset

Empirical validation leverages Thor25, a rigorously curated benchmark built from 151,000+ real-world, native layer-1 swap records across BTC, ETH, DOGE, and LTC. Three progressive tiers—raw, high-value/fast (Thor25HF), and a mini-benchmark for efficient agentic evaluation—are provided, each with rich ground-truth structure. Thor25 uniquely incorporates annotated flows from the March 2025 Bybit exploit, offering rare labeled data for sophisticated multi-entity laundering patterns and Sybil group tracing.

Experimental Results

Single-Transfer Tracing

LOCARD's baseline performance closely tracks deterministic heuristics, achieving recall rates $B$ 1 across all asset-pair directions and perfect recall for select pairs. The system typically produces Hit@50 scores exceeding $B$ 2, attesting to effective search space narrowing even with intentionally simple scoring. Operationally, traces cost $B$ 3 USD and require 1–2 minutes per trace, primarily due to LLM inference. This efficiency is competitive for high-value forensic tasks.

Group-Transfer Tracing: Bybit Case Study

When challenged to attribute five disparate Bitcoin laundering endpoints from the Bybit exploit to their origin without prior path knowledge, LOCARD accurately converged on the correct Ethereum root address, leveraging dynamic graph intersection and co-occurrence analysis via its group tracing agents. This demonstrates robust, explainable entity-level forensics in live adversarial scenarios.

Figure 5: LOCARD reconstructs an obfuscated laundering subflow from the Bybit hack, uncovering the true common ancestor behind group-dispersed illicit flows.

Discussion and Implications

The core contribution is establishing that agentic, belief-guided workflows can replicate and, crucially, generalize forensic heuristics to domains where static pipelines fail—especially as actors exploit multi-chain, high-entropy, and adversarial techniques. The modular tri-core architecture enables extensibility to new evidence classes, chain environments, and investigative objectives, while structured belief states enforce procedural rigor largely absent in prior LLM-centric agents.

Future AI system directions include federated agent ensembles specializing in diverse ledgers/protocols, adaptive robustness to noise and dusting attacks, and compositional reasoning for regulatory compliance or fraud detection at web-scale. Integrating more advanced scoring, OODA-inspired multi-agent orchestration, or explainable AI for analyst-in-the-loop audit are natural next steps.

Conclusion

LOCARD formalizes and operationalizes Agentic Blockchain Forensics, setting a foundation for modeling forensic reasoning as an adaptive, evidence-driven agentic process. Empirical results on the Thor25 benchmark and real attack flows provide strong evidence for the viability of rigorous agentic forensic systems in addressing emergent challenges of adversarial cross-chain activity.

Reference: "LOCARD: An Agentic Framework for Blockchain Forensics" (2604.04211).

Markdown Report Issue