- The paper presents an agentic forensic framework that reconceptualizes investigations as evidence-driven sequential decision processes.
- The paper formalizes single-transfer and group-transfer tracing by integrating heuristics such as temporal order, value proximity, and economic consistency.
- The paper demonstrates LOCARD's tri-core architecture and structured belief state in achieving high recall rates and cost-effective cross-chain forensic analysis.
LOCARD: An Agentic Framework for Blockchain Forensics
Motivation and Limitations of Static Blockchain Forensics
Contemporary blockchain forensics predominantly relies on static inference pipelines that conflate investigation with deterministic graph traversals or heuristic rules. While effective for singular, intra-chain tracing, these approaches have severe limitations when faced with the escalating complexity of cross-chain fund flows that exploit chain hopping, decentralized bridges, and Sybil fragmentation. Adversarial actors adaptively manipulate transaction flows, obfuscating provenance and ownership, thereby rendering static analysis brittle and insufficient.
To address these gaps, the agentic paradigm reconceptualizes forensic investigation as an evidence-driven, sequential decision process. This perspective enables iterative hypothesis formation, adaptive evidence gathering, strategic backtracking, and explicit management of investigative uncertainty—a marked departure from fixed-scripted logic.
Figure 1: LOCARD system overview, illustrating its agentic tri-core architecture for blockchain investigative automation.
The paper formalizes two primary cross-chain tracing tasks central to forensic workflows: single-transfer and group-transfer tracing. Each is defined over a set of atomic on-chain transfers T, cross-chain bridges B, and assets A distributed across a support set of blockchains C.
This abstraction lays the groundwork for Agentic Blockchain Forensics (ABF), wherein the investigation operates as a tuple ⟨S,A,O,Φ⟩ capturing state, actions, observations, and belief updates. The practical effect is an investigation loop—driven by agentic logic—that refines its hypotheses dynamically with each cycle of evidence gathering and evaluation.
The LOCARD Framework: Tri-Core Cognitive Architecture
LOCARD is instantiated as an agentic, multi-agent system, structurally decomposed into a tri-core architecture:
Crucially, these cores interact within a Perception–Reasoning–Action (PRA) loop, iteratively refining the investigation context and automatically adapting the search process as new evidence or contradictions emerge. This loose coupling ensures extensibility and supports advanced task decomposition, hierarchical control, and reflective state tracking.
Workflow Instantiation for Tracing Tasks
The agentic workflow is realized via specialized agent roles:
- Orchestrator (Strategic Core): Decomposes investigative goals, synchronizes feedback and state transitions, and ensures procedural compliance.
- Worker (Operational Core): Invokes atomic evidence collection through on-chain artifact lookups, price retrievals, and graph traversals.
- Critic (Evaluative Core): Applies formal and domain-specific checks (e.g., temporal causality, economic consistency), filtering and scoring valid candidates.
For group-transfer tracing, batch tasking and address-level ancestry aggregation are layered atop the single-transfer workflow. The orchestration automates concurrent exploration of divergent transaction leaves, followed by analytical intersection to expose shared fund sources and group-level entities.
Figure 4: Workflow instantiation for both single- and group-transfer cross-chain tracing using the agentic multi-core workflow.
Structured Belief State and State-Aware Reflection
A key innovation is LOCARD's Structured Belief State Bt∈{0,1}N, representing the episodic completion of SOP-constrained forensic milestones. By grounding LLM-based agents with explicit state vectors and constraining action space at each iteration, the system robustly suppresses common reasoning pathologies (e.g., hallucination, early step termination, redundant queries) typical of autonomous agents in high-ambiguity settings.
Belief state transitions are managed via feedback-triggered State-Aware Reflection: only upon receiving validated evidence does the system advance the corresponding state, enforcing rigor and reproducibility absent in ungrounded LLM approaches.
Heuristic Evaluation and Scoring
Candidate linkage evaluation proceeds via:
- Temporal decay filtering (exponential penalty on high-latency links)
- Value-range verification (amount feasibility given asset exchange rates and bridge noise)
- Aggregated confidence scoring (weighted prioritization with B0 emphasizing temporal order)
The resulting ranked shortlist provides high-probability candidates for downstream analyst examination, reducing false positives without deterministic over-commitment.
The Thor25 Cross-Chain Forensics Dataset
Empirical validation leverages Thor25, a rigorously curated benchmark built from 151,000+ real-world, native layer-1 swap records across BTC, ETH, DOGE, and LTC. Three progressive tiers—raw, high-value/fast (Thor25HF), and a mini-benchmark for efficient agentic evaluation—are provided, each with rich ground-truth structure. Thor25 uniquely incorporates annotated flows from the March 2025 Bybit exploit, offering rare labeled data for sophisticated multi-entity laundering patterns and Sybil group tracing.
Experimental Results
Single-Transfer Tracing
LOCARD's baseline performance closely tracks deterministic heuristics, achieving recall rates B1 across all asset-pair directions and perfect recall for select pairs. The system typically produces Hit@50 scores exceeding B2, attesting to effective search space narrowing even with intentionally simple scoring. Operationally, traces cost B3 USD and require 1–2 minutes per trace, primarily due to LLM inference. This efficiency is competitive for high-value forensic tasks.
Group-Transfer Tracing: Bybit Case Study
When challenged to attribute five disparate Bitcoin laundering endpoints from the Bybit exploit to their origin without prior path knowledge, LOCARD accurately converged on the correct Ethereum root address, leveraging dynamic graph intersection and co-occurrence analysis via its group tracing agents. This demonstrates robust, explainable entity-level forensics in live adversarial scenarios.
Figure 5: LOCARD reconstructs an obfuscated laundering subflow from the Bybit hack, uncovering the true common ancestor behind group-dispersed illicit flows.
Discussion and Implications
The core contribution is establishing that agentic, belief-guided workflows can replicate and, crucially, generalize forensic heuristics to domains where static pipelines fail—especially as actors exploit multi-chain, high-entropy, and adversarial techniques. The modular tri-core architecture enables extensibility to new evidence classes, chain environments, and investigative objectives, while structured belief states enforce procedural rigor largely absent in prior LLM-centric agents.
Future AI system directions include federated agent ensembles specializing in diverse ledgers/protocols, adaptive robustness to noise and dusting attacks, and compositional reasoning for regulatory compliance or fraud detection at web-scale. Integrating more advanced scoring, OODA-inspired multi-agent orchestration, or explainable AI for analyst-in-the-loop audit are natural next steps.
Conclusion
LOCARD formalizes and operationalizes Agentic Blockchain Forensics, setting a foundation for modeling forensic reasoning as an adaptive, evidence-driven agentic process. Empirical results on the Thor25 benchmark and real attack flows provide strong evidence for the viability of rigorous agentic forensic systems in addressing emergent challenges of adversarial cross-chain activity.
Reference: "LOCARD: An Agentic Framework for Blockchain Forensics" (2604.04211).