Token Flow Behavior Graph (TFBG)
- Token Flow Behavior Graph (TFBG) is a directed, attributed multigraph representing on-chain token transfers with temporal and behavioral features.
- Its deterministic extraction pipeline normalizes blockchain events into structured node and edge attributes to capture suspicious transactional patterns.
- Embedding TFBG via graph neural networks enables real-time, explainable fraud detection, notably in identifying Rug Pull incidents.
A Token Flow Behavior Graph (TFBG) is a directed, attributed multigraph used to model and analyze the transactional activity of a single crypto-token within blockchain networks. Developed in the context of RPHunter for robust detection of Rug Pull schemes, TFBG serves as a dynamic representation of token flow, capturing both structural and temporal characteristics of wallet and contract interactions. Through an expressive set of node and edge features, including multi-dimensional behavioral, transactional, and network-structural attributes, TFBG enables fine-grained, time-aware modeling of suspicious transaction patterns, particularly those indicative of market manipulation or financial fraud strategies intrinsic to Rug Pull scams (Wu et al., 23 Jun 2025).
1. Formal Mathematical Structure
TFBG is defined as the quadruple
where:
- is the set of wallet and contract nodes.
- is the set of directed edges, each identified by their source , destination , and transaction timestamp . The multigraph permits multiple edges between at distinct .
- assigns each node a -dimensional feature vector 0.
- 1 assigns each edge a 2-dimensional feature vector 3.
The construction is inherently time-aware, preserving the chronological order of token transfers and associating temporal and behavioral metadata with every transaction.
2. Construction and Feature Engineering Pipeline
The TFBG construction follows a deterministic extraction pipeline from blockchain data as implemented in RPHunter:
- Blockchain Event Parsing: Acquire the first 4 token-transfer events for the token under scrutiny (5 default). For each event, retrieve 6 via blockchain explorer queries.
- Normalization and Transformation: Normalize rawValue to account for token decimals (7) and apply a log transformation: 8.
- Graph Expansion: For each transfer event, create or use nodes 9 "from", 0 "to" in 1, and append edge 2 to 3.
- Feature Extraction: Compute and store node and edge attributes at the timestamp of each transfer using transaction history up to that point.
Node Features 4
- Network-Structural Group (8 dims): Degree, in- and out-centrality, betweenness, closeness, eigenvector, Katz centrality, clustering coefficient.
- Investment-Behavior Group (6 dims): IfTokenCreator5{0,1}; FundInOutRatio; ShortMaxInAmount, ShortMaxInCount, ShortMaxOutAmount, ShortMaxOutCount over a moving window 6.
Edge Features 7
- Time-Series (3 dims): CreationInterval, LatestInterval, IfApprove.
- Transactional (3 dims): GasLimit, TransferValue, HarmonicTransferValue (normalized by harmonic mean of source/target centrality).
- Investment-Behavior (8 dims): Cumulative token and count in/out for involved nodes, max cumulative in/out over the 8 interval.
All continuous features are min–max scaled within the training set domain, ensuring consistent embedding magnitudes.
3. Temporal and Dynamic Graph Modeling
TFBG encodes temporal dynamics explicitly by leveraging timestamped edges and time-dependent node/edge statistics:
- Every edge 9 retains its transaction timestamp.
- For any edge 0, the set 1 or 2 captures all prior edges sharing the same source or destination, supporting dynamic statistics such as ShortMax and Cumulative features within 3-length time windows.
- The temporal dimension is further reflected in GNN processing without time-slicing, ensuring that evolving, high-frequency behaviors central to attack strategies—such as liquidity drains—are captured in representations.
This design highlights the model's capacity for sensitivity to rapid, patterned transactional shifts typical of market manipulation and scam events.
4. Embedding via GNN Architectures
In RPHunter, TFBG provides the substrate for a two-branch graph neural architecture:
- Transaction Branch: UAGNN is applied to TFBG, intertwining node and edge updates across two layers.
- Node updates: Standard GCN with 4 (adjacency normalization with self-loops).
- Edge updates: Concatenate current edge, source and target node embeddings, and mean of embeddings over 5 (temporal predecessors), then apply learnable transformations and nonlinearity.
The resulting 6 aggregates across all node and edge features via mean pooling. These representations are fused with code-derived embeddings (from semantic risk code graphs, SRCG) using an attention mechanism to obtain a joint feature 7, which informs the final classification.
5. Empirical Observations and Case Analysis
Analysis of TFBG patterns in prominent Rug Pull incidents provides actionable forensic insights:
- In the "MNHA" incident (ETH Mainnet, July 2024), the TFBG exhibited a star topology; the attacker's node displayed extreme outdegree and minimal LatestInterval (seconds) immediately prior to liquidity withdrawal.
- ShortMaxOutAmount increased sharply to 8 tokens within 9 minutes post-creation, signaling rapid fund extraction.
- Edges with IfApprove = 0 indicated bypasses of typical approval flows, consistent with malicious contract design.
- Anomalous edge attention weights in UAGNN were observed to be 0 higher than average, heavily influencing the detection decision.
Across a dataset of 645 Rug Pulls and 1675 benign instances, TFBG-based detection alone achieved Precision=1, Recall=2, F1=3. Mean node centrality in Rug Pull TFBGs exceeded that of benign tokens by a factor of 4, marking network-structural concentration as a salient fraud indicator.
6. Detection Performance and Model Fusion
The integration of TFBG and SRCG embeddings via attention-based fusion yields significant improvements in scam detection:
| Approach | Precision | Recall | F1 |
|---|---|---|---|
| Code-only | 92.7% | 89.1% | 90.8% |
| Tx-only (TFBG) | 71.7% | 80.7% | 76.0% |
| Fusion (RPHunter) | 95.3% | 93.8% | 94.5% |
For real-world mainnet deployment (May–July 2024), RPHunter, using fused TFBG & code information, identified 4801 suspicious tokens with a manual-checked precision of 91%. This demonstrates that while TFBG transaction features are vital, their synergy with code semantics via structured fusion delivers state-of-the-art composite detection (Wu et al., 23 Jun 2025).
7. Significance and Application Context
TFBG represents a methodological advance for automated on-chain fraud analysis, uniquely addressing the limitations of code-only and transaction-only approaches. Its explicit modeling of temporal, relational, and behavioral regularities makes it highly responsive to complex, adaptive attack patterns that evade rigid rule-based or purely statistical detection. TFBG-style representations are extensible beyond Rug Pulls, providing a blueprint for future research in explainable crypto forensics and adaptive financial security analytics.