Papers
Topics
Authors
Recent
Search
2000 character limit reached

RPHunter: Advanced Rug Pull Detection

Updated 30 April 2026
  • RPHunter is an advanced detection system that integrates static code and dynamic transaction analysis to identify rug pull scams with high precision.
  • It employs graph-based flow analysis and neural embedding to extract semantic code risk features and capture nuanced token transfer behaviors.
  • Evaluations show improved performance with a 94.5% F1 score, low false positive rates, and timely real-world detection on Ethereum token contracts.

RPHunter is an advanced detection system designed to identify rug pull scams in crypto token ecosystems by fusing static code analysis with dynamic transaction behavior modeling. By integrating a graph-based flow analysis of smart contract code with a granular model of token transfer activity, RPHunter achieves early and high-precision rug pull detection and outperforms existing static analysis, transaction-pattern, and hybrid approaches (Wu et al., 23 Jun 2025).

1. Formalization of Rug Pull Code Risk

RPHunter begins by extracting latent code risks from token contracts through a multi-stage pipeline:

  • Bytecode Acquisition and Decompilation: For a given token contract address, on-chain bytecode is retrieved (Web3.getCode) and decompiled into an intermediate representation (IR) via Gigahorse. The IR encodes program structure as a set of basic blocks, with explicit control-flow and data-flow edges. This yields a Semantic Code Graph (SCG), SCG=(B,EctrlEdata)SCG = (B, E_{\text{ctrl}} \cup E_{\text{data}}), where BB represents basic blocks and EctrlE_{\text{ctrl}}, EdataE_{\text{data}} represent control- and data-flow, respectively.
  • Declarative Relations and Rules: A series of first-order relations is defined over code statements ss, functions ff, variables vv, and mappings mm. For example:
    • DF(v1,v2)\operatorname{DF}(v_1, v_2): v2v_2 is data-flow dependent on BB0
    • BB1: BB2 is a public function with parameter BB3
    • BB4: BB5 is a transaction-tax manipulated in BB6
  • Risk Typology: Rug pull risks are partitioned into three core classes—Sale Restrict, Variable Manipulation, and Balance Tamper—with eight specialized subtypes. Each sub-risk BB7 is formalized by a conjunction BB8, pairing a broad code-pattern (BB9) with plugin predicates (EctrlE_{\text{ctrl}}0) for subtype refinement.
  • Flow Analysis: Declarative rules are traversed via a data and control-flow taint-propagation algorithm. For each EctrlE_{\text{ctrl}}1, blocks matching EctrlE_{\text{ctrl}}2 are "tainted," with propagation through EctrlE_{\text{ctrl}}3; satisfaction of EctrlE_{\text{ctrl}}4 records critical blocks and flows, defining risk evidence sets EctrlE_{\text{ctrl}}5.

2. Token Flow Behavior Modeling

  • Transaction Event Extraction: For each token, the first EctrlE_{\text{ctrl}}6 transfer events are collected, capturing sender EctrlE_{\text{ctrl}}7, receiver EctrlE_{\text{ctrl}}8, normalized value EctrlE_{\text{ctrl}}9, and timestamp EdataE_{\text{data}}0.
  • Token Flow Behavior Graph (TFBG): The TFBG is defined as EdataE_{\text{data}}1, where EdataE_{\text{data}}2 is the set of observed accounts and EdataE_{\text{data}}3 the sequence of transfer edges (multiple per sender–receiver–time triple). Node and edge features reflect:
    • Node (14D): Network centrality, in/out degrees, clustering, token creator flags, fund flow ratios, short-term activity.
    • Edge (15D): Time series (intervals, approvals), transaction (gas, value, harmonic value), and investment features (cumulative volumes, short-term maxima).

The TFBG structure is designed to expose both network structural anomalies and market manipulation signatures typical of rug pulls.

3. Joint Graph Representation and Neural Embedding

RPHunter constructs two heterogeneous graphs for each token:

  • Semantic Risk Code Graph (SRCG): Nodes corresponding to code blocks are labeled ("critical," "invocation," "normal") based on rule activation and call sites, with edges labeled by criticality and dependency.
  • TFBG: Encodes dynamic transfer behaviors as described above.

Graph Embedding Architectures

  • Relational GCN for SRCG: Each relation type EdataE_{\text{data}}4 (critical, dependent, normal) defines adjacency EdataE_{\text{data}}5, with propagation:

EdataE_{\text{data}}6

where node features are BERT-based opcode embeddings.

  • Unified Aggregation GNN (UAGNN) for TFBG: Alternates node and edge updates:
    • Node: GCN-style message passing.
    • Edge: Each edge updates via local node states and mean aggregation over temporally preceding, incident edges.
    • After 2 rounds, mean-pooling yields global embeddings EdataE_{\text{data}}7 and EdataE_{\text{data}}8.
  • Attention Fusion: Both embeddings are projected to EdataE_{\text{data}}9, with bidirectional attention weights ss0, yielding a fused ss1 for final MLP-based binary prediction.

4. Dataset Construction and Evaluation

  • Dataset Source: 1048 publicly reported rug pull incidents from 8 security platforms; after removal of code/incomplete data, 645 cases remain. 1806 manually reviewed benign tokens from TokenScout yield 1675 negatives.
  • Split and Validation: Dataset split 60:20:20 among train/val/test; within training, five-fold cross-validation is used for robustness.

Metrics and Results

  • Binary classification metrics:

ss2

RPHunter achieves, on held-out test folds: - Precision: 95.3% - Recall: 93.8% - F1 Score: 94.5% - FPR: 1.8% - FNR: 6.2%

Performance exceeds two rule-based baselines (Pied-Piper, CRPWarner), two transaction-only models, and the commercial scanner GoPlus.

5. Deployment and Case Studies

  • Ethereum Mainnet Scan: RPHunter was deployed across all ERC-20-like contracts between blocks 19,771,560–20,207,949 (May–July 2024), flagging 4,801 tokens as rug pulls.
  • Real-world Precision: Manual sampling of 247 detections found 23 false positives, giving a live precision of approximately 91%.
  • Timeliness: Fast rug pulls were detected—108 flagged tokens existed <24h; in a notable case (MNHA), RPHunter identified the risk prior to withdrawal, preempting loss.

6. Limitations and Component Analysis

  • Flow Analysis Quality: Tied to Gigahorse decompilation fidelity; integrating more precise analyzers (e.g., Vandal or Mythril) could reduce false negatives.
  • Dataset Bias: Current positives are high-confidence, but adaptive scam tactics may evade static rule sets; continual rule/plugin refreshes are required.
  • Off-Chain/Multimodal Blind Spots: Off-chain activities and social/contractual "trust pulls" (not reflected on-chain) escape current detection; fusing social and multi-modal signals is suggested as future work.
  • Ablation Insights:
    • Removing code-risk features: F1 drops ≈6.7%.
    • Excluding node/edge features in TFBG: F1 drops by up to 11.2%.
    • Omission of fusion model: F1 decreases by up to 18.5%.

7. Impact, Novelty, and Future Directions

RPHunter establishes the paradigm of graph- and attention-based fusion of (i) formally defined code-risk features from declarative flow analysis and (ii) transaction-behavioral features. This enables high sensitivity and specificity in real-world, early-stage rug pull detection within heterogeneous, adversarial token environments. High F1 and operational precision demonstrate practical efficacy. Future directions include integration of higher-fidelity code analyzers, expansion of rule/plugin libraries to adapt to new attack patterns, and incorporation of off-chain data sources to address non-code-based scams (Wu et al., 23 Jun 2025).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to RPHunter.