CRPWarner: Contract Risk Analyzer
- CRPWarner is a static analysis tool that detects malicious back-door patterns in Ethereum and EVM-compatible DeFi token contracts.
- It decompiles bytecode and employs feature extraction with Datalog rules to classify hidden mint functions, limiting sell orders, and leaking token techniques.
- Experimental evaluations demonstrated high precision and recall while identifying limitations such as proxy contract issues and static-only analysis gaps.
CRPWarner (Contract-related Rug Pull Risk Warner) is a static analysis tool designed to detect and warn of contract-related rug pull risks in Ethereum and EVM-compatible DeFi token contracts. By analyzing smart contract bytecode for indicator patterns of malicious functions, CRPWarner provides early detection of possible back-door mechanisms enabling developers to abscond with investor funds. This approach targets a major vector of financial exploitation in decentralized finance ecosystems and seeks to systematically mitigate contract-level fraud (Lin et al., 2024).
1. Terminology and Taxonomy of Rug Pulls
A rug pull is defined as a scam where token developers intentionally abandon a project and abscond with investors’ assets, rendering the token worthless. Manual analysis of 103 real-world rug pull incidents from platforms such as PeckShield, SlowMist, and RugDoc led to a taxonomy distinguishing two primary categories:
- Contract-related Rug Pulls: Scams employing malicious (“back-door”) smart contract functions, including:
- Hidden Mint Function: Unauthorized minting access; 13 documented cases, $10.4M USD lost.
- Limiting Sell Order: Restrictions that only allow insiders or the owner to sell; 12 cases, $20.5M lost.
- Leaking Token: Direct theft via broad transfer authority or fee-draining; 11 cases, $46.1M lost.
- Transaction-related Rug Pulls: Scams without back-door functions, relying on on-chain trade manipulations:
- Dumping Cryptocurrency: Mass sell-offs; 34 cases, $58.6M lost.
- Withdrawing Liquidity: Removal of liquidity to render tokens illiquid; 18 cases, $7.8M lost.
- Abandoning Project after Funding: Developer exit scam; 5 cases, $9.5M lost.
The following table presents a summary taxonomy, event counts, and associated losses:
| Subtype | # Events | Loss (k USD) |
|---|---|---|
| Hidden Mint Function | 13 | 10,376.6 |
| Limiting Sell Order | 12 | 20,527.2 |
| Leaking Token | 11 | 46,178.0 |
| Dumping Cryptocurrency | 34 | 58,638.9 |
| Withdrawing Liquidity | 18 | 7,754.6 |
| Abandoning Project after Funding | 5 | 9,460.5 |
2. CRPWarner System Architecture
CRPWarner exclusively employs static, bytecode-based analysis to evaluate deployed contracts for back-door function patterns enabling contract-related rug pulls. The workflow involves:
- Bytecode Decompilation: The Gigahorse decompiler processes EVM bytecode, outputting a control-flow graph (CFG), a three-address-code IR, and basic opcode/operators metrics.
- Feature Extraction: The CFG and IR are translated into Datalog facts such as statements, functions, storage variables, data flows, and control-flow dependencies.
- Information-Flow Analysis: Deductive Datalog rules identify abstractions such as storage access to balances, fee variables, freeze lists, and ownership checks, leading to predicates like
LoadTokenBalances,StoreTokenBalances,VarToLimitTransfer, andPublicFuncForOwner. - Malicious-function Classification: Rule-based classification culminates in flagging contract functions as one of the three contract-related rug pull subtypes using specific Datalog predicates.
- Report Generation: Contracts containing functions matching any rug pull signature receive warnings, including function signatures and subtype rationale.
3. Feature Engineering and Detection Rules
Detection comprehensively relies on three categories of features:
A. Token-Balance Management Features
LoadTokenBalances(stmt): SLOAD from mapping(address → uint256).StoreTokenBalances(stmt): SSTORE to the same map.LoadAndStoreBalances(func): Both load and store opcodes occur within the function body.CheckTokenBalances(func): Presence of value checks ensuring sufficient balance (balance ≥ amount).CheckBalancesOfInput(func): Validation checks for addresses originating in calldata.
B. Variable-Type Features
VarToLimitTransfer(v): Storage variablevcontrols transfer permission via a JUMPI guard.VarForFee(v): Arithmetic fee variable usage, as inamount * v / 100.
C. Function-Level Features
PublicFuncForOwner(f): Functionfis public and encapsulated byrequire(msg.sender == owner).FunctionModifyStorage(f, v): Function modifies storage variablev, possibly from calldata.FunctionTransfer(f): Reads and writes two distinct balances, indicating token transfer logic.
Detection Rules (Datalog-style pseudocode):
- Hidden Mint Function:
1 2 3 4 |
hiddenMint(f) :- PublicFuncForOwner(f), LoadAndStoreBalances(f), not CheckTokenBalances(f). |
- Limiting Sell Order:
1 2 3 4 |
limitSell(f) :- PublicFuncForOwner(f), VarToLimitTransfer(v), FunctionModifyStorage(f, v). |
- Leaking Token (direct transfer):
1 2 3 4 |
leakDirect(f) :- PublicFuncForOwner(f), FunctionTransfer(f), CheckBalancesOfInput(f). |
- Leaking Token (fee-drain):
1 2 3 4 |
leakFee(f) :- PublicFuncForOwner(f), VarForFee(v), FunctionModifyStorage(f, v). |
4. Experimental Evaluation and Formal Metrics
CRPWarner’s performance is evaluated via two principal research questions on dedicated datasets:
RQ1: Open-source Rug-Pull Contracts (69 contracts)
- Manual auditing for each flagged function.
- Subtype-level results:
| Subtype | Precision | Recall | F₁ |
|---|---|---|---|
| Hidden Mint | 94.7 % | 90.0 % | 92.3 % |
| Limiting Sell Order | 93.1 % | 90.0 % | 91.5 % |
| Leaking Token | 87.5 % | 77.8 % | 82.4 % |
| Overall | 91.8 % | 85.9 % | 88.7 % |
False positives primarily stemmed from benign functions with opcode similarity to flagged patterns; false negatives arose in contracts with complex or nonstandard balance map layouts.
RQ2: Large-scale Scan of 13,484 ERC-20 Contracts
- CRPWarner identified 4,168 flagged contracts (30.9% prevalence).
- Hidden Mint: 2,775 contracts (20.6%)
- Limiting Sell: 2,796 (20.7%)
- Leaking Token: 1,155 (8.6%)
- Random sampling validation (n = 272, 95% confidence, ±10% interval):
| Subtype | Samples | TP | FP | Precision |
|---|---|---|---|---|
| Hidden Mint | 92 | 78 | 14 | 84.8 % |
| Limiting Sell | 92 | 79 | 13 | 85.9 % |
| Leaking Token | 88 | 74 | 14 | 84.1 % |
| Overall | 272 | 231 | 41 | 84.9 % |
A representative zero-day detection was demonstrated on the Indo Token (IDRT) contract, which contained a previously unflagged public mint back-door.
Evaluation Metrics use standard definitions:
- Precision:
- Recall:
- :
5. Limitations
Three principal limitations constrain CRPWarner’s coverage and recall:
- Proxy Contracts: Static bytecode analysis often targets a proxy, whereas the actual logic contract is invoked via delegatecall at runtime. This can result in missed risk indicators unless the proxy’s target is resolved, which is nontrivial; future enhancements may require dynamic runtime or trace-based contract resolution.
- Pattern Coverage and Adaptiveness: The rule-based approach is inherently limited to known exploit patterns and back-door idioms. Emerging scam archetypes are not detected until added as new rules. Ongoing community engagement and rulebase updates are necessary to maintain relevance.
- Static-only Gaps: Pure static analysis cannot reason over behaviors contingent on runtime state or dynamic contract evolution—e.g., owner renouncement, time-based permission changes. Hybrid techniques combining symbolic or concolic execution or using live call traces could improve recall.
- Potential Evasion: Obfuscated or bytecode-reordered functions may evade established pattern matching, suggesting an ongoing adversarial arms race.
6. Future Research Directions
Future prospects for CRPWarner and contract risk analysis tools include the integration of dynamic and hybrid analysis methods to address the static-only gaps, development of more robust logic-contract resolution in proxy deployments, and rapid, collaborative expansion of exploit-pattern definitions. A plausible implication is that as attacker techniques evolve, a continuously adaptive and community-driven approach to feature and rule engineering will be critical for sustaining detection efficacy (Lin et al., 2024).