Insecurity Through Obscurity: Veiled Vulnerabilities in Closed-Source Contracts (2504.13398v2)

Published 18 Apr 2025 in cs.CR

Abstract: Most blockchains cannot hide the binary code of programs (i.e., smart contracts) running on them. To conceal proprietary business logic and to potentially deter attacks, many smart contracts are closed-source and employ layers of obfuscation. However, we demonstrate that such obfuscation can obscure critical vulnerabilities rather than enhance security, a phenomenon we term insecurity through obscurity. To systematically analyze these risks on a large scale, we present SKANF, a novel EVM bytecode analysis tool tailored for closed-source and obfuscated contracts. SKANF combines control-flow deobfuscation, symbolic execution, and concolic execution based on historical transactions to identify and exploit asset management vulnerabilities. Our evaluation on real-world Maximal Extractable Value (MEV) bots reveals that SKANF detects vulnerabilities in 1,030 contracts and successfully generates exploits for 394 of them, with potential losses of \$10.6M. Additionally, we uncover 104 real-world MEV bot attacks that collectively resulted in \$2.76M in losses.

Summary

The paper introduces skanf, a tool that deobfuscates EVM bytecode by converting indirect jumps into a switch table for clearer control flows.
The paper employs concolic execution with historical transaction seeds to detect adversary-controlled asset transfer vulnerabilities in smart contracts.
The study evaluated 6,554 MEV bot contracts, identifying over 1,000 vulnerabilities with potential losses exceeding $9.0 million USD.

This paper investigates the security risks associated with closed-source and obfuscated smart contracts on the Ethereum Virtual Machine (EVM), arguing that obfuscation can hide critical vulnerabilities rather than enhance security – a phenomenon termed "insecurity through obscurity." The primary focus is on asset management vulnerabilities, which allow attackers to steal tokens (like ERC-20s) from vulnerable contracts. The research uses Maximal Extractable Value (MEV) bots as a key case paper due to their economic significance and common use of obfuscation techniques (2504.13398).

The authors identify three main challenges in analyzing such contracts:

Control-Flow Obfuscation: Techniques like using indirect jumps (where the jump destination is computed at runtime) make it difficult for static analysis tools and decompilers to understand the contract's execution flow.
Lack of Fine-Grained Analysis: Simple pattern matching is insufficient. Detecting asset management vulnerabilities requires deep analysis of how external calls (CALL instructions) are constructed and whether an attacker can manipulate critical parameters like the recipient address or transfer amount, even if parts of the call (like the function selector) are fixed.
Performance Issues: The complex logic often found in MEV bots (involving interactions with DEXs, flash loans, etc.) leads to path explosion in symbolic execution and performance bottlenecks in static analysis.

To address these challenges, the paper introduces skanf, a novel EVM bytecode analysis tool with a three-stage workflow:

Control Flow Deobfuscation:

Problem: Indirect jumps (JUMP/JUMPI with runtime-dependent destinations) hinder analysis.
Solution: skanf leverages the EVM requirement that all valid jump destinations must be marked with a JUMPDEST instruction. It statically identifies all JUMPDEST locations in the bytecode. Then, it instruments the bytecode by replacing indirect jumps with direct jumps to a generated switch table. This table contains conditional checks comparing the runtime destination value against all known JUMPDEST addresses, effectively converting indirect jumps into a series of direct, conditional jumps. This makes the control flow graph explicit and analyzable by subsequent stages and standard tools. The switch table is typically placed at a high memory address (e.g., 0xe000) to avoid collision with existing code.

Example Transformation: An indirect jump like PUSH runtime_value; JUMP is transformed into code that pushes the switch table address (0xe000) and jumps there. The switch table then has entries like:

// At 0xe000 (start of switch table)
DUP1 // duplicate runtime_value
PUSH 0x0a00 // known JUMPDEST
EQ
PUSH 0xf000 // jump to cleanup gadget if equal
JUMPI
// ... next check for another JUMPDEST (e.g., 0x0b00) ...

// At 0xf000 (cleanup gadget)
SWAP1 // if original was JUMPI, handle condition
POP // remove original runtime_value
JUMP // jump to the actual destination (e.g., 0x0a00)

Concolic Execution Powered by Historical Transactions:
- Problem: Pure symbolic execution is often too slow for complex contracts. Generating good seed inputs for concolic execution manually is difficult for closed-source contracts.
- Solution: skanf automatically extracts seed inputs from the target contract's historical on-chain transactions. Successful past transactions often exercise relevant code paths, including those involving external CALLs for asset transfers. These historical transaction details (caller, origin, calldata, value, block number) provide high-quality concrete inputs for concolic execution.
- During concolic execution, skanf performs taint analysis to track which parts of the CALL instruction's parameters (target address, function selector, arguments) are influenced by attacker-controlled inputs (typically originating from calldata).
- A potential vulnerability is flagged if a CALL instruction is reachable, and its critical parameters are either adversary-controllable (tainted) or fixed but risky (e.g., target is WETH address, function is transfer). Particular attention is paid to whether the first argument (recipient address for transfer/approve) can be controlled.
- A fallback to pure symbolic execution is used if concolic execution yields no results. This mode explores paths more broadly but is slower. It attempts different configurations for tx.origin and msg.sender to bypass access controls.
- Identified potential vulnerabilities are validated by simulating the execution path with concrete inputs on a local EVM fork to filter out false positives.
Exploit Generation and Validation:
- Goal: Synthesize a concrete transaction that triggers the vulnerability and successfully transfers assets to an attacker address.
- Method: skanf uses the vulnerability information (controllable parameters, required tx.origin if any) from the previous stage. It sets controllable CALL parameters adversarially (e.g., target WETH, call transfer, to attacker address, amount contract's balance). It then uses constraint solving (continuing symbolic/concolic execution after the vulnerable CALL) to find calldata values that satisfy all path conditions required for the transaction to complete successfully (i.e., reach a STOP or RETURN). It assumes tx.origin constraints can be bypassed via phishing.
- Validation: The generated exploit transaction is executed on a local EVM fork simulating the state at the relevant block height. Success is verified by checking for successful transaction completion and the presence of expected Transfer or Approval event logs.

Evaluation and Findings:

Dataset: 6,554 real-world MEV bot contracts.
Deobfuscation (RQ1): skanf significantly improved code coverage for obfuscated contracts compared to the baseline Gigahorse tool. For 90% of contracts with initial coverage <50%, skanf increased coverage. For 5 out of the top 10 most obfuscated MEV bots, coverage went from <10% to 100%.
Vulnerability Detection (RQ2):
- skanf identified 1,028 vulnerable contracts using concolic execution (vs. 719 using symbolic only).
- It significantly outperformed state-of-the-art tools: Mythril (89 true positives), ETHBMC (0), JACKAL (18).
- skanf generated working exploits for 373 of the 1,028 vulnerable contracts.
- The estimated potential loss from these 373 exploits exceeds $9.0 million USD (based on historical maximum balances).
Real-World Attacks (RQ3):
- The paper discovered 40 previously largely unreported real-world attacks ("MEV Phishing") targeting 12 MEV bot contracts, resulting in ~$900,000 in losses.
- These attacks often involved tricking the searcher (victim) into calling a malicious contract (token-based or via refund mechanisms), thereby bypassing tx.origin checks.
- skanf successfully identified vulnerabilities in all 12 victim contracts, suggesting these losses could potentially have been prevented.
- The paper also notes cases where contracts lack access control entirely, possibly reflecting a gas cost vs. security trade-off.

Contributions:

A novel EVM bytecode deobfuscation technique using switch table rewriting.
A practical concolic execution approach leveraging historical transactions as automatic seed inputs.
The skanf tool implementing these techniques for detecting asset management vulnerabilities.
Large-scale evaluation demonstrating high vulnerability rates and potential losses ($9M+) in real-world MEV bots.
Discovery and analysis of 40 real-world MEV phishing attacks, highlighting the practical impact of these vulnerabilities.

The authors practiced responsible disclosure for the vulnerabilities found. The work highlights the dangers of relying on obscurity for security in smart contracts and provides a practical tool (skanf) for developers and auditors to proactively identify and mitigate these risks.

PDF Markdown

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

Generate Now

Related Papers

Find Related Papers

Authors (4)

Tweets

https://twitter.com/0xFanZhang/status/1914399074251428242

https://twitter.com/syang2ng/status/1914390769764286723

https://twitter.com/yaish_aviv/status/1914397887150088235

https://twitter.com/syang2ng/status/1914389645359489246

https://twitter.com/YaleACL/status/1914396662077816907

https://twitter.com/FSFG/status/1914436908236537902