Insecurity Through Obscurity: Veiled Vulnerabilities in Closed-Source Contracts (2504.13398v2)
Abstract: Most blockchains cannot hide the binary code of programs (i.e., smart contracts) running on them. To conceal proprietary business logic and to potentially deter attacks, many smart contracts are closed-source and employ layers of obfuscation. However, we demonstrate that such obfuscation can obscure critical vulnerabilities rather than enhance security, a phenomenon we term insecurity through obscurity. To systematically analyze these risks on a large scale, we present SKANF, a novel EVM bytecode analysis tool tailored for closed-source and obfuscated contracts. SKANF combines control-flow deobfuscation, symbolic execution, and concolic execution based on historical transactions to identify and exploit asset management vulnerabilities. Our evaluation on real-world Maximal Extractable Value (MEV) bots reveals that SKANF detects vulnerabilities in 1,030 contracts and successfully generates exploits for 394 of them, with potential losses of \$10.6M. Additionally, we uncover 104 real-world MEV bot attacks that collectively resulted in \$2.76M in losses.
Summary
- The paper introduces skanf, a tool that deobfuscates EVM bytecode by converting indirect jumps into a switch table for clearer control flows.
- The paper employs concolic execution with historical transaction seeds to detect adversary-controlled asset transfer vulnerabilities in smart contracts.
- The study evaluated 6,554 MEV bot contracts, identifying over 1,000 vulnerabilities with potential losses exceeding $9.0 million USD.
This paper investigates the security risks associated with closed-source and obfuscated smart contracts on the Ethereum Virtual Machine (EVM), arguing that obfuscation can hide critical vulnerabilities rather than enhance security – a phenomenon termed "insecurity through obscurity." The primary focus is on asset management vulnerabilities, which allow attackers to steal tokens (like ERC-20s) from vulnerable contracts. The research uses Maximal Extractable Value (MEV) bots as a key case paper due to their economic significance and common use of obfuscation techniques (2504.13398).
The authors identify three main challenges in analyzing such contracts:
- Control-Flow Obfuscation: Techniques like using indirect jumps (where the jump destination is computed at runtime) make it difficult for static analysis tools and decompilers to understand the contract's execution flow.
- Lack of Fine-Grained Analysis: Simple pattern matching is insufficient. Detecting asset management vulnerabilities requires deep analysis of how external calls (
CALL
instructions) are constructed and whether an attacker can manipulate critical parameters like the recipient address or transfer amount, even if parts of the call (like the function selector) are fixed. - Performance Issues: The complex logic often found in MEV bots (involving interactions with DEXs, flash loans, etc.) leads to path explosion in symbolic execution and performance bottlenecks in static analysis.
To address these challenges, the paper introduces skanf, a novel EVM bytecode analysis tool with a three-stage workflow:
- Control Flow Deobfuscation:
- Problem: Indirect jumps (
JUMP
/JUMPI
with runtime-dependent destinations) hinder analysis. - Solution: skanf leverages the EVM requirement that all valid jump destinations must be marked with a
JUMPDEST
instruction. It statically identifies allJUMPDEST
locations in the bytecode. Then, it instruments the bytecode by replacing indirect jumps with direct jumps to a generated switch table. This table contains conditional checks comparing the runtime destination value against all knownJUMPDEST
addresses, effectively converting indirect jumps into a series of direct, conditional jumps. This makes the control flow graph explicit and analyzable by subsequent stages and standard tools. The switch table is typically placed at a high memory address (e.g., 0xe000) to avoid collision with existing code.
Example Transformation: An indirect jump like
PUSH runtime_value; JUMP
is transformed into code that pushes the switch table address (0xe000
) and jumps there. The switch table then has entries like:1 2 3 4 5 6 7 8 9 10 11 12
// At 0xe000 (start of switch table) DUP1 // duplicate runtime_value PUSH 0x0a00 // known JUMPDEST EQ PUSH 0xf000 // jump to cleanup gadget if equal JUMPI // ... next check for another JUMPDEST (e.g., 0x0b00) ... // At 0xf000 (cleanup gadget) SWAP1 // if original was JUMPI, handle condition POP // remove original runtime_value JUMP // jump to the actual destination (e.g., 0x0a00)
- Problem: Indirect jumps (
Concolic Execution Powered by Historical Transactions:
- Problem: Pure symbolic execution is often too slow for complex contracts. Generating good seed inputs for concolic execution manually is difficult for closed-source contracts.
- Solution: skanf automatically extracts seed inputs from the target contract's historical on-chain transactions. Successful past transactions often exercise relevant code paths, including those involving external
CALL
s for asset transfers. These historical transaction details (caller, origin, calldata, value, block number) provide high-quality concrete inputs for concolic execution. - During concolic execution, skanf performs taint analysis to track which parts of the
CALL
instruction's parameters (target address, function selector, arguments) are influenced by attacker-controlled inputs (typically originating fromcalldata
). - A potential vulnerability is flagged if a
CALL
instruction is reachable, and its critical parameters are either adversary-controllable (tainted) or fixed but risky (e.g., target is WETH address, function istransfer
). Particular attention is paid to whether the first argument (recipient address fortransfer
/approve
) can be controlled. - A fallback to pure symbolic execution is used if concolic execution yields no results. This mode explores paths more broadly but is slower. It attempts different configurations for
tx.origin
andmsg.sender
to bypass access controls. - Identified potential vulnerabilities are validated by simulating the execution path with concrete inputs on a local EVM fork to filter out false positives.
- Exploit Generation and Validation:
- Goal: Synthesize a concrete transaction that triggers the vulnerability and successfully transfers assets to an attacker address.
- Method: skanf uses the vulnerability information (controllable parameters, required
tx.origin
if any) from the previous stage. It sets controllableCALL
parameters adversarially (e.g., target WETH, calltransfer
,to
attacker address,amount
contract's balance). It then uses constraint solving (continuing symbolic/concolic execution after the vulnerableCALL
) to findcalldata
values that satisfy all path conditions required for the transaction to complete successfully (i.e., reach aSTOP
orRETURN
). It assumestx.origin
constraints can be bypassed via phishing. - Validation: The generated exploit transaction is executed on a local EVM fork simulating the state at the relevant block height. Success is verified by checking for successful transaction completion and the presence of expected
Transfer
orApproval
event logs.
Evaluation and Findings:
- Dataset: 6,554 real-world MEV bot contracts.
- Deobfuscation (RQ1): skanf significantly improved code coverage for obfuscated contracts compared to the baseline Gigahorse tool. For 90% of contracts with initial coverage <50%, skanf increased coverage. For 5 out of the top 10 most obfuscated MEV bots, coverage went from <10% to 100%.
- Vulnerability Detection (RQ2):
- skanf identified 1,028 vulnerable contracts using concolic execution (vs. 719 using symbolic only).
- It significantly outperformed state-of-the-art tools: Mythril (89 true positives), ETHBMC (0), JACKAL (18).
- skanf generated working exploits for 373 of the 1,028 vulnerable contracts.
- The estimated potential loss from these 373 exploits exceeds $9.0 million USD (based on historical maximum balances).
- Real-World Attacks (RQ3):
- The paper discovered 40 previously largely unreported real-world attacks ("MEV Phishing") targeting 12 MEV bot contracts, resulting in ~$900,000 in losses.
- These attacks often involved tricking the searcher (victim) into calling a malicious contract (token-based or via refund mechanisms), thereby bypassing
tx.origin
checks. - skanf successfully identified vulnerabilities in all 12 victim contracts, suggesting these losses could potentially have been prevented.
- The paper also notes cases where contracts lack access control entirely, possibly reflecting a gas cost vs. security trade-off.
Contributions:
- A novel EVM bytecode deobfuscation technique using switch table rewriting.
- A practical concolic execution approach leveraging historical transactions as automatic seed inputs.
- The
skanf
tool implementing these techniques for detecting asset management vulnerabilities. - Large-scale evaluation demonstrating high vulnerability rates and potential losses ($9M+) in real-world MEV bots.
- Discovery and analysis of 40 real-world MEV phishing attacks, highlighting the practical impact of these vulnerabilities.
The authors practiced responsible disclosure for the vulnerabilities found. The work highlights the dangers of relying on obscurity for security in smart contracts and provides a practical tool (skanf
) for developers and auditors to proactively identify and mitigate these risks.
Follow-up Questions
We haven't generated follow-up questions for this paper yet.