Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 152 tok/s
Gemini 2.5 Pro 51 tok/s Pro
GPT-5 Medium 30 tok/s Pro
GPT-5 High 27 tok/s Pro
GPT-4o 119 tok/s Pro
Kimi K2 197 tok/s Pro
GPT OSS 120B 425 tok/s Pro
Claude Sonnet 4.5 34 tok/s Pro
2000 character limit reached

Why Does My Transaction Fail? A First Look at Failed Transactions on the Solana Blockchain (2504.18055v1)

Published 25 Apr 2025 in cs.SE

Abstract: Solana is an emerging blockchain platform, recognized for its high throughput and low transaction costs, positioning it as a preferred infrastructure for Decentralized Finance (DeFi), Non-Fungible Tokens (NFTs), and other Web 3.0 applications. In the Solana ecosystem, transaction initiators submit various instructions to interact with a diverse range of Solana smart contracts, among which are decentralized exchanges (DEXs) that utilize automated market makers (AMMs), allowing users to trade cryptocurrencies directly on the blockchain without the need for intermediaries. Despite the high throughput and low transaction costs of Solana, the advantages have exposed Solana to bot spamming for financial exploitation, resulting in the prevalence of failed transactions and network congestion. Prior work on Solana has mainly focused on the evaluation of the performance of the Solana blockchain, particularly scalability and transaction throughput, as well as on the improvement of smart contract security, leaving a gap in understanding the characteristics and implications of failed transactions on Solana. To address this gap, we conducted a large-scale empirical study of failed transactions on Solana, using a curated dataset of over 1.5 billion failed transactions across more than 72 million blocks. Specifically, we first characterized the failed transactions in terms of their initiators, failure-triggering programs, and temporal patterns, and compared their block positions and transaction costs with those of successful transactions. We then categorized the failed transactions by the error messages in their error logs, and investigated how specific programs and transaction initiators are associated with these errors...

Summary

  • The paper presents an empirical analysis of failed Solana transactions by categorizing error types and quantifying bot versus human failure rates.
  • It reveals that bot accounts experience a 58.43% failure rate compared to 6.22% for human accounts, underscoring automated trading challenges.
  • The study finds that 77.95% of failures stem from ten key programs, with AMMs and DEX aggregators being major contributors.

Analysis of Failed Transactions on the Solana Blockchain

The paper "Why Does My Transaction Fail? A First Look at Failed Transactions on the Solana Blockchain" presents an empirical investigation of failed transactions on the Solana blockchain. The research focuses on characterizing failed transactions, identifying error types, and examining programmatic and user-driven factors contributing to transaction failures. This essay analyzes the paper's findings and discusses their implications.

Characteristics of Failed Transactions

The paper provides a detailed macro- and micro-level analysis of failed transactions on Solana. It classifies transaction initiators into bot and human accounts, revealing distinct behaviors and failure rates. Bot accounts have a significantly high transaction failure rate of 58.43%, while human accounts show a notably lower failure rate of 6.22%. This disparity highlights the prevalence of high-frequency, automated trading strategies among bots, contributing to network congestion.

Programs accountable for transaction failures are also analyzed. Ten programs are identified as responsible for a substantial portion of failures, constituting 77.95% of all failed transactions. The data points to AMMs and DEX aggregators as major contributors. Notably, Raydium Liquidity Pool and Jupiter Aggregator V6 are significant sources of failures, aligning with their roles in the ecosystem.

Hourly analysis of transaction volume and failure rates reveals a positive correlation and cyclical patterns, indicating consistent daily operational behaviors and stress points on the network.

Error Types Causing Transaction Failures

The paper categorizes error types into ten distinct categories, with the top three—price or profit not met, invalid status, and validity expiration—comprising 84.90% of all errors. The most common errors suggest issues in business logic validation, state management, and infrastructure reliability on Solana.

For instance, the "price or profit not met" error reflects challenges in profit-driven transactions, often associated with slippage and unmet market conditions. Invalid status errors indicate frequent interactions with liquidity pools and the stateful nature of specific programs, while validity expiration errors highlight temporal challenges in transaction processing.

Contributions of Programs and Users to Failures

The paper further investigates the contributions of specific programs and account types to distinct error types. For instance, Jupiter Aggregator V6 predominantly faces price or profit not met errors, indicating frequent arbitrage and trading activities. Conversely, Raydium Liquidity Pool suffers primarily from invalid status errors, often due to sniper bot activities.

Analysis of bot and human accounts reveals that bots encounter a broader range of errors compared to human users, aligning with their complex high-frequency interactions. The research highlights how these differing behaviors result in varying error type distributions, providing insights into user and program interaction patterns on Solana.

Conclusion

This paper serves as the first large-scale empirical exploration of transaction failures on Solana, offering a granular analysis of error types and contributors. The findings underscore the necessity for ecosystem-wide strategies to mitigate bot activity, enhance system reliability, and improve user experience. Future research can build on these insights to develop automated diagnostic tools, optimize transaction efficiency, and explore cross-platform implications for blockchain networks.

Ai Generate Text Spark Streamline Icon: https://streamlinehq.com

Knowledge Gaps

Knowledge gaps, limitations, and open questions

Below is a concise, actionable list of what remains missing, uncertain, or unexplored in the paper, aimed to guide future research.

  • Sampling design and representativeness: The one-day-per-week stratified sampling (53 days) may miss multi-day events and bursty phenomena (e.g., “memecoin mania”), limiting the robustness of temporal analyses and autocorrelation findings. A continuous, full-year dataset or event-aware sampling is needed.
  • RPC-source bias: All data were retrieved via a single Alchemy RPC endpoint; potential node-specific biases, missing logs, or rate-limiting effects were not assessed. Cross-provider, multi-node validation would strengthen reliability.
  • Exclusion of vote transactions: While intentional, not analyzing vote transactions leaves unexplored how consensus traffic interacts with (and potentially amplifies) non-vote failure dynamics and congestion.
  • Unknown initiator accounts dominate: Of 12,712,516 accounts, only 2,162,908 were labeled (803,136 bots; 1,359,772 humans), leaving ~10.55M accounts as “unknown.” Key conclusions about bot vs. human failure rates may not generalize. Methods to reduce “unknown” (e.g., semi-supervised learning, richer features) are needed.
  • Small ground-truth and limited features for bot classification: The RF model was trained on only 200 manually labeled accounts using coarse behavioral features (frequency, volume, intervals). Incorporating richer signals (e.g., wallet fingerprinting, program interaction graphs, timing jitter analysis, tip usage) and larger labeled sets could improve accuracy and reduce misclassification.
  • Attribution of responsibility to the outermost program: Errors were attributed to the outermost program in the call stack, which may misassign failures originating in nested calls (e.g., aggregators delegating to AMMs). A call-graph-aware attribution is needed to apportion blame across composing programs.
  • Error extraction coverage gaps: Approximately 365M failed transactions (~24% of failures) were excluded due to incomplete or missing log messages. Understanding why logs are missing (program logging practices, RPC limits, truncation) and recovering or triangulating error causes is a key gap.
  • Long-tail error types underexplored: Thematic analysis focused on the top 173 high-frequency messages; rare but critical errors (security-sensitive, protocol-level) may be overlooked. Systematic coverage of low-frequency errors is needed.
  • Limited decoding of numeric error codes: Many Solana programs emit “custom program error: 0xNN” without descriptive text. The paper did not systematically map these codes to program-specific error registries/IDLs; doing so could sharpen error categorization and root-cause analysis.
  • Lack of instruction-level failure analysis: The paper does not identify which specific instructions within transactions fail most often (e.g., account initialization vs. swaps vs. transfers), hindering precise remediation targets.
  • No analysis of account-lock conflicts: Solana’s parallel execution relies on account read/write locks; failures due to contention (“account in use,” lock conflicts) are not quantified or linked to program types or times of day.
  • Priority-fee mechanics unexamined: The effects of compute unit price, priority fees, and Jito tips on success/failure, block position, and latency are discussed but not measured. Collecting per-transaction tip amounts and CU price settings is essential to quantify scheduling outcomes.
  • Compute unit limit vs. usage: The paper analyzes CUs consumed but not the sender-specified CU limit per transaction, leaving under-/over-provisioning and its relationship to failure rates unresolved.
  • Validator- and leader-level heterogeneity: Failure rates were not stratified by leader schedule, validator client/version, or region. Per-leader/validator analyses could reveal scheduling policies, tip adoption, and congestion hotspots.
  • Pre/post-upgrade causal effects: The observed failure-rate reduction post June 16, 2024 is attributed to upgrades, but no causal inference (e.g., interrupted time series, difference-in-differences) is performed. Formal evaluation of protocol changes is needed.
  • Economic impact of failures: The aggregate SOL spent on failed transactions, user-level loss metrics, and ecosystem-level resource waste (e.g., total consumed CUs by failed txs per block/program) are not quantified.
  • “Expected” vs. “problematic” failures: Protective failures (e.g., slippage checks) are conflated with undesirable failures. A taxonomy distinguishing intentional safeguards from misconfigurations, contention, or bugs would focus mitigation efforts.
  • Market-state linkage: Price/profit-not-met errors likely depend on volatility, liquidity, and pool states; the paper does not correlate error incidence with market conditions, oracle updates, or AMM liquidity dynamics.
  • Front-running/MEV characterization: Bots are said to trigger invalid status errors via front-running/manipulation, but MEV strategies, outcomes, and their share of failures are not measured. Instrumentation to detect and quantify MEV-related failures is an open need.
  • Program-version evolution: Differences between Jupiter V4 vs. V6 and high-failure unnamed programs are reported but not tied to code changes, deployment epochs, or configuration updates. Linking failures to program versions/commits would identify regressions.
  • Wallet/frontend effects: Human failures (e.g., out-of-funds) may be frontend-driven (poor fee estimation, slippage defaults). The paper does not stratify by wallet/application, leaving UI/UX improvement opportunities unquantified.
  • Geographic/temporal drivers of daily cycles: The 24-hour periodicity is documented but not explained (e.g., overlap with major market hours, bot scheduling, validator maintenance windows). Attribution of cyclical drivers remains open.
  • Cross-chain comparison: The paper cites Ethereum’s lower failure rate but does not perform controlled cross-chain analyses (error types, economic conditions, fee markets, scheduler designs), limiting generalization of findings.
  • Reproducibility and data availability: The curated dataset is described but not publicly linked, and data collection/processing pipelines are not fully documented for replication. Publishing datasets, parsers, and classification code would enable external validation.
  • Mitigation evaluation: Recommendations are high-level; there is no empirical assessment of concrete mitigations (e.g., slippage advisories, dynamic CU estimation, program-level prechecks, anti-spam gating) on failure reduction. Experimental pilots or A/B tests are needed.
Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 2 tweets and received 9 likes.

Upgrade to Pro to view all of the tweets about this paper: