Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
158 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Blockchain is Watching You: Profiling and Deanonymizing Ethereum Users (2005.14051v2)

Published 28 May 2020 in cs.CR and cs.CY

Abstract: Ethereum is the largest public blockchain by usage. It applies an account-based model, which is inferior to Bitcoin's unspent transaction output model from a privacy perspective. Due to its privacy shortcomings, recently several privacy-enhancing overlays have been deployed on Ethereum, such as non-custodial, trustless coin mixers and confidential transactions. In our privacy analysis of Ethereum's account-based model, we describe several patterns that characterize only a limited set of users and successfully apply these quasi-identifiers in address deanonymization tasks. Using Ethereum Name Service identifiers as ground truth information, we quantitatively compare algorithms in recent branch of machine learning, the so-called graph representation learning, as well as time-of-day activity and transaction fee based user profiling techniques. As an application, we rigorously assess the privacy guarantees of the Tornado Cash coin mixer by discovering strong heuristics to link the mixing parties. To the best of our knowledge, we are the first to propose and implement Ethereum user profiling techniques based on quasi-identifiers. Finally, we describe a malicious value-fingerprinting attack, a variant of the Danaan-gift attack, applicable for the confidential transaction overlays on Ethereum. By incorporating user activity statistics from our data set, we estimate the success probability of such an attack.

Citations (67)

Summary

  • The paper identifies quasi-identifiers such as address reuse and transaction patterns to profile and deanonymize Ethereum users using graph representation learning techniques.
  • Applying graph representation learning methods like Diff2Vec and Role2Vec significantly enhances the effectiveness of Ethereum user deanonymization tasks.
  • Analysis reveals vulnerabilities in privacy tools like Tornado Cash and user behavior, proposing practical steps like randomized mixing and limiting address reuse to improve privacy.

An Analysis of Ethereum Privacy Mechanisms: Profiling and Deanonymizing Users

The paper "Blockchain is Watching You: Profiling and Deanonymizing Ethereum Users" addresses critical privacy concerns within the Ethereum blockchain ecosystem. Unlike Bitcoin's UTXO model, Ethereum follows an account-based model that presents inherent privacy limitations. These limitations necessitate a more in-depth exploration of how users can be profiled and potentially deanonymized. This work presents both empirical analysis and methodology in attempting to quantify and mitigate privacy risks for Ethereum users, specifically focusing on non-custodial mixing services and user behavior patterns.

Key Contributions

  1. Quasi-identifiers for Deanonymization: The authors identify quasi-identifiers for Ethereum users based on address reuse, time-of-day activity, transaction fees, and transaction graph analysis. Through these identifiers, the paper assesses the extent to which these patterns disclose user information and can be exploited for deanonymizing tasks.
  2. Graph Representation Learning: A significant portion of the processing involved the application of graph representation learning techniques. The paper is the first to quantitatively assess these machine learning algorithms within the Ethereum context, finding that methods like Diff2Vec and Role2Vec offer considerable enhancements in addressing deanonymization tasks.
  3. Privacy Evaluations of Tornado Cash: The paper evaluates the Tornado Cash coin mixer, which utilizes zkSNARKs to provide privacy for Ethereum users. While this tool is an attempt to address privacy issues, the authors identify vulnerabilities stemming from user behavior, such as insufficient randomness in mixing intervals and the reuse of withdrawal addresses.
  4. Proposed Ethereum Profiling Techniques: The methodology introduces various profiling techniques, including time-of-day and gas price modeling, which when used effectively can link Ethereum addresses back to a singular user. An empirical exploration of nearly 3300 Ethereum addresses highlighted the viability of these techniques.
  5. Danaan-Gift Attack as a Value Fingerprinting Strategy: By introducing a modified Danaan-gift attack application for Ethereum, the authors underscore the risk in confidential transaction overlays, allowing adversaries to track transaction trails despite privacy-enhancement tools, such as the AZTEC protocol.

Practical and Theoretical Implications

The theoretical implications challenge the presumed privacy of blockchain transactions, advocating for a comprehensive revision of both the understanding and implementation of privacy on account-based models like Ethereum. Practically, the results urge service providers and users to reconsider the design and utilization of Ethereum wallets and mixing services to enhance privacy. By showcasing the vulnerabilities within current user behavior and mixing strategies, practical modifications such as enforcing randomized mixing intervals and limiting address reuse could significantly bolster privacy outcomes.

Future Directions

In plausibly extending the research, future studies are encouraged to delve into additional layers of privacy concerns, such as the impact of off-chain data or network-level metadata collection. Exploring the synergy between on-chain analysis and off-chain metadata presents an intricate but necessary trajectory in advancing Ethereum's privacy mechanisms. Additionally, revisiting wallet software to bolster user privacy through automated anonymity practices could pave the way for more user-centric privacy controls.

Conclusion

This paper crucially revisits the privacy assumptions of the Ethereum blockchain, underscoring the sensitivity of user activity and the intricate nature of blockchain privacy. The fusion of empirical data analysis with sophisticated graph-theoretical models provides a nuanced understanding of how modern privacy mechanisms might adapt in the rapidly evolving blockchain domain.

Youtube Logo Streamline Icon: https://streamlinehq.com