A Longitudinal Analysis of Misinformation, Polarization, and Toxicity on Bluesky Following Its Public Launch
The academic paper titled "A Longitudinal Analysis of Misinformation, Polarization and Toxicity on Bluesky After Its Public Launch" aims to investigate the impact of Bluesky's transition from an invite-only model to a publicly accessible social media platform. Employing a comprehensive longitudinal approach, the paper analyzes user interactions, language use, network structure, political leanings, misinformation spread, and toxicity, providing invaluable insights into the dynamics of a decentralized social media environment.
Methodology
Data were collected through Bluesky's Firehose endpoint, capturing real-time user activities over a specified timeframe. The paper period spanned from January 9 to March 4, 2024, covering both pre- and post-public launch phases. Language classification, community detection via the Louvain algorithm, and cross-referencing shared links with NewsGuard and Media Bias/Fact Check databases formed the backbone of the analyses. Toxicity assessment utilized the Detoxify model for multilingual evaluation.
Key Findings
- User and Network Dynamics: The platform's public opening on February 6, 2024, resulted in a surge of new users and interactions. Original posts dominate user activities compared to reposts, contrary to patterns typical of centralized social media platforms. The density of the follower network slightly diminished post-launch, yet both the size of the strongly connected component and average degree doubled, suggesting an increase in user connectivity.
- Language Distribution: Japanese and English emerged as the predominant languages. Interestingly, Japanese saw a dramatic rise in post-launch activity, an aspect indicating that decentralized platforms like Bluesky may appeal significantly to certain linguistic demographics.
- Political Leanings: The platform's user base predominantly leaned left politically, as determined by the political bias of shared domains and interactions within the largest detected communities.
- Misinformation Trends: The paper noted a negligible presence of low-credibility content, with high-credibility domains being shared significantly more. Despite the minor share of disinformation, superspreaders were identified, a pattern consistent with prior findings on similar platforms. Users involved in sharing low-credibility content were few, highlighting the relative robustness of information reliability on Bluesky.
- Toxicity and Moderation: Bluesky maintained low toxicity levels across different language groups, though English-speaking users showed slightly higher toxicity levels. Only 0.5% of users experienced moderation actions, with some users flagged or removed for violating policies, reflecting active moderation efforts despite the decentralized nature of the platform.
- Community Insights: The five largest reposting communities encompassed a wide array of interests, from art-centric to politically oriented discussions, primarily reflecting language-based segmentation. Notably, there was an observed higher level of toxicity in communities where English predominated.
Implications and Future Directions
This paper offers critical insights into decentralized social media dynamics and suggests that Bluesky's model encourages rich, original content creation while fostering a relatively low-toxic environment. The research underscores the platform's potential role in political discourse, given its left-leaning user base.
Practically, these findings provide a foundation for enhancing content moderation strategies within decentralized platforms. The presence of superspreaders and the spread of low-credibility content, albeit limited, highlight areas for improved vigilance. The paper suggests decentralized networks can be managed effectively with the right moderation tools, although they are not entirely immune to manipulative behaviors.
Theoretically, the dynamics observed in this paper contribute to broader understandings of social networks, decentralization impact, and online behavior patterns. Future work should investigate longer-term trends to verify if initial findings hold and explore the scalability of moderation strategies. Additionally, understanding user retention and activity in other languages not covered by this paper could provide a holistic view of decentralization effects on online interactions.
Ultimately, this research offers a nuanced perspective on decentralized social media, presenting empirical evidence on user behavior and content dynamics—key considerations for the future of online interactions and the development of safe, user-controlled digital spaces.