Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash 97 tok/s
Gemini 2.5 Pro 58 tok/s Pro
GPT-5 Medium 38 tok/s
GPT-5 High 37 tok/s Pro
GPT-4o 101 tok/s
GPT OSS 120B 466 tok/s Pro
Kimi K2 243 tok/s Pro
2000 character limit reached

BEYONDWORDS is All You Need: Agentic Generative AI based Social Media Themes Extractor (2503.01880v1)

Published 26 Feb 2025 in cs.CL, cs.AI, and cs.SI

Abstract: Thematic analysis of social media posts provides a major understanding of public discourse, yet traditional methods often struggle to capture the complexity and nuance of unstructured, large-scale text data. This study introduces a novel methodology for thematic analysis that integrates tweet embeddings from pre-trained LLMs, dimensionality reduction using and matrix factorization, and generative AI to identify and refine latent themes. Our approach clusters compressed tweet representations and employs generative AI to extract and articulate themes through an agentic Chain of Thought (CoT) prompting, with a secondary LLM for quality assurance. This methodology is applied to tweets from the autistic community, a group that increasingly uses social media to discuss their experiences and challenges. By automating the thematic extraction process, the aim is to uncover key insights while maintaining the richness of the original discourse. This autism case study demonstrates the utility of the proposed approach in improving thematic analysis of social media data, offering a scalable and adaptable framework that can be applied to diverse contexts. The results highlight the potential of combining machine learning and Generative AI to enhance the depth and accuracy of theme identification in online communities.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.