Who Started It? Identifying Root Sources in Textual Conversation Threads (1809.03648v2)
Abstract: In textual conversation threads, as found on many popular social media platforms, each particular user text comment either originates a new thread of discussion, or replies to a previous comment. An individual who makes an original comment ---termed as the "root source''---is a topic initiator or even an information source, and identifying such individuals is of particular interest. The reply structure of comments is not always available (e.g. in the proliferation of a news event), and thus identifying root sources is a nontrivial task. In this paper, we develop a generative model based on marked multivariate Hawkes processes, and introduce a novel concept, "root source probability", to quantify the uncertainty in attributing possible root sources to each comment. A dynamic-programming-based algorithm is then derived to efficiently compute root source probabilities. Experiments on synthetic and real-world data show that our method identifies root sources that match ground truth and human intuition.