Papers
Topics
Authors
Recent
Search
2000 character limit reached

Out of vocabulary words decrease, running texts prevail and hashtags coalesce: Twitter as an evolving sociolinguistic system

Published 17 Sep 2015 in cs.SI | (1509.05096v1)

Abstract: Twitter is one of the most popular social media. Due to the ease of availability of data, Twitter is used significantly for research purposes. Twitter is known to evolve in many aspects from what it was at its birth; nevertheless, how it evolved its own linguistic style is still relatively unknown. In this paper, we study the evolution of various sociolinguistic aspects of Twitter over large time scales. To the best of our knowledge, this is the first comprehensive study on the evolution of such aspects of this OSN. We performed quantitative analysis both on the word level as well as on the hashtags since it is perhaps one of the most important linguistic units of this social media. We studied the (in)formality aspects of the linguistic styles in Twitter and find that it is neither fully formal nor completely informal; while on one hand, we observe that Out-Of-Vocabulary words are decreasing over time (pointing to a formal style), on the other hand it is quite evident that whitespace usage is getting reduced with a huge prevalence of running texts (pointing to an informal style). We also analyze and propose quantitative reasons for repetition and coalescing of hashtags in Twitter. We believe that such phenomena may be strongly tied to different evolutionary aspects of human languages.

Citations (7)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.