A Method for Estimating Individual Socioeconomic Status of Twitter Users (2203.11636v2)
Abstract: The rise of social media has opened countless opportunities to explore social science questions with new data and methods. However, research on socioeconomic inequality remains constrained by limited individual-level socioeconomic status (SES) measures in digital trace data. Following Bourdieu, we argue that the commercial and entertainment accounts Twitter users follow reflect their economic and cultural capital. Adapting a political science method for inferring political ideology, we use correspondence analysis to estimate the SES of 3,482,652 Twitter users who follow the accounts of 339 brands in the United States. We validate our estimates with data from the Facebook Marketing API, self-reported job titles on users' Twitter profiles, and a small survey sample. The results show reasonable correlations with the standard proxies for SES, alongside much weaker or non-significant correlations with other demographic variables. The proposed method opens new opportunities for innovative social research on inequality on Twitter and similar online platforms.
- Interpretable socioeconomic status inference from aerial imagery through urban patterns. Nature Machine Intelligence, 2(11):684–692, 2020. doi:10.1038/s42256-020-00243-5.
- Optimal proxy selection for socioeconomic status inference on Twitter. Complexity, 2019:e6059673, 2019. doi:10.1155/2019/6059673.
- Socioeconomic status and health: The challenge of the gradient. American Psychologist, 49(1):15–24, 1994. doi:10.1037/0003-066X.49.1.15.
- Massimo Airoldi. The techno-social reproduction of taste boundaries on digital platforms: The case of music on YouTube. Poetics, 89:101563, 2021. doi:10.1016/j.poetic.2021.101563.
- Social status and cultural consumption in the United States. Poetics, 35(2):191–212, 2007. doi:10.1016/j.poetic.2007.03.005.
- Predicting Twitter user socioeconomic attributes with network and language information. In Proceedings of the 29th on Hypertext and Social Media, HT ’18, pages 20–24, 2018. doi:10.1145/3209542.3209577.
- Is the desire for status a fundamental human motive? A review of the empirical literature. Psychological Bulletin, 141(3), 2015. doi:10.1037/a0038781.
- Using Facebook ads audiences for global lifestyle disease surveillance: Promises and limitations. In Proceedings of the 2017 ACM on Web Science Conference, WebSci ’17, pages 253–257, 2017. doi:10.1145/3091478.3091513.
- Mining the Demographics of Political Sentiment from Twitter Using Learning from Label Proportions. In 2017 IEEE International Conference on Data Mining (ICDM), pages 733–738, November 2017. doi:10.1109/ICDM.2017.84. ISSN: 2374-8486.
- Your digital image: factors behind demographic and psychometric predictions from social network profiles. In Proceedings of the 2014 international conference on Autonomous agents and multi-agent systems, AAMAS ’14, pages 1649–1650, 2014.
- Linking Twitter and survey data: Asymmetry in quantity and its impact. EPJ Data Science, 10(1), 2021. doi:10.1140/epjds/s13688-021-00286-7.
- Exposure to opposing views on social media can increase political polarization. Proceedings of the National Academy of Sciences, 115(37):9216–9221, 2018. doi:10.1073/pnas.1804840115.
- Pablo Barberá. Birds of the same feather tweet together: Bayesian ideal point estimation using Twitter data. Political Analysis, 23(1):76–91, 2015. doi:10.1093/pan/mpu011.
- Pablo Barberá. pablobarbera/echo_chambers, 2020. URL https://github.com/pablobarbera/echo_chambers.
- Tweeting from left to right: Is online political communication more than an echo chamber? Psychological Science, 26(10):1531–1542, 2015. doi:10.1177/0956797615594620.
- Subtle signals of inconspicuous consumption. Journal of Consumer Research, 37(4):555–569, 2010. doi:10.1086/655445.
- Inferring the demographics of search users: social data meets search queries. In Proceedings of the 22nd international conference on World Wide Web, WWW ’13, pages 131–140, 2013. doi:10.1145/2488388.2488401.
- Quantifying social media’s political space: Estimating ideology from publicly revealed preferences on Facebook. American Political Science Review, 109(1):62–78, 2015. doi:10.1017/S0003055414000525.
- Pierre Bourdieu. Distinction: A Social Critique of the Judgement of Taste. Harvard University Press, Cambridge, MA, 1984.
- Beyond SES: A resource model of political participation. American Political Science Review, 89(2):271–294, 1995. doi:10.2307/2082425.
- Brandwatch. 60 Incredible and Interesting Twitter Stats and Statistics, 2020. URL https://www.brandwatch.com/blog/twitter-stats-and-statistics/.
- Social resources and socioeconomic status. Social Networks, 8(1):97–117, 1986. doi:10.1016/S0378-8733(86)80017-X.
- Tak Wing Chan. Understanding social status: A reply to Flemmen, Jarness and Rosenlund. The British Journal of Sociology, 70(3), 2019. doi:10.1111/1468-4446.12628.
- Class and status: The conceptual distinction and its empirical relevance. American Sociological Review, 72(4):512–532, 2007a. doi:10.1177/000312240707200402.
- Social status and newspaper readership. American Journal of Sociology, 112(4):1095–1134, 2007b. doi:10.1086/508792.
- Social capital I: measurement and associations with economic mobility. Nature, 608, 2022. doi:10.1038/s41586-022-04996-4. URL https://www.nature.com/articles/s41586-022-04996-4.
- Predicting the present with Google Trends. Economic Record, 88(s1):2–9, 2012. doi:10.1111/j.1475-4932.2012.00809.x.
- What does it mean to be a cultural omnivore? Conflicting visions of omnivorousness in empirical research. Sociological Research Online, page 13607804211006109, 2021. doi:10.1177/13607804211006109.
- Best practices in conceptualizing and measuring social class in psychological research. Analyses of Social Issues and Public Policy, 13(1):77–113, 2013. doi:10.1111/asap.12001.
- Network effects and social inequality. Annual Review of Sociology, 38(1):93–118, 2012. doi:10.1146/annurev.soc.012809.102545.
- Cumulative advantage as a mechanism for inequality: A review of theoretical and empirical developments. Annual Review of Sociology, 32(1):271–297, 2006. doi:10.1146/annurev.soc.32.061604.123127.
- Socioeconomic status and psychiatric disorders: The causation-selection issue. Science, 255(5047):946–952, 1992. doi:10.1126/science.1546291.
- Otis Dudley Duncan. A socioeconomic index for all occupations. In Occupations and Social Status, pages 109–38. Free Press, New York, 1961.
- Network diversity and economic development. Science, 328(5981):1029–1031, 2010. doi:10.1126/science.1186605.
- The rise of inconspicuous consumption. Journal of Marketing Management, 31(7-8):807–826, 2015. doi:10.1080/0267257X.2014.989890.
- The Constant Flux: A Study of Class Mobility in Industrial Societies. Oxford University Press, Oxford, England, 1992.
- Facebook. Marketing API - Documentation, 2021. URL https://developers.facebook.com/docs/marketing-apis/.
- Using Facebook ad data to track the global digital gender gap. World Development, 107:189–209, 2018. doi:10.1016/j.worlddev.2018.03.007.
- Inferring User Social Class in Online Social Networks. In Proceedings of the 8th Workshop on Social Network Mining and Analysis, SNAKDD’14, pages 1–5, New York, NY, USA, August 2014. Association for Computing Machinery. ISBN 978-1-4503-3192-0. doi:10.1145/2659480.2659502. URL https://doi.org/10.1145/2659480.2659502.
- Susan T. Fiske. Envy Up, Scorn Down: How Status Divides Us. Russell Sage Foundation, 2011.
- Social space and cultural class divisions: The forms of capital and contemporary lifestyle differentiation. The British Journal of Sociology, 69(1):124–153, 2018. doi:10.1111/1468-4446.12295.
- Class and status: On the misconstrual of the conceptual distinction and a neo-Bourdieusian alternative. The British Journal of Sociology, 70(3):816–866, 2019. doi:10.1111/1468-4446.12508.
- Social class and cultural consumption: The impact of modernisation in a comparative European perspective. Comparative Sociology, 12(2):160–183, 2013. doi:10.1163/15691330-12341258.
- Assessing socioeconomic status of Twitter users: A survey. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019), pages 388–398, 2019. doi:10.26615/978-954-452-056-4_046.
- Amir Goldberg. Mapping shared understandings using relational class analysis: The case of the cultural omnivore reexamined. American Journal of Sociology, 116(5):1397–1436, 2011. doi:10.1086/657976.
- What does it mean to span cultural boundaries? Variety and atypicality in cultural consumption. American Sociological Review, 81(2):215–241, 2016. doi:10.1177/0003122416632787.
- Google. Overview | Geolocation API, 2020. URL https://developers.google.com/maps/documentation/geolocation/overview.
- Michael Greenacre. Correspondence Analysis in Practice. CRC Press, 2017.
- The consequences of online partisan media. Proceedings of the National Academy of Sciences, 118(14), 2021. doi:10.1073/pnas.2013464118.
- Digital Inequality: Differences in young adults’ use of the Internet. Communication Research, 35(5):602–621, 2008. doi:10.1177/0093650208321782.
- Socioeconomic indexes for occupations: A review, update, and critique. Sociological Methodology, 27:177–298, 1997.
- What demographic attributes do our digital footprints reveal? A systematic review. PLOS ONE, 13(11):e0207112, 2018. doi:10.1371/journal.pone.0207112.
- How visibility and divided attention constrain social contagion. In 2012 International Conference on Privacy, Security, Risk and Trust and 2012 International Confernece on Social Computing, pages 249–257, 2012. doi:10.1109/SocialCom-PASSAT.2012.129.
- Prediction and explanation in social systems. Science, 355(6324):486–488, 2017. doi:10.1126/science.aal3856.
- Douglas B. Holt. Does cultural capital structure American consumption? Journal of Consumer Research, 25(1):1–25, 1998. doi:10.1086/209523.
- Combining satellite imagery and machine learning to predict poverty. Science, 353(6301):790–794, 2016. doi:10.1126/science.aaf7894.
- Understanding demographic and socioeconomic biases of geotagged Twitter users at the county level. Cartography and Geographic Information Science, 46(3):228–242, 2019. doi:10.1080/15230406.2018.1434834.
- Twitter and research: A systematic literature review through text mining. IEEE Access, 8:67698–67717, 2020. doi:10.1109/ACCESS.2020.2983656.
- Tally Katz-Gerro. Cultural consumption and social stratification: Leisure activities, musical tastes, and social location. Sociological Perspectives, 42(4):627–646, 1999. doi:10.2307/1389577.
- Private traits and attributes are predictable from digital records of human behavior. Proceedings of the National Academy of Sciences, 110(15):5802–5805, 2013. doi:10.1073/pnas.1218772110.
- The Geometry of Culture: Analyzing the Meanings of Class through Word Embeddings. American Sociological Review, 84(5):905–949, October 2019. ISSN 0003-1224. doi:10.1177/0003122419877135.
- Measuring social class in US public health research: Concepts, methodologies, and guidelines. Annual Review of Public Health, 18(1):341–378, 1997. doi:10.1146/annurev.publhealth.18.1.341.
- Brand followers. International Journal of Advertising, 33(4), 2014. doi:10.2501/IJA-33-4-657-680.
- Computational social science. Science, 323(5915):721–723, 2009. doi:10.1126/science.1167742.
- Meaningful measures of human society in the twenty-first century. Nature, 595:189–196, July 2021. ISSN 1476-4687. doi:10.1038/s41586-021-03660-7.
- Socioeconomic correlations and stratification in social-communication networks. Journal of the Royal Society Interface, 13(125), 2016. doi:10.1098/rsif.2016.0598.
- Correlations and dynamics of consumption patterns in social-economic networks. Social Network Analysis and Mining, 8(1):9, 2018. doi:10.1007/s13278-018-0486-1.
- Social media fingerprints of unemployment. PLoS ONE, 10(5):e0128692, 2015. doi:10.1371/journal.pone.0128692.
- Inferring personal economic status from social network location. Nature Communications, 8(1):1–7, 2017. doi:10.1038/ncomms15227.
- Tony Maglio. TV Show Viewers Ranked by Wealth, From ’Modern Family’ to ’Empire’, 2016. URL https://www.thewrap.com/richest-poorest-tv-shows-modern-family-empire/.
- Peter V. Marsden. Core discussion networks of Americans. American Sociological Review, 52(1):122–131, 1987. doi:10.2307/2095397.
- Using Twitter for Demographic and Social Science Research: Tools for Data Collection and Processing. Sociological Methods & Research, 46(3):390–421, 2017. doi:10.1177/0049124115605339.
- Culture is digital: Cultural participation, diversity and the digital divide. New Media & Society, 21(7):1465–1485, 2019. doi:10.1177/1461444818822816.
- Does education improve citizenship? Evidence from the United States and the United Kingdom. Journal of Public Economics, 88(9):1667–1695, 2004. doi:10.1016/j.jpubeco.2003.10.005.
- User-annotated microtext data for modeling and analyzing users’ sociolinguistic characteristics and age grading. In 2014 IEEE Eighth International Conference on Research Challenges in Information Science (RCIS), pages 1–6, 2014. doi:10.1109/RCIS.2014.6861046.
- Variations in socioeconomic structure by race, residence, and the life cycle. American Sociological Review, 30(1):97–103, 1965. doi:10.2307/2091776.
- Correspondence analysis in R, with two- and three-dimensional graphics: The ca package. Journal of Statistical Software, 20(1):1–13, 2007. doi:10.18637/jss.v020.i03.
- Network structure and economic prosperity in municipalities: A large-scale test of social capital theory using social media data. Social Networks, 52:120–134, 2018. doi:10.1016/j.socnet.2017.06.002.
- The measurement of socioeconomic status. In J. Michael Oakes and Jay S. Kaufman, editors, Methods in social epidemiology, pages 23–42. Jossey-Bass & Pfeiffer Imprint, a Wiley brand, San Francisco, CA, second edition, 2017.
- US Bureau of Labour Statistics. May 2019 National Occupational Employment and Wage Estimates, 2020. URL https://www.bls.gov/oes/current/oes_nat.htm.
- ONS. Standard Occupational Classification (SOC) - Office for National Statistics, 2020. URL https://www.ons.gov.uk/methodology/classificationsandstandards/standardoccupationalclassificationsoc.
- Economic correlates of diversity and inequality online social networks. Academy of Management Proceedings, 2018(1):18881, 2018. doi:10.5465/AMBPP.2018.18881abstract.
- Richard A. Peterson. Understanding audience segmentation: From elite and mass to omnivore and univore. Poetics, 21(4):243–258, 1992. doi:10.1016/0304-422X(92)90008-Q.
- Changing highbrow taste: From snob to omnivore. American Sociological Review, 61(5):900–907, 1996. doi:10.2307/2096460.
- An analysis of the user occupational class through Twitter content. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 1754–1764, 2015a. doi:10.3115/v1/P15-1169.
- Studying user income through language, behaviour and affect in social media. PLOS ONE, 10(9):e0138717, 2015b. doi:10.1371/journal.pone.0138717.
- Emerging forms of cultural capital. European Societies, 15(2):246–267, 2013. doi:10.1080/14616696.2012.748930.
- Aaron Reeves. How class identities shape highbrow consumption: A cross-national analysis of 30 European countries and regions. Poetics, 76:101361, 2019. doi:10.1016/j.poetic.2019.04.002.
- Socio-economic status and academic performance in higher education: A systematic review. Educational Research Review, 29:100305, 2020. doi:10.1016/j.edurev.2019.100305.
- The National Statistics Socio-Economic Classification: origins, development, and use. Palgrave Macmillan, Basingstoke, Hampshire ; New York, 2005.
- Measuring the predictability of life outcomes with a scientific mass collaboration. Proceedings of the National Academy of Sciences, 117(15):8398–8403, 2020. doi:10.1073/pnas.1915006117.
- A new model of social class? Findings from the BBC’s Great British Class Survey Experiment. Sociology, 47(2):219–250, 2013. doi:10.1177/0038038513481128.
- The strategic control of information: Impression management and self-presentation in daily life. In Psychological perspectives on self and identity, pages 199–232. American Psychological Association, Washington, DC, US, 2000. doi:10.1037/10357-008.
- Selcuk R. Sirin. Socioeconomic status and academic achievement: A meta-analytic review of research. Review of Educational Research, 75(3):417–453, 2005. doi:10.3102/00346543075003417.
- Who tweets? Deriving the demographic characteristics of age, occupation and social class from Twitter user meta-data. PLOS ONE, 10(3):e0115545, 2015. doi:10.1371/journal.pone.0115545.
- Limits of predictability in human mobility. Science, 327(5968):1018–1021, 2010. doi:10.1126/science.1177170.
- Integrating survey data and digital trace data: Key issues in developing an emerging field. Social Science Computer Review, page 0894439319843669, 2019. doi:10.1177/0894439319843669.
- Georg Szalai. Cable shows with the wealthiest viewers, 2010. URL https://www.hollywoodreporter.com/news/cable-shows-wealthiest-viewers-25905.
- Concept Class Analysis: A Method for Identifying Cultural Schemas in Texts. Sociological Science, 7:544–569, November 2020. ISSN 2330-6696. doi:10.15195/v7.a23.
- Social media, political polarization, and political disinformation: A review of the scientific literature. SSRN Scholarly Paper ID 3144139, Social Science Research Network, 2018.
- Twitter. GET friends/ids, 2020. URL https://developer.twitter.com/en/docs/accounts-and-users/follow-search-get-users/api-reference/get-friends-ids.
- Correspondence analysis, spectral clustering and graph embedding: applications to ecology and economic complexity. Scientific Reports, 11(1):8926, December 2021. ISSN 2045-2322. doi:10.1038/s41598-021-87971-9.
- Alexander J. A. M. van Deursen and Ellen J. Helsper. The third-level digital divide: Who benefits most from being online? In Communication and Information Technologies Annual, volume 10 of Studies in Media and Communications, pages 29–52. Emerald Group Publishing Limited, 2015. doi:10.1108/S2050-206020150000010002.
- Alexander JAM van Deursen and Jan AGM van Dijk. The digital divide shifts to differences in usage. New Media & Society, 16(3):507–526, 2014. doi:10.1177/1461444813487959.
- Thorstein Veblen. The Theory of the Leisure Class. Routledge, Boca Raton, 2017.
- On predicting sociodemographic traits and emotions from communications in social networks and their implications to online self-disclosure. Cyberpsychology, Behavior, and Social Networking, 18(12):726–736, 2015. doi:10.1089/cyber.2014.0609.
- Mining user interests to predict perceived psycho-demographic traits on Twitter. In 2016 IEEE Second International Conference on Big Data Computing Service and Applications (BigDataService), pages 36–43, 2016. doi:10.1109/BigDataService.2016.28.
- Measuring algorithmically infused societies. Nature, pages 1–6, 2021. doi:10.1038/s41586-021-03666-1.
- Elliot B Weininger. Pierre Bourdieu on social class and symbolic violence. In E.O Wright, editor, Approaches to Class Analysis, pages 116–65. Cambridge University Press., Cambridge, UK, 2005.
- Rune Werliin. New study: Instagram climbs the ladder, TikTok has a long way to go, 2020. URL https://www.audienceproject.com/blog/key-insights/new-study-instagram-climbs-the-ladder-tiktok-has-a-long-way-to-go/.
- Wikipedia. List of supermarket chains in the United States, 2020. URL https://en.wikipedia.org/w/index.php?title=List_of_supermarket_chains_in_the_United_States&oldid=954384369. Page Version ID: 954384369.
- How Twitter Users Compare to the General Public, 2019. URL https://www.pewresearch.org/internet/2019/04/24/sizing-up-twitter-users/.
- YouGov. The most popular speciality retail stores in America | Consumer | YouGov Ratings, 2018. URL https://today.yougov.com/ratings/consumer/popularity/speciality-retail-stores/all.
- Computer-based personality judgments are more accurate than those made by humans. Proceedings of the National Academy of Sciences, 112(4):1036–1040, 2015. doi:10.1073/pnas.1418680112.
- A bibliometric overview of Twitter-related studies indexed in Web of Science. Future Internet, 12(5):91, 2020. doi:10.3390/fi12050091.
- Yuanmo He (2 papers)
- Milena Tsvetkova (13 papers)