Papers
Topics
Authors
Recent
Search
2000 character limit reached

From Division to Unity: A Large-Scale Study on the Emergence of Computational Social Science, 1990-2021

Published 11 Dec 2024 in cs.CY and cs.DL | (2412.08087v2)

Abstract: We present a comprehensive study on the emergence of Computational Social Science (CSS) - an interdisciplinary field leveraging computational methods to address social science questions - and its impact on adjacent social sciences. We trained a robust CSS classifier using papers from CSS-focused venues and applied it to 11 million papers spanning 1990 to 2021. Our analysis yielded three key findings. First, there were two critical inflections in the rise of CSS. The first occurred around 2005 when psychology, politics, and sociology began engaging with CSS. The second emerged in approximately 2014 when economics finally joined the trend. Sociology is currently the most engaged with CSS. Second, using the density of yearly knowledge embeddings constructed by advanced transformer models, we observed that CSS initially lacked a cohesive identity. From the early 2000s to 2014, however, it began to form a distinct cluster, creating boundaries between CSS and other social sciences, particularly in politics and sociology. After 2014, these boundaries faded, and CSS increasingly blended with the social sciences. Third, shared data-driven methods homogenized CSS papers across disciplines, with politics and economics showing the most alignment due to the combined influence of CSS and causal identification. Nevertheless, non-CSS papers in sociology, psychology, and politics became more divergent. Taken together, these findings highlight the dynamics of division and unity as new disciplines emerge within existing knowledge landscapes. A live demo of CSS evolution can be found in https://evolution-css.netlify.app/

Summary

  • The paper introduces a novel ensemble classifier using Word2Vec embeddings to analyze 11 million papers over 32 years.
  • It identifies two critical growth phases, with rapid diffusion post-2005 and marked unity post-2014 across multiple social science disciplines.
  • The study provides actionable insights into CSS's integration patterns, highlighting its transformative impact on sociology, political science, and economics.

Emergence of Computational Social Science: A Large-Scale Analysis

Introduction

The paper "From Division to Unity: A Large-Scale Study on the Emergence of Computational Social Science, 1990-2021" provides an extensive analysis of the rise and impact of Computational Social Science (CSS) within the broader landscape of social sciences. The study meticulously tracks the diffusion of CSS starting from the early 1990s, highlighting its increasing influence, especially post-2005 and more significantly after 2014. Such temporal inflection points correlate with broader adoption of data-driven methodologies within the field of social science research. This paper leverages a robust empirical framework, analyzing 11 million papers across 32 years to discern the integration of CSS in related fields like psychology, sociology, economics, and political science.

Methodology

The authors employ a CSS classifier trained on data from venues specifically focused on CSS. This classifier was applied to a massive dataset comprising 11 million papers sourced from the Microsoft Academic Graph (MAG). The sophistication of the method is evident in the meticulous curation of both CSS and non-CSS papers across different periods and disciplines.

To identify CSS papers, the authors utilized the "Awesome Computational Social Science" list, which helped establish ground-truth labels. Subsequently, word embeddings were generated using Word2Vec models to support the classification process. The classifier itself is an ensemble combining linear (Support Vector Machine, Logistic Regression) and non-linear approaches (Random Forest, Gradient Boosting Decision Tree), achieving high precision with ROC-AUC at approximately 0.9958.

Results

The Growth of CSS

The paper identifies two critical periods of exponential CSS growth: 2005 and 2014. Sociology and political science significantly contributed to the early growth, whereas economics caught up post-2014 due to the proliferation of machine learning and AI techniques. Figure 1

Figure 1: CSS in the embedding space. Panel (a) illustrates the cosine similarity between the central embeddings of CSS papers and non-CSS papers across different years and fields. Panel (b) depicts the dynamics of the normalized density of CSS papers over time.

A striking finding is the transition in the identity of CSS, which initially lacked cohesion until the early 2000s. Post-2010, it formed a distinct cluster within the scientific landscape before the boundaries began to fade, integrating CSS more seamlessly into adjacent social sciences.

Evolution Dynamics

The study employs a compelling visual approach using SPECTER2 to track CSS's embedding trajectory. During the early stages, CSS existed without distinct identity boundaries but began exhibiting unique characteristics, forming a discernible cluster in the embedding space by 2014. The authors observe a notable trend towards unity post-2014, with a significant diffusion into non-CSS domains, as evidenced by increased similarity indices and clustering measures.

Implications and Future Directions

The findings underscore CSS as both a unifying and divisive force in social sciences. While data-driven methodologies introduced collective alignment among CSS papers, non-CSS domains appeared increasingly distinct, emphasizing efforts within fields like sociology to maintain unique methodological traditions amidst CSS's rising prominence.

Future research should expand on communication sciences' exclusion, given their notable convergence with political science in recent years. Moreover, incorporating post-2021 data, particularly reflecting GenAI advancements, could yield additional insights into CSS's evolving paradigms. Additionally, analyzing the reception of CSS within traditional outlets and its demographic authorship could provide deeper contextual understanding.

Conclusion

The paper presents a thorough quantitative evaluation of CSS's journey from a nascent collective of computational methods to a comprehensive interdisciplinary force. The dual role of CSS in blending and distinguishing disciplines signifies its complex legacy and pivotal role in shaping future social science inquiries. The robust methodological framework and exhaustive data analysis offer a critical reference point for understanding the intricate dynamics of knowledge diffusion and the evolution of scientific fields.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 2 tweets with 61 likes about this paper.