How the Taiwanese Do China Studies: Applications of Text Mining (1801.00912v3)
Abstract: With the rapid evolution of cross-strait situation, "Mainland China" as a subject of social science study has evoked the voice of "Rethinking China Study" among intelligentsia recently. This essay tried to apply an automatic content analysis tool (CATAR) to the journal "Mainland China Studies" (1998-2015) in order to observe the research trends based on the clustering of text from the title and abstract of each paper in the journal. The results showed that the 473 articles published by the journal were clustered into seven salient topics. From the publication number of each topic over time (including "volume of publications", "percentage of publications"), there are two major topics of this journal while other topics varied over time widely. The contribution of this study includes: 1. We could group each "independent" study into a meaningful topic, as a small scale experiment verified that this topic clustering is feasible. 2. This essay reveals the salient research topics and their trends for the Taiwan journal "Mainland China Studies". 3. Various topical keywords were identified, providing easy access to the past study. 4. The yearly trends of the identified topics could be viewed as signature of future research directions.
- The Theory of Cross-Strait Relations in Argument. Wunan Book Co., Ltd., 2011.
- Event prediction with learning algorithms—a study of events surrounding the egyptian revolution of 2011 on the basis of micro blog data. Policy & Internet, 7(2):159–184, 2015.
- Rens Vliegenthart Bjorn Burscher and Claes H. De Vreese. Using supervised machine learning to code policy issues: Can classifiers generalize across contexts? The ANNALS of the American Academy of Political and Social Science, 659(1):122–131, 2015.
- Six degrees of francis bacon: a statistical method for reconstructing large historical social networks. Digital Humanities Quarterly, 10(3), 2016.
- Text as data: The promise and pitfalls of automatic content analysis methods for political texts. Political analysis, 21(3):267–297, 2013.
- Harold Lasswell. Who gets what, when, how. Whittlesey House, New York-London, 1936.
- Dialogism in the novel: A computational model of the dialogic nature of narration and quotations. Digital Scholarship in the Humanities, 32(suppl_2):ii31–ii52, 2017.
- Structural topic models for open-ended survey responses. American Journal of Political Science, 58(4):1064–1082, 2014.
- Research Methods and Achievements in Mainland China. Institute of International Relations, NCCU, 2005.
- Donald Sturgeon. Unsupervised identification of text reuse in early chinese literature. Digital Scholarship in the Humanities, page fqx024, 2017. 10.1093/llc/fqx024. URL http://dx.doi.org/10.1093/llc/fqx024.
- Yuen-Hsien Tseng. Generic title labeling for clustered documents. Expert Systems with Applications, 37(3):2247–2254, 2010.
- A comparison of methods for detecting hot topics. Scientometrics, 81(1):73–90, 2009.
- Large-scale computerized text analysis in political science: Opportunities and challenges. Annual Review of Political Science, 20:529–544, 2017.