The Many Dimensions of Political Discourse on Taiwan among Chinese netizens: an analysis of 20 million Weibo posts

Huan-Kai Tseng,  Osbern Huang, Waybe Lee and Yu-tzung Chang (National Taiwan University)

Abstract: Can microblog data be a useful substitute for internet poll to gauge public opinion on politically sensitive issue in an authoritarian context where implementing polls of this type is political unfeasible? This study pursues a data-driven approach by leveraging a variety of data science analytical methods to make inference from Chinese Weibo posts on a sensitive issue of domestic political relevance – Taiwan. Drawing on a data of 20 million Weibo posts, we identify important dimensions of discourse, detect over time change in attitude toward Taiwan’s election outcome; this study also made the first attempt to “profile” user features for each category of topics through unsupervised deep learning. Weibo, China’s largest social media, is often being referred to as China version of Twitter, its large user population as well as their comparatively even demographic and geographic distribution make Weibo a rather representative online forum for public opinion expression. Weibo has been credited by many scholars as the catalyst for China’s rising grassroot political participation, an instrument for political accountability, an arena for contentious political action, and a boiler for brewing nationalist sentiment. Some issue domains, owing to their politically sensitive nature, have drawn sustained public attention yet also subject to constant filtering by the state; among them, discourse on Taiwan, a breakaway province having close ethno-cultural affinity but governed by a democratic political system, has often provoked heated discussion and nationalist sentiment, but in the absence of systemic studies and results from large-scale political survey data to compare with, one could not know to what extent do these politically-related posts truly reflect the general perception of Taiwan among Chinese citizens and if specific set of personal attributes are associated with such attitude. We argue that the rich textual data contained in Weibo posts as well as their temporal evolution can help identifying salient topics and gauging public sentiment in lieu of large-scale internet survey. We retrieve over 20 million posts made between September and December 2018 (amidst Taiwan’s gubernatorial election) along with user features, classify this large corpus of text into 13 latent topics using a combination of structural topic models (STM) and a pre-trained contemporary China-Taiwan political dictionary. We find that, as with much of received wisdom, the lingering unification-independence debates, the U.S.-China-Taiwan strategic triangle, and explosive political events like Taiwan’s election outcome, all feature prominently. However, other issues, such as democracy, entertainment, travel, bilateral trade and investment, and even gender equality and same sex marriage (that formed the backbone of Taiwan’s 10-issue referendum held during the 2018 election) also stand out as salient topics in this time frame. Our analysis also reveals intertemporal variation in topic salience and public mood. The salience of politically-related topics declines swiftly relative to non-political topics. Positive sentiments can be found in unification, election outcome (in which the pro-independence party lost), and surprisingly, democracy and same sex marriage, while negative sentiments are exhibited toward independence and Taiwan’s major political parties. Yet the overall positive sentiment also vanishes quick on top of a nearly constant negative sentiment toward Taiwan. We also profile personal attributes by main topic categories by concatenating text and post-level features into a feature layer, fed into an 8-layer LSTM process with t-Distributed Stochastic Neighbor Embedding, classify topic labels using K-means clustering, and predict clustering with personal features using kernel SVM. We find that user names that contain keywords related to state-owned news media, male, having fewer followers, and living in more prosperous cities/regions are more responsive to posts under unification and nationalism categories, while female with more followers, and using iOS devices (an indicator of personal affluence) tend to be more interested in posts pertaining to democracy, entertainment, and travel. Our results not only suggest the feasibility of using Weibo data as a suitable substitute for political survey but also shed light on a multi-dimensional understanding of Chinese citizens’ perception toward Taiwan. We also discuss the reach and limitations of this method. # abstract and preliminary results can be seen here:

View Poster in a New Tab