Contrastive Multiple Component Analysis (cMCA): Applying the Contrastive Learning Method to Identify Political Subgroups

Tzu-Ping Liu and Takanori Fujiwara (University of California, Davis)

Abstract: Ideal point estimation and dimensionality reduction have long been utilized to simplify and cluster complex, high-dimensional political data (e.g., roll-call votes, surveys, and texts) for use in (preliminary) analysis and visualization. These methods often work by finding directions or principal components (PCs) on which either the data varies the most or respondents make the fewest decision errors. However, these PCs, which usually reflect the left-right political spectrum (Coombs 1964), are sometimes uninformative in explaining significant differences in the distribution of the data (e.g., how to categorize a set of highly-moderate voters). To tackle this prevalent issue, we adopt an emerging analysis approach, called contrastive learning. For example, cPCA—contrastive version of principal component analysis (PCA) (Abid et al. 2018)—works by first splitting the data by predefined groups, such as partisanship, and then deriving PCs on which the target group varies the most but the background group varies the least. As a result, cPCA can often find 'hidden' patterns, such as subgroups within the target group, which PCA cannot reveal when some variables are the dominant source of variations across the groups. We contribute to the field of contrastive learning by extending it to multiple component analysis (MCA) in order to enable an analysis of data often encountered by social scientists—namely binary, ordinal, and nominal variables. We demonstrate the utility of contrastive MCA (cMCA) by analyzing three different surveys: The 2015 Cooperative Congressional Election Study, 2012 UTokyo-Asahi Elite Survey, and 2018 European Social Survey. Our results suggest that, first, for the cases when ordinary MCA depicts differences between groups, cMCA can further capture the characteristics that are enriched in a target group relative to others; second, for the cases when MCA does not show the clear differences, cMCA can successfully identify meaningful directions and subgroups, which traditional methods overlook.

View Poster in a New Tab