# PolMeth XXXVII Live Program

### View all paper presentations on one page.

Times in UTC-4 (Eastern Daylight Time).  The authors expected to present the papers are indicated in bold font.

## TUESDAY JULY 14, 2020

Widget Title
Panel 1
• 2020 Jul 14
Virtual Room 2: Experimental Designs
12:00pm to 01:30pm
Body

### Chair: Justin Esarey (Wake Forest University)

Co-Host: Anwar Mohammed (McMaster University)

## Using Eye-Tracking to Understand Decision-Making in Conjoint Experiments

### Discussant: Anton Strezhnev (New York University)

Conjoint experiments enjoy increasing popularity in political and social science, but there is a paucity of research on respondents' underlying decision-making processes. We leverage eye-tracking methodology and a conjoint experiment, administered to a subject pool consisting of university students and local community members, to examine how respondents process information when completing conjoint surveys. Our study has two main findings. First, we find a positive correlation between attribute importance measures inferred from the stated choice data and attribute importance measures based on eye movement. This validation test supports the interpretation of common conjoint metrics, such as Average Marginal Component Effects and marginal R2 values, as valid measures of attribute importance. Second, when we experimentally increase the number of attributes and profiles in the conjoint table, respondents on average view a larger absolute number of cells but a smaller fraction of the total cells displayed, and the patterns in which they search between cells change conditionally. At the same

• 2020 Jul 14
Virtual Room 1: Spatial Analysis
12:00pm to 01:30pm
Body

Chair: Neal Beck (New York University)

Co-Host: Sophie Borwein (University of Toronto)

## Causal Inference for Policy Diffusion

### Discussant: Yiqing Xu (Stanford University)

Understanding why governments adopt policies and how policy innovations diffuse from one government to others is a central goal in all subfields of political science. Despite numerous methodological developments in the policy diffusion literature, unfortunately, fundamental issues of causal inference have been left unaddressed for decades. As a result, little is known about which substantive findings in the literature have causal interpretations. To improve causal inferences in policy diffusion studies, we make three contributions. First, we define a variety of causal effects relevant to policy diffusion questions and clarify assumptions required for causal identification. Second, we provide a general estimation method by extending the standard event history analysis commonly used in practice. Finally, we propose a sensitivity analysis method that can assess the potential influence of unmeasured confounding on causal conclusions. We illustrate the general applicability of the proposed approach using a diffusion study of abortion policies. Open-source software will be made available for implementing our methods.

## Network Event History Analysis for Modeling Public Policy Adoption with Latent Diffusion Networks

### Author(s): Bruce

• 2020 Jul 14
Virtual Room 4: Text-as-Data
12:00pm to 01:30pm
Body

### Chair: Suzanna Linn (Penn State University)

Co-Host: Justin Savoie (University of Toronto)

## A comparison of methods in political science text classification: Transfer learning language models for politics

### Discussant: Leah Windsor (Institute for Intelligent Systems, University of Memphis)

Automated text classification has rapidly become an important tool for political analysis. Recent advancements in natural language processing (NLP) enabled by advances in deep learning now achieve state of the art results in many standard tasks for the field. However, these methods require large amounts of both computing power and text data to learn the characteristics of the language, resources which are not always accessible to political scientists. One solution is a transfer learning approach, where knowledge learned in one area or source task is transferred to another area or a target task. A class of models that embody this approach are language models, which demonstrate extremely high levels of performance in multiple natural language understanding tasks. We investigate the feasibility of the use of these models and their performance in the political science domain by comparing multiple text classification methods

• 2020 Jul 14
Virtual Room 3: Sample Selection
12:00pm to 01:30pm
Body

### Chair: Ludovic Rheault (University of Toronto)

Co-Host: Regan Johnston (McMaster University)

## How You Ask Matters: Interview Requests as Network Seeds

### Discussant: Jennifer Bussell (University of California, Berkeley)

When recruiting interview subjects is the goal, building rapport is conventionally heralded as the superior method. Cold-emails, in contrast, are often dismissed as inferior for their low response rate. Our study suggests that this stance is mistaken. When it is elites who are to serve as interview subjects, we argue that cold-emails can yield tremendous benefits that have thus far been overlooked. More specifically, we posit that when paired with network effects, which are rooted in the linkages among elites, cold-emails can outperform the standard but costly interview solicitation method of building rapport with subjects. In a series of experiments and simulations, we show that small changes to the wording of cold-emails translates into greater network coverage, thereby offering researchers a richer set of insights from their interview subjects.

## How Much Should You Trust Your Power Calculation Results? Power Analysis as an Estimation Problem

### Discussant: Clayton Webb (University of Kansas)

Widget Title
Plenary Session Monday
• 2020 Jul 14
Plenary Session
12:30pm to 01:30pm
Body

## WEDNESDAY JULY 15, 2020

Widget Title
Panel II
• 2020 Jul 15
Virtual Room 1: Data Access
12:00pm to 01:30pm
Body

### Chair: Suzanna Linn (Penn State University)

Co-Host: Anwar Mohammed (McMaster University)

## Statistically Valid Inferences from Privacy Protected Data

### Discussant: James Honaker (Harvard University)

Unprecedented quantities of data that could help social scientists understand and ameliorate the challenges of human society are presently locked away inside companies, governments, and other organizations, in part because of worries about privacy violations. We address this problem with a general-purpose data access and analysis system with mathematical guarantees of privacy for individuals who may be represented in the data and statistical validity guarantees for researchers seeking population-level insights from it. We build on the standard of "differential privacy" but, unlike most such approaches, we also correct for the serious statistical biases induced by privacy-preserving procedures, provide a proper accounting for statistical uncertainty, and impose minimal constraints on the choice of data analytic methods and types of quantities estimated. Our algorithm is easy to implement, simple to use, and computationally efficient; we also offer open source software to illustrate all our methods.

## Hidden in Plain Sight? Detecting Electoral Irregularities Using Statutory Results

### Author(s): Zach Warner, J

• 2020 Jul 15
Virtual Room 4: Applications
12:00pm to 01:30pm
Body

### Chair: John Londregan (Princeton University)

Co-Host: Mikaela Karstens (Penn State University)

## The Political Ideologies of Organized Interests: Large-Scale, Social Network Estimation of Interest Group Ideal Points

### Discussant: In Song Kim (MIT)

Interest group influence is pervasive in American politics, impacting the function of every branch. Core to the study of interest groups, both theoretically and empirically, is the ideology of the group, yet relatively little is known on this front for the vast expanse of them. By leveraging ideal point estimation and network science, we provide a novel measure of interest group ideology for nearly 15,000 unique groups across 95 years, which provides the largest and longest measure of interest group ideologies to date. We make methodological and measurement contributions using exact matching and hand-validated fuzzy string matching to identify amicus curiae signing organizations who have given political donations and then impute and cross-validate ideal points for the organizations based on the network structure of amicus cosigning. Our empirical investigation provides insights into the dynamics of interest group macro-ideology, ideological issue domains and ideological differences between donor and non-donor organizations

• 2020 Jul 15
Virtual Room 3: Panel and Spatial Analysis
12:00pm to 01:30pm
Body

### Chair: Ludovic Rheault (University of Toronto)

Co-Host: Regan Johnston (McMaster University)

## A Bayesian Method for Modeling Dynamic Network Influence With TSCS Data

### Discussant: Matthew Blackwell (Harvard University)

With fast accumulations of network data, modeling time-varying network influence is necessary and important to relax the unrealistic constant-effect assumption and to deepen our understanding of the changing dynamic between networks and social behavior. However, even in static settings, the identification of network influence remains a challenging problem due to the complicated entanglement of network interdependence, homophily (selection), and common shocks. To identify and explain dynamic network influence, this paper proposes a multilevel Spatio-Temporal model with a multifactor error structure. Network influence is allowed to vary, and network structural features could enter the group-level regression and further explain the variation. The multifactor term is included to capture unobserved time-varying homophily and heterogeneous time trends. We apply Bayesian shrinkage for factor-selection to achieve sufficient bias-correction and avoid overfitting. The Bayesian Spatio-Temporal model is highly flexible and can accommodate a wide variety of network types. Monte Carlo experiments show the model performs well in recovering the true trajectory of network influence. Besides, the

• 2020 Jul 15
Virtual Room 2: Causal Inference
12:00pm to 01:30pm
Body

### Chair: Kenichi Ariga (University of Toronto)

Co-Host: Ilayda Onder (Penn State University)

## Casual Inference, or How I Learned to Stop Worrying and Love Hypothesis Testing

### Discussant: Luke Keele (University of Pennsylvania)

Many social scientists now consider it necessary for an empirical research design to achieve identification of a causal relationship as defined by the Rubin (1974) causal model or the closely related Pearl (2009) model. In this paper, we argue that no empirical estimand can take a meaningful causal interpretation without a supporting theoretical structure, even if that estimand is strongly identified by a careful research design; that is, an identified research design is necessary but not sufficient for a causal inference. An atheoretical estimand might be causal'' in the narrow sense that changes in the dependent variable are ascribable to the treatment in the specific data used in the study, but not in the sense of providing predictive or explanatory guidance for treatment effects in any other situation in the past, present, or future. For instance, when objects of study strategically interact with one another the straightforward application of common causal inference research designs

• 2020 Jul 15
12:30pm to 01:30pm
Body

## Business Meeting of the Society for Political Methodology

Open to registered members of the Society.

## THURSDAY JULY 16, 2020

Widget Title
12:00PM to 1:30PM EST (9:00AM to 10:30AM PST) Panel III
• 2020 Jul 16
Virtual Room 1: Machine Learning
12:00pm to 01:30pm
Body

Chair: Suzanna Linn (Penn State University)

Co-Host: Anwar Mohammed (McMaster University)

## Experimental Evaluation of Computer-Assisted Human Decision Making: Application to Pretrial Risk Assessment Instrument

### Discussant: Jonathan Mummolo (Princeton University)

Despite an increasing reliance on computerized decision making in our day-to-day lives, human beings still make highly consequential decisions. As frequently seen in business, healthcare, and public policy, recommendations produced by statistical models and machine learning algorithms are provided to human decision-makers in order to guide their decisions. The prevalence of such computer-assisted human decision making calls for the development of a methodological framework to evaluate its impact. Using the concept of principal stratification from the causal inference literature, we develop a statistical methodology for experimentally evaluating the causal impacts of machine recommendations on human decisions. We also show how to examine whether machine recommendations improve the fairness of human decisions. We apply the proposed methodology to the randomized evaluation of a pretrial risk assessment instrument (PRAI) in the criminal justice system. Judges use the PRAI when deciding which arrested individuals should be released and, for those ordered released, the corresponding bail amounts

• 2020 Jul 16
Virtual Room 4: Text and Image Data
12:00pm to 01:30pm
Body

### Chair: Kevin Quinn (University of Michigan)

Co-Host: Ilayda Onder (Penn State University)

## Untangling Mixtures in Judicial Opinions

### Discussant: Brandon Stewart (Princeton University)

Within the small deliberative voting body of the U.S. Supreme Court, prior work has regularly looked to the voting alignments of the justices in order to understand bargaining power and who wins or loses. However, little consensus has emerged over who exerts control over the Court's actual output--judicial opinions. Rather, three distinct perspectives have emerged: (1) The opinion author controls opinion content; (2) the median member of the court controls opinion content; (3) the median of the majority coalition controls the content of the opinion. In this paper, by contrast, we look for evidence of influence where it theoretically operates -- in the opinions themselves -- and identify the unique contributions of separate justices to the collective output of the Court. To do so, we build on a well-established tradition within text-as-data research on the challenge of authorship attribution, and develop a novel authorship model that leverages writing characteristics to predict the authors of individual sentences. More specifically, we introduce an approach that evaluates authorship across multiple competitive

• 2020 Jul 16
Virtual Room 3: Experimental Designs
12:00pm to 01:30pm
Body

### Chair: Ludovic Rheault (University of Toronto)

Co-Host: Regan Johnston (McMaster University)

## Elements of External Validity: Framework, Design, and Analysis

### Discussant: Daniel Hopkins (University of Pennsylvania)

External validity of randomized experiments has been a focus of long-standing methodological debates in the social sciences. However, in practice, discussions of external validity often differ in their definitions, goals, and assumptions, without making them clear. Moreover, while many applied studies recognize it as their potential limitations, unfortunately, few studies have explicit designs or analysis aimed towards externally valid inferences. In this article, we propose a framework, design, and analysis to address two central goals of external validity inferences — (1) assess whether the direction of causal effects is generalizable (sign-validity), and (2) generalize the magnitude of causal effects (effect-validity). First, we propose a formal framework of external validity to decompose it into four components, X-, Y-, T-, and C-validity (units, outcomes, treatments, and contexts) and clarify the source of potential biases. Second, we present assumptions required to make externally valid causal inferences, and we propose experimental designs to make such assumptions more plausible. Finally, we introduce a multiple-testing procedure to address sign-validity and general

• 2020 Jul 16
Virtual Room 2: Panel and Spatial Analysis
12:00pm to 01:30pm
Body

### Chair: Justin Esarey (Wake Forest University)

Co-Host: Md Mujahedul Islam (University of Toronto)

## How Wide is the Ethnic Border?

### Discussant: Florian Hollenbach (Texas A&M University)

We explore the relationship between ethnic heterogeneity and within- and cross-country barriers to trade. We develop a spatial model of trade in which observable productivity shocks directly affect local prices. These local shocks propagate through the trading network differentially, depending on unobserved trading frictions. Coupling data describing monthly commodity prices in 227 cities across 42 African counties, remotely sensed weather data, and spatial data describing the locations of ethnic-group homelands, we estimate this model to quantify the costs traders incur when by crossing ethnic and national borders. We show that ethnic borders induce a friction approximately half the magnitude of national borders, indicating that ethnic heterogeneity is an impediment to the development of efficient national markets. Through counterfactual experiments, we quantify the effect of these frictions on consumer welfare and the extent to which colonial-era political borders have hindered African economic integration. In all, our paper suggests that trade impediments caused by ethnic heterogeneity are a substantial channel through which ethnic fractionization

Widget Title
2:30PM to 4:00PM EST (11:30AM to 1:00PM PST) Panel IV
• 2020 Jul 16
Virtual Room 1: Hierarchical Models
02:30pm to 04:00pm
Body

### Chair: Ludovic Rheault (University of Toronto)

Co-Host: Mikaela Karstens (Penn State University)

## Fast and Accurate Estimation of Non-Nested Binomial Hierarchical Models Using Variational Inference

### Discussant: Justin Grimmer (Stanford University)

Estimating non-linear hierarchical models can be computationally burdensome in the presence of large datasets and many non-nested random effects. Popular inferential techniques may take hours to fit even relatively straightforward models. This paper provides two contributions to scalable and accurate inference. First, I propose a new mean-field algorithm for estimating logistic hierarchical models with an arbitrary number of non-nested random effects. Second, I propose “marginally augmented variational Bayes” (MAVB) that further improves the initial approximation through a post-processing step. I show that MAVB provides a guaranteed improvement in the approximation quality at low computational cost and induces dependencies that were assumed away by the initial factorization assumptions. I apply these techniques to a study of voter behavior. Existing estimation took hours whereas the algorithms proposed run in minutes. The posterior means are well-recovered even under strong factorization assumptions. Applying MAVB further improves the approximation by partially correcting the under-estimated variance. The proposed methodology is implemented in an open source software package

• 2020 Jul 16
Virtual Room 4: Instrumental Variables
02:30pm to 04:00pm
Body

### Chair: Jonathan N. Katz (California Institute of Technology)

Co-Host: Justin Savoie (University of Toronto)

## An omitted variable bias framework for sensitivity analysis of instrumental variables

### Discussant: Jacob Montgomery (WUSTL)

We develop an omitted variable bias framework for sensitivity analysis of instrumental variable (IV) estimates that is immune to "weak instruments," naturally handles multiple "side-effects" and "confounders," exploits expert knowledge to bound sensitivity parameters, and can be easily implemented with standard software. In particular, we introduce sensitivity statistics for routine reporting, such as robustness values for IV estimates, describing the minimum strength that omitted variables need to have to change the conclusions of a study. We show how these depend upon the sensitivity of two familiar auxiliary estimates–the effect of the instrument on the treatment (the "first-stage") and the effect of the instrument on the outcome (the "reduced form")–and how an extensive set of sensitivity questions can be answered from those alone. Next, we provide tools that fully characterize the sensitivity of point-estimates and confidence intervals to violations of the standard IV assumptions. Finally, we offer formal bounds on the worst damage caused by these violations by means of comparisons with the explanatory

• 2020 Jul 16
Virtual Room 3: Panel Data
02:30pm to 04:00pm
Body

### Chair: Suzanna Linn (Penn State University)

Co-Host: Ilayda Onder (Penn State University)

## Bayesian Causal Inference With Time-Series Cross-Sectional Data: A Dynamic Multilevel Latent Factor Model with Hierarchical Shrinkage

### Discussant: Neal Beck (New York University)

This paper proposes a Bayesian causal inference method based on estimating posterior predictive distributions of counterfactuals with TSCS data. To construct the prediction model, we fully take advantage of the flexibility of multilevel modeling and Bayesian model specification to reduce dependence on modeling assumptions. We start with a multilevel dynamic factor model and adopt a Bayesian Lasso-like hierarchical shrinkage strategy for stochastic model-specification selection. Counterfactual imputation based on the posterior predictive distribution generalizes the classic synthetic control approach by assigning observation-specific weights to features of the treated units and exploiting high-order relationships between treated and control time series. With empirical posterior distributions of counterfactuals, it is convenient and intuitive to make causal

• 2020 Jul 16
Virtual Room 2: Causal Inference
02:30pm to 04:00pm
Body

### Chair: Justin Esarey (Wake Forest University)

Co-Host: Md Mujahedul Islam (University of Toronto)

## Retrospective causal inference via elapsed time-weighted matrix completion, with an evaluation on the effect of the Schengen Area on the labour market of border regions

### Discussant: James Bisbee (Princeton University)

We propose a strategy of retrospective causal inference in panel data settings where (1) there is a continuous outcome measured before and after a single binary treatment; (2) there exists a group of units exposed to treatment during a subset of periods (switch-treated) and group of units always exposed to treatment (always-treated), but no group that is never exposed to treatment; and (3) the elapsed treatment duration, z, differs across groups. The potential outcomes under treatment for the switch-treated in the pre-treatment period are missing and we impute these values via nuclear-norm regularized least squares using the observed (i.e, factual) outcomes. The imputed values can be interpreted as the counterfactual outcomes of the switch-treated had they been always-treated. Differencing the counterfactual outcomes from the factual outcomes can be interpreted as the effect of not having assigned treatment to the switch-treated in

## FRIDAY JULY 17, 2020

Widget Title
12:00PM to 1:30PM EST (9:00AM to 10:30AM PST) - Panel V
• 2020 Jul 17
Virtual Room 1: Covariate Balancing
12:00pm to 01:30pm
Body

Chair: Ludovic Rheault (University of Toronto)

Co-Host: Anwar Mohammed (McMaster University)

## Balancing covariates in randomized experiments using the Gram-Schmidt Walk

### Discussant: Marc Ratkovic (Princeton University)

The paper introduces a class of experimental designs that allows experimenters to control the robustness and efficiency of their experiments. The designs build on a recently introduced algorithm in discrepancy theory, the Gram--Schmidt walk. We provide a tight analysis of this algorithm, allowing us to prove important properties of the designs it produces. These designs aim to simultaneously balance all linear functions of the covariates, and the variance of an estimator of the average treatment effect is shown to be bounded by a quantity that is proportional to the loss function of a ridge regression of the potential outcomes on the covariates. No regression is actually conducted, and one may see the procedure as regression adjustment by design. The class of designs is parameterized so to give experimenters control over the worse case performance of the treatment effect estimator. Greater covariate balance is attained by allowing for a less robust design in terms of worst case variance. We

• 2020 Jul 17
Virtual Room 3: Text-as-Data & Item Response Theory
12:00pm to 01:30pm
Body

### Chair: Suzanna Linn (Penn State University)

Co-Host: Regan Johnston (McMaster University)

## Troubles in/with Text: Combining qualitative and NLP approaches to analyzing government archives from the UK Troubles in Northern Ireland

### Discussant: Arthur Spirling (New York University)

Natural language processing (NLP) offers a range of tools to test conceptual trends in large text corpora. However, NLP scholars typically developed and trained models on carefully curated data, focusing on settings where natural sources of annotation lead to straightforward application of supervised learning. Can NLP tools be used on complicated and imperfect real-world data—e.g., data that is idiosyncratic to a specific place, time, and purpose; and /or data that has undergone imperfect optical character recognition digitization processes—to address complex political science questions? To begin answering this methodological question, this research combines qualitative and computational NLP approaches

• 2020 Jul 17
Virtual Room 2: Conjoint Designs
12:00pm to 02:15pm
Body

### Chair: Peter J. Loewen (University of Toronto)

Co-Host: Md Mujahedul Islam (University of Toronto)

## Avoiding Measurement Error Bias in Conjoint Analysis

### Discussant: Naoki Egami (Columbia University)

Conjoint analysis is a survey research methodology spreading fast across the social sciences and marketing due to its widespread applicability and apparent capacity to disentangle many causal effects with a single survey experiment. Unfortunately, conjoint designs are also especially prone to measurement error, revealed by surprisingly low levels of intra-coder reliability, which can exaggerate, attenuate, or give incorrect signs for causal effect estimates. We show that measurement error bias is endemic in applications, and so assuming its absence, as many studies implicitly do, is not defensible. With replications of

• 2020 Jul 17
10:30am to 12:00pm
Body

## Working as a Data Scientist Outside the Academia

### Chair: Ludovic Rheault (University of Toronto)

Co-Host: Justin Savoie (University of Toronto)

Join our special guests for a timely and stimulating talk on non-academic careers. Our speakers will share hands-on experience about working as data scientists in the private sector. We will conclude with an audience question period.