Please see below for a video archive of the TRIADS Speaker Series, which showcased cutting-edge data science research from leading scholars at Washington University and beyond.
Caitlin McMurtry: Vaccine hesitancy and trust in government during public health emergencies
Abstract: Many studies separately examine political polarization, trust in government and public institutions, and vaccine hesitancy in the U.S. during the COVID-19 pandemic. Few, however, place their findings in historical context or explore the relationship between these variables. Drawing upon nearly seven decades of archival polling data, we use meta-analysis and meta-regression techniques to examine political polarization as it pertains to vaccine hesitancy and trust. We evaluate how attitudes about vaccines, the federal government, and the Centers for Disease Control and Prevention (CDC) among partisans changed during the coronavirus pandemic and how these attitudes compare to previous public health crises and disease outbreaks. Preliminary findings indicate that political polarization during COVID-19 is several times greater than any other public health emergency in modern American history. Fortunately, high levels of polarization do not appear endemic to disease outbreaks, meaning they may be preventable in the future. These findings inform ongoing efforts to improve vaccination rates across the nation, future pandemic readiness, and the importance of restoring confidence in public institutions.
Laura Nelson: Beyond Protests: Using Computational Text Analysis to Explore a Greater Variety of Social Movement Activities
Abstract: Social movement scholars use protest events as a way to quantify social movements, and have most often used large, national newspapers to identify those events. This has introduced known and unknown biases into our measurement of social movements. We know that national newspapers tend to cover larger and more contentious events and organizations. Protest events are furthermore a small part of what social movements actually do. Without other readily available options to quantify social movements, however, big-N studies have continued to focus on protest events via a few large newspapers. With advances in digitized data and computational methods, we now no longer have to rely on large newspapers or focus only on protests to quantify important aspects of social movements. In this paper we use the environmental movement as a case study, analyzing data from a wide range of local, regional, and national newspapers in the United States to quantify multiple facets of social movements. We argue that the incorporation of more data and new methods to quantify information in text has the potential to transform the way we both conceive of and measure social movements in three ways: (1) the type of focal social movement organization included, (2) the type of tactics and issues covered, and (3) the ability to go beyond protest events as the primary unit of analysis. In addition to demonstrating ways that the focus on counting protest events has introduced specific biases in the type of tactics, issues, and organizations covered in social movement research, we argue that computational methods can help us extract and count meaningful aspects of social movements well beyond event counts. In short, the infusion of new data and methods into social movements, peace, and conflict studies could lead us to a substantial shift in the way we quantify social movements, from protest events to everything that occurs outside of them.
Yevgeniy Vorobeychik: Is it (computationally) hard to steal an election?
Abstract: The integrity of elections is central to democratic systems. However, a myriad of malicious actors aspire to influence election outcomes for financial or political benefit. The issue of election vulnerability to malicious manipulation has been studied in the computational social choice literature from a computational complexity perspective. However, the traditional study of election control models voter preferences as rankings over candidates. This provides no natural way to reason about manipulations by attackers of specific issues that influence such preferences. Spatial theory of voting offers a social choice model that explicitly captures voter and candidate positions on issues, with voter preferences over candidates determined by their relative distance in issues space. We study the problem of election manipulation within the framework of spatial voting theory, by considering three models of malicious manipulation: 1) changing which issues are salient to voters (issue selection control, or ISC), 2) changing the relative importance of issues (issue significance manipulation, or ISM), and 3) changing how voters perceive where a particular candidate stands on issues (issue perception control, or IPC). All three models capture different aspects of the impact that political advertising or even misinformation can have on voter behavior. In all cases, the manipulation problem is NP-hard in general, even when there are only 2 candidates. ISC remains hard even when we have a single voter if issues are real-valued, and even with 3 or more voters if issues are binary. ISM and IPC, however, have considerably more interesting structure. In particular, we find that one crucial element is opinion diversity: if voter views are highly diverse, election control is hard, whereas when diversity is limited, with only a constant number of groups of voters with essentially identical views, control becomes tractable. Finally, we consider the problem of election manipulation by spreading misinformation over social networks, and close with a discussion of potential mitigations.
Jesus Fernandez-Villaverde: Taming the Curse of Dimensionality: Old Ideas and New Strategies
Abstract: A fundamental challenge in dealing with data and models in economics and other social sciences is the curse of dimensionality, i.e., the exponential increase in computational complexity of problems as dimensions grow. This talk will review recent advances in approaches to tame the curse of dimensionality. Some are old ideas that have been rediscovered. Others are new and build on recent developments in hardware and algorithms. In particular, we will discuss: (1) better numerical algorithms (e.g., continuous-time methods, deep learning); (2) better software implementations (e.g., functional programming, flexible data structures, advances in massive parallelization); and (3) Better hardware designs (e.g., GPUs, TPUs and other AI accelerators, FPGAs, Quantum computing). We will use many examples from my research during the last few years to motivate and illustrate these ideas.
Michael Esposito: Historical redlining and contemporary racial disparities in neighborhood life expectancy
Abstract: While evidence suggests a durable relationship between redlining and population health, we currently lack an empirical account of how this historical act of racialized violence produced contemporary inequities. In this paper, we use a mediation framework to evaluate how redlining grades influenced later life expectancy and the degree to which contemporary racial disparities in life expectancy between Black working-class neighborhoods and white professional-class neighborhoods can be explained by past HOLC mapping. Life expectancy gaps between differently graded tracts are driven by economic isolation and disparate property valuation that developed within these areas in subsequent decades. Still, only a small fraction of a total disparity between contemporary Black and white neighborhoods is explained by HOLC grades. We discuss the role of HOLC maps in analyses of structural racism and health, positioning them as only one feature of a larger public-private project conflating race with financial risk. Policy implications include targeting resources to formerly redlined neighborhoods, but also the larger project of dismantling racist theories of value that are deeply embedded in the political economy of place.
Deen Freelon: Operation Dumpster Fire; or, toward balance in the detection and profiling of low-quality content online
Abstract: Mis-and disinformation, conspiracy theories, hyperpartisan distortions, and similar phenomena (collectively low-quality content) have grown into a major focus area for social science. Many of the quantitative studies in this area rely on blacklists of low-quality web domains—Infowars.com, naturalnews.com, thegatewaypundit.com, and the like—to measure how much low-quality content exists and is viewed or shared on social media. While such studies have contributed much to our understanding of low-quality content, few of them empirically incorporate substantial amounts of high-quality content. Doing so may open new avenues for understanding low-quality content: for example, we could develop a taxonomy of disinformation attractors—individuals, places, institutions, and ideas that are frequent subjects of disinformation. We could also generate linguistic profiles of low-quality content, identifying specific words, phrases, and types of language that are statistically associated with it. Our research team is currently developing software, under the temporary designation Operation Dumpster Fire, to accomplish these and more research tasks related to low-quality content. This presentation will explore the project’s theoretical underpinnings, technical architecture, and offer a feature demonstration.