We first review the literature on models and algorithms. We then review the literature on polarization.
A.1 Models and algorithms
While the study of influence in networks is very broad [
16], we focus on models and algorithms for game-theoretic settings. Irfan and Ortiz propose
Linear Influence Games (LIGs) [
30], a type of 2-action graphical game [
35]. In an LIG, every node (or player) represents an individual with a binary action (1 or
\(-1\)) and a threshold level representing their “stubbornness.” There is an underlying network structure among the nodes. The weight of each edge (
u,
v) represents the amount of influence that node
u exerts on node
v. These influence weights may be positive or negative,
7 and are not required to be symmetric (meaning that node
u may exert more or less influence on node
v than it receives). The best response of a node depends on its threshold level and the net influence on it. The net influence is found by calculating (1) the sum of all incoming influences from nodes playing action 1 and (2) the sum of all incoming influences from nodes playing action
\(-1\), and then subtracting the second sum from the first. If this net influence exceeds the node’s threshold, the best response for that node is 1; if it does not, the best response is
\(-1\). In the case of a tie, the node is indifferent between the two actions.
Instantiating an LIG, then, requires a matrix of influence weights
\({\mathbf {W}} \in {\mathbf {R}}^{n \times n}\) and a threshold vector
\({\mathbf {b}} \in {\mathbf {R}}^n\). Each outcome of an LIG is a joint action
\({\mathbf {x}}\), which is basically a vector of actions of all players. For every individual player
i,
\({\mathbf {x}}_{-i}\) is the vector of all actions except the action of
i. Each player has an influence function
\(f_i ({\mathbf {x}}_{-i}) \equiv \sum _j x_j w_{ij} - b_i\) and a payoff function
\(u_i(x) \equiv x_i f_i (x_{-i})\). A joint action
\({\mathbf {x}}^*\) is a
pure-strategy Nash equilibrium (PSNE) when every individual is playing their best response
\(x_i^*\)—that is, when no player has an incentive to unilaterally deviate from their chosen action. With the United States Senate as an example, each node is an individual senator, and each edge is the influence that a senator has upon another senator. A senator will vote
yea (1) if their threshold has been met given all incoming influences from other senators, or
nay (
\(-1\)) if not. When all senators are playing their best responses in
\({\mathbf {x}}\), the system is stable, and the network is in PSNE. The LIG model is further explained in "
The LIG model".
While the matrix of influence weights and vector threshold values necessary for instantiating an LIG could be generated manually for very small instances, Honorio and Ortiz develop a method of instantiating an arbitrarily large LIG from raw, binary-action data via machine learning [
26]. Only voting records are made available to the learning program; no other information is involved. Given these data, the program generates the influence weights
w and influence thresholds
b which define a game
G. The program seeks to instantiate an LIG where a high proportion of real-world data is accurately reflected as PSNE, without allowing so many PSNE that any joint action would be in equilibria. Finding the number of ground-truth joint actions represented as PSNE is computationally easy, but computing the total number of PSNE in a game is NP-hard, and therefore infeasible on large datasets. By proving a number of simplifying assumptions, they approximate the problem using
convex loss minimization. In this function, parameters of the game are chosen so that the average error—the proportion of ground-truth joint actions which are not reflected as PSNE—is minimized. This algorithm is explained in "
Machine Learning" section.
The majority of research in analyzing and predicting legislative votes has not been in the game theory space. Rather, roll-call data are most often used in ideal point models, which estimate the ideal point of a legislature upon a scale of conservative to liberal extremes. Clinton et al. proposed Bayesian methods for ideal point estimation, which can be solved using Markov Chain Monte Carlo (MCMC) simulations [
9]. In contrast to prior methods, this MCMC-calculated Bayesian method is computationally efficient at large scale; other methods required small populations or made statistical compromises in order to be feasible. Regardless of methods used, ideal points range on an arbitrary scale of negative to positive. In practice, a negative ideal point represents “liberal” polarity, while a positive ideal point represents a “conservative” polarity. Their work is widely cited in later ideal point models which expand upon the original concept.
While the importance of roll-call data is widely recognized, it is also recognized that each vote is a member of a broader context with important characteristics. Grerish and Blei extend the traditional ideal point model, which relies solely on roll-call data, to account for the topics of bills [
18]. Using a
Latent Dirichlet Allocation (LDA) topic model, Gerrish and Blei integrate bill topics and political tone into their ideal point model. LDA topic models identify patterns in words, but labeling and interpreting these patterns are left up to the researchers. These bill topics may be, for example, national recognition (“people”, “month”, “recognize”, “history”, “week”, and “woman”) or healthcare (“care”, “applicable”, “coverage”, “hospital”, and “eligible”). They find that the model performs especially well when bills have bipartisan support or disapproval, or when bills face clearly partisan support and disapproval, but lose accuracy when bills receive mixed, nonpartisan support. Topic modelling is not the only method of inferring bill topics: The Congressional Research Service (CRS) assigns subject codes to every bill, out of close to a thousand possible codes. In an ensuing study on ideal point models, Gerrish and Blei note that using CRS subject codes rather than an LDA topic model also provided a good basis for their ideal point model [
19].
In a recent paper, Irfan and Gordon add context to the LIG models [
28]. By combining social interactions and context, they develop a model which performs better than the purely behavioral model. They learn the ideal points of each senator while learning parameters for the LIG, and account for disparities in polarity across bill topics by utilizing the subject codes of each bill. They expand the influence function of every senator
i to include the ideal point of that senator (
\(p_i\)), and the polarity
\(a_l\) of a bill
l. The product of these two terms is added to the otherwise unchanged influence. When the signs of the polarity of the bill and the ideal point of the senator are the same (e.g.,
\(-1.5\) and
\(-0.5\), meaning that both are liberal leaning), the signs cancel, increasing senator
i’s payoff for voting
yea; when they differ, a negative value is added, decreasing senator
i’s payoff for voting
yea.
Some researchers have taken other approaches to modeling congressional behavior. Woon utilizes both ideal points and game-theoretic concepts to analyze how bill sponsorship and co-sponsorship affect the content senators write in a bill [
51]. Woon argues that, when sponsoring a bill, legislators balance two opposing forces. One pushes them toward writing median language because they want a bill to pass without complications, and the other toward writing highly polarized language because they wish to signal their beliefs to their constituents. As such, a legislator
L will propose a bill with location
y within a one-dimensional policy space. They also consider that another legislator,
P, will be
pivotal in allowing a bill’s passage. That pivotal senator may choose either
y or the status quo,
q.
P’s choice is known as the policy outcome and is denoted by
x. The passage of a bill depends on senator
L and
P’s utility functions, which consider the distance between
x and the ideal points of
L and
P, respectively. In addition,
L’s utility function also considers the weight
w that
L places on being close to
y, which is known as
L’s position-taking. Woon extends the model to account for co-sponsorship of other legislators, each with their own utility functions. While our research focuses on legislative votes rather than policy proposals, Woon’s research affirms the validity of combining contextual data and game-theoretic models, and puts forth bill sponsorship and co-sponsorship as another direction of future research.
Bill sponsorship and co-sponsorship is not the only method by which legislators may signal their preferences for a bill prior to voting. Desmarais et al. build upon prior bill co-sponsorship research to introduce co-participation in press events—called the joint press events network—as an indicator for voting behavior [
14]. Using linear regression, they show a statistically significant positive relationship between press event co-participation and roll-call votes. While not focused on the computational aspects of congressional research, this study highlights the observation that “[l]egislation is often the end product of a lengthy collaborative effort.” Studies like this attempt to uncover ostensibly hidden mechanisms within that lengthy effort. This process starkly contrasts to the behavioral, game-theoretic approach, which makes no assumptions about the underlying mechanism or process, viewing them instead as a “black box”. This lack of assumptions is one of the key benefits of the game-theoretic approach.
Recently, a group of mathematicians took a very different approach to analyzing congressional voting networks from roll-call data. Glonek et al. introduce the Graph Labeling Semi-Supervised (GLaSS) method [
25], a random-walk-based graph labeling method. They model both the House and Senate (from 1935-2017, in different trials) as a graph from roll-call data, where nodes are Democratic or Republican legislators (other parties are ignored), and their labels correspond to their parties. While every senator’s party affiliation is known for validation purposes, the only labelled nodes in the graph are the Democratic and Republican party leaders; all other nodes are unlabelled. With the GLaSS method, those nodes are labelled based on the expected time to absorption in a discrete-time Markov chain (DTMC), where absorption states are labelled nodes (i.e., party leaders) and transient states are unlabeled nodes (i.e., other senators). By comparing the labels generated by the GLaSS method to the ground-truth labels of legislator, they measure polarization in Congress. When party affiliation can be accurately predicted by voting trends, Congress is more polarized; when there is some uncertainty, it is less so. Their results show that the U.S. Congress has become remarkably polarized in the past decade, with the model able to accurately predict every senator’s affiliation in each term of Congress since 2007. In contrast to Glonek et al.’s stochastic process-based approach, we model strategic interactions among senators in a game-theoretic fashion that allows us to infer joint behavioral outcomes. Additionally, Glonek et al.’s method relies on a model of binary party affiliation and considers nodes as labeled only by party affiliation rather than named as individual senators, which prevents further analysis of the model’s network structure.
A.2 Literature review: polarization
While modularity [
23,
41,
42] has been widely used as a measure of polarization in networks, it is often not a definitive measure. Guerra et al. presents a novel metric based on the edges incident on the boundary nodes [
22]. Like most other metrics of polarization, their metric is also structural in the sense that it does not take into account potentially different network structures among the same population induced by different behavioral contexts. One of the main goals of this paper is to analyze polarization within behavioral context.
Closely related to this paper is Waugh et al.’s work on polarization in Congress [
52]. They first compute a weighted network among the members of Congress by counting how many times each pair of members voted the same way. They then compute the modularity of this network as a measure of polarization. Their work can be contrasted with McCarty et al.’s ideal point-based approach [
40], where the absolute value of the differences in mean ideal points of the two parties serves as a measure of polarization. In fact, our approach may be mistaken as a combination of these two approaches. First, we do compute influence networks among the senators, but these networks are learned from behavioral data. Moreover, there are positive as well as negative edge weights in our networks, whereas Waugh et al.’s networks have only non-negative edge weights by definition [
52]. Second, the richer model [
28] which we use combines influence networks with ideal points in such a way that we cannot talk about either networks or ideal points in isolation of the other.
Zhang et al. [
53] study polarization in the U.S. Congress, the same setting as ours. However, theirs is based on co-sponsorship networks, which is observed from data. In contrast, ours is based on networks of influence, which have been learned using roll-call and bill-text data. Furthermore, one of the central aspects of our work is to show that polarization in Senate varies according to the spheres of legislation. We do not touch on the rise in polarization in Senate over time, which by now is a well-settled matter [
15].
Behavioral aspects of polarization among political parties have been studied before, but at an empirical level. Garcia et al. analyze multiplex networks consisting of comments, likes, and supports levels among multiple political parties in Switzerland [
17]. In contrast, ours is a model-based approach where polarization can be considered an inference question.
At a broader level, there have been numerous studies on political polarization. The edited volume by Hopkins and Sides [
27] presents a comprehensive treatise from three different perspectives: why American politics is polarized, how it became polarized, and what we can do about it (including whether the alternatives are any better). As a specific example, Conover et al. [
10] give evidence of polarization in Twitter network based on retweet networks. Interestingly, the opposite happens in mention networks (where ideologically opposing individuals mention each other to start conversations).
Not surprisingly, Twitter provides a trove of data that has been used in several other studies. Notably, Morales et al. [
39] give a framework to estimate
polarization index using a model of opinion generation. Unlike other generative models of opinion propagation [
49], their focus is on the distribution of opinions and not the dynamics of opinions. We briefly reviewed their model in “
Toward richer models” section. One major difference between Marales et al.’s work and ours is how we get to the behavioral distribution (or PSNE in our case). In our models, we do not have predefined elite and listener nodes and do not perform DeGroot-style iterative updates [
12]. Furthermore, the complexity of interdependent actions in a PSNE and the multiplicity of PSNE make a direct application of polarization index to our setting challenging (see Footnote 6).
Whereas Morales et al. apply polarization index to a case study of tweets in the aftermath of Venezuelan leader Hugo Chávez’s death, their basic idea has been generalized to any Twitter topics by Garimella et al. [
21]. Of course, there are methodological differences between the two studies. Garimella et al.’s random walk-based algorithm to measure polarization is promising for large-scale networks. In contrast to these studies, we use machine learning to learn the networks of influence from voting data. Also, our behavioral model is strictly game-theoretic.
There has also been some interesting work on the behavioral choice of individuals in a polarized environment. Bakshy et al. [
4] use large-scale Facebook data to show that the consumption of politically “hard content” is largely controlled by individuals’ own choices and not by algorithmically fed news rankings.
On the computational side, algorithmic approaches to polarization extend beyond modularity. Al Amin et al. [
1] give a matrix factorization-based algorithm to uncover polarization in Twitter networks.