Why Weight A Cluster-Theoretic Approach to Political Districting Abstract Political districting has been one of the most contentious issues within American politics over the last two centuries. Since the landmark case of Baker u. Carr, in which the United States Supreme Court ruled that the constitutionality of a state's legislated districting is within the jurisdiction of a federal court, many within academia have attempted to produce a rigorous system for determining a set of districts for a given state. In this paper, we attempt to improve upon these past efforts. We propose both a modified form of classical K-means clustering and an interesting algorithm called the shortest-splitline algorithm to accomplish impartial redistricting. As an example, we apply our methods to redistricting the state of New York, and, as further examples, to Texas and Colorado. Both methods use only population density data and state boundaries as inputs and run in a feasible amount of time. Our criteria for successful redistrict- ing include contiguity, compactness, and sufficiently uniform population The K-means method produces districts similar to convex polygons and the splitline method guarantees that the resulting districts have piecewise linear boundaries. The K-means method has the advantage of allowing eeding of the district centers. The centers of the generated districts then roughly correlate to the existing districts, by proper seeding, but the re-
Why Weight? A Cluster-Theoretic Approach to Political Districting February 17, 2007 Abstract Political districting has been one of the most contentious issues within American politics over the last two centuries. Since the landmark case of Baker v. Carr, in which the United States Supreme Court ruled that the constitutionality of a state’s legislated districting is within the jurisdiction of a federal court, many within academia have attempted to produce a rigorous system for determining a set of districts for a given state. In this paper, we attempt to improve upon these past efforts. We propose both a modified form of classical K-means clustering and an interesting algorithm called the shortest-splitline algorithm to accomplish impartial redistricting. As an example, we apply our methods to redistricting the state of New York, and, as further examples, to Texas and Colorado. Both methods use only population density data and state boundaries as inputs and run in a feasible amount of time. Our criteria for successful redistricting include contiguity, compactness, and sufficiently uniform population. The K-means method produces districts similar to convex polygons and the splitline method guarantees that the resulting districts have piecewise linear boundaries. The K-means method has the advantage of allowing seeding of the district centers. The centers of the generated districts then roughly correlate to the existing districts, by proper seeding, but the resulting boundaries are vastly simpler. 1
Control No. 1036 2of18 Contents 1 Introduction 11 Plan of atta 1.2 Defining Simpleness 1.3 Towards a Suitable Conception of Compactness 1.4 Defining Fairness 2 Applying the Theory of Data Clustering 3 The K-means Algorithm 3.1 Standard Algorithm 3.2 Weighted Algorithm 33445677899 4 Splitline algorithm 4.2 Demonstration 5 An Application: Considering the Congressional Districting of New York State 11 5.1K- Algorithm 11 5.2 Splitline Algorithm 6 Conclusions 6.1 Towards a Suitable Definition of Compactnes 13 a Various Definitions of Compactness 15 B Numerical Results 16 1 Compactness quotient results B 2 Population distribution error
Control No. 1036 2 of 18 Contents 1 Introduction 3 1.1 Plan of Attack . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2 Defining Simpleness . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3 Towards a Suitable Conception of Compactness . . . . . . . . . . 4 1.4 Defining Fairness . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2 Applying the Theory of Data Clustering 6 3 The K-means Algorithm 7 3.1 Standard Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 7 3.2 Weighted Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 8 4 Splitline Algorithm 9 4.1 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 4.2 Demonstration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 5 An Application: Considering the Congressional Districting of New York State 11 5.1 K-means Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 11 5.2 Splitline Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 12 6 Conclusions 12 6.1 Towards a Suitable Definition of Compactness . . . . . . . . . . . 13 A Various Definitions of Compactness 15 B Numerical Results 16 B.1 Compactness quotient results . . . . . . . . . . . . . . . . . . . . 16 B.2 Population distribution error . . . . . . . . . . . . . . . . . . . . 17
Control No. 1036 f18 1 Introduction In the words of Ted Harrington, chair of political science at the University of North Carolina There is no issue that is more sensitive to politicians of all colors and ideological perst than redistricting. It will determine who wi and loses for eight years [You88 The writers of the constitution created the House of Representatives with the intention that it would be the branch of government most responsive to the people. The reality is just the opposite. Though representatives are elected every 2 years, instead of every 4 or 6 years, almost 400 of the 435 seats of the House are not contested as a result of the extraordinary power of gerryman- dering. With the immensely detailed amount of data and unlimited computing power available to politicians today, gerrymandering has been elevated to an art. With only the requirements that districts be connected and all have equal population, it is possible to pinpoint candidates and place them in a different district than their neighbors Too03 Though undemocratic, gerrymandering is nearly always legal(see, for in stance,Bac86) and has been used to obtain striking results. In 2002 only 4 incumbent representatives lost their bid for reelection -the lowest total ever Too03. We will argue that it is certainly true that any attempt to fairly restruc ture legislative districts needs to ignore the human factors that overwhelmingly determine the current redistricting process. Defining some measure of compact ness is essential to ensure fair districts. Both methods we describe produce districts that at first glance are clearly simpler than the existing ones Restructuring the districts with no regard to the current layout would be ore difficult to implement. We will use the centers of the existing districts seeds for our clustering algorithm. Thus, the new districts have some correla- tion to the existing districts, but their boundaries will be determined in a fair nanner. The core of many districts will be roughly the same, while the bound- aries will be dramatically simpler. This will effectively counteract the effects of gerrymandering, without being overly difficult to put into use immediately. 1.1 Plan of Attack Our goal is to develop an algorithmic process for dividing an arbitrary region into k legislative districts, which satisfy some heuristic definition of fairness. In order to do so we must do the following Define terms. Crucial to creating a model is defining the somewhat ambiguous terms fairness and simpleness Define metrics for comparing algorithms
Control No. 1036 3 of 18 1 Introduction In the words of Ted Harrington, chair of political science at the University of North Carolina, There is no issue that is more sensitive to politicians of all colors and ideological persuasions than redistricting. It will determine who wins and loses for eight years [You88]. The writers of the constitution created the House of Representatives with the intention that it would be the branch of government most responsive to the people. The reality is just the opposite. Though representatives are elected every 2 years, instead of every 4 or 6 years, almost 400 of the 435 seats of the House are not contested as a result of the extraordinary power of gerrymandering. With the immensely detailed amount of data and unlimited computing power available to politicians today, gerrymandering has been elevated to an art. With only the requirements that districts be connected and all have equal population, it is possible to pinpoint candidates and place them in a different district than their neighbors [Too03]. Though undemocratic, gerrymandering is nearly always legal (see, for instance, [Bac86]) and has been used to obtain striking results. In 2002 only 4 incumbent representatives lost their bid for reelection — the lowest total ever [Too03]. We will argue that it is certainly true that any attempt to fairly restructure legislative districts needs to ignore the human factors that overwhelmingly determine the current redistricting process. Defining some measure of compactness is essential to ensure fair districts. Both methods we describe produce districts that at first glance are clearly simpler than the existing ones. Restructuring the districts with no regard to the current layout would be more difficult to implement. We will use the centers of the existing districts as seeds for our clustering algorithm. Thus, the new districts have some correlation to the existing districts, but their boundaries will be determined in a fair manner. The core of many districts will be roughly the same, while the boundaries will be dramatically simpler. This will effectively counteract the effects of gerrymandering, without being overly difficult to put into use immediately. 1.1 Plan of Attack Our goal is to develop an algorithmic process for dividing an arbitrary region into k legislative districts, which satisfy some heuristic definition of fairness. In order to do so, we must do the following: • Define terms. Crucial to creating a model is defining the somewhat ambiguous terms fairness and simpleness. • Define metrics for comparing algorithms
Control No 4of18 Figure 1: The compactness quotients of the circle, square, and gerrymander are 1,丌/4≈0.79,and23x/576≈0.13, respectively 1.2 Defining Simpleness Ve say that district A is more simple than district B if district A is contiguous and district A is more compact than district B Contiguity. We say that a district is contiguous if it is arcwise-connected hat is, if one can travel from any point a to any other point b in district A hile remaining entirely within district A. If A contains regions separated by bodies of water, A is contiguous if all regions are connected by water and each region is arcwise-connected Compactness Intuitively, we say that a district is compact if it does not meander exces- sively. This is a hard concept to formalize; many authors give only a hasty definition of compactness, and some have even argued that compactne is ambiguous to the point of being irrelevant in a serious treatment of dis- tricting. Nonetheless, we will now attempt to develop a suitable definition of compactness 1.3 Towards a Suitable Conception of Compactness In [You88, Young gives compelling reasons for abandoning all of the definitions of compactness mentioned in the second appendix. Interestingly enough, Young does not consider the following adjusted version of the Schwartzberg Test, which is alluded to in GN7o
Control No. 1036 4 of 18 Figure 1: The compactness quotients of the circle, square, and gerrymander are 1, π/4 ≈ 0.79, and 23π/576 ≈ 0.13, respectively. 1.2 Defining Simpleness We say that district A is more simple than district B if district A is contiguous, and district A is more compact than district B. • Contiguity. We say that a district is contiguous if it is arcwise-connected; that is, if one can travel from any point a to any other point b in district A while remaining entirely within district A. If A contains regions separated by bodies of water, A is contiguous if all regions are connected by water and each region is arcwise-connected. • Compactness. Intuitively, we say that a district is compact if it does not meander excessively. This is a hard concept to formalize; many authors give only a hasty definition of compactness, and some have even argued that compactness is ambiguous to the point of being irrelevant in a serious treatment of districting. Nonetheless, we will now attempt to develop a suitable definition of compactness. 1.3 Towards a Suitable Conception of Compactness In [You88], Young gives compelling reasons for abandoning all of the definitions of compactness mentioned in the second appendix.. Interestingly enough, Young does not consider the following adjusted version of the Schwartzberg Test, which is alluded to in [GN70]:
Control No. 1036 f18 Definition 1. We say that district A is more compact than district B if (Perimeter A)2(PerimeterS)2 Call the quantity 4 Area/ Perimeter the compactness quotient For a circle of radius r, this ratio is equal t 4丌 It is well-known that the shape with the largest ratio of area to squared perimeter is the circle( see, for instance, Fol02 ) Because of this, the quantity Area 4丌 Perimeter is restricted to the interval [0, 1] As seen in figure 1, a compactness quotient of 0. 13 is visually quite bad Using the fact given in Bou88 that the area of a non-self-intersecting closed N-gon(with the k-th vertex taken in counterclockwise order equal to(ak, yk)) we have calculated the compactness quotients of several actual districts by ap proximating their boundaries by piecewise linear segments. The results illustrate the inappropriate nature of the districts currently in place. Two of New Yorks nore sprawling districts, the &th and 28th, produced compactness quotients of 0.097 and 0.101, respectively -even worse then the gerrymander shown in figure 1! The two most compact districts in New York, the 26th and 21st had compactness quotients of 0.406 and 0.498, respectively. We decided that the mean for any state should be at least. 6. With this condition the average district in every state would be better than the best districts currently in New York. Furthermore we insist that 25 should be more than 2 standard deviations from the mean. It is not possible to require that all districts be greater than. 25 as several districts will inevitably end up having most of their border coincide with the border of the state 1. 4 Defining fairness Almost all unfairness occurs when political and social measures factor into redis- tricting decisions. Practices such as concentrating supporting voters in a single district, diluting opposing voters over several districts, placing two incumbents n the same district and forcing them to run against each other, and isolating minorities have been seen many times before(see Too03 and Hay 96 ) and are all the result of districing being controlled by those who attempt to skew voting patterns. In general, one can summarize past districting patterns in the ollowing way:
Control No. 1036 5 of 18 Definition 1. We say that district A is more compact than district B if 4π AreaA (PerimeterA) 2 > 4π AreaB (PerimeterB) 2 . Call the quantity 4π Area /Perimeter2 the compactness quotient. For a circle of radius r, this ratio is equal to 4π · πr2 (2πr) 2 = 1. It is well-known that the shape with the largest ratio of area to squared perimeter is the circle (see, for instance, [Fol02]). Because of this, the quantity 4π · Area Perimeter2 is restricted to the interval [0, 1]. As seen in figure 1, a compactness quotient of 0.13 is visually quite bad. Using the fact given in [Bou88] that the area of a non-self-intersecting closed N-gon (with the k-th vertex taken in counterclockwise order equal to (xk, yk)) is equal to 1 2 N X−1 i=1 (xiyi+1 − xi+1yi), we have calculated the compactness quotients of several actual districts by approximating their boundaries by piecewise linear segments. The results illustrate the inappropriate nature of the districts currently in place. Two of New York’s more sprawling districts, the 8th and 28th, produced compactness quotients of 0.097 and 0.101, respectively — even worse then the gerrymander shown in figure 1! The two most compact districts in New York, the 26th and 21st, had compactness quotients of 0.406 and 0.498, respectively. We decided that the mean for any state should be at least .6. With this condition the average district in every state would be better than the best districts currently in New York. Furthermore we insist that .25 should be more than 2 standard deviations from the mean. It is not possible to require that all districts be greater than .25 as several districts will inevitably end up having most of their border coincide with the border of the state. 1.4 Defining Fairness Almost all unfairness occurs when political and social measures factor into redistricting decisions. Practices such as concentrating supporting voters in a single district, diluting opposing voters over several districts, placing two incumbents in the same district and forcing them to run against each other, and isolating minorities have been seen many times before (see [Too03] and [Hay96]), and are all the result of districing being controlled by those who attempt to skew voting patterns. In general, one can summarize past districting patterns in the following way: