《数学建模》美赛优秀论文：2007 A H Genetic Algorithm for Non-Partisan Legislative Districting.pdf

CoNTENTS 1. Introduction 2.1. Population Specifications 2.2. Geographic Simplicit 3. Our Method 3.1. Source for data 3.2. Other possibilities for district-drawing 23334467 4. Data 5. Evaluation of our Method 12 5.1. Defects 5.2. Merits 13 6. Appendix 1-Probability factors 7. Appendix 2-Other data R pretences 1. INTRODUCTION Gerrymandering is one of the oldest and most successful methods of disenfran chisement available to Congress. This much-maligned practice is essential to main- taining the 98% re-election rate currently enjoyed by incumbents 5. Obviously many people arent happy with the status quo, as it can render an individuals vote meaningless as his representation is determined not by popular sentiment, but by the incumbents drawing up the districts. What is needed is a new method for determining legislative districts, one out of the hands of incumbents What should an ideal district-drawing algorithm accomplish? This is certainly a vexing political question. Should it take the truly unbiased approach and ignore ny and all political considerations? We can just randomly create districts, blind to both political ambitions and political fairness. Others may feel that steps should be taken to ensure that elected representatives fairly represent their constituents Should we set up the districts so that a 60% Republican state will likely have 60% of its districts be majority Republican? Perhaps this just returns us to a different type of gerrymandering, done for the cause of"fair representation, but gerrymandering nonetheless This dilemma can only be solved by political scientists, not mathematicians. How- ever, a mathematician should be able to create a district-drawing algorithm which can accomodate certain 'preferences'in drawing districts, whatever they may be It is then up to social scientists to ascertain what these preferences should be. Our goal is to produce just such an algorithm. We will then test the algorithm by having it select districts for New York that are 'geographically simple'in a specific sense to be defined later I Average re-election rate of House incumbents over election cycles 2000, 2002, and 2004

Team 1035 2 Contents 1. Introduction 2 2. Specifications 3 2.1. Population Specifications 3 2.2. Geographic Simplicity 3 3. Our Method 4 3.1. Source for data 4 3.2. Other possibilities for district-drawing 6 4. Data 7 5. Evaluation of our Method 12 5.1. Defects 12 5.2. Merits 13 6. Appendix 1 - Probability factors 13 7. Appendix 2 - Other data 14 References 15 1. Introduction Gerrymandering is one of the oldest and most successful methods of disenfranchisement available to Congress. This much-maligned practice is essential to maintaining the 98% re-election rate currently enjoyed by incumbents1 [5]. Obviously, many people aren’t happy with the status quo, as it can render an individual’s vote meaningless as his representation is determined not by popular sentiment, but by the incumbents drawing up the districts. What is needed is a new method for determining legislative districts, one out of the hands of incumbents. What should an ideal district-drawing algorithm accomplish? This is certainly a vexing political question. Should it take the truly unbiased approach and ignore any and all political considerations? We can just randomly create districts, blind to both political ambitions and political fairness. Others may feel that steps should be taken to ensure that elected representatives fairly represent their constituents. Should we set up the districts so that a 60% Republican state will likely have 60% of its districts be majority Republican? Perhaps this just returns us to a different type of gerrymandering, done for the cause of ‘fair representation’, but gerrymandering nonetheless. This dilemma can only be solved by political scientists, not mathematicians. However, a mathematician should be able to create a district-drawing algorithm which can accomodate certain ‘preferences’ in drawing districts, whatever they may be. It is then up to social scientists to ascertain what these preferences should be. Our goal is to produce just such an algorithm. We will then test the algorithm by having it select districts for New York that are ‘geographically simple’ in a specific sense to be defined later. 1Average re-election rate of House incumbents over election cycles 2000, 2002, and 2004. 2

2. SPECIFICATIONS We require our district-drawing algorithm to satisfy the following criteria: 2.1. Population Specifications In the 1962 decision of Baker v. Carr, the Us Supreme Court asserted the abil ity to decide on the constitutionality of state district divisions 2. Although th Supreme Court has not specified the exact terms of what is acceptable, past rulings have suggested the following: " congressional redistricting plans with overall range percentage variances of up to. 73 percent have been approved based upon identi- fiable state objectives"[1. Here, the overall range percentage variance is defined by the maximum difference in population between any pair of districts, divided by the " ideal district population(which is just the population of the state divided by the number of districts). We would like to note, however, that a lower deviation percentage of 0.69% has been rejected in the past [ 1]. Pending further legal clarifi cation, we set a range variance of less than or equal to 0.7 percent as our goal for opulation range variance 2.2. Geographic Simplicity. We will require our districts to satisfy two criteria. The first requirement is ab- solute each district must be connected. we will later describe our method for enforcing this. Our second requirement will be to have each district contain as many interior points as possible. Here, an interior point means a grid point which does not neighbor a grid point of a different region on either the east, west, north or south. As with any notion of geometric simplicity, this one is motivated by esthetic reasons: intuitively, discretized versions of squares and circles contain a arge number of interior grid points The recognition that others may not agree with our particular definition of simplic ty is an important reason why we wish our algorithm to accommodate different district division criteria easily Let us be more specific about what a district-drawing algorithm should do. We will be given a region(which represents the state map), along with a population density function defined on the state. In our algorithm the region and density function will be discretized, but a priori one thinks of them as approximately continuous. We will discuss the validity of the discretization later In addition, the aforementioned social scientists may find other variables to be of nterest in drawing up districts. There are many ways in which such data can be naturally presented. For example, we may wish to have our districts follow county lines when possible. In this case, county lines should be included in our naturally presen ap. On the other hand. information about political leanings is state boundary d as a density function Pp on the region (.e, for each point on the state, pd(a)dA is the percentage of people living in differential area dA sur rounding a who are Democrats, with analogous functions PR, pI for Republicans nd Independents). We will later show how these possibilities and others can be

Team 1035 3 2. Specifications We require our district-drawing algorithm to satisfy the following criteria: 2.1. Population Specifications. In the 1962 decision of Baker v. Carr, the US Supreme Court asserted the ability to decide on the constitutionality of state district divisions [2]. Although the Supreme Court has not specified the exact terms of what is acceptable, past rulings have suggested the following: “congressional redistricting plans with overall range percentage variances of up to .73 percent have been approved based upon identi- fiable state objectives” [1]. Here, the overall range percentage variance is defined by the maximum difference in population between any pair of districts, divided by the ‘ideal’ district population (which is just the population of the state divided by the number of districts). We would like to note, however, that a lower deviation percentage of 0.69% has been rejected in the past [1]. Pending further legal clarifi- cation, we set a range variance of less than or equal to 0.7 percent as our goal for population range variance. 2.2. Geographic Simplicity. We will require our districts to satisfy two criteria. The first requirement is absolute: each district must be connected. We will later describe our method for enforcing this. Our second requirement will be to have each district contain as many interior points as possible. Here, an interior point means a grid point which does not neighbor a grid point of a different region on either the east, west, north, or south. As with any notion of geometric simplicity, this one is motivated by aesthetic reasons: intuitively, discretized versions of squares and circles contain a large number of interior grid points. The recognition that others may not agree with our particular definition of simplicity is an important reason why we wish our algorithm to accommodate different district division criteria easily. Let us be more specific about what a district-drawing algorithm should do. We will be given a region (which represents the state map), along with a population density function defined on the state. In our algorithm the region and density function will be discretized, but a priori one thinks of them as approximately continuous. We will discuss the validity of the discretization later. In addition, the aforementioned social scientists may find other variables to be of interest in drawing up districts. There are many ways in which such data can be naturally presented. For example, we may wish to have our districts follow county lines when possible. In this case, county lines should be included in our state boundary map. On the other hand, information about political leanings is naturally presented as a density function ρD on the region (i.e., for each point x on the state, ρD(x)dA is the percentage of people living in differential area dA surrounding x who are Democrats, with analogous functions ρR, ρI for Republicans and Independents). We will later show how these possibilities and others can be 3

Team 1035 FIGURE 2. Source for population density figures, New York City district consists of 1 grid point. The rest of the map is not initially assigned a district. Now we loop through all grid points on the map, performing the following tasks for each point If the grid point does not belong to a district, go on to the next grid point If it does belong to a district, say district A, consider each of its neighboring grid points to the west, east, north, and south. If a neighboring grid point does not belong to a district, assign it to district A. This way unused grid oints on the map are quickly taken up If a neighboring grid point does belong to a district, say district B, calculate the total population of districts A and B, as well as the potential change in total number of interior grid points if the neighboring grid point were to be re-assigned to district B. Calculate a probability for re-assignment based on these two numbers, and "Hip a coin'based on this probability to determine whether or not the neighboring grid point gets re-assigned to district A See Appendix 1 for the actual probability formula used After the above steps have been performed for each grid point, repeat the whole loop again and again, until(hopefully) the population numbers stabilize. In reali our discretization prevents the populations of the regions from becoming exactly equal, so we must introduce a cut-off. In our algorithm, this is done as follows: when calculating a potential re-assignment of a grid point(r, y) from district A to district B, look at the relative population difference A PaB between regions A and B, given by APAB-(PA+PB, where PA, PB are the current populations of districts A and B. If this number is below 0.01, our probability formula gives zero probability for re-assignment. This way, theoretically, our simulation should stop hen the relative population differences of neighboring districts is smaller than 2% Recall that our goal for range variance was 0.7%, so we are essentially giving up this goal. This was a result of recognizing that our discretization was simply not fine

Team 1035 5 Figure 2. Source for population density figures, New York City district consists of 1 grid point. The rest of the map is not initially assigned a district. Now we loop through all grid points on the map, performing the following tasks for each point: • If the grid point does not belong to a district, go on to the next grid point. • If it does belong to a district, say district A, consider each of its neighboring grid points to the west, east, north, and south. If a neighboring grid point does not belong to a district, assign it to district A. This way unused grid points on the map are quickly taken up. • If a neighboring grid point does belong to a district, say district B, calculate the total population of districts A and B, as well as the potential change in total number of interior grid points if the neighboring grid point were to be re-assigned to district B. Calculate a probability for re-assignment based on these two numbers, and ‘flip a coin’ based on this probability to determine whether or not the neighboring grid point gets re-assigned to district A. See Appendix 1 for the actual probability formula used. After the above steps have been performed for each grid point, repeat the whole loop again and again, until (hopefully) the population numbers stabilize. In reality, our discretization prevents the populations of the regions from becoming exactly equal, so we must introduce a cut-off. In our algorithm, this is done as follows: when calculating a potential re-assignment of a grid point (x, y) from district A to district B, look at the relative population difference ∆PAB between regions A and B, given by ∆PAB = PA−PB (PA+PB) , where PA, PB are the current populations of districts A and B. If this number is below 0.01, our probability formula gives zero probability for re-assignment. This way, theoretically, our simulation should stop when the relative population differences of neighboring districts is smaller than 2%. Recall that our goal for range variance was 0.7%, so we are essentially giving up on this goal. This was a result of recognizing that our discretization was simply not fine 5